s2ds final project presentation: building a recommendation engine for refme

24
www.refme.com @getrefme Building a Library Recommendation Engine Richard Hanson Daniel Pape Martina Pugliese

Upload: martina-pugliese

Post on 13-Aug-2015

119 views

Category:

Education


0 download

TRANSCRIPT

www.refme.com @getrefme

Building a Library Recommendation Engine

Richard Hanson

Daniel Pape

Martina Pugliese

What is RefME?• award-winning referencing tool!

• synchronised website + mobile app!

• easy to use!

• barcode scanning!

• many reference types!

• import from several formats!

• export in 1000s of styles

2 / 13

Who uses RefME?• 100 000s users!

• mainly undergraduate students!

• writing essays, theses, building bibliographies!

• Other tools focus on academic market!

• RefME is much more general

3 / 13

Our Task

4

RefME wants to broaden the appeal of their product by offering a tool to build better bibliographies.

/ 13

Our Task

4

RefME wants to broaden the appeal of their product by offering a tool to build better bibliographies.

/ 13

“take the search out of research”

Our Task

Task: Develop a recommendation system!

• suggest relevant references to users!

• general enough to handle any reference type

4

RefME wants to broaden the appeal of their product by offering a tool to build better bibliographies.

/ 13

“take the search out of research”

The Data

• small subset of data-base!

• 21 000 projects with 120 000+ references!

• references are user-generated and associated via projects!

• pre-processing: sort/unify identifiers (isbn, doi, url) + retrieve subject keywords

Recommendation!EngineProject ID Reference IDs

5 / 13

Collaborative Filtering Content-based Recommendations

6 / 13

Our Recommendation Engine

“look at content information”“ask your friends”

Project Based

similar projects nearest neighbours

Collaborative Filtering

7 / 13

Project Based Reference Based

similar projects nearest neighbours

interaction patterns co-occurrence

Collaborative Filtering

7 / 13

Project Based Reference Based

similar projects nearest neighbours

interaction patterns co-occurrence

multiple similarity measures for robust results

Collaborative Filtering

7 / 13

Project Based Reference Based

similar projects nearest neighbours

interaction patterns co-occurrence

multiple similarity measures for robust results

recommendations

Collaborative Filtering

7 / 13

Collaborative Filtering Content-based Recommendations

8 / 13

Content-based Recommendations

9 / 13

no references

Content-based Recommendations

9 / 13

no references

compare titles with NLP

retrieve subjects in matching projects

extract references based on subject frequency

project title

Content-based Recommendations

9 / 13

no references with references

compare titles with NLP

retrieve subjects in matching projects

extract references based on subject frequency

project title

Content-based Recommendations

9 / 13

no references with references

compare titles with NLP

retrieve subjects in matching projects

extract references based on subject frequency

project title subject keywords

measure similarity

extract most similar references

Content-based Recommendations

9 / 13

Collaborative Filtering Content-based Recommendations

10 / 13

independent of item frequency good for common topics

good for items with sufficient frequency

Benefits of two approaches

• Relocating Television: Television in the Digital Context • Transmedia Television: Audiences, New Media and Daily Life • Television and Its Audience • Television Goes Digital. The Economics of Information Communication and Entertainment: The Impacts of Digital Technology in the 21st Century.

• Grown Up Digital How the Net Generation is Changing Your World • Future Minds: How the Digital Age is Changing Our Minds Why This Matters and What We Can Do About It

• The Television Studies Book • Television and Everyday Life

Project: “Media Research”IN

PUT

OU

TPU

T

11 / 13

Project: “Group Presentation”• Person to Person: Ways of Communicating • The Psychology of Interpersonal Perception • Interpersonal Communication • Body Movement and Interpersonal Communication • Intergroup Behaviour

• The Social Psychology of Everyday • Interviews Made Easy How to Get the Psychological Advantage • Groups at Work: Theory and Research • Theories of Human Communication (with InfoTrac)

INPU

TO

UTP

UT

12 / 13

13

Summary

/ 13

RefME wants to broaden the appeal of their product by offering a tool to build better bibliographies.

Developed recommendation system!

• yields good results for any reference type!

• will be incorporated as premium feature soon

www.refme.com @getrefme

Richard Hanson

Daniel Pape

Martina Pugliese

Used Tools

• Python NLTK libraries!

• openlibrary.org API!

• crossref.org API!

• mahout.apache.org!

• R, Java