analogist/ezpaarse: analysing locally gathered logfiles to determine users’ accesses to subscribed...
DESCRIPTION
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources (Thomas Jouneau, Université de Lorraine, France). This presentation was one of the 10 most highly ranked at LIBER's Annual Conference 2014 in Riga, Latvia. Learn more: www.libereurope.euTRANSCRIPT
![Page 1: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/1.jpg)
LIBER 2014 - RIGA - 3/07/2014
ANALOGIST/EZPAARSE : ANALYSING LOCALLY GATHERED LOGFILES TO DETERMINE USERS’
ACCESSES TO SUBSCRIBED E-RESOURCES
http://ezpaarse.couperin.org
http://analogist.couperin.org
![Page 2: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/2.jpg)
LIBER 2014 - RIGA - 3/07/2014
1- The Context : A Need for Evaluation 2- Gathering Local Data 3- Parsers and Analyses 4- AnalogIST and ezPAARSE 5- Results and Visualization 6- Project Organization
Presentation Outline
![Page 3: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/3.jpg)
LIBER 2014 - RIGA - 3/07/2014
1 The Context :
A Need for Evaluation
![Page 4: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/4.jpg)
LIBER 2014 - RIGA - 3/07/2014
1. The Context : A need for evaluation
About some well-known facts
5.000 to 10.000 publishers / 23.000 e-journals
$25 billion global revenue in 2012, increasing 4-5 %/year
The 4 biggest publishers make half the market
For 10 years the price of most journals increases from 3% to 5% / year
5.500.000 researchers, increasing 3,5% per year
1.5 billion articles downloaded per year and by 10M users
The Scientific and Technical
Information Market
We need to assess and evaluate the use of these e-resources
![Page 5: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/5.jpg)
LIBER 2014 - RIGA - 3/07/2014
1. The Context : A need for evaluation
What we’ve currently got
… are not available
… are available and COUNTER-compliant
… are available but not COUNTER-
compliant
1st limitation : Vendors are the only source
2nd limitation : Only a partial view, no comparison possible
3d limitation : These numbers just offer mere quantification
A possible solution : → locally-gathered usage quantification
Publisher provided statistics
→ We need to assess these numbers
→ We need to complete the figures
→ We need to qualify them
![Page 6: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/6.jpg)
LIBER 2014 - RIGA - 3/07/2014
2 Gathering Usage
Data Locally
![Page 7: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/7.jpg)
LIBER 2014 - RIGA - 3/07/2014
4
3
2. Gathering usage data locally
The reverse proxy
![Page 8: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/8.jpg)
LIBER 2014 - RIGA - 3/07/2014
1
4
2
3
2. Gathering usage data locally with a reverse proxy
Where ezPAARSE comes into play
![Page 9: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/9.jpg)
LIBER 2014 - RIGA - 3/07/2014
3 Parsers and Analyses
![Page 10: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/10.jpg)
LIBER 2014 - RIGA - 3/07/2014
3. Parsers and analyses
Example of an URL structuration
http://pdn.sciencedirect.com/science?_ob=MiamiImageURL&_cid=271664&_user=4046427&_pii=S0001457512000747&_check=y&_origin=browse&_zone=rslt_list_item&_coverDate=2012-07-31&wchp=dGLbVlt-zSkWb&md5=f5d8d157ccda6d597cb466af123dbff3/1-s2.0-S0001457512000747-main.pdf
![Page 11: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/11.jpg)
LIBER 2014 - RIGA - 3/07/2014
3. Parsers and analyses
Example of an URL structuration
ISSN & type of the downloaded file
http://pdn.sciencedirect.com/science?_ob=MiamiImageURL&_cid=271664&_user=4046427&_pii=S0001457512000747&_check=y&_origin=browse&_zone=rslt_list_item&_coverDate=2012-07-31&wchp=dGLbVlt-zSkWb&md5=f5d8d157ccda6d597cb466af123dbff3/1-s2.0-S0001457512000747-main.pdf
![Page 12: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/12.jpg)
LIBER 2014 - RIGA - 3/07/2014
http://www.sciencedirect.com/science/journal/00014575
ISSN By manually trying the URL, we find an HTML table of contents
3. Parsers and analyses
Example of an URL structuration
![Page 13: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/13.jpg)
LIBER 2014 - RIGA - 3/07/2014
http://www.cairn.info/load_pdf.php?ID_ARTICLE=RFG_218_0009
We know it’s a PDF but we only get a publisher-specific identifier : we need a correspondance table : the Publisher Knowledge Base (ideally a KBART file)
3. Parsers and analyses
Example of an URL structuration
![Page 14: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/14.jpg)
LIBER 2014 - RIGA - 3/07/2014
http://pdn.sciencedirect.com/science?_ob=MiamiImageURL&_cid=271664&_user=4046427&_pii=S0001457512000747&_check=y&_origin=browse&_zone=rslt_list_item&_coverDate=2012-07-31&wchp=dGLbVlt-zSkWb&md5=f5d8d157ccda6d597cb466af123dbff3/1-s2.0-S0001457512000747-main.pdf
/_pii=S([0-9]{0,7}[0-9X])/i
3. Parsers and analyses
Parse the URL
![Page 15: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/15.jpg)
LIBER 2014 - RIGA - 3/07/2014
3. Parsers and analyses
What do we count?
Serials E-‐books Law databases Inst. repositories
Ar#cles (ARTICLE) Book by #tle (BOOK) Law encyclopedia (ENCYCLOPEDIES)
PHD_THESIS
Abstract (ABS) Chapter, sec#on (BOOK_SECTION)
Law memento (FORMULES)
MD_THESIS
Table of contents (TOC) Book series (BOOKSERIE) Law manual (BROCHES) MASTER_THESIS
Reference (REF) Manuals, handbooks (HANDBOOK)
Law codes (CODES)
Ar#cle preview (for ex. “Look inside” func#on of SpringerLink) (PREVIEW)
Ar#cle in basket/personal folder (BOOKMARK)
- The availability of these items depend on the elements present in the URL - The Law databases currently covered are only French ones
![Page 16: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/16.jpg)
LIBER 2014 - RIGA - 3/07/2014
...we need one parser for each
3. Parsers and analyses
Platforms covered
Each platform has its own structuration...
![Page 17: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/17.jpg)
LIBER 2014 - RIGA - 3/07/2014
Opaque URLs : session ids, encryption…. Example : the former Springer platformhttp://www.springerlink.com/content/j5q872410p510m63/fulltext.pdf
Publisher IDs, needing to be linked to a knowledge base or a reference file. Example : Cairnhttp://www.cairn.info/load_pdf.php?ID_ARTICLE=RFG_218_0009
- Opaque URLs (session ids, encryption…) - Knowledge bases having to be manually edited
3. Parsers and analyses
Some limitations apply
![Page 18: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/18.jpg)
LIBER 2014 - RIGA - 3/07/2014
4 AnalogIST
and ezPAARSE
![Page 19: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/19.jpg)
LIBER 2014 - RIGA - 3/07/2014
AnalogIST : the wiki portal Analyse des Logs de l'IST = Analysing the logs of Scientific and Technical Information → The place where we gather the platform analysis, and synchronise the new parsers with the local installations http://analogist.couperin.org
4. AnalogIST and ezPAARSE
● ezPAARSE : the software ez : easy / PAARSE : Progiciel d'Analyse des Accès aux RessourceS Electroniques = Software for Analysing the Accesses to Online Resources
● as a local installation ● as an online service (SaaS)
Free (libre) software Multi-platform http://ezpaarse.couperin.org
![Page 20: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/20.jpg)
LIBER 2014 - RIGA - 3/07/2014
4. AnalogIST and ezPAARSE
Univ 1
Univ 2
...
AnalogIST
local installations global installation + collaborative space
![Page 21: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/21.jpg)
LIBER 2014 - RIGA - 3/07/2014
4. AnalogIST and ezPAARSE
Through a web form With the command line (cURL)
a actualiser nouveau formulaire EN
Use the web form to create the command line suiting your needs.
![Page 22: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/22.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : Using the Results
Example of an ezPAARSE output
KBART fields geoip fields
Ded
uplic
ate
cons
ulta
tion
even
ts :
CO
UN
TER
reco
mm
enda
tion
Text file (CSV format)
![Page 23: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/23.jpg)
LIBER 2014 - RIGA - 3/07/2014
5 ezPAARSE :
Using the Results
![Page 24: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/24.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results
(Libre/MS) Office rendering macros
![Page 25: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/25.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results
Exploiting the Results with
![Page 26: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/26.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results Who (student, researcher, staff) consults what? (UL)
Repartition of consultations of paid content (books, journals, law references…) by user type at the Université de Lorraine
![Page 27: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/27.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results
Consultations by research unit (UL)
Consultations of articles from Jan 2014 to May 2014 by research units at the Université de Lorraine
![Page 28: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/28.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results
Consultations by teaching unit (UL)
Consultations of articles by teaching unit or faculty at the Université de Lorraine
![Page 29: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/29.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results
Geolocalisation of consultations (CNRS)
![Page 30: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/30.jpg)
LIBER 2014 - RIGA - 3/07/2014
5. ezPAARSE : using the results
Detection of an anomaly (CNRS)
The consultation peak corresponds to an abuse of an e-resource. Detection allows to react promptly to this incident.
![Page 31: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/31.jpg)
LIBER 2014 - RIGA - 3/07/2014
6 Project Organization
![Page 32: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/32.jpg)
LIBER 2014 - RIGA - 3/07/2014
6. Project organization : the method
SCRUM : An agile development method
4
C
PRODUCT VISION
![Page 33: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/33.jpg)
LIBER 2014 - RIGA - 3/07/2014
6. Project organization : the team
![Page 34: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/34.jpg)
LIBER 2014 - RIGA - 3/07/2014
In conclusion
● ezPAARSE is free and open source ● Simple use and testing ● State of the art technologies
● Feel free to test
● send us log samples ● give us feedback !
![Page 35: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/35.jpg)
LIBER 2014 - RIGA - 3/07/2014
Any Questions?
http://ezpaarse.couperin.org
http://analogist.couperin.org
https://twitter.com/ezpaarse
nuage de tag avec termes appropriés
![Page 36: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/36.jpg)
LIBER 2014 - RIGA - 3/07/2014
http://analogist.couperin.org/platforms/analyse-helper/start
The rest is automatically processed
dokuwiki syntax generated
![Page 37: AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ Accesses to Subscribed e-Resources](https://reader033.vdocuments.mx/reader033/viewer/2022052412/55893d22d8b42a40428b45d3/html5/thumbnails/37.jpg)
LIBER 2014 - RIGA - 3/07/2014
More features : exploiting the results with geolocalization