1 caslin 2009 institutional repositories and document citation caslin 2009 8th june 2009, hotel...
DESCRIPTION
3CASLIN 2009 Methodology 1/2 Survey of existing open source software for institutional repositories Survey of existing open source software for institutional repositories Choice of a particular installation of the software in question Choice of a particular installation of the software in question Study of software’s capability to export citation data (from user point of view) Study of software’s capability to export citation data (from user point of view) Study of selected journals digital libraries and citation databases from citation data processing view Study of selected journals digital libraries and citation databases from citation data processing viewTRANSCRIPT
1CASLIN 2009
Institutional Institutional repositories and repositories and
document citationdocument citation
CASLIN 2009CASLIN 200988th June 2009, Hotel Klášter Tepláth June 2009, Hotel Klášter Teplá
Linda SkolkovLinda Skolková á & Miloslav & Miloslav NiNičč
2CASLIN 2009
IntroductionIntroduction Terminology:Terminology:
Citation formats – intended primarily Citation formats – intended primarily for machine usefor machine use
Citation styles – intended primarily for Citation styles – intended primarily for human usehuman use
Citation manager – a software tool used Citation manager – a software tool used to process citations and references to process citations and references („shorter“ and „longer“ records)(„shorter“ and „longer“ records)
3CASLIN 2009
Methodology Methodology 1/21/2 Survey of existing open source Survey of existing open source
software for institutional repositoriessoftware for institutional repositories Choice of a particular installation of Choice of a particular installation of
the software in questionthe software in question Study of softwareStudy of software’s capabilit’s capability to export y to export
citation data (from user point of view)citation data (from user point of view) Study of selected journals digital Study of selected journals digital
libraries and citation databases from libraries and citation databases from citation data processing viewcitation data processing view
4CASLIN 2009
Methodology Methodology 2/22/2 Choice of open source citation Choice of open source citation
managers based on scientists’ managers based on scientists’ demandsdemands
Export of sample citation data, Export of sample citation data, further work with exported data further work with exported data (import to a second citation (import to a second citation manager, comparison)manager, comparison)
5CASLIN 2009
Institutional repository Institutional repository softwaresoftware
Four institutional repository Four institutional repository software tools chosen:software tools chosen: DSpace (DSpace (http://www.dspace.org/http://www.dspace.org/) ) EPrints (http://www.eprints.org/) EPrints (http://www.eprints.org/) Fedora (http://www.fedora-Fedora (http://www.fedora-
commons.org/) commons.org/) CDS Invenio CDS Invenio
(http://cdsware.cern.ch/invenio/index.ht(http://cdsware.cern.ch/invenio/index.html) ml)
6CASLIN 2009
DSpace: DSpaceDSpace: DSpace@MIT@MIT No particular citation data export No particular citation data export
formats offeredformats offered either for individual either for individual records of for record setsrecords of for record sets
URI URI – handle – to identify the – handle – to identify the documentdocument
7CASLIN 2009
EPrints:E-LISEPrints:E-LIS A variety of citation data export A variety of citation data export
formats available for record sets:formats available for record sets: ASCII Citation, BibTeX, Dublin Core, ASCII Citation, BibTeX, Dublin Core,
EP3XML, EndNote, Eprints Application EP3XML, EndNote, Eprints Application Profile, HTML Citation, ISO Citation, Profile, HTML Citation, ISO Citation, METS, Object IDs, OpenURL Context METS, Object IDs, OpenURL Context Object, Refer and Reference ManagerObject, Refer and Reference Manager
For individual records:For individual records: Those above + Full Metadata, DIDL and Those above + Full Metadata, DIDL and
Simple MetadataSimple Metadata
8CASLIN 2009
Fedora: ARROW Fedora: ARROW RepositoryRepository
No options to export citation data No options to export citation data offeredoffered
Possibility to copy the identifier Possibility to copy the identifier (handle) and basic (not complete) (handle) and basic (not complete) bibliographic data availablebibliographic data available
9CASLIN 2009
CDS Invenio: CDS CDS Invenio: CDS Document ServerDocument Server
Eleven citation data formats available Eleven citation data formats available for record sets:for record sets: Excel, HTML brief, BibTeX, HTML address Excel, HTML brief, BibTeX, HTML address
label, HTML detailed, HTML MARC, HTML label, HTML detailed, HTML MARC, HTML photo captions only, HTML portfolio, photo captions only, HTML portfolio, MODS, XML Dublin Core, XML MARCMODS, XML Dublin Core, XML MARC
Six formats available for individual Six formats available for individual records:records: BibTeX, MARC, MARCXML, DC, EndNote, BibTeX, MARC, MARCXML, DC, EndNote,
NLMNLM
10CASLIN 2009
Journals, digital librariesJournals, digital libraries Nature:Nature: RIS; Connotea RIS; Connotea Science:Science: EndNote, Reference EndNote, Reference
Manager (RIS), ProCite (RIS), BibTeX, Manager (RIS), ProCite (RIS), BibTeX, RefWorks, Medlars; CiteULike, RefWorks, Medlars; CiteULike, PubMedCitation; additional formats PubMedCitation; additional formats can be requestedcan be requested
SpringerLink:SpringerLink: RIS, text RIS, text Wiley InterScience:Wiley InterScience: Plain Text, Plain Text,
EndnoteEndnote
11CASLIN 2009
Citation databasesCitation databases Scopus:Scopus: RIS (Reference Manager, RIS (Reference Manager,
ProCite, EndNote), Text (ASCII ProCite, EndNote), Text (ASCII format), RefWorks Direct Export, format), RefWorks Direct Export, Comma Separated file, .csv (e.g. Comma Separated file, .csv (e.g. Excel); BibTeX (citation style!), Excel); BibTeX (citation style!), 2collab bookmarks2collab bookmarks
Web of Science:Web of Science: EndNoteWeb, EndNoteWeb, EndNoteEndNote/RefMan/Procite /RefMan/Procite (not RIS!), (not RIS!), HTML, Plain Text, tab-delimited files HTML, Plain Text, tab-delimited files (for Win and for Mac)(for Win and for Mac)
12CASLIN 2009
Formats chosen for Formats chosen for further workfurther work
Exports from EPrints and CDS Invenio:Exports from EPrints and CDS Invenio: MODSMODS
(http:(http://www.loc.gov/standards/mods///www.loc.gov/standards/mods/) ) BibTeX BibTeX (http:(http://www.bibtex.org///www.bibtex.org/) )
Exports from citation managers:Exports from citation managers: RISRIS
(http:(http://www.refman.com/support/risformat//www.refman.com/support/risformat_intro.asp) _intro.asp)
BibTeX (http:BibTeX (http://www.bibtex.org///www.bibtex.org/))
13CASLIN 2009
Choice of open source Choice of open source software managerssoftware managers
Zotero (http://www.zotero.org/) Zotero (http://www.zotero.org/) Connotea Connotea
(http:(http://www.connotea.org///www.connotea.org/) )
Reasons:Reasons: Available as open source, free of chargeAvailable as open source, free of charge Work with a variety of import and Work with a variety of import and
export formatsexport formats
14CASLIN 2009
ZoteroZotero Seven import formats offered (when Seven import formats offered (when
importing from a local file):importing from a local file): MODS, MAB2, MARC, RDF, RIS, MODS, MAB2, MARC, RDF, RIS,
ReferRefer/BibTeX, BibTeX/BibTeX, BibTeX Seven export formats:Seven export formats:
Zotero RDF, MODS, RIS, ReferZotero RDF, MODS, RIS, Refer//BibIXBibIX, , Unqualified Dublin Core RDF, Unqualified Dublin Core RDF, Wikipedia Citation Templates, BibTeXWikipedia Citation Templates, BibTeX
15CASLIN 2009
ConnoteaConnotea Seven import formats Seven import formats (for uploading data (for uploading data
from a local file):from a local file): Firefox Bookmarks, RIS, EndNote (Refer), Firefox Bookmarks, RIS, EndNote (Refer),
EndNote (XML) (Experimental), BibTeX, ISI Web EndNote (XML) (Experimental), BibTeX, ISI Web of Knowledge, MODS, Plain Text (one URL and of Knowledge, MODS, Plain Text (one URL and tags per line)tags per line)
Six basic export formats:Six basic export formats: RIS, EndNote, BibTeX, MODS (XML), Word 2007 RIS, EndNote, BibTeX, MODS (XML), Word 2007
Bibliography, Simple Text CitationsBibliography, Simple Text Citations Three other export formats:Three other export formats:
RSS, RDF, PlainRSS, RDF, Plain
16CASLIN 2009
Citation data imports to Citation data imports to ZoteroZotero
A sample set of citation data from E-LIS A sample set of citation data from E-LIS eprint archive (110 records; BibTeX to eprint archive (110 records; BibTeX to Zotero) and from CERN Document Zotero) and from CERN Document Server (80 records; MODS to Zotero) Server (80 records; MODS to Zotero) gatheredgathered
At first sight these steps proceeded At first sight these steps proceeded without any significant problemswithout any significant problems
Minor problems ocurred, e.g. viewing a Minor problems ocurred, e.g. viewing a record representing a document in Greek record representing a document in Greek with characters from Greek alphabetwith characters from Greek alphabet
17CASLIN 2009
Citation data exports Citation data exports from Zoterofrom Zotero
Both data sets exported from Zotero Both data sets exported from Zotero in RIS format and imported to in RIS format and imported to ConnoteaConnotea
As in the previous case, in general As in the previous case, in general the import was successful but the import was successful but problems ocurred anywayproblems ocurred anyway
Connotea informs the user about Connotea informs the user about changes being made during importchanges being made during import
18CASLIN 2009
Citation data export from Citation data export from ConnoteaConnotea
Export to BibTeX formatExport to BibTeX format Comparison of original BibTeX files Comparison of original BibTeX files
with these new ones (after two with these new ones (after two conversions took place)conversions took place)
The files were significantly different:The files were significantly different: The Connotea export file did not contain The Connotea export file did not contain
abstract, other data (volume, number, abstract, other data (volume, number, pages, journal) were also lostpages, journal) were also lost
Author, title, year, keywords and URL Author, title, year, keywords and URL were still presentwere still present
19CASLIN 2009
Data loss – something to Data loss – something to think about?think about?
Yes, certainly!Yes, certainly! Associated problem – citing different versions of Associated problem – citing different versions of
documents (journal vs. institutional repository):documents (journal vs. institutional repository): Journal article in print Journal article in print Journal article onlineJournal article online Preprint of a journal articlePreprint of a journal article Postprint of a journal articlePostprint of a journal article ……
Issue of trust in the electronic environment Issue of trust in the electronic environment (CASLIN 2008)(CASLIN 2008)
„„Authoritative“ vs. „unauthoritative“ versions of Authoritative“ vs. „unauthoritative“ versions of documents (citing one, actually using another)documents (citing one, actually using another)
20CASLIN 2009
Discussion and Discussion and conclusionsconclusions
A survey of institutional repositories has A survey of institutional repositories has shown that not all repositories offer shown that not all repositories offer satisfactory citation data export optionssatisfactory citation data export options
Experiments leading to rather interesting Experiments leading to rather interesting results – at first sight everything is perfect, at results – at first sight everything is perfect, at second sight it becomes apparent that second sight it becomes apparent that transformations lead to data (information) transformations lead to data (information) lossloss
Citation export capabilities as one of selection Citation export capabilities as one of selection criteria for institutional repository software?criteria for institutional repository software?
Open source – Open AccessOpen source – Open Access
21CASLIN 2009
Questions?Questions?
22CASLIN 2009
Thank you for your Thank you for your attention!attention!
23CASLIN 2009
Contact detailsContact detailsLinda SkolkováLinda Skolková
Institute of Information Studies and Institute of Information Studies and Librarianship, Faculty of Arts, Charles Librarianship, Faculty of Arts, Charles University, Prague (University, Prague (http:http://uisk.ff.cuni.cz///uisk.ff.cuni.cz/) ) E-mail: E-mail: [email protected]@ff.cuni.cz
Miloslav NičMiloslav NičInstitute of Chemical Technology, Prague Institute of Chemical Technology, Prague ((http:http://www.vscht.cz///www.vscht.cz/))E-mail: E-mail: [email protected]@vscht.cz