open access indicator 2016 technical en...2 0 preface the national steering group for open access1...
TRANSCRIPT
1
OpenAccessIndicatorfor2016
Part2
TechnicalDescriptionofDataFoundation,ProcessesandOutput
0 Preface..........................................................................................................................................................21 IntroductionandMainProcesses.....................................................................................................32 Process1:CollectionofTheData......................................................................................................42.1 TheUniversitiesPublicationData...........................................................................................42.1.1 RequirementsonUniversities–MetadataFormatandMethodofCollection.....................................................................................................................................................42.1.2 ThisYearsUniversitiesandTheirResearchDatabases.......................................5
2.2 AuthorityandAuxiliaryData....................................................................................................52.2.1 DirectoryofOpenAccessJournals(DOAJ)................................................................52.2.2 Sherpa/Romeo(Sh/Ro).....................................................................................................52.2.3 TheDanishBibliometricResearchIndicator(BFI)...............................................52.2.4 AuthorityList:AcceptedExternalRepositories(”TheWhitelist”).................62.2.5 AuthorityList:JournalswithextendedEmbargo(”TheBlacklist”)...............6
2.3 ThisYearsCompleteDataCollection.....................................................................................63 Process2:DefiningtheSetofIn-ScopedPublications.............................................................63.1 TheSetofScopedRecordsIncludingDuplicates..............................................................73.2 TheSetofScopedRecordsExcludingDuplicates.............................................................8
4 Process3:CalculationofOARealizationandPotential...........................................................94.1 OpenAccessClassification–UniversityLevel................................................................104.1.1 CheckingforGoldenOpenAccessPotential..........................................................114.1.2 CheckingforGreenOpenAccessPotential.............................................................124.1.3 CheckingforUnused&UnclearPotential...............................................................154.1.4 CheckingOpenAccessPotential–Combined.......................................................16
4.2 OpenAccessClassification–NationalandMainResearchAreaLevel................185 Process4:QualityAssurance...........................................................................................................186 Process5:Output..................................................................................................................................196.1 DataReportsfordownload.....................................................................................................196.2 WebDisseminationviaTheDanishResearchDatabase............................................19
7 AppendixA:TheFulltextDownloadSubProcess...................................................................20
Revision1of20April2018
2
0 PrefaceTheNationalSteeringGroupforOpenAccess1hasproposedtheDanishAgencyforScience,TechnologyandInnovationandDenmark’sElectronicResearchLibrary,todevelopaDanishOpenAccessIndicator.TheintentionistosupporttheimplementationofthenationalOpenAccessstrategy2-cf.thestrategy’sstatementonmonitoring:”TheimplementationofOpenAccessistobemonitoredonanongoingbasistoensurethatallpartiesmakeamaximumefforttodevelopanddisseminatefreeaccessibilitytoDanishresearchfindings.”TheOpenAccessIndicatoriscalculatedonceperyearwiththetargetfield:ScientificandpeerreviewedarticlesandconferencecontributionsinjournalsandproceedingswithISSN.InthecontextofHorizon20203,EUrequiresthatOpenAccessbeestablishedwithinatmost6monthsafterpublicationfortheareasofscience,technologyandhealthandwithinatmost12monthsforthesocialsciencesandhumanities.Thisdelayiscausedbymanyjournalsmaintainingso-calledembargoperiods,wheretheyexcluderesearchersfromestablishingOpenAccesstothearticlesbeforetheendoftheembargoperiod.AstheOAIndicatoriscalculatedonceannuallyforallpublicationswithinitstargetfield,itisdesignedtoacceptaone-yeardelayinOpenAccesstothepublications.Consequently,theOAIndicatorfor2016iscalculatedearlyMarch2018inordertoaccommodateafullyearembargoperiodalsoforpublicationsfromDecember2016.InpracticethismeansthatpublicationsfromJanuary2016couldhaveembargoperiodsallthewayupto24monthsandstillbecreditedbytheOAIndicator.ThedescriptionoftheOpenAccessIndicatorisorganizedintwoparts:
• Part1:Overviewofdatafoundation,processesandoutput• Part2:Technicaldescriptionofdatafoundation,processesandoutput
Note:InPart2,thetechnicaldescription,thenotionoftheindicator’s“targetfield”isexpressedusingtheterm“setofscopedrecords”.
Queriesregardingtheindicatormaybedirectedto
AdamBaden/Hanne-LouiseKirkegaardDanishAgencyforScienceandHigherEducationMinistryofHigherEducationandScienceBredgade40DK-1260KøbenhavnKEmail:[email protected]/[email protected]
1http://ufm.dk/en/research-and-innovation/cooperation-between-research-and-innovation/open-access2http://ufm.dk/en/research-and-innovation/cooperation-between-research-and-innovation/open-access/Publications/denmarks-national-strategy-for-open-access3https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
3
1 IntroductionandMainProcessesTheactivitiesoftheOAIndicatorcanbebrokendownintothesefivemainprocesses.
Thefivemainprocessesaredescribedinfurtherdetailinthesectionsbelow.ThisdescriptionoftheOpenAccessIndicatorisaimedforatechnicallyinclinedaudienceandaimstodescribeindepthhowtheIndicatorworks–overallaswellasindetail.ThedescriptionassumesthatthereaderhasfamiliaritywithbasicXML4andbasicpartsoftheXPath5notationforreferingtoXMLelementsofanXMLdocumentconformingtoacertainXMLSchema.Italsoassumesthatthereaderisfamiliarwithvisualisationofprocessesafworkflowdiagrams6.
4https://www.w3.org/TR/xml/5https://www.w3.org/TR/xpath-30/6https://en.wikipedia.org/wiki/Flowchart
4
2 Process1:CollectionofTheData
ThefirstactivityintheOAIndicatoristhecollectionofthecompletedatafoundationusedbytheindicator.Thisincludesimportingsixnationalandinternationalsources.Thedatafoundationiscomposedofmetadatadescribingthepublicationsoftheuniversities,aswellasauthority-andauxiliarydata.
2.1 TheUniversitiesPublicationDataMetadatadescribingthepublicationsoftheuniversitiesareusedtoestablishthesetofpublicationsinscopeoftheOAIndicator.MetadatadescribingthepublicationsoftheuniversitiesarecollectedfortheOAIndicatoronceannually.Collectionisdonedirectlyfromtheuniversities,usinganXML-basednationallyagreedexchangeformatandanationallyagreedexchangeprotocol.Forfulltextsregisteredinthecollectedpublicationmetadata,collection(download)areattempted.
2.1.1 RequirementsonUniversities–MetadataFormatandMethodofCollectionAuniversitycanbeincludedintheOAIndicatorifitmeetsthefollowingminimumrequirements:
• Publicationspublishedbyresearchersemployedattheuniversityarecollectedinauniversityresearchdatabasecontainingpublicationdata,persondata,projectdataetcofthatparticularuniversityonly.
• ThisresearchdatabaseoftheuniversitymustexposeitspublicationdatausingOAI-PMH(http://www.openarchives.org/OAI/openarchivesprotocol.html).
• TheresearchdatabasemustsupportOAI-PMHselectiveharvestingusingSets,characterisedbytheirsetSpec(code),toharvestonlypartsofthedatabase.
• AdedicatedOAI-PMHSetexposingallpublicationdataheldintheresearchdatabasemustexist.
• Forthisdedicatedset,OAI-PMHmetdataPrefix”ddf_mxd”mustbesupported.• WhenanOAI-PMHclientharvestthisdedicatedsetusingmetadataPrefix
”ddf_mxd”,metadatarecordsmustbevalidDDF-MXD(http://mx.forskningsdatabasen.dk/mxd/).
5
2.1.2 ThisYearsUniversitiesandTheirResearchDatabasesThefollowing8universities–andassociatedresearchdatabases–areincludedintheOAIndicatorfor2016:University ResearchDatabase-OAI-PMHserver OAI-PMHsetSpecAAU http://vbn.aau.dk/ws/oai publications:allAU https://pure.au.dk/ws/oai publications:allCBS http://research.cbs.dk/ws/oai publications:allDTU http://orbit.dtu.dk/ws/oai publications:allITU https://pure.itu.dk/ws/oai publications:allKU http://curis.ku.dk/ws/oai publications:allRUC http://rucforsk.ruc.dk/ws/oai publications:allSDU http://heinz.sdu.dk:8080/ws/oai publications:all
2.2 AuthorityandAuxiliaryDataAuthorityandAuxiliaryDataarecollectedfortheOAIndicatorfromvarioussources.Foreachofthesesources,thecollectionisdoneonceannually.Collectionmethodanddataformatsvaryacrosssources.
2.2.1 DirectoryofOpenAccessJournals(DOAJ)DOAJisusedbytheOAIndicatorasanauthorativelistofGoldenOpenAccessJournalsaswellasthesourceofdatadescribingifthejournalrequireAPCchargesornot.Parametersofthedatacollection:
• Protocol:OAI-PMH(serverhttp://www.doaj.org/oai/)• metadataPrefix:oai_dc• Dataformat:DublinCore(http://dublincore.org/documents/dces/)• Enrichment:Per-journallookupusingRESTAPIendpoint
https://doaj.org/api/v1/journals(cf.https://doaj.org/api/v1/docs#!/CRUD_Journals/get_api_v1_journals_journal_id)
• Dataformat:JSON
2.2.2 Sherpa/Romeo(Sh/Ro)Sh/RoisusedbytheOAIndicatortodeterminethepolicyforGreenOpenAccessbyjournals,andtherebytheOpenAccesspotentialofindividualjournalarticles.Parametersofthedatacollection:
• Protocol:HTTP(GETfromhttp://www.sherpa.ac.uk/downloads/)• Dataformat:ProprietaryXML-basedformat(http://sherpa.ac.uk/news/2012-10-08-
RoMEO-API-News.html)
2.2.3 TheDanishBibliometricResearchIndicator(BFI)DatafromBFIareusedbytheOAIndicatorfortwopurposes:
• Toidentifyduplicatepublicationdataacrossuniversities(existsforcollaborativepublicationswithcoauthorsemployedatdifferentuniversitiesandthereforeregisteredinmultipleresearchdatabases)
• Toresolvepotentialconflictswrt.MainResearchAreasregisteredinthemetadataforthepublications
Parametersofthedatacollection:
6
• Protocol:HTTPS(GETfromhttps://bfi.fi.dk/AnnualReport)• Format:CompressedExcelspreadsheet–undocumentedtemplate
2.2.4 AuthorityList:AcceptedExternalRepositories(”TheWhitelist”)Forfulltextsdepositedinexternalrepositories,thisauthoritylistisusedbytheOAIndicatortoonlyallowfulltextsdepositedinacceptedexternalrepositoriestodemonstrateRealisedOpenAccessPotential.
• Protocol:Mail(fromAuthoritylistmaintainers)• Format:ExcelSpreadsheet–undocumentedtemplate
2.2.5 AuthorityList:JournalswithextendedEmbargo(”TheBlacklist”)TheauthoritylistisusedbytheOAIndicatortoreclassifyfromUnusedtounclearOpenAccessPotentialforjournalsregisteredonthelist.
• Protocol:Mail(fromAuthoritylistmaintainers)• Format:ExcelSpreadsheet–undocumentedtemplate
2.3 ThisYearsCompleteDataCollectionSummaryofthedatacollectionfortheOAIndicatorfor2016:Source Protocol Ver. Format Ver. CollectionDateAAU OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018AU OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018CBS OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018DTU OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018ITU OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018KU OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018RUC OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018SDU OAI-PMH 2.0 DDF-MXD 1.4.0 6/3–2018DOAJ OAI-PMH 2.0 DC+JSON % 6/3–2018Sh/Ro HTTP % Proprietary % 5/3–2018BFI HTTPS % Proprietary % 23/10-2017Whitelist Mail % Proprietary % 6/3-2018Blacklist Mail % Proprietary % 5/3-2018
3 Process2:DefiningtheSetofIn-ScopedPublications
7
AfterthecollectionofalldatafortheOAIndicator,anumberofactivitiesareinitiatedinordertoisolatethepublicationrecordswhichareinscopefortheOAIndicator.Notallpublicationsareinscope–onlyasubsetofthepublicationsoftheuniversities.Thescopeisdefinedas:
• Scientific,peer-reviewedarticlesandconferencecontributionspublishedinjournalsorproceedingswithISSN
Thus,thesubsetofpublicationmetadatarecordsrepresentingthisscopemustbeisolatedfromthetotalsetofpublicationmetadatacollected.Thsisdoneintwoways,inordertofacilitatestatisticsonthenationallevelandontheuniversitylevel:
• Scopedrecordsincludingduplicates–forstatisticsontheuniversitylevelForcollaborativearticlesacrossuniversities,allregistrationsfromallparticipatinguniversitiesarekept
• Scopedrecordsexcludingduplicates–forstatisticsonthenationallevelForcollaborativearticlesacrossuniversities,onlyoneregistrationiskept.
3.1 TheSetofScopedRecordsIncludingDuplicatesEachoftherequirementsinthedefinitionofthescopemapsnicelytoacorrespondingruleregardingDDF-MXDdataelementsandtheircontent.Thesetofscopedpublicationmetadatarecordsarethereforethesetthatcompliestoalltherules.Therulesaredescribedbelow.Firstofall,thesetofscopedrecordsmustrepresentrecordswithagivensubmissionyear.Initialruleistherefore:
0) Thesubmissionyear(indberetningsår)mustbemarkedupinthepublicationmetadatarecordwiththegivenvalue.Ruleapplied:Attribute/ddf_doc/@doc_yearhavethevalue(year)fortheOAindicatorcalculation
Subsequently,thefollowingfourrulesareappliedonallrecords:
1) Thetypeofthepublicationmustbemarkedupinthepublicationmetadatarecordas”JournalArticle””Reviewarticle”or”ConferenceContribution”(samedefinitionof“article”asusedbyBFI).Ruleapplied:Attribute/ddf_doc/@doc_typehasvalue“dja”,“djr”or“dcp”.
2) Thereview-statusofthepublicationmustbemarkedupinthepublicationmetadatarecordas“Peer-review”(similardemandasforBFI).Ruleapplied:Attribute/ddf_doc/@doc_reviewhasvalue“pr”.
3) Thescientificlevelofthepublicationmustbemarkedupinthepublicationmetadatarecordas“Scientific”(similardemandasforBFI).Ruleapplied:Attribute/ddf_doc/@doc_levelhasvalue“sci”
4) ThepublicationchannelofthepublicationmustbemarkedupinthepublicationmetadatarecordwithanISSN.Ruleapplied:Element/ddf_doc/publication/*/issnhasvalue.
8
3.2 TheSetofScopedRecordsExcludingDuplicatesForcollaborativepublicationsbetweentheuniversities,multiplepublicationmetadatarecordsmayrepresentthesamepublication.Asthisisimpracticalwhenproducingstatisticsonthenationallevel,asetofscopedrecordswithoutduplicatesareproduced.Thissetisproducedbyexposingthesetofscopedrecordswithduplicatestoadeduplicationprocess.Theambitionofthisprocessistoensure,thatforeachpublicationinthescopeoftheOAIndicatorandforwhichthereisatleastonerecordinthesetofscopedrecordsincludingduplicates,thereisexactlyonerecordinthesetofscopedrecordsexcludingduplicates.Thededuplicationprocescreatesclustersofrecords.Aclustercontainsrecordsthatrepresentsthesamepublication.Thefullsetofscopedrecordsexcludingduplicatesisultimatelyestablishedbyproducingonerecordpercluster.Thealgorithmforproducingclustersis:
1) RecordsthatwerepartoftheBFIcalculationforthesamesubmissionyearandwereidentifiedbytheBFIprocessasbeingduplicates,areaddedtothesamecluster
2) Recordsforwhichsignificantmetadataelements(DOI,title,subtitle,ISSN,publicationyear,etc.)matchessufficientlywell,areconsideredtorepresentthesamepublicationandareaddedtothesamecluster
ThisalgorithmrespectsBFI’sdeduplicationalgorithm:Rule(1)ensuresthatanyrecordsidentifiedbyBFIasduplicatesarealsoidentifiedbytheOAIndicatorasduplicates.ThescopeofBFIandthescopeoftheOAIndicatordiffer.Thismakesitrealisticthatothernon-BFI-scopedrecordsarepartoftheOAIndicatorscopeandareindeedduplicatestootherrecords.Rule(2)ensures,thattheserecordsareinfact(besteffort)beingfathomedintoclustersaswell.Thus,clustersmayinclude
a. OnlyrecordswhichwerepartofBFI,b. BothrecordswhichwerepartofBFIandrecordswhichwerenot,orc. OnlyrecordswhichwerenotpartofBFI.
Asubtlebutimportantremark:ForclusterscontainingBFIrecords-(a)and(b)above–theBFIrecordsclusteredbyrule(2)abovemaystemfromdifferentBFIclusters.OAIndicatorclustersmaycontainBFIrecordswhichwerenotjoinedbytheBFIdeduplicationalgorithm.ConflictResolutionTheresultsoftheOAIndicatoraredistributedonMainResearchArea(MRA).Inordertobeabletodothisdistribution,eachclustermusthaveauniqueMainResearchArea.BFI’sdefinitionofMRAisusedbytheOAIndicator:
• Science(sci)
9
• SocialScience(soc)• Humanities(hum)• Medicine(med)
AllDDF-MXDrecordscontainauniqueMRA.Forrecordsinthesetofscopedrecordsincludingduplicates,theseMRA’sareused.Forrecordsinthesetofscopedrecordsexcludingduplicates,recordsintheunderlyingclustersmaydisagreeonMRA.UsingBFIterminology,suchasituationiscalledanMRA-conflict.SuchMRA-conflictsmustberesolvedsoeachclusterhasauniqueMRA.ThealgorithmforresolvingMRA-conflictsinaclusterare:
1) IfalltherecordsinaclusterhavethesameMRA,thisisusedforthecluster(noconflict)
2) Otherwise,ifoneormoreoftherecordsintheclusterwerepartofaBFIcluster,theBFIMRAforthatclusterisused.
3) IfnoneoftherecordsintheclusterwerepartoftheBFIcalculation–orifmultiplerecordswerepartofdifferentBFIclustersdiagreeingontheirBFIMRAforthoseBFI-clusters–majoritywins:TheMRAoftheclusteristheMRArepresentedbymostoftherecordsinthecluster.
4) IftwoormoreMRA’sarerepresentedbythesamenumberofrecordsinthecluster,theMRAwiththehighestrepresentationintheentiresetofscopedrecordsischosenforthecluster.
Thisalgorithmensures,thattheOAIndicatorsolvespotentialMRA-conflictsrespectingtothelargestextendpossiblethecorrespondingMRA-conflictresolutionsdonebyBFI.
4 Process3:CalculationofOARealizationandPotential
ThecalculationofOArealisationandpotentialaredonerespectingGreenandGoldenOpenAccess.Thecalculationisdonenationally,distributedonMainResearchArea(MRA)anddistributedonuniversities.
10
TheOpenAccesspotential–andtherealisationofthat–isinitiallycalculatedperuniversity,usingaper-publicationapproachbasedonthesetofscopedrecordsincludingduplicates.Subsequently,itisalsocalculatedforthenationallevelandMRAlevel,alsousingaper-publicationapproach,butbasedonthesetofscopedrecordsexcludingduplicatesForbothsets,eachrecord/publicationbelongingtothesetisclassifiedaccordingtohowthepublicationrealiseitsOpenAccesspotential.Therearethreevaluesforthisclassifications,andtheyarecolorcodedusinggreen,yellowandred(trafficlight):
• RealisedOpenAccesspotential• UnusedOpenAccesspotential,and• UnclearOpenAccesspotential
Forsomein-scopedrecords,theclassificationincludesattemptingadownloadofafulltextregisteredintherecord.Fortechnicalreasons,theactualdownloadattemptsofallpotentialfulltextsarethefirstsubprocess.PleaserefertoAppendixAfortechnicaldetailsonhowthisisdone.Forrecords/publicationsclassifiedasRealised,thetypesofrealisationarealsodetermined.TherearefourtypesofRealised:
• GoldenOpenAccessinjournalswithAPC• GoldenOpenAccessinjournalswithoutAPC• GreenOpenAccessfromlocalrepository• GreenOpenAccessfromexternalrepository
Eachrecord/publicationmayhavemorethanonetypeofrealisation.
4.1 OpenAccessClassification–UniversityLevelForanyrecordinthesetofscopedrecordsincludingduplicates,theOpenAccesspotentialisestablishedthroughanumberofvalidationsteps.Asanoverview,theclassificationprocesscanbeillustratedasfollows:
11
Pleasenote,thatalthoughthediagramaboveindicatesthatvalidationforGoldenandGreenOpenAccesstakesplaceinparallel,theactualimplementationis,thatGoldenisvalidatedbeforeGreen.Eachofthestepsillustratedaboveareworkflowsoftheirown.Theyaredescribedindividuallybelow.
4.1.1 CheckingforGoldenOpenAccessPotentialFirst,thejournalregisteredinthepublicationmetadatarecordischeckedagainstDOAJ.Ifpresent,thepublicationisconsideredonewitha(Golden)OpenAccesspotential,andthepotentialisconsideredtobeRealised.Todeterminethetypeofrealisation,DOAJAPIisrequestedforthejournal,andJSONresponseelementapc{average_price}ischecked.Below,thiselementisreferredtoinshorthandnotation‘apc_price’.Ifapc_pricehasavaluebiggerthanzero,thetypeofrealisationisconsideredtobeGoldenwithAPC.Otherwise,itisGoldenwithoutAPC.Theassociated–simple-workflowcanbedepictedasfollows:
12
4.1.2 CheckingforGreenOpenAccessPotentialGreenOpenAccessvalidationofapublicationrecordinvolvesinspectingtheelement/ddf_doc/oa_link.Below,itwillbereferredtowiththeshorthandnotation//oa_link.Recordsmaycontainzero,oneormore//oa_linkelements.ThecombinedworflowforvalidatingGreenOpenAccessisasfollows:
13
Threedecisionsinthisworkflowhastodowithqualification.Thesethreedecisionsaremadefollowingsub-workflowsdescribedbelow.Foreachfilethatpassallthreedecisionssuccessfully,givingtherecordstatusRealised,theTypeof(GreenOpenAccess)realisationforthisfileisdetermined.Thisfourthdecisionisalsodescribedbelow.Decision1://oa_linkelementqualify?Aqualified//oa_linkelementisa//oa_linkelement
• withattribute@typehavinganacceptablevalue(”loc”forlocalor”rem”forremote”–not”doi”forDOI),and
• witha@urlattributethathasavalue.Checkingforqualificationcanbeillustratedwiththefollowingworkflow:
14
Decision2:DoesURLqualify?AqualifiedURLiseitheraURLtoalocalrepositoryoraURLtoanexternalrepositorythathasaprefix(domainnameandpotentiallyalsopath)registeredforarepositoryonthelistofacceptedexternal(/remote)repositories(theWhitelist).Checkingforqualificationcanbeillustratedwiththefollowingworkflow:
Decision3:DoesFilequalify?Aqualifiedfileisafilethat
• canbedownloadedbyacomputer• wherethecontentofthedownloadedfilehassizebiggerthanzero
Checkingforqualificationcanbeillustratedwiththefollowingworkflow:
15
Decision4:DeterminingthetypeofrealisationThetypeofrealisationisdeterminedbyattribute//oa_link/@type:
• Ifthisattributehasvalue“loc”,thetypeisGreenOpenAccessfromlocalrepository,• otherwiseitisGreenOpenAccessfromexternalrepository.
Illustratedbythefollowingworkflow:
4.1.3 CheckingforUnused&UnclearPotentialIftherecordhasnoRealisedOpenAccessPotential,therecordisexaminedtodetermineifthepotentialisUnusedorUnclear.TheOpenAccesspotentialofthepublicationisderivedfromthetheOpenAccesspotentialofthejournalregisteredinthepublicationmetadatarecord,asregisteredintheSherpa/Romeadataset(c.f.http://www.sherpa.ac.uk/romeoinfo.html).
Rulesapplied:
• IftheISSNofthejournalisregisteredinSherpa/Romeowithcolorcodegreen,blueoryellow,thejournalisconsideredonewithOpenAccessPotential,andthepublicationmetadatarecordisconsideredonewithanUnusedOpenAccesspotential.
o AnExceptiontothisruleis,iftheISSNisregisteredonthelistofacceptedjournalswithextendedembargoperiods(theBlacklist).Ifso,therecordisreclassifiedtoUnclear
16
• IfthejournalisregisteredinSherpa/Romeowithadifferentcolorcodeornotregisteredatall,thejournaldoesnothaveaclearOpenAccesspotential,andthepublicationmetadatarecordisconsideredtobeonewithanUnclearOpenAccesspotential.
Thisvalidationcanbedepictedasfollows:
4.1.4 CheckingOpenAccessPotential–CombinedThus,thecombineddecissionworkflowfordeterminingtheOpenAccesspotentialofarecordis:
17
18
4.2 OpenAccessClassification–NationalandMainResearchAreaLevelPublicationmetadatarecordsinthesetofscopedrecordsexcludingduplicatescorrespondtoclustersofoneormorerecordsfromthesetofscopedrecordsincludingduplicates.AfterclassifyingeachoftherecordsofthesetofscopedrecordsincludingduplicatesaccordingtoOpenAccesspotentialanditsrealization,clustersinheritclassificationsaccordingtoa”best-classification-wins”algorithm,usingthefollowingdecisionworkflow:
ForclustersclassifiedasRealised,thetypeofrealisationfortheclusterisalsoinheritedfromtherecordsofthecluster.Theinheritanceisdonebyunion:AnytypeofrealisationassociatedtoanyrecordintheclusterthatareclassifiedasRealised,arealsoassociatedwiththeclusterasawhole.
5 Process4:QualityAssurance
TheresultsoftheOpenAccessIndicatorhavebeensubjectedtoqualityassurance.Foradescription,pleaserefertotheOverviewdocumentation
19
6 Process5:Output
Asoutput,theOpenAccessIndicatorproduceanumberofdatareportsaswellasweb-friendlyvisualisationsofthesummationsofthese.TheDanishResearchDatabase(http://forskningsdatabasen.dk/)isusedasdisseminationplatformforthevisualisationsandthereports.
6.1 DataReportsfordownloadFivedatareportsareproduced:
1) Summations::Thesetsofscopedrecords,aggregatedanddistributedonRealized(andtypesofrealisation),UnusedandUnclearOpenAccesspotential
a. Nationaly(setofscopedrecordsexcludingduplicates)b. DistributedonMainResearchArea(setofscopedrecordsexcluding
duplicates)c. Distributedontheuniversities(setofscopedrecordsincludingduplicates)
2) Detailedfoundationfor(a)and(b):Totallistofpublicationrecordsinthesetof
scopedrecordsexcludingduplicates
3) Detailedfoundationfor(c):Totallistofpublicationrecordsinthesetofscopedrecordsincludingduplicates
4) Thelistofacceptedexternalrepositories(TheWhitelist)usedforthecalculation
5) Thelistofacceptedjournalswithextendedembargoes(TheBlacklist)usedforthecalculation
6.2 WebDisseminationviaTheDanishResearchDatabaseThesummationsoftheOpenAccessIndicatorarevisualisedonhttp://forskningsdatabasen.dk/en/open_access/overview,fromwheredatareportscanbedownloadedaswell.
20
7 AppendixA:TheFulltextDownloadSubProcessAllthefulltextsregistered(byitsURL)inthescopedsetofpublicationmetadatarecordsareattempteddownloadedinasinglesubprocess.Thissubprocessisimplementedinthefollowingway:
• Fulltextsaredownloadedonebyone(serial;notinparallel)
• Fulltextsaredownloadedina”UniversityRoundRobin”fashion:o onefulltextfromuniversity1o onefulltextfromuniversity2,o onefulltextfromuniversity3,o …,o onefulltextfromuniversityN,o onefulltextfromuniversity1,o onefulltextfromuniversity2,o …,o onefulltextfromuniversityN,o …o …
AlldownloadsaredoneautomaticallybytheOAIndicatordownloadrobot.Anyrepositoryholdingthefulltexts(eithertheresearchdatabasesoftheuniversitiesorexternalrepositories)canidentifyadownloadbytheOAIndicatorrobotby:
• IPaddress:192.38.67.38