niso webinar: authority control: are you who we say you are?

77
NISO Webinar Authority Control: Are You Who We Say You Are? Wednesday, February 11, 2015 Speakers: Simeon Warner, Director of Repository Development, Cornell University Library Laura Dawson, Product Manager, ProQuest Thomas Hickey, Chief Scientist, OCLC http://www.niso.org/news/events/2015/webinars/authority_control /

Upload: national-information-standards-organization-niso

Post on 15-Jul-2015

1.335 views

Category:

Education


0 download

TRANSCRIPT

Page 1: NISO Webinar:  Authority Control: Are You Who We Say You Are?

NISO Webinar Authority Control:

Are You Who We Say You Are?

Wednesday, February 11, 2015

Speakers:

Simeon Warner, Director of Repository Development, Cornell University Library

Laura Dawson, Product Manager, ProQuest

Thomas Hickey, Chief Scientist, OCLC

http://www.niso.org/news/events/2015/webinars/authority_control/

Page 2: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ORCID identifiers in research

workflows

Simeon Warner, Cornell University Library

with thanks to

Laure Haak, ORCID Executive Director and

Josh Brown, ORCID Regional Director, Europe

for slides and comments

NISO Webinar:

Authority Control: Are You Who We Say You Are?

February 11, 2015

Page 3: NISO Webinar:  Authority Control: Are You Who We Say You Are?

“Use ORCID iDs in research

workflows to solve name

ambiguity and save everyone

a bunch of effort!”

Page 4: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ORCID background

• open - anyone can register, any organization with interest in

research and scholarly communications can join, iDs intended

for reuse, software open source

• non-profit - incorporated in USA, also ORCID EU

• community-driven - where community includes all sectors of

research process including publishers, funders, universities,

and the researchers themselves

two core functions:

1. a registry of unique identifiers and manage a record of

activities

2. APIs that support system-to-system communication and

authentication

see: http://orcid.org/content/initiative

Page 5: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ORCID status and adoption

A little over 2 years since launch, over 1.1M ids created,

over 190 members from all sectors and around the world.

-

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

Oct

No

vD

ec Jan

Feb

Mar

Ap

rM

ay Jun

Jul

Au

gSe

pO

ctN

ov

Dec Jan

Feb

Mar

Ap

rM

ayJu

ne

July

Au

g

Creator

Website

Trusted Party

2012 2013 2014

Publishing25%

Universities & Research

Orgs45%

Funders7%

Associations

12%

Repositories & Profile

Sys11%

EMEA35%

Americas

50%

AsiaPac15%

Page 6: NISO Webinar:  Authority Control: Are You Who We Say You Are?

National integrations and membership

http://openaccess.blogg.kb.se/2013/01/30/slutrapport-fran-projekt-forfattarindentifikatorer/

http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/researchinformation/orcid.aspx

http://orcid.org/blog/2014/09/03/denmark-adopts-orcid-consortium-approach-orcid-implementation

http://orcidpilot.jiscinvolve.org/wp/

Page 7: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ORCID Scope

ORCID = Open RESEARCHER AND CONTRIBUTOR Identifier

o Research activities

o Living people

o There are fewer researchers than the scope of people and

personas covered by ISNI or VIAF

CONTRIBUTOR -- ORCID intended to be used for the spectrum of

actors in the research process, not just authors, and records roles.

o Already supports roles like translator, principal investigator

o 2012 Harvard Workshop http://projects.iq.harvard.edu/attribution_workshop/home

o 2014 Project CRediT Workshop http://www.eventbrite.ca/e/project-credit-workshop-tickets-10314211083

Page 8: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Researcher driven

Creation methods:

• integrations dominate

• website second

• institutional creation

Researcher must be involved to create or activate the ORCID iD,

and can control the privacy settings and/or add information.

Recommend institutions use the trusted party creation method

rather then direct record creation. Need to connect with and

educate users anyway. Can pre-populate registration fields.

-

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

Oct

No

vD

ec Jan

Feb

Mar

Ap

rM

ay Jun

Jul

Au

gSe

pO

ctN

ov

Dec Jan

Feb

Mar

Ap

rM

ayJu

ne

July

Au

g

Creator

Website

Trusted Party

2012 2013 2014

Page 9: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Leveraging ISNI Organization IDs

ORCID uses Ringgold (an ISNI registrar) organization list to support

connection between individuals and education and employment

affiliations.

Page 10: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Leveraging FundRef identifiers

Funding agency list coordinated with FundRef

Auto-complete based

on FundRef data

Page 11: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Integration of ORCID iDs in research

workflows

Page 12: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Publication round trip

ORCID iDs are intended to be integrated into research and

publication workflows, and become embedded in the

metadata. ORCID iDs will thus be associated with new

works at the time of publication.

ORCID

record

Manuscript

SubmissionORCID

record

ORCID

recordReview

Publication

w DOI &

ORCID(s)

CrossRef

DOI assignment

Verified ORCID, update permission

Readers

Page 13: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Round trip process and implications

Publisher captures ORCID iD during manuscript submission

o Authenticated process, no mistyping, accurate

o User may grant permission to add works later

Publisher includes ORCID iD in metadata when minting DOI

o Will be available to support discovery

o Available in CrossRef search

Publisher/CrossRef writes metadata back to ORCID record

o Holder notified, can control visibility

o Saves effort updating record

o Information flow to other systems such as local profile (e.g.

I've linked my ORCID record with my VIVO profile)

Similar process for datasets, mediated by DataCite

ref: http://orcid.org/blog/2014/11/21/new-functionality-friday-auto-update-your-orcid-record

Page 14: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Funder workflow

• Use for applicants and reviewers

• Profile data reduces applicant/grantee form filling burden

• Improve reporting accuracy

• Pull publications, datasets and other works based on ORCID iD

ref: http://support.orcid.org/knowledgebase/articles/426596-orcid-funder-workflow

Page 15: NISO Webinar:  Authority Control: Are You Who We Say You Are?

An ounce of ambiguity avoidance is worth a

pound of disambiguation

-- with apologies to Benjamin Franklin

• Workflow integration avoids name ambiguity at source

• Resulting data good for disambiguation of older data

• Resulting data good for compilation of authority records

Page 16: NISO Webinar:  Authority Control: Are You Who We Say You Are?

“How much information should my

ORCID record have?”

Page 17: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Minimal record

Registration is really quick and

easy, 30 seconds perhaps

1. name

2. email

3. password

4. agree to privacy policy and

conditions

A minimal ORCID record that is

enough to get an iD and use it in

research workflows

Page 18: NISO Webinar:  Authority Control: Are You Who We Say You Are?
Page 19: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Helpful ORCID record

Reasons to add a little more information:

1. Provide enough information so that someone who follows a

link to your record, or searches for you, can understand which

"John Smith" you are

o alternate names

o education and employment information

o a few works. Everyone likes to show off their best work …

o opens the door for disambiguation of existing data

1. Provide other identifiers so that ORCID can act as a

switchboard to connect your identities in different systems.

o local profile id (e.g. my VIVO id at Cornell)

o Scopus Author ID, Researcher ID, ISNI

o (Using the search and link wizards that connect to these

other systems is also the easiest way to add works.)

Page 20: NISO Webinar:  Authority Control: Are You Who We Say You Are?
Page 21: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Expansive ORCID record

There are many import wizards which not only allow

o connection of an ORCID record to other identifiers

o also import of works, grants, etc..

o source is recorded and provides way to assess trust

ORCID registry has facilities for users to enter works themselves,

specify their roles, etc..

ORCID UI groups information about the same work from multiple

sources

o user may select preferred one to display

You may make your ORCID record a complete picture research

contributions if you choose. But a complete record isn't necessary

for ORCID to work.

Page 22: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ORCID as a hub identifier

Page 23: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ORCID is a hub

Other Identifiers

Funders

Higher Education

and Employers

Professional Associations

Repositories

Publishers

The ORCID identifier

connects researchers

with their works

(papers, grants,

datasets, and more),

organizations, and

other identifiers.

ORCID APIs enable data

exchange between

research information

systems.

DOI

DOI

ISBN

Thesis ID

ISNI

Researcher ID

Scopus Author ID

Internal identifiers

Member ID

Abstract ID

Member ID

Abstract ID

FundRef

GrantID

Page 24: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Hub identifier linking to other

identifiers and to profiles in

other systems

Page 25: NISO Webinar:  Authority Control: Are You Who We Say You Are?

… and data in machine form too

$ curl –H “Accept: application/orcid+xml”

“http://pub.orcid.org/0000-0002-7970-7855/orcid-bio”

| grep external-id-url

<external-id-url>

http://isni.org/isni/0000000351311901

</external-id-url>

<external-id-url>

http://vivo.cornell.edu/individual/individual24416

</external-id-url>

<external-id-url>

http://www.researcherid.com/rid/E-2423-2011

</external-id-url>

<external-id-url>

http://www.scopus.com/inward/authorDetails.url?authorID=7103063073&amp;p

artnerID=MN8TOARS

</external-id-url>

Page 26: NISO Webinar:  Authority Control: Are You Who We Say You Are?
Page 27: NISO Webinar:  Authority Control: Are You Who We Say You Are?
Page 28: NISO Webinar:  Authority Control: Are You Who We Say You Are?
Page 29: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Thanks for listening!

Page 30: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Pointers

Register at https://orcid.org/register if you haven’t already!

http://orcid.org/

• Research organizations: http://orcid.org/organizations/institutions

• Publishers: http://orcid.org/organizations/publishers

• Associations: http://orcid.org/organizations/associations

• Funders: http://orcid.org/organizations/funders

• Researchers: http://orcid.org/content/initiative

Membership http://orcid.org/about/membership

• Questions: [email protected]

Blog http://orcid.org/category/newsletter/blog

Slides: http://www.slideshare.net/simeonwarner/orcid-identifiers-in-research-workflows

Page 31: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ISNI

Disambiguating Public Identities

Page 32: NISO Webinar:  Authority Control: Are You Who We Say You Are?

What Is ISNI

• ISO Standard, published in 2012

• International Standard Name Identifier

• Numerical representation of a name

– 16 digits

– Assigned to public figures, contributors of content –

researchers, authors, musicians, actors, publishers,

research institutions – and subjects of that content (if

they are people or institutions).

– Example: 0000 0004 1029 5439

Page 33: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Who is ISNI

• Founding members

– IFRRO (International Federation of Reproduction Rights Organizations)

– CISAC (International Confederation of Authors and Composers Societies)

– SCAPR (Societies’ Council for the Collective Management of Performers’ Rights)

– OCLC

– CENL (Conference of European National Librarians), represented by the British Library and the National Library of France

– ProQuest, represented by Bowker

Page 34: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Members

Quality Team

Board of Directors

ISNI Organizational Structure

Registration Agencies

Ongoing

assignments/

general public

Page 35: NISO Webinar:  Authority Control: Are You Who We Say You Are?

How Does ISNI Registration Work

• Publisher submits names for assignment through a Registration Agency

• RA works with the publisher to ensure the data feed is well-formatted, and sends that feed to the Assignment Agency

• AA assigns as many ISNIs to the names in the feed as it can, using complex algorithms and business rules that evolve with each feed

• AA returns a file of names with ISNIs attached to them

– This may not be the full file of names

– Ambiguous names are held for review by Quality Team

– QT assignments and other exceptions (assignments as a result of improvements to the algorithm) are returned to RA quarterly

– Process is not instant. Assignment may be immediate if the name and other information is unique, but frequently assignments take a week or two.

Page 36: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Stage One

Customer submits data to Registration Agency

Registration Agency sends file to Assignment Agency

Assignment Agency assigns as many ISNIs to the names as it can

Page 37: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Stage Two

Assignment Agency sends assigned file to

Registration Agency

Registration Agency sends assigned file to

Customer

Customer reviews, QAs, ingests

Page 38: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Stage Three

Assignment Agency sends updates on a monthly basis

Registration Agency disperses files to appropriate

Customers

Customers ingest updates

Page 39: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Display

• Only minimal metadata is displayed

• Not meant as a comprehensive profile

• ISNI is a tool for linking data sets, collocation, and

disambiguation

• Enhancements to the record can be made but not

required

Page 40: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Sample Public ISNI Record

Page 41: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Bridge identifier linking disparate data sets

ISNI links

41

Page 42: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Who is using ISNIs?

• Wikipedia/Wikidata

• VIAF

• Access Copyright

• Scholar Universe

• British Library

• JISC

• Musicbrainz

• Macmillan (Digital Science)

• Booknet Canada (piloting)

• Authors Guild (piloting)

• Books in Print ONIX 2.1 extracts (sent to Google, B&N, Chegg and others)

Page 43: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Einstein’s Wikipedia Page

Page 44: NISO Webinar:  Authority Control: Are You Who We Say You Are?

How many names in the ISNI database?

• Over 8,000,000 assigned

• 10,112,931 provisional (awaiting a match from another

data set for corroboration)

• Your author names may well already have ISNIs.

http://www.isni.org/search.

Page 45: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Use Case: Publisher

Page 46: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Use Case: Research Institution

Page 47: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Use Case: University

Page 48: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Use Case: Cross-Domain Linking

Page 49: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Use Case: Cross-Domain Linking

Page 50: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Data Quality

• Based on matching names to existing records in

database (over 17 million names)

• Strict criteria for assigning ISNIs to names

• Quality team oversight (manual edits)

– British Library

– National Library of France

– OCLC

50

Page 51: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Assignment Criteria

• If on the common surname list:

– Birth date

– Death date

– ISBN(s)

– Title(s)

– Co-authors or institutional affiliation

• If not on the common surname list

– Title(s)

– Birth date

– Death date

– Any other distinguishing factors (“is not”)

• If unique

– Immediate assignment

51

Page 52: NISO Webinar:  Authority Control: Are You Who We Say You Are?

ISNI and ORCID

• ORCID numbers are a subset of the numbers in ISNI’s

database

• Working towards alignment, with ultimate goal of single

assignment

• There is ISNI representation on the ORCID Technical

Steering Group, and ORCID representation on the ISNI

Technical Committee

• A researcher may have both an ORCID and an ISNI

52

Page 53: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Do You Have An ISNI?

53

Page 55: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Thomas Hickey

Chief Scientist, OCLC Research

2015 February

NISO Webinar on Authority Control

VIAF Relations

VIAF

Page 56: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Virtual International Authority File

• Grew out of collaboration with national libraries

• Implemented and run by OCLC

• VIAF Council helps oversee it

• ~36 files, mainly from national authority files

• Everything libraries control other than topical subject headings is in scope– Personals, corporates, families

– Jurisdictionals, geographics

– Works, expressions

– Imaginary characters, etc.

56

Page 57: NISO Webinar:  Authority Control: Are You Who We Say You Are?

57

Page 58: NISO Webinar:  Authority Control: Are You Who We Say You Are?

58

Page 59: NISO Webinar:  Authority Control: Are You Who We Say You Are?

59

Page 60: NISO Webinar:  Authority Control: Are You Who We Say You Are?

60

Page 61: NISO Webinar:  Authority Control: Are You Who We Say You Are?

61

Page 62: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Why multiple files?

• Different

– Information collected

• Private vs. public

• Identification vs. comprehensive

– Technologies and systems

• APIs

– Time scales

• Batch vs. interactive creation

• Historical vs. contemporary

– Business models

62

Page 63: NISO Webinar:  Authority Control: Are You Who We Say You Are?

VIAF’s characteristics

• Origins

• What is being identified

• Who creates it

• Range of entities

• Priorities and control

• What can be shared

Library authorities

Entities libraries control

Library staff

Very broad

Libraries

Open

63

Page 64: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Relationship with ISNI

• Both systems run by OCLC– VIAF helped get ISNI started

• Problems– Each absorbs the other’s data

– Feedback loops!

• Who’s in charge?– ISNI now indicates reviewed records

• Relationships treated as though from xA

• Can both merge and split VIAF clusters

Page 65: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Wikipedia & Wikidata

Page 66: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Wikipedia & Wikidata

Page 67: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Wikipedia & Wikidata

Page 68: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Wikipedia & Wikidata

Page 69: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Wikipedia & Wikidata

Page 70: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Relationship with Wikipedia

• VIAF Harvests Wikipedia dumps monthly

• Pages about people that are in VIAF are added

• VIAFbot back loaded links into Wikipedia

– http://en.wikipedia.org/wiki/User:VIAFbot

Page 71: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Relationship with WorldCat

• One of the main uses of VIAF internally at OCLC is controlling names

• Multilingual Bibliographic Structure project

• Generate ‘xR’ authority records

– Works

– Expressions

Page 72: NISO Webinar:  Authority Control: Are You Who We Say You Are?

OCLC Production Services

External OCLC Research Systems

Internal OCLC Research Resources

enhancedWorldCat

Kindred Works

Classify

Identities

FictionFinder

Cookbook Finder

LCSH

FAST

VIAF

GMGPC

Linked Data Entities

WORKSGSAFD

GTT

DDC

LCTGMMeSH

Page 73: NISO Webinar:  Authority Control: Are You Who We Say You Are?

enhancedWorldCat

WORKSxRSandbox

Multi-lingualBib Records

VIAF

FRBRClustering

Page 74: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Unexpected interactions

• Drive towards comprehensiveness– More information about entities

– More entities

• Importing other files

• Keeping up with updates

• Recognizing source of information

• What to trust

• How to leverage limited staff

Page 75: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Thank you

Page 76: NISO Webinar:  Authority Control: Are You Who We Say You Are?

NISO Webinar • February 11, 2015

Questions?All questions will be posted with presenter answers on

the NISO website following the webinar:

http://www.niso.org/news/events/2015/webinars/authority_control/

NISO Webinar

Authority Control:

Are You Who We Say You Are?

Page 77: NISO Webinar:  Authority Control: Are You Who We Say You Are?

Thank you for joining us today.

Please take a moment to fill out the brief online survey.

We look forward to hearing from you!

THANK YOU