data curation profiles aka your midterm assignment
DESCRIPTION
Data Curation Profiles AKA your midterm assignment. GRAD 521, Research Data Management Winter 2014 – Lecture 6 Amanda L. Whitmire, Asst. Professor. Roadmap. Characteristics & content of a Data Curation Profile (DCP) The components of the DCP Toolkit How these components fit together - PowerPoint PPT PresentationTRANSCRIPT
Data Curation Profiles
AKA your midterm assignment
GRAD 521, Research Data Management Winter 2014 – Lecture 6
Amanda L. Whitmire, Asst. Professor
Roadmap
1. Characteristics & content of a Data Curation Profile (DCP)
2. The components of the DCP Toolkit
3. How these components fit together
4. Discussion: your ideas for the assignment
Characteristics of the DCP
Tells “the story” of the dataFocused on a specific data set: provides depth,
not breadthInterview-basedMeant to be “discipline neutral” & widely
applicable to different types of dataModular – allows for flexibility & tailoring to
specific situations and uses
Characteristics of the DCP
Represents the researcher’s needs & perspectives
A concise, structured document suitable for sharing & annotation
A resource for librarians, archivists, IT professionals, data managers & others
DCP Sections
Information about the data & its context1. Summary of data curation needs2. Overview of the research
• Focus• Intended audience• Funding
3. Data types, formats & stages• Data narrative (data lifecycle)• Target data for sharing• Use/re-use value• Contextual narrative
Data Stage Output Typical File Size Format Other / Notes
Primary Data
Raw Sensor data 100k in 1 file per day proprietary to the sensor
FTP downloads are mostly automated.
Processing Stage 1Sensor data –open/accessible format
Roughly 6kb .csv / .xlsData are formatted into .csv before bring reformatted into a mySQL database.
Processed Data vectors 800 records per intersection per day. SQL / .xls
Data are extracted from the mySQL database for analysis purposes.
Analyzed charts/Graphs .xls / .emf charts and graphs used for
interpretation.
Published charts/graphs .ppt Data are presented via power
point.
Ancillary Data
Image Stills taken from video .gif /.jpg / .ppt Images generated from video.
Example: Data types & formats
DCP Sections
4. Intellectual property5. Organization &
description of data6. Ingest7. Sharing & access
8. Discovery9. Tools10. Interoperability11. Measuring impact12. Data management13. Preservation
Information about practices & needs
The DCP Toolkit
The Data Curation Profile Toolkit consists of 4 components:
1. User Guide2. Interviewer’s Manual3. Interview Worksheet 4. DCP Template
The User Guide
◻ Describes the rationale for the DCP
◻ Describes the process through which a DCP is generated
Stage 1 – PreparationStage 2 – Worksheet & interviewsStage 3 – Constructing the Profile
◻ Provides guidance & advice
Interview Worksheet & Manual
Meant to be used in tandem• The Interview Worksheet is
given to the researcher to fill out.
• The Interviewer’s Manual contains follow up questions for the interviewer to ask once the researcher has filled out a module.
DCP Template
Provides the structure of the Data Curation Profile
Each section or sub-section contains a brief definition of the information that is needed to populate a Data Curation Profile
Module 2 Example
InterviewWorksheet
Interview Manual
DCP Template
CompletedProfile
Module 2 Example
Stages of the DCP
Preparation
Interviewing
Connections between the components, pulling it all together
How to Develop a DCP
A Data Curation Profile is developed through 3 stages:
Stage 1 – Preparation Stage 2 – Interviews Stage 3 – Constructing the Profile
Stage 1 – Preparing
Investigate researchers’ work and use of data (e.g., a recent article/grant)
Faculty’s websiteFaculty publicationsSeminarsPress releasesReview of grants that have been awarded
Stage 1 – Preparing
Audio recording
Strongly recommendedStorage & safe-keeping of audio filesTranscription
Stage 2 – Interviewing
Introduction to the Interview
Need for two interviews?Time requiredCoverage
Stage 2 – Interviewing
Using the Interview Manual & Worksheet
1. Read any introductory statement listed in the “Interviewer’s Manual” (if any)
2. Then have the researcher complete the list of questions for the module in the “Interview Worksheet”
3. Review the responses and ask the questions listed in the “Interviewer’s Manual” as appropriate
4. Ask any follow up questions you feel are needed 5. Move on to the next module
Stage 2 – Interviewing
Types of Worksheet Questions:
Free textShort answer (text)Selecting from a range
of possible responsesYes/NoLikert Scale
Stage 2 – Interviewing
Types of Questions: Manual
Explanatory – “Tell me why you selected “x” as your response”
Clarifying – “Could you explain what you mean by “x”?
Probing – “Could you tell me more about “x”?
Relational – “Could you tell me how “x” relates to your earlier response of “y”?
Transcribing the Interview
Full:
Transcribing the Interview
Indexed
Modules & Sections of DCP
Connections b/w components
Interview
Worksheet
Interview
ManualDCP
Template
Worksheet Mod.13 Q2
Interview Worksheet
Manual: Mod 13 Q2
Connections b/w components
Section 13.1 of the Completed Data Curation Profile
Discussion
1. Modifications to the DCP2. Audio recording3. Biggest challenges?4. Realistic timeframe
Modifications to the DCP
How many of the profile modules do you want to include?
Required interview modules:Background demographic
Q’sMod.01: The datasetMod.02: Lifecycle of
datasetMod.03: SharingMod.04: AccessMod.06: Organization &
description of dataMod.12: Data Management
Optional interview modules:Mod.05: Data Xfer/ingestMod.07: DiscoveryMod.08: Intellectual
propertyMod.09: ToolsMod.10: Linking/Interop.Mod.11: Measuring impactMod.13: Data Preservation
Audio recording
Is it necessary & possible?
Would you & your interviewee be OK w/it?
Challenges?
What are you concerned about?
Realistic timeframe?
M Tu W Th F
WEEK 4
27 28 29 30 31
WEEK 5
3 4 9 10 11
WEEK 6
14 15 16 17 18
WEEK 7
21 22 23 24 25
WEEK 8
28 29 30