hands-on activity 7: metadata - github pages...this is some (fictional) information about the...
TRANSCRIPT
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
1
Hands-onActivity7:MetadataAssociatedDataONELecture:Lesson7:MetadataObjectives:Studentsconsiderthelevelofdetailthatisnecessaryformetadatatoadequatelydescribedatasets,andworkwithametadatarecord.Outcomes:(1)Studentscanexplainwhydetailedmetadataarevaluable.(2)Studentscanprovidesuggestionsforimprovingmetadatadescriptions.TimeNeeded:45minutesinclass.URLs:Morpho(https://knb.ecoinformatics.org/#tools/morpho),DataUp(http://dataup.cdlib.org/)AdditionalFilesNeeded:xlsx,zoop-temp-main.xlsx;zoop-temp.xlsxKeyReading:Borer,E.T.,Seabloom,E.W.,Jones,M.B.,Schildhauer,M.,2009.SomeSimpleGuidelinesforEffectiveDataManagement.BulletinoftheEcologicalSocietyofAmerica90,205–214.White,E.P.,E.Baldridge,Z.T.Brym,K.J.Locey,D.J.McGlinn,andS.R.Supp.2013.Ninesimplewaystomakeiteasierto(re)useyourdata.PeerJPrePrints.
NotesandInstructionsforInstructors:
Background:Planktonaremicroscopicorganismsthatformthebaseofmanyaquaticfoodwebs–fuelingthegrowthoffishandotherlargerorganisms.It’scommontosamplethemusinganetoranothercontainerthatcanbecontrolledtocollectwaterjustfromcertaindepths;soyoucanseehowplanktoncollectedatthesurface(0meters)mightbedifferentfromplanktonatanotherdepth(e.g.10metersbelowthesurface).(Formoreinformation:http://en.wikipedia.org/wiki/Phytoplanktonandhttp://en.wikipedia.org/wiki/Zooplankton.)Theyareidentifiedandcountedunderamicroscope,andusuallytheirnumbersarereportedasindividualsperliterormilliliter.Frequently,aquaticscientistscollectplanktonsamplesduring
bothday(e.g.noon)andnight(e.g.2am)becauseplanktonchangetheirdistributionsfromday
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
2
tonight,andnotallspeciesaltertheirdistributionsinthesameway.(Formoreinformation,search“dielverticalmigration”ontheweb.)Youshouldhave3(fictional)datafiles:pond2010.xlsx,zoop-temp-main.xlsx;zoop-temp.xlsx.These3fileswereallintendedtobepartofthesamestudy–theinvestigatorswantedtoexaminetheday-nightdistributionof2speciesofzooplanktonacrossmultipleyears.Thetypeofzooplanktontheystudiediscalledrotifersgenerally,andspecificallythegenusConochilus,inwhichgroupsofindividualrotiferssticktogetherincolonies(seehttp://eol.org/pages/43393/overview).Theinvestigatorsplantorepeatthisstudyforseveralmoreyears.Thefileshavesomeproblemsinhowtheyareorganized,whichyouhavealreadydiscussedinapreviousexercise.Nowlet’sthinkaboutwritingsomegoodmetadatathatdescribesthedataset.NotethatActivities1-4refertothegrayareasinthemetadatarecord,whichisfoundlaterinthisdocument.
Activity1Asindividualsorinsmallgroups,lookthroughthefilesandlocatealltheinformationthatdescribesthesedata–themetadata.Someofthisinformationisfoundinthishandout,andsomeofitiswithinthe3datasheetsprovided.Describewhereyoufoundtheinformationthatisneededtopopulatethemetadatarecord.Exampleanswer:
Lookatthecolumnheadersinallthesheets,abrieftableonzoop+temp.xlsx,andasecondworksheetonzoop+temp-main.xlsx.Sometraineesmayalsosuggestthatinformationisonlineorelsewhere–e.g.thegeographicalcoordinatesmaybeusedtolocatelakenames,andinformationabouttheorganismsmaybepublished.
Activity2Nowlet’sfocusonametadatadescriptionjustforpond2010.xlsx.Lookatthetablecontainedinthefile.Writeanappropriatetitleforthisdataset.Exampleanswer:
Therearemanygoodanswersherebutwearelookingforverydescriptivetitles,andconsiderthatkeywordscanbeusedtocomplementthetitlessothattheydon’tgettoolong!Here’sonesuggestion:SummerpopulationdensityandcolonysizeofConochilushippocrepisandConochilusunicornisatmultipleponddepthsinLittlevickPondNaturalReserve,Surrey,UKin2010
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
3
Activity3“TimePeriodofContent”representsthetimeperiodthedatawascollected.Whatdateswouldyouenter?Exampleanswer:
Looktothecolumnsfordates.5June2010–18June2010isthetimeperiodcoveredbypond2010.xlsx.InthemetadatarecordthedateswouldberepresentedasYYYYMMDD:20100605and20100618.
Activity4Whatwouldbesomeappropriatethemekeywordsforthisdataset?Wherecanyoufindhelpfordevelopingkeywords?Exampleanswer:
Again,therearemanygoodanswershere.Youmayfindthatsomeofthesametermsappearinboththetitleandkeywordssection.Wordsmightbetaxonomiclike:rotifers,zooplankton,plankton.Theymaydescribetheprocessthattheresearchersarestudyingsuchas:dielverticalmigration.TaxonomicreferencesmayincludeCowardinWetlandClassificationSystemandotherdisciplinespecifictaxonomies.PlaceKeywordthesauricouldincludeGeographicNamesIndexService(GNIS).Discussrelevanttaxonomieswithparticipants.
Activity5Takealookatthemetadatarecordinthisexercise.Notethatthereareavarietyofdomaintypes,andsomearenotedas“unrepresentable.”Whatthatmightmean?Exampleanswer:
Attributessuchastemperature,diameter,anddensityarelistedas“unrepresentable”insteadoflistingarangeofvalues(ie,10-30cm)becausethereisnoabsolutemaxandminvaluefortheattributenotedanywhere.A“percent”attributeisagoodexampleofarangedomainbecausethevaluesmustbegreaterthanorequalto0andlessthanorequalto100.
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
4
Pond2010MetadataThisissome(fictional)informationaboutthe(fictional)datasetcalledpond2010.xlsx.Thedatasetcanbeusedtofillinmetadatafieldsinaformalrecord,suchastheonebelow,butnotethattheremayalsobeadditionalimportantmetadatawithinthepond2010fileanditsrelatedfiles,zoop-temp-main.xlsxandzoop-temp.xlsx.TitleoftheDataset
Originator/DatasetAuthor AnnaSassinDanD.Lyons
Abstract Thisdatasetisoneofacollectionoffourpopulationsurveydatasetsdocumentingcolonygrowth,reproduction,andsurvivaloftworotiferspecies(ConochilusunicornisandConochilushippocrepis)atfourtimeperiodsoftheyear.Thisdatasetdescribespopulationdataforthesummerseason.SamplesofbothspeciesweretakenatLittlevickpond,Surrey,UK.Measurementstakenincludedepth,temperature,colonydensityandcolonydiameter.
Purpose DatawerecollectedtoevaluatehowtemperatureanddepthaffectthesurvivalofrotifercoloniesinpondswithintheUK.
Publication Publisher:InternationalRotiferRecoveryScienceCenterPlace:Surrey,UKPublication_Date:12/08/2012SeriesName:FourSeasonRotiferSurveyNameofIssue:SummerSurvey
Larger_Work_Citation Originator:Sassin,AnnaandLyons,Dan.D.Publication_Date:12/08/2012Title:Relationshipsbetweenpopulationandtemperature:TrackingrotifersoverthecourseoffourseasonsintheUnitedKingdom.Publisher:RotiferConservationPlace:UKVolume;Issue;Pages:4(2):325-340
TimePeriodofContent BeginDate:EndDate:
CurrentnessReference GroundCondition
Progress/status:
Complete
Maintenance_and_Update_Frequency
Noneplanned
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
5
Geographiccoverage LittlevickPondNaturalReserve,Surrey,UK.Bounding_Coordinates:
West_Bounding_Coordinate:-0.92456818028327East_Bounding_Coordinate:0.371818538415North_Bounding_Coordinate:51.511581803063South_Bounding_Coordinate:50.808817656094
Keywords(theme)
Keywords(place)
SurreyUKInternationalLittlevickPondNaturalReserve
Keywords(temporal) summer,JuneDataAccess_Constraints Nolegalorpolicyrestrictionforaccessingthisdataset.DataUse_Constraints:
Mustproperlyciteoriginatorifusedinpublications,reports,presentations,etc.PleasecitedatasetaccordingtoDataCite.orgstandards
Contact_Person_Primary:
Contact_Person:TadPohl(Datasteward)Contact_Organization:InternationalRotiferRecoveryScienceCenterAddress:5638IndependenceWayCity:GuildfordState_or_Province:Surrey,UKContact_Telephone:+44(0)888-8888
Data_Set_Credit FundingwasprovidedbyInternationalRotiferFoundationAnalytical_Tools SAS,R,MatLabData_Quality_Information Attribute_Accuracy_Report
Temperatureinstrumentwastestedandcalibratedforaccuracybeforeeachsampling.DensityandcolonycountswereconductedaccordingtotheStandardPlateCountprocedure.Countswereconductedbytwodatacounters.Eachtechnicianscountwasverifiedbythesecondtechnician.Countingaccuracywasfoundtobe95%accurate.
Completeness_Report Thedatasetisgenerallycompletealthoughthetemperatureforonesampledepthcouldnotberecordedduetoinstrumentmalfunction.Colonyanddensitycountsarealsomostlycompleteexceptfortwoinstanceswherethedataismissingandisthereforeunknown.Statisticalsummary(boxplot)ofthedatawasperformedandnooutstandingoutliersorpotentiallyerroneousvalueswerefound.
Positional_Accuracy: PositionalAccuracywasnotassessedProcess_Step: Datawascollectedby2peoplethefirstweekandbythesame2
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
6
Process_Description:
peoplethefollowingweek.Watersamplesandtemperatureweretakenatfivedifferentdepths.Inordertoaccountforvariabilityinsamplemeasurements,6watersamplesweretakenateachdepth.These6sampleswerelaterrandomlydividedintotwoevengroupsofthree.Thetwogroupswererandomlyassignedarotiferspeciesnamewherebydatacounterswouldperformthedensityandcolonycountsfortheparticularspecies.
EntityandAttributeInformation
Detailed_DescriptionEntity_Type
Entity_Type_Label:pond2010.xlsxEntity_Type_Definition:Rotiferpopulationsurveyatvariousdepthsandtemperature
Attribute
Attribute_Label:zAttribute_Definition:DepthincentimetersfromthesurfaceAttribute_Domain_Values:Enumerated_Domain:Enumerated_Domain_Value:0.5Enumerated_Domain_Value_Definition:0.5cmbelowsurfaceEnumerated_Domain_Value:5Enumerated_Domain_Value_Definition:5cmbelowsurfaceEnumerated_Domain_Value:10Enumerated_Domain_Value_Definition:10cmbelowsurfaceEnumerated_Domain_Value:25Enumerated_Domain_Value_Definition:25cmbelowsurfaceEnumerated_Domain_Value:50Enumerated_Domain_Value_Definition:50cmbelowsurface
Attribute
Attribute_Label:TemperatureAttribute_Definition:TemperatureofwaterinCelsiusAttribute_Domain_Values:Unrepresentable_Domain
Attribute
Attribute_Label:DensityAttribute_Definition:NumberofindividualspercolonyAttribute_Domain_Values:Unrepresentable_Domain
Attribute
Attribute_Label:ColonyDiameterAttribute_Definition:LengthoflongestcolonydiameterinmillimetersAttribute_Domain_Values:Unrepresentable_Domain
Attribute
Attribute_Label:SpeciesAttribute_Definition:RotiferspeciesAttribute_Domain_Values:Enumerated_Domain_Value:cuni
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
7
Enumerated_Domain_Value_Definition:ConochilusunicornisEnumerated_Domain_Value:chippoEnumerated_Domain_Value_Definition:Conochilushippocrepis
DistributionInformation DistributorContact_InformationContact_Organization_Primary
Contact_Organization:RotiferNetworkforBiocomplexity(RNB)Contact_Person:MetadataCoordinatorAddress:6534BiodataWayCity:NovelJerseyState_or_Province:NewJerseyPostal_Code:97564Contact_Voice_Telephone:555-555-1034Contact_Email:[email protected]
Distribution_Liability
TheRotiferNetworkforBiocomplexity(RNB)shallnotbeheldliableforimproperorincorrectuseofthedatadescribedand/orcontainedherein.Itistheresponsibilityofthedatausertousethedataappropriatelyandconsistentwithinthelimitationsofthedata.
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
8
StudentInstructions:
Background:Planktonaremicroscopicorganismsthatformthebaseofmanyaquaticfoodwebs–fuelingthegrowthoffishandotherlargerorganisms.It’scommontosamplethemusinganetoranothercontainerthatcanbecontrolledtocollectwaterjustfromcertaindepths;soyoucanseehowplanktoncollectedatthesurface(0meters)mightbedifferentfromplanktonatanotherdepth(e.g.10metersbelowthesurface).
(Formoreinformation:
http://en.wikipedia.org/wiki/Phytoplanktonand
http://en.wikipedia.org/wiki/Zooplankton.)
Theyareidentifiedandcountedunderamicroscope,andusuallytheirnumbersarereportedasindividualsperliterormilliliter.
Frequently,aquaticscientistscollectplanktonsamplesduringbothday(e.g.noon)andnight(e.g.2am)becauseplankton
changetheirdistributionsfromdaytonight,andnotallspeciesaltertheirdistributionsinthesameway.(Formoreinformation,search“dielverticalmigration”ontheweb.)
Youshouldhave3(fictional)datafiles:pond2010.xlsx,zoop-temp-main.xlsx;zoop-temp.xlsx.
These3fileswereallintendedtobepartofthesamestudy–theinvestigatorswantedtoexaminetheday-nightdistributionof2speciesofzooplanktonacrossmultipleyears.Thetypeofzooplanktontheystudiediscalledrotifersgenerally,andspecificallythegenusConochilus,inwhichgroupsofindividualrotiferssticktogetherincolonies(seehttp://eol.org/pages/43393/overview).Theinvestigatorsplantorepeatthisstudyforseveralmoreyears.
Thefileshavesomeproblemsinhowtheyareorganized,whichyouhavealreadydiscussedinapreviousexercise.Nowlet’sthinkaboutwritingsomegoodmetadatathatdescribesthedataset.NotethatActivities1-4refertothegrayareasinthemetadatarecord,whichisfoundlateroninthisdocument.
Activity1Asindividualsorinsmallgroups,lookthroughthefilesandlocatealltheinformationthatdescribesthesedata–themetadata.Someofthisinformationisfoundinthishandout,andsomeofitiswithinthe3datasheetsprovided.Describewhereyoufoundtheinformationthatisneededtopopulatethemetadatarecord.
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
9
Activity2Nowlet’sfocusonametadatadescriptionjustforpond2010.xlsx.Lookatthetablecontainedinthefile.Writeanappropriatetitleforthisdataset.
Activity3“TimePeriodofContent”representsthetimeperiodthedatawascollected.Whatdateswouldyouenter?
Activity4Whatwouldbesomeappropriatethemekeywordsforthisdataset?Wherecanyoufindhelpfordevelopingkeywords?
Activity5Takealookatthemetadatarecordinthisexercise.Notethatthereareavarietyofdomaintypes,andsomearenotedas“unrepresentable.”Whatthatmightmean?
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
10
Pond2010MetadataThisissome(fictional)informationaboutthe(fictional)datasetcalledpond2010.xlsx.Thedatasetcanbeusedtofillinmetadatafieldsinaformalrecord,suchastheonebelow,butnotethattheremayalsobeadditionalimportantmetadatawithinthepond2010fileanditsrelatedfiles,zoop-temp-main.xlsxandzoop-temp.xlsx.TitleoftheDataset
Originator/DatasetAuthor AnnaSassinDanD.Lyons
Abstract Thisdatasetisoneofacollectionoffourpopulationsurveydatasetsdocumentingcolonygrowth,reproduction,andsurvivaloftworotiferspecies(ConochilusunicornisandConochilushippocrepis)atfourtimeperiodsoftheyear.Thisdatasetdescribespopulationdataforthesummerseason.SamplesofbothspeciesweretakenatLittlevickpond,Surrey,UK.Measurementstakenincludedepth,temperature,colonydensityandcolonydiameter.
Purpose DatawerecollectedtoevaluatehowtemperatureanddepthaffectthesurvivalofrotifercoloniesinpondswithintheUK.
Publication Publisher:InternationalRotiferRecoveryScienceCenterPlace:Surrey,UKPublication_Date:12/08/2012SeriesName:FourSeasonRotiferSurveyNameofIssue:SummerSurvey
Larger_Work_Citation Originator:Sassin,AnnaandLyons,Dan.D.Publication_Date:12/08/2012Title:Relationshipsbetweenpopulationandtemperature:TrackingrotifersoverthecourseoffourseasonsintheUnitedKingdom.Publisher:RotiferConservationPlace:UKVolume;Issue;Pages:4(2):325-340
TimePeriodofContent BeginDate:EndDate:
CurrentnessReference GroundCondition
Progress/status:
Complete
Maintenance_and_Update_Frequency
Noneplanned
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
11
Geographiccoverage LittlevickPondNaturalReserve,Surrey,UK.Bounding_Coordinates:
West_Bounding_Coordinate:-0.92456818028327East_Bounding_Coordinate:0.371818538415North_Bounding_Coordinate:51.511581803063South_Bounding_Coordinate:50.808817656094
Keywords(theme)
Keywords(place)
SurreyUKInternationalLittlevickPondNaturalReserve
Keywords(temporal) summer,JuneDataAccess_Constraints Nolegalorpolicyrestrictionforaccessingthisdataset.DataUse_Constraints:
Mustproperlyciteoriginatorifusedinpublications,reports,presentations,etc.PleasecitedatasetaccordingtoDataCite.orgstandards
Contact_Person_Primary:
Contact_Person:TadPohl(Datasteward)Contact_Organization:InternationalRotiferRecoveryScienceCenterAddress:5638IndependenceWayCity:GuildfordState_or_Province:Surrey,UKContact_Telephone:+44(0)888-8888
Data_Set_Credit FundingwasprovidedbyInternationalRotiferFoundationAnalytical_Tools SAS,R,MatLabData_Quality_Information Attribute_Accuracy_Report
Temperatureinstrumentwastestedandcalibratedforaccuracybeforeeachsampling.DensityandcolonycountswereconductedaccordingtotheStandardPlateCountprocedure.Countswereconductedbytwodatacounters.Eachtechnicianscountwasverifiedbythesecondtechnician.Countingaccuracywasfoundtobe95%accurate.
Completeness_Report Thedatasetisgenerallycompletealthoughthetemperatureforonesampledepthcouldnotberecordedduetoinstrumentmalfunction.Colonyanddensitycountsarealsomostlycompleteexceptfortwoinstanceswherethedataismissingandisthereforeunknown.Statisticalsummary(boxplot)ofthedatawasperformedandnooutstandingoutliersorpotentiallyerroneousvalueswerefound.
Positional_Accuracy: PositionalAccuracywasnotassessedProcess_Step: Datawascollectedby2peoplethefirstweekandbythesame2
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
12
Process_Description:
peoplethefollowingweek.Watersamplesandtemperatureweretakenatfivedifferentdepths.Inordertoaccountforvariabilityinsamplemeasurements,6watersamplesweretakenateachdepth.These6sampleswerelaterrandomlydividedintotwoevengroupsofthree.Thetwogroupswererandomlyassignedarotiferspeciesnamewherebydatacounterswouldperformthedensityandcolonycountsfortheparticularspecies.
EntityandAttributeInformation
Detailed_DescriptionEntity_Type
Entity_Type_Label:pond2010.xlsxEntity_Type_Definition:Rotiferpopulationsurveyatvariousdepthsandtemperature
Attribute
Attribute_Label:zAttribute_Definition:DepthincentimetersfromthesurfaceAttribute_Domain_Values:Enumerated_Domain:Enumerated_Domain_Value:0.5Enumerated_Domain_Value_Definition:0.5cmbelowsurfaceEnumerated_Domain_Value:5Enumerated_Domain_Value_Definition:5cmbelowsurfaceEnumerated_Domain_Value:10Enumerated_Domain_Value_Definition:10cmbelowsurfaceEnumerated_Domain_Value:25Enumerated_Domain_Value_Definition:25cmbelowsurfaceEnumerated_Domain_Value:50Enumerated_Domain_Value_Definition:50cmbelowsurface
Attribute
Attribute_Label:TemperatureAttribute_Definition:TemperatureofwaterinCelsiusAttribute_Domain_Values:Unrepresentable_Domain
Attribute
Attribute_Label:DensityAttribute_Definition:NumberofindividualspercolonyAttribute_Domain_Values:Unrepresentable_Domain
Attribute
Attribute_Label:ColonyDiameterAttribute_Definition:LengthoflongestcolonydiameterinmillimetersAttribute_Domain_Values:Unrepresentable_Domain
Attribute
Attribute_Label:SpeciesAttribute_Definition:RotiferspeciesAttribute_Domain_Values:Enumerated_Domain_Value:cuni
Hands-on Exercises for Data Management http://www.dataone.org/education-modules
13
Enumerated_Domain_Value_Definition:ConochilusunicornisEnumerated_Domain_Value:chippoEnumerated_Domain_Value_Definition:Conochilushippocrepis
DistributionInformation DistributorContact_InformationContact_Organization_Primary
Contact_Organization:RotiferNetworkforBiocomplexity(RNB)Contact_Person:MetadataCoordinatorAddress:6534BiodataWayCity:NovelJerseyState_or_Province:NewJerseyPostal_Code:97564Contact_Voice_Telephone:555-555-1034Contact_Email:[email protected]
Distribution_Liability
TheRotiferNetworkforBiocomplexity(RNB)shallnotbeheldliableforimproperorincorrectuseofthedatadescribedand/orcontainedherein.Itistheresponsibilityofthedatausertousethedataappropriatelyandconsistentwithinthelimitationsofthedata.