send me a disk, ok?
DESCRIPTION
Send Me a Disk, Ok?. -Sharing Genealogical Information With Your Relatives. Beau Sharbrough [email protected] PO Box 3170 Grapevine TX 76099-3170. Thank you. To the CIG. I’m grateful for the invitation to be here. - PowerPoint PPT PresentationTRANSCRIPT
Send Me a Disk, Ok?
-Sharing Genealogical Information -Sharing Genealogical Information With Your RelativesWith Your Relatives
Beau [email protected]
PO Box 3170Grapevine TX 76099-3170
Thank you.
To the CIG. I’m grateful for the invitation To the CIG. I’m grateful for the invitation to be here.to be here.
To Russ and Birdie Holsclaw. They took To Russ and Birdie Holsclaw. They took care of me the past three days, sharing their care of me the past three days, sharing their home, their cars, their community, and their home, their cars, their community, and their son Will.son Will.
To Roger Ebert. I was starting to worry To Roger Ebert. I was starting to worry about about mymy weight. weight.
General Topics Five steps to Five steps to
understanding what understanding what they’re sayingthey’re saying
Discussion of software Discussion of software developers’ methods of developers’ methods of merging filesmerging files
Significance of Significance of GENTECH GENTECH Genealogical Data Genealogical Data ModelModel
Five Steps to Combining Your Research
Step 1. Determine What Form the Data Is in. Which program do they use?Which program do they use? What type of disk drives do What type of disk drives do
they have?they have? What general field usage have What general field usage have
they adopted?they adopted?
Step 2. Exchange Pedigree and Group Sheet Examples. Look for detail, accuracy, thoroughness.Look for detail, accuracy, thoroughness. Are there full or partial dates?Are there full or partial dates? Do the citations for US places include Do the citations for US places include
counties? Streets? Cemetery names?counties? Streets? Cemetery names? Are nicknames used in place of “real” Are nicknames used in place of “real”
names?names? Are sources cited?Are sources cited?
Step 3. Agree on Usage of Fields. RESIdes or ADDRess?RESIdes or ADDRess? Will you both use CHRIsten?Will you both use CHRIsten? How will you document sources?How will you document sources? How will you document the research of How will you document the research of
others?others?
Step 4. Convert Your Information. Nobody Can Avoid This Step. Agree with your relative what Agree with your relative what
information you will convert and howinformation you will convert and how Normally, this means saying things like, Normally, this means saying things like,
"I’ll put in the counties after I get it from "I’ll put in the counties after I get it from you"you"
Step 5. Exchange Only the Individuals You Want. NEVERNEVER just import the whole family on just import the whole family on
top of the information you already have. top of the information you already have. Computer routines for merging data are Computer routines for merging data are
improving, but not complete or effective improving, but not complete or effective yet.yet.
There Are No Effective Routines for Merging Data Sets at Present.
The problems of …The problems of … IdentityIdentity merging methods andmerging methods and data formatsdata formats
… … are too new for generalized solutions to be are too new for generalized solutions to be available in the marketplaceavailable in the marketplace
Good theoretical solutions don’t even existGood theoretical solutions don’t even exist
Merging Data Sets
Customers who just assume that someone will know what they want and have it ready when they recognize that need had parents that spoilt them rotten.
WHY?
Family history record-keeping is Family history record-keeping is increasingly becoming a digital process. increasingly becoming a digital process.
Linking one’s information to the Linking one’s information to the information already gathered by other information already gathered by other family members and researchers is family members and researchers is becoming more and more common. becoming more and more common.
We Have to Put Our Information Together Somehow
A Few Basics
Computer programs store the data that we Computer programs store the data that we enter in FILESenter in FILES
Each genealogical program stores the Each genealogical program stores the information in its own way, called a information in its own way, called a PROPRIETARY FORMATPROPRIETARY FORMAT
Most programs can also read and write in Most programs can also read and write in GEDCOM formatGEDCOM format
A word about exchange …
AA BB
Import
Routine
ExportRoutine
Possible Intermediate
Format
A Few Basics
Merging is copyingMerging is copying From a SOURCEFrom a SOURCE To a TARGETTo a TARGET Sometimes called the SURVIVING Sometimes called the SURVIVING
INFORMATIONINFORMATION
MERGING DATABASES
merging the files into a single one merging the files into a single one merging the duplicated individuals merging the duplicated individuals merging the restmerging the rest
sourcessources repositoriesrepositories
The database merging process is evolving More input sourcesMore input sources More freedom to choose the features you More freedom to choose the features you
like.like. GenBridgeGenBridge
Freedom has a price
Enter a nameEnter a name Program won’t Program won’t
break it upbreak it up Enter a placeEnter a place Program won’t Program won’t
break it upbreak it up
Legacy Trick
You can open two family files at the same You can open two family files at the same time, and copy and paste a person and their time, and copy and paste a person and their descendents from one set into another, like descendents from one set into another, like grafting a tree branch from one tree to grafting a tree branch from one tree to another.another.
Making automatic citations
Legacy – individual levelLegacy – individual level TMG and FTM – field levelTMG and FTM – field level
The Current Merging Art
Merging DatabasesMerging Databases Merging IndividualsMerging Individuals Merging the RestMerging the Rest Spotting DuplicatesSpotting Duplicates
Merging Individuals
If you want to merge duplicates, most If you want to merge duplicates, most programs will make you choose programs will make you choose
which “tags” to keep and throw the which “tags” to keep and throw the rest away.rest away.
MERGING INDIVIDUALS:The old way Copy the infoCopy the info Delete one of the peopleDelete one of the people Type the info into the new oneType the info into the new one
MERGING INDIVIDUALS:The middle way View both personsView both persons Select what you wantSelect what you want The program does the restThe program does the rest
MERGING INDIVIDUALS:The future way Computer spots likely dupsComputer spots likely dups Recommends them to youRecommends them to you You control the processYou control the process
Merge Sources for most popular software Their own filesTheir own files GEDCOMGEDCOM In some cases, files from other programsIn some cases, files from other programs In some cases, CD and internet databasesIn some cases, CD and internet databases
Still, it ends up being like pouring two cans of paint together.
Merging the Rest
Most programs don’t even import and merge Most programs don’t even import and merge place tables, source tables, etc.place tables, source tables, etc.
I don’t know of any program that recognizes I don’t know of any program that recognizes the same source in two separate datasets.the same source in two separate datasets.
Merging The Rest
source citations, master sources, source citations, master sources, repositories, and placesrepositories, and places
Most programs just combine the Most programs just combine the tables, creating duplicatestables, creating duplicates
LG will combine a source, with LG will combine a source, with exact spellingexact spelling
UFT and FTM merge master sourcesUFT and FTM merge master sources PAF and TMG merge master sources PAF and TMG merge master sources
and repositoriesand repositories
Limits to Storage Some programs have really Some programs have really
limited storage, and only store limited storage, and only store conclusionsconclusions
If you have two birth dates, If you have two birth dates, they put your favorite one in they put your favorite one in and throw the other away, or and throw the other away, or store it in a note.store it in a note.
Some programs have a lot of Some programs have a lot of storage, and let you make your storage, and let you make your own “tags” such as own “tags” such as executrixexecutrix..
SPOTTING DUPLICATES
Some programs have “merging routines” Some programs have “merging routines” based on:based on:
SoundexSoundex Spelling of nameSpelling of name Birth dateBirth date TMG and Legacy use a large variety of TMG and Legacy use a large variety of
match choicesmatch choices
Spotting duplicates
Soundex for names (AQ)Soundex for names (AQ) Exact spelling or soundex (PAF 3.0)Exact spelling or soundex (PAF 3.0) Exact spelling and exact birth date (FTM)Exact spelling and exact birth date (FTM) Many name compares (TMG and UFT)Many name compares (TMG and UFT) Soundex surname and user choice of # of Soundex surname and user choice of # of
letters in first name (LG)letters in first name (LG) Warn if duplicate name entered (most)Warn if duplicate name entered (most)
Merging tips
Match on parent soundex reduces false Match on parent soundex reduces false positives (Gaylon Findlay)positives (Gaylon Findlay)
If your program won’t let you choose If your program won’t let you choose initials, but has a number-of-letters, try that initials, but has a number-of-letters, try that with 1.with 1.
Beware of people about whom you know Beware of people about whom you know very little.very little.
Beware of blank dates.Beware of blank dates.
Signs that you can merge better today than you could before More formats allowedMore formats allowed Easier individual mergingEasier individual merging Identifying routines are becoming more Identifying routines are becoming more
sophisticatedsophisticated More storage of conflicting data allowedMore storage of conflicting data allowed More variety in the software marketplaceMore variety in the software marketplace
Signs that we aren’t getting there yet No formal studies on known datasets to No formal studies on known datasets to
quantify false positives and false negativesquantify false positives and false negatives No implementation of information sciences No implementation of information sciences
in commercial productsin commercial products No implementation of AI in commercial No implementation of AI in commercial
productsproducts No formal discussion of algorithmsNo formal discussion of algorithms
MERGING SUMMARY
Users can merge from a wider variety of data formats than in the past.
Users can merge individuals more easily.
MERGING SUMMARY
Routines to help identify candidates for merging are becoming quite sophisticated.
More programs store conflicting data today.
It’s also encouraging that they are not all
doing the same thing.
The resultant diversity and innovation offer us more chances to connect Where-We’ve-Been to Where-We’re-Going than we’ve ever had before.
The GENTECH Genealogical Data Model Purpose: Purpose:
To define and To define and communicate the communicate the meanings of family meanings of family history data.history data.
Genealogical Data Model
Request for CommentRequest for Comment Project by genealogists Project by genealogists
and developers to describe and developers to describe genealogy processes. genealogy processes.
Describes the Describes the relationships between the relationships between the various kinds of family various kinds of family history information.history information.
Overview of what Overview of what genealogists dogenealogists do
Not a genealogy program.Not a genealogy program. Not a database designNot a database design Not a document saying Not a document saying
what genealogists what genealogists SHOULD do.SHOULD do.
Every genealogist says that they do research differently.
The GDM describes the process that The GDM describes the process that they do differently.they do differently.
Stop Starting with Conclusions
Don’t start with conclusions, start with Don’t start with conclusions, start with evidence.evidence.
Some features of Evidence in the GDM REPOSITORYREPOSITORY SOURCESOURCE REPRESENTATION TYPEREPRESENTATION TYPE REPRESENTATIONREPRESENTATION CITATIONCITATION
CONCLUSIONS
ASSERTIONS about …ASSERTIONS about … PERSONAPERSONA EVENTSEVENTS CHARACTERISTICSCHARACTERISTICS GROUPSGROUPS ASSERTIONSASSERTIONS
XML is eXtended Markup Language <TITLE><TITLE>The Title of My BookThe Title of My Book</TITLE></TITLE> <NAME><NAME>Jonathan SharbroughJonathan Sharbrough</NAME></NAME> <BIRTHDATE><BIRTHDATE>circa 1734circa 1734</BIRTHDATE></BIRTHDATE> <BIRTH PLACE=<BIRTH PLACE=“North Carolina”“North Carolina” DATE= DATE=“circa “circa
1734”1734”>>
Future digital research
programs publish pedigrees and registers programs publish pedigrees and registers in some XML formatin some XML format
repositories publish recordsrepositories publish records in the same format in the same format
local links, remote sourceslocal links, remote sources external authoritiesexternal authorities
A new culture
most quoted sites - “authorities”most quoted sites - “authorities” many link sites - “hubs”many link sites - “hubs” links define culture, tribe, familieslinks define culture, tribe, families
The digital future of family history is a virtual library where it is ...
Easy to find the conclusionsEasy to find the conclusions Easy to identify the evidenceEasy to identify the evidence Easy to identify the thought process that Easy to identify the thought process that
links them.links them.
Missing ingredients
agreement on LexML standardagreement on LexML standard wide acceptance of LexML standardwide acceptance of LexML standard wide implementation of LexML wide implementation of LexML
Send Me A Disk, Ok?
Do’s and Don’tsDo’s and Don’ts Merging TechniqueMerging Technique GENTECH GDMGENTECH GDM
Beau SharbroughPO Box 3019
Grapevine TX [email protected]