requirements of a taxonomy database tcl-db a prototype
DESCRIPTION
Requirements of a Taxonomy Database Tcl-DB a Prototype. Outline Requirements Hierarchy Alternative Search Terms: Synonyms and Vernaculars Alternative Spellings Alternative Classifications Tcl-DB Prototype System Tcl-DB Structure 2NF Extensibile: Adding a new data source e.g. NCBI - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/1.jpg)
Requirements of a Taxonomy Database
Tcl-DB a Prototype
![Page 2: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/2.jpg)
Outline1. Requirements
• Hierarchy• Alternative Search Terms: Synonyms and
Vernaculars• Alternative Spellings• Alternative Classifications
2. Tcl-DB Prototype System• Tcl-DB Structure• 2NF
3. Extensibile: Adding a new data source e.g. NCBI4. Tcl-DB: UID Tracking5. Tcl-DB: Stats6. Utility and Further Work
![Page 3: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/3.jpg)
1. Hierarchy
![Page 4: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/4.jpg)
2. Alternative Search Terms: Synonyms and Vernaculars
![Page 5: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/5.jpg)
3. Alternative Spellings: Caenorabditis elegans, C elegansand Caenorhabditis elegans
![Page 6: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/6.jpg)
4. Alternative Classifications:
![Page 7: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/7.jpg)
Tcl-DB Prototype System. Proposed Architecture
![Page 8: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/8.jpg)
Tcl-DB: Logical Structure
![Page 9: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/9.jpg)
Tcl-DB Physical Database Structure
![Page 10: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/10.jpg)
Assertion:Resolving the M:M with an association entity
![Page 11: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/11.jpg)
Node:Hierarchical QueriesNested Set, Path and Connect by
>select count(name_id) from node
start with name_id = ‘100891'
connect by prior name_id = parent_name_id;
>select count(name_id) from node
where path like '/%';
>select count(name_id) from node
where left_id between 1 and 9290;
![Page 12: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/12.jpg)
synonym_name and vernacular:subtypes,multi-valued attributes or weak entities
![Page 13: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/13.jpg)
Tcl-DB: 2NF
Kingdom
KINGDOM_ID
NAME_ID NAME_TEXT SOURCE_ID
Rank
RANK_ID
RANK_NAME SOURCE_ID
ASSERTION
PK ASSERTION_ID
I2,I1 NAME_IDI1 SOURCE_IDI1 DBSOURCE_ID AID NID RANK_ID KINGDOM_ID
ASSERTION
PK ASSERTION_ID
I1,I2 NAME_IDI1 SOURCE_IDI1 DBSOURCE_ID AID NID RANK KINGDOM
![Page 14: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/14.jpg)
Adding a new data source e.g. NCBITcl-DB: Procedures, Packages and Functions:
![Page 15: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/15.jpg)
Step 1: Build Views, what names are already in the database
![Page 16: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/16.jpg)
Step 2: Move names from view to Tcl schema
![Page 17: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/17.jpg)
Step 3: Fill the nodes table in tcl schema
![Page 18: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/18.jpg)
Step 4: fill synonym_name table in tcl schemaStep 5: fill vernacular table in tcl schema
![Page 19: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/19.jpg)
Tcl-DB: UID Tracking
after name data load:
1. Run two joins on name and nids_mv
• Nids – name_id when the name_text exist
• Null – name_id when the name_text not exist
2. Update name and give all new names a NID
3. Update name give all names their original NID
4. Refresh the NID_view
![Page 20: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/20.jpg)
Tcl-DB: Utility and Further Work
Computing Interesting Stats:•How much overlap between ITIS and NCBI?•How many names unique to NCBI?•How many of these are binomials Vs ‘environmental sample 256’•How many of these names can be matched allowing for 1 – 3 letter mismatches.•NCBI taxonomy – data quality, Integrity and Usability?Transitively closing the Synonyms Table and Vernacular TableBuilding an interface.•Spell checkers
![Page 21: Requirements of a Taxonomy Database Tcl-DB a Prototype](https://reader035.vdocuments.mx/reader035/viewer/2022062517/56813adf550346895da32b5c/html5/thumbnails/21.jpg)
Lots of Questions?How do we use this to build taxonomically aware databases?How about updates to the data?Database links , Web services, Simple DB Cross References?Use Genbank Model?Open to Suggestions/Ideas!
Do we need to think about:PhyloCode?Type Specimens?