a dissertation defense presented to the department of computer science
DESCRIPTION
David A. Gaitros, Dissertation Defense, FSU, December 2006 Overview Acknowledgements Problem Definition Research Statement Goals and Challenges Semantic Associations Ontology in Semantic Associations MorphBank Architecture MorphBank Object Relations Annotation and Collections Semantic Annotations Example MorphBank Semantic Association Results Future work Questions I am going to cover the topic of Semantic Associations using Annotations and the application of this idea to MorphBank. MorphBank is a very large and complex project which has many facets. Much of the background is covered in the dissertation and additional information can be obtained by going directly to the web site at http:.//morphbank.net. David A. Gaitros, Dissertation Defense, FSU, December 2006 Davd A. Gatros, Dissertation DefenseTRANSCRIPT
THE REPRESENTATION OF ASSOCIATION SEMANTICS WITH ANNOTATIONS IN
A BIODIVERSITY INFORMATICS SYSTEM
A Dissertation defense presented to the Department of Computer
Science In partial fulfillment for the Requirements of the degree
Doctor of Philosophy David A. Gaitros The Representation of
Association Semantics with Annotations in a Biodiversity
Informatics System Supported by the National Science Foundation
(NSF) Biological Database Informatics (BDI) program Gant DBI ).
$2.25 Million 3 year project. Welcome committee and distinguished
guests. Dissertation Committee Dr.Greg Riccardi Dr.Fredrik Ronquist
Dr.Robert van Engelen Dr.Ashok Srinivasan December 8th, 2006 Davd
A. Gatros, Dissertation Defense David A. Gaitros, Dissertation
Defense, FSU, December 2006
Overview Acknowledgements Problem Definition Research Statement
Goals and Challenges Semantic Associations Ontology in Semantic
Associations MorphBank Architecture MorphBank Object Relations
Annotation and Collections Semantic Annotations Example MorphBank
Semantic Association Results Future work Questions I am going to
cover the topic of Semantic Associations using Annotations and the
application of this idea to MorphBank.MorphBank is a very large and
complex project which has many facets.Much of the background is
covered in the dissertation and additional information can be
obtained by going directly to the web site at David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense MorphBank Primary Investigators MorphBank
Development Team
Acknowledgement MorphBank Primary Investigators Dr. Fredrik
RonquistDr. Greg Riccardi Dr. Austin MastDr. Robert van Engelen Dr.
Corinne JrgensenDr. Peter Jrgensen Dr. Greg Erickson MorphBank
Development Team Mr. Wilfredo BlancoMrs. Neelima Jammigumpula Mr.
Steve WinnerMrs. Karolina Maneva-Jakimoska Mrs. Cynthia GaitrosMrs.
Debbie Paul Ms. Katja SeltmannMr. Chris Cprek Showing the
acknowledgements first because of the The magnitude of the project.
Contribution to the whole project by a large number of people. To
express my appreciation to them. David A. Gaitros, Dissertation
Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense
Acknowledgement (continued)
Research Associates Dr. Gordon ErlebacherDr. Andy Deans Dr. Matthew
BuffingtonMr. Shayne Steele Student Research Associates Mr. Gabriel
LoganMr. Jason Simmons Mr. Stanislov UstymenkoMr. Wei Zhang Ms.
Allison von EbersteinMs. Janet Capps The list of individuals does
not include many other participants who have also made
contributions to MorphBank and indirectly to this research.Also
thanks to the Spring 2004 Software Engineering class for their work
on the original MorphBank requirements document with me. David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Problem Statement Scientist canproduce large amounts of data but
cannot always process or search it. In biodiversity, specimens can
be dissected, cataloged, photographed, analyzed, and stored in a
variety of media. Much of the detailed knowledge of these specimens
are still kept in personal journals, scientific logs, hand-written
notes, and human memory. Such informal methods of storing and
retrieving information represented a problem when other biologists
attempting to search for biodiversity subject matter. How can we
help solve this problem? What would a dissertation be without a
problem to solve? Scientists can and do produce terabytes of
data.Some, such as meteorologists can do this daily.Making sense of
this data is becoming a very difficult problem. Biodiversity
information is not different. However, much of the data is very
informal and in a non-digital media. The are many scientific
collections exist of specimens in boxes locked in cabinets that
have never been cataloged in a central data repositories. David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Statement This research adds value to image repositories
by collecting and publishing semantically rich user specified
associations among images and other objects. Read statement. I
wanted to increase the productivity of a biodiversity web site by
creating a method that would allow biologists and eventually other
scientists to search and discover new relationships amongst data. I
am going to show that I not only conducted the research but was
able to take the demonstration of the utility past the prototype
stages.This was a very successful venture. David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Goals Gather available data standards for biodiversity and
semantic associative systems. Develop models Transform models into
a relational database Develop data retrieval methods Research and
develop methods to expose MorphBank data Develop a prototype
semantic associative annotation tool Research automated object
association Show that a semantically rich environment is useful to
research scientists Look at the databases, including MorphBank and
literature for any proposed standards on naming and structure. Come
up with a schema and review the ideas with the community Create a
relational database. How can we get data. We want to make the data
easily available to the world. Annotation tool Research what is
being done in areas of mining data. Are people going to use the
system. Here I want to make a comment concerning the initial
perception of the research. When I started the PhD program here at
FSU, a few of the professors told me I would find out two things:
My initial idea of the outcome of my research and the final version
would be very different. I would probably have to scale back on the
scope of what I wanted to accomplish. #1 was true,#2 however as it
turns out I was able to accomplish much more. David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Challenges Finding consensus on data naming standards
Finding a flexible and reliable taxonomic name server Developing a
model for semantic associations Developing a prototype of a
functional semantic association annotation tool. The magnitude of
the work that must be accomplished. Management of a development
team Creation of a development environment Creation of a commercial
quality web site Populate database Maintenance of a Biodiversity
system Attracting sufficient users to determine the feasibility of
such a system How mature are the data standards out there. Naming
of specimens is a tedious task. Many scientists avoid this problem
by just allowing users to type in the names.To prone to errors.
Difficult to search. How is the data associated, I did not know
when we started. The annotation tool I need does not exist.I really
dont want to write one. The massive amount of work that must be
done. David A. Gaitros, Dissertation Defense, FSU,December 2006
Davd A. Gatros, Dissertation Defense Semantic Associations
Represents a very complex set of relations among objects Allows
users to gain insight or query for interesting relationships among
large amounts of data Inside a semantically rich environment,
ontologies and context are preserved. The novel approach is
integrating ad-hoc annotation data with semantic associations with
tools that allow for the discovery of the relationship. Semantic
associations are complex relationships built upon the ontology of
the terms used in the data.Semantic associations are dynamic and
not static in nature so they change over time. We want scientists
to use the ontology that best describes their perception of their
research. Lets say that you had two mathematicians who are
describing proofs to the same problem using different types of
mathematics to describe their solution. This would be an example of
using different ontologies. There are research projects that
involve Semantic Associations, There are research project that
involve Annotations. The novel approach here is the marrying of the
two to create an environment where ontologies are preserved and
using annotations, new semantic associations can be created,
searched, and found. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense Semantic
Associations
Associations that have a direct relations are easy to find, others
are not View Information: Head posterior cleanedin alcohol Locality
Information: Europe Specimen Data: Female, indeterminate, adult,
Diplopepis rosae Contributor: Johan Liljeblad and Fredrik Ronquist
General Comments: 12 records Determinations: 15 records Related
Phylogenetic Characters: 1 record External Data Sources:15 sources
Some of the data is easy to store and retrieve.View, locality,
specimen, contributor, etc.All of these we can create static tables
for and queries to retrieve the data quite easily. However, ad-hoc
comments, determination annotations, different physical
characteristics , and even storing an unspecified number of
external sources all prove to be difficult in a data
repository.They can be stored but linking them with other objects
is often not attempted. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense Semantic
Associations
What we would like to be able to find: Other images By this
contributor Related specimens Other images That use this view
Contributor Data About the View Image Specimen Data Here is an
example.These are some of the things we wanted to do with
MorphBank. Note that any object that is one edge away is easy to
find.Two or more become difficult. Image can find a contributor or
view or specimen.But lets say that I wanted to find the related
specimen, where the specimen was located, and then find other
related images collected at that site and perhaps the contributors.
ALL PATHS ARE NOT DEFINED. Other related objects Related comments
Place where This specimen Was collected The nature Of the
relationships David A. Gaitros, Dissertation Defense, FSU,December
2006 Davd A. Gatros, Dissertation Defense Semantic
Associations
What we would like to be able to find: Any phylogenetic
Characters/states Other Taxonomic descriptions All related images
Associated publications Specimen Data All Annotations Annotation
contributors All Determination Annotations Lets take the problem
another step and show the complexity of the situation.On the
previous slide we noted the relationship to image and specimen. We
can take any relation ( specimen) and expand the desired paths in
an almost infinite series. All related images Other objects
contributed External Data Links David A. Gaitros, Dissertation
Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense
Ontology in Semantic Associations
Ontology is a specification of a specialization (Charles Canton)
Ontologies represent a community consensus among participants and
there is pressure against change. Issues: What if someone desires
to use a different taxonomic structure to describe a specimen? What
if an error is discovered in the current ontology? How do you
deviate from the current ontology without distorting the data and
relationships? Before we go any further, lets talk about ontology.A
funny word, the definition of which causes some problem. It is
usually used in philosophy but in this case we use it to mean a
consensus among participants on meanings and definitions. Change is
hard but in order for science to progress, change is often
necessary so in MorphBank we must be able to allow scientists to
use their own ontology. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense Ontology in
Semantic Associations
MorphBank has several software and internal features that address
this problem Through the use of Semantic Associations with
Annotations, users can preserve the use of their own ontologies
without inhibiting anyone else or corrupting data MorphBank allows
for local modifications on external data references We have several
features that allow this. Most notably is the use of collections
and annotations the give scientists a forum for discussion and
agreement/disagreement. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank
Architecture
Working Set Under Review Released MorphBank Version 2.5 Data
Service Browse Search Upload Admin Annotation Read Only ITIS
MorphBank Security Service Login About News Help Contributor
Unregistered User Scientist Lead Group Coordinator Administrator I
want to give you a little background on MorphBank, There was
considerable effort in the beginning of the research project to
create a valid and correct architecture. In a Software Engineering
sense, this project was accomplished correctly. There was a
tremendous amount of work accomplished early in the discovery and
analysis of the requirements.This early work eased the production
effort later on. There are things we would change but that is
always the case. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank
Object Model
The early ideas of creating a valid data model with a centralized
catalog was absolutely paramount in the success of the
project.Since all objects in MorphBank (image, specimen, view
,locality, publication, user, group, annotation, collection) have a
unique serial number and inherit a base object, relationships among
these objects can be built with maximum flexibility. David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense MorphBank Inheritance Relationships
Here we see an example.The idea of a collection is central to
MorphBank.A collection is a group of related items that have an
idea in common.A collection can be any number of related objects
including the relation myCollection which is a restricted subset of
a Collection used internally within the database.A collection can
have many objects and likewise an object can be in many collection.
Here is the idea.By putting related objects in a collection, I can
show their relationship to other objects regardless of the
distance. David A. Gaitros, Dissertation Defense, FSU,December 2006
Davd A. Gatros, Dissertation Defense MorphBank Object
Relationships
Another view point.Note that there are hard coded relationships
between specimen and locality, image and view, and user and group.
The all, including annotation and mycollection inherit the base
object.So, by going through the baseobject I can find any
relationship of distance 1. Using the concept of a Collection, I
can using annotations and myCollection to find any relationship of
distance N. David A. Gaitros, Dissertation Defense, FSU,December
2006 Davd A. Gatros, Dissertation Defense MorphBank Annotation
Architecture
Here is how the annotation tool works.As I had stated earlier, the
idea of an Annotation has changed.I originally concentrated on the
graphic nature of the tool and I spent quite a bit of time. And
although this is important, I was somewhat surprise at the
importance of annotation of data relationships later in the
project. David A. Gaitros, Dissertation Defense, FSU,December 2006
Davd A. Gatros, Dissertation Defense MorphBank Object
Relationship
So important was the relationship annotation and the ability to
include external data references (different data models and
ontologies) that we included in the research the ability to import
and store XML documents inside of MorphBank. Example: Image
Annotation Overview Using an XML Schema David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense Annotations, Collections, and
Associations
The research program started with a concentration on annotations.
However, the idea of a collection and building a relationship
between the two evolved after time Annotation: A note that
describes, explains, and/or evaluates the contents of a book,
article, video, image, etc. This information is always accompanied
by a citation. Collection: Several things grouped together or
considered a whole. [Websters Dictionary] Associations: Phrases
that lend meaning to information, making it understandable and
actionable, and provide new and possibly unexpected insights
[Boenerges Aleman-Meza] David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Collection The is a partial screen shot of a collection screen.At
this time, we limited a collection to images but internally we are
able to put any valid MorphBank object in a Collection including
another Collection. If we have time I can demonstrate this. David
A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense Annotation Select Related New Determination
Taxon name Annotations
This is an annotation screen that show Determination Annotations. A
specialized annotation that allows biologists to assign a taxonomic
determination to a specimen and agree, disagree, or agree with
qualification on other determinations. This particular screen is
full of the complex relations that we have described. Title,
comments, and image Annotation Associate Related Materials David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Annotation Looking at the show feature, we start to see these
relationships.Note the references to the specimen and related
annotations.Also, not shown because of size limitations, are
related objects. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Annotation Here is another type of annotation that allows users to
import XML data and also place a marker on an image. David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Annotation The annotation tool that allows the user to insert
specific markers on an image. Note the image is not altered. The
marker and annotations are stored separately.There is not
restriction on the number of annotations an image or any other
object may have. The original data is not altered. David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Annotation Text version of previous annotation Specimen record of
an adult female of form Indeterminate Pteroceraphron mirablipennis
gathered by D. C. Darling of the institute CNCI. The specimen was
gathered on August 4th, The specimen was gathered near Indiana:
Porter Co.: Cowles Bog: Dune Acres, United States of America.This
particular specimen is of class Insecta, order Hymenoptera, family
Ceraphronidae, Genus Species Pteroceraphron mirablipennis. This
particular image (104272) was submitted by Dr. Andy Deans on August
8th, 2006 and released November 12th The view of the image is of
the body with a lateral view using auto-montage photography.No
particular preparation.There are six related images of this same
specimen. There are two related determination annotations.(1) Which
identifies the wings, antennae match key and (2) that states this
diagnosis if for the genus Pteroceraphron If we were to attempt to
write out the complete annotation of the previous image it would
look something like this. Note, this is not a complete
description.Finding information and parsing this document would be
very difficult. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Annotation 104272 lanceolot wings 25.2352.1 This is a lanceolot
wing 67572 Andy Deans 3 HymAtol . This represents the XML version
of the annotation that can be produced by MorphBank.Used in
communicating to external sources, this is in a very machine
readable format but is less useful to humans.It is also very
verbose and I was only able to place a small portion of the
document on the slide. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Research Results Collections are a form of annotations by the fact
that items that are in a collection define a relationship.
Inheritance is a strategy for annotation whereby we now know we can
extend this model into new meanings. Through the baseObject class
we can form complex relationships and through annotations we can
provide meaning to those relationships. Through Collections we can
form relationships of objects thatwould otherwise have no direct
links to each other and through annotations we can provide meaning
to those relationships. There is no limit to how this capability
can be extended. David A. Gaitros, Dissertation Defense,
FSU,December 2006 David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Results Through inheritance we restrict the semantics that
are used with objects to improve context searches. Fields of data
are not open to interpretation Fields are distinct to the objects
they reference Example:Determination annotations inherit from
Annotations and further restrict the meaning of that type of
annotation. Tools can now be built that allow for more extensive
and elaborate building of relationships. David A. Gaitros,
Dissertation Defense, FSU,December 2006 David A. Gaitros,
Dissertation Defense, FSU, December 2006
Research Results Version 2.2 and 2.5 MorphBank documented and
released Currently working on subsequent versions Updating
documentation Under Configuration and Control hits on the web site
per day 3 accepted conference papers, 1 Biodiversity Journal
publication, 3 Taxonomic Data Working Group Presentation, 1
ATOL/PBI Presentation Over 100,000 data items Over 60,000 images 98
Groups 121 Registered users from 85 organizations (Ex: FSU, UF,
Harvard, Yale, USC, American Museum of Natural History, Duke, Johns
Hopkins) 350 Annotations As stated before, the results were
significant.Multiple versions of the database were produced and
documented for release.A simple search on the web shows the site
being used throughout the community and it has received rave
reviews. In particular the capability of collections and
annotations and the uniqueness of that capability are drawing more
attention to the site. Images, data, users, and groups are
constantly being added and as the capability of the system grows,
so will the use. Efforts are underway to require that biologists
publish their image in MorphBank to be used as references in
publications. David A. Gaitros, Dissertation Defense, FSU,December
2006 Davd A. Gatros, Dissertation Defense David A. Gaitros,
Dissertation Defense, FSU, December 2006
Research Results 336 Determination Annotations 1,544 distinct
objects contained in 384 Collections Received very positive
feedback from trial participants Received praise from the National
Science Foundation for the quality and quantity of work
accomplished to date First Biodiversity System to offer semantic
association annotations, general annotations, legacy annotations,
and determination annotations Being used currently by organizations
for collaboration on specimen determinations MorphBank., through
the use of semantic associations and annotations, represents the
ability to increase our understanding of the relationships of
distant entities.The more MorphBank is used (even if data is not
added) the greater our understanding of relationships. David A.
Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Results Developed prototype for semantic search on
internally stored XML documents External objects are exposed
through LSIDs in an RDF format.XML documents and being exposed and
used by other organizations and data repositories Morphobank
Genbank Provide direct links using the MorphBank Show function as
URLs used in Conference and Journal papers As the amount of data
grows in MorphBank so does the wealth of semantic associations. We
were able to accomplish more.The ideas on collection and the
extended use of XML data, Life Science Identifiers, exposing
objects as RDF documents, and the proliferation of the use of the
site is beyond our original expectations. David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Results David Gaitros Contribution Analysis of the problem
Analysis of the original MorphBank version 1.0. Analysis of data
requirements and gathering of initial MorphBank requirements.
Research of the current state of knowledge of annotations in
scientific systems. Research of available taxonomic name servers.
Modeling Creation of the MorphBank security model. Creation of the
MorphBank data model and schema. Creation of the semantic
association annotation model. Project Manager Leadership of the
design team for the MorphBank system. Management of the production
of MorphBank version 2.2 and 2.5. Procurement of hardware and
software licenses. Management of the MorphBank NSF/BDI grant under
the direction of the Primary Investigators. Oversight of the
functional and design review meetings with users and primary
investigators. Presentations of the project at conferences and
workshops. MorphBank is a very large project.There were lots of
people involved with it. Besides the basic research on MorphBank,
my particular contribution involved much more.Remember that in
order to get the results, MorphBank had to work. David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense David A. Gaitros, Dissertation Defense, FSU,
December 2006
Research Results Software Design and Development Design and
implementation of the initial MorphBank Administration Model.
Design and implementation of the initial version of the Taxonomic
name selection module. Design and implementation of the MorphBank
Annotation Software. Design and implementation of the initial
version of the MorphBank Collection module. Design of the external
search and exposure feature for the release of MorphBank images in
response to MorphOBank external references requirements. Design of
the software test plans. Contributor to the MorphBank users manual.
David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A.
Gatros, Dissertation Defense David A. Gaitros, Dissertation
Defense, FSU, December 2006
Future Work Continue to extend the capability of Annotations and
Collections Turn on the feature that allows for the annotation of
any object Turn on the feature that allows for any object to be in
a collection Research more efficient search techniques for semantic
associations Complete development and release of phylogenetic
character state software Research the possibility of further
developing the extensible schema capability Analysis of the
complexity of relationships of the objects associated through
collections and annotations Expand and mature the use of Life
Science Identifiers Implement a security strategy that is separate
from the implementation of the software Map the current data schema
to the ABCD standard for the purpose of exporting data. Publish
results in high quality journal.Continued exposure at conferences
and workshops Future work. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
QUESTIONS David A. Gaitros, Dissertation Defense, FSU,December 2006
Davd A. Gatros, Dissertation Defense Environment Requirements
One of the major problems with semantic associations is the
complexity and reliability of the relationships Allowing
unqualified individuals to make contributions to the data
repositories induces errors that makes the data unreliable
Relationship connections are easily corrupted if heuristics are not
followed David A. Gaitros, Dissertation Defense, FSU,December 2006
Environment Requirements
Features of MorphBank that satisfy environment requirements Secure
login of ALL contributors Restriction of contributors to the area
of their expertise Group membership and data ownership Categories
of data In-progress Under review Released (cannot be altered only
annotated) Strict adherence to add, update, view, anddelete
heuristics All objects are centrally cataloged and uniquely
identifiable All objects can be accessed via a globally unique
identifier David A. Gaitros, Dissertation Defense, FSU,December
2006 David A. Gaitros, Dissertation Defense, FSU, December
2006
Semantic Annotation We want multiple annotations per any MorphBank
object to allow scientists to add ad-hoc data to the database
without specifically creating new tables or columns in existing
tables. How to store and retrieve this information in an efficient
and reliable manner. How to relate these annotations correctly to
all other objects. David A. Gaitros, Dissertation Defense,
FSU,December 2006 Davd A. Gatros, Dissertation Defense David A.
Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation Most disciplines have a common language and
phrases that they use in describing articles in their area.
Example: Communication of a pilot to a control tower: Pilot:
Tallahassee ground control this is Cessna 3245 Yankee on ramp ready
for taxi to active runway with information Bravo. We can pick out
specific information that appears in an exact order. This system of
formal semantics in aviation communication allows the participants
to communicate efficiently and effectively without
misunderstanding. We can schematize this conversation: We found
during the course of research that biologists, like other
disciplines, have their own common language and ontologies that are
used. If we can capture these types of phrases and dissect them, we
can schematize them for storage. David A. Gaitros, Dissertation
Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense
David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation Tallahasee Ground Control Cessna
32345/Taxittoramp> This is a simple illustration of how we were
able to formulate the different annotation and collection schemas.
David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A.
Gatros, Dissertation Defense David A. Gaitros, Dissertation
Defense, FSU, December 2006
Semantic Annotation With a Biological Image annotation we have
several distinct parts: Specimen ( biological item of interest0
Image ( A specimen may have more then one image) Type Annotation
Text Description of Annotation Title of Annotation Date (Time
Stamp) Location (X/Y coordinate of the area on the image) Associate
MorphBank Object ( Image, Specimen, Publication, Group, User,
Annotation, Location, View). David A. Gaitros, Dissertation
Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense
David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation All aspects of the annotation can be placed
into a schema and searched accurately. Searches on plain text
presents a problem. Example: WebSearch for Fruit Fly Solution:
Allow researchers to use restricted semantic annotation in writing
the text description Place data in an XML document Items can be
searched quickly and efficiently No restrictions on content New
semantics can be added at anytime. In MorphBank and in this
research we want to make several improvements on search for
information. We want to make it more accurate.Unlike a Google
search, we want a search for a specific string to return only the
related objects. We need it to also be fast. There are several ways
to accomplish this. The use of specific relationships built into
the system. Only searching related objects and not the whole
database. Also using the power of XML documents and specify the
attributes. David A. Gaitros, Dissertation Defense, FSU,December
2006 Davd A. Gatros, Dissertation Defense David A. Gaitros,
Dissertation Defense, FSU, December 2006
Semantic Annotation A B Red Green Black David A. Gaitros,
Dissertation Defense, FSU,December 2006 Davd A. Gatros,
Dissertation Defense