bioinformatics and data management
DESCRIPTION
- PowerPoint PPT PresentationTRANSCRIPT
CTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCAAAGNTNNGNNNANNACNNTTGGCCGAGAACTTNGNNGGGGNTNANTNNNATATTCCNATTTTGCCTAATACNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAAACANTNNNCNANTNNCATCNTNTTNGNANAACNNGGGGGGGAATTTTTTNNCAAGGNGGNNCCAANGCGNNATTATCCANCNCNNCCNAGTTGTNNAAANNAGTNTNCCNCGAGGNTAAAAAAACTTTTNTCCGGCGGNGGCAGNTNGNGNAAATAACNGTTTCCCCNCCNTTGTGTTNGGGGGCNCCCCCCCCCCCCCTNCAAANAANANAAANGNNNGNCGGNNATTTTNACCGTCGCGGGGGGCCCNCACCNCACCCGNAGNAAATCNACCANATCAAGNGAGGANGGNGGGNGAGGCCTTTTTTTTTTTNNAAAATCCCANAAAAACTTNNNNCCGNNGGGGGGGCTAAAAAAAAAACCCCCCCCNCCCACCCNNCCNGGGGGGGNGNAGGTTTNTTGTTTTTTTTTCCANAANACTNGGTTNGNGGGAAGAGATNAANNAACACACCCCCCCNCNTGNGGTCCTTNTTTCCCCNAANGGGTGNGGGNGGNNNATTCCTCCTNCNTNCCACNAANAAAGGGGGNNTTATTAAAAACTTNNCCTCAGGTNCNCTNGNGGGGGGGGGGGGGGGGNGGNCCANAANTNTTNCNCCCGGGNCGGGGNNAATTNCCCNGGGTNAGGNATCCTTCNAANAGAGGTTTTTAAAANACCTTNNCNCCCGGGGGGAAATNCCTGNTCCCCCCTCTCNNNAAGANGAAAAATAAAACTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCAAAGNTNNGNNNANNACNNTTGGCCGAGAACTTNGNNGGGGNTNANTNNNATATTCCNATTTTGCCTAATACNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAAACANTNNNCNANTNNCATCNTNTTNGNANAACNNGGGGGGGAATTTTTTNNCAAGGNGGNNCCAANGCGNNATTATCCANCNCNNCCNAGTTGTNNAAANNAGTNTNCCNCGAGGNTAAAAAAACTTTTNTCCGGCGGNGGCAGNTNGNGNAAATAACNGTTTCCCCNCCNTTGTGTTNGGGGGCNCCCCCCCCCCCCCTNCAAANAANANAAANGNNNGNCGGNNATTTTNACCGTCGCGGGGGGCCCNCACCNCACCCGNAGNAAATCNACCANATCAAGNGAGGANGGNGGGNGAGGCCTTTTTTTTTTTNNAAAATCCCANAAAAACTTNNNNCCGNNGGGGGGGCTAAAAAAAAAACCCCCCCCNCCCACCCNNCCNGGGGGGGNGNAGGTTTNTTGTTTTTTTTTCCANAANACTNGGTTNGNGGGAAGAGATNAANNAACACACCCCCCCNCNTGNGGTCCTTNTTTCCCCNAANGGGTGNGGGNGGNNNATTCCTCCTNCNTNCCACNAANAAAGGGGGNNTTATTAAAAACTTNNCCTCAGGTNCNCTNGNGGGGGGGGGGGGGGGGNGGNCCANAANTNTTNCNCCCGGGNCGGGGNNAATTNCCCNGGGTNAGGNATCCTTCNAANAGAGGTTTTTAAAANACCTTNNCNCCCGGGGGGAAATNCCTGNTCCCCCCTCTCNNNAAGANGAAAAATAAAACTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAA
Jeff [email protected]
Bioinformatics and
Data Management
Lecture Goals
• Define Bioinformatics
• Explore NCBI’s website
• Introduce some useful tools in available Microsoft Excel
• Review other computer software used in Molecular Epidemiology
Bioinformatics Defined
Systematic approach to
store and classify and analyze
data and information and metadata
that allows for the
acquisition of knowledge and
enhancement of the understanding
of biological systems.
Data
Numbers derived from observations, experiments or calculations
Examples: Positive/NegativeA, T, G, Cill/healthy
Information
Data in context.
Data with associated explanations and interpretations.
Examples: Publications, available sequence data, that which is freely available.
Metadata
• Data about data
• Context in which information is used.
• One application's metadata is another application's data
Examples:Descriptive studies, reference lists, sources, Genbank accession numbers.
Knowledge and Understanding
• This is what we’re all doing here!
• Advancement of Science
• Solving of problems
Bioinformatics DefinedData Information Metadata
Knowledge & Understanding
Data Analysis Tools
• NCBI– Bibliographic information– Sequence analysis (nucleotide, protein)– Other
• Data scanning using Microsoft Excel• Other tools
– Gel comparisons– Spatial data-GIS– Temporal data – Statistical considerations
On-line Tools Available at NCBI
Blast Searches
Microsoft Excel
• Easily available
• Filters
• Pivot Table Reports and Charts
Other Software
• ClustalW multiple sequence alignments– www.ebi.ac.uk/clustalw/index.html
• BioNumerics
•Fingerprint types
•Character types
•Sequence types
•2-D gel types
•Matrix types
Geographic Information System (GIS) software
Cardinal Rules of Data Management
I. Save your data
II. Back-up your data
III. Record databases, and program versions used
IV. Write down sequence numbers
V. Record program parameters
VI. Use E-values
VII. Double check your results visually
VIII. In spreadsheets, one entry per column