bioinformatics and data management

16
CTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCAAAGNTNNGNNNANNACNNTTGGCCGAGAACTTNGNNGGGGNTNANTNNN ATATTCCNATTTTGCCTAATACNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCA ANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGN NTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAAACANTNNNCNANTN NCATCNTNTTNGNANAACNNGGGGGGGAATTTTTTNNCAAGGNGGNNCCAANGCGNNATTATCCANCNCNNCCNAGTTGTNNAA ANNAGTNTNCCNCGAGGNTAAAAAAACTTTTNTCCGGCGGNGGCAGNTNGNGNAAATAACNGTTTCCCCNCCNTTGTGTTNGGG GGCNCCCCCCCCCCCCCTNCAAANAANANAAANGNNNGNCGGNNATTTTNACCGTCGCGGGGGGCCCNCACCNCACCCGNAGNA AATCNACCANATCAAGNGAGGANGGNGGGNGAGGCCTTTTTTTTTTTNNAAAATCCCANAAAAACTTNNNNCCGNNGGGGGGGC TAAAAAAAAAACCCCCCCCNCCCACCCNNCCNGGGGGGGNGNAGGTTTNTTGTTTTTTTTTCCANAANACTNGGTTNGNGGGAA GAGATNAANNAACACACCCCCCCNCNTGNGGTCCTTNTTTCCCCNAANGGGTGNGGGNGGNNNATTCCTCCTNCNTNCCACNAA NAAAGGGGGNNTTATTAAAAACTTNNCCTCAGGTNCNCTNGNGGGGGGGGGGGGGGGGNGGNCCANAANTNTTNCNCCCGGGNC GGGGNNAATTNCCCNGGGTNAGGNATCCTTCNAANAGAGGTTTTTAAAANACCTTNNCNCCCGGGGGGAAATNCCTGNTCCCCC CTCTCNNNAAGANGAAAAATAAAACTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCAAAGNTNNGNNNANNACNNTTGGCCG AGAACTTNGNNGGGGNTNANTNNNATATTCCNATTTTGCCTAATACNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCN NNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCC NNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTG AGNANTTNAAACANTNNNCNANTNNCATCNTNTTNGNANAACNNGGGGGGGAATTTTTTNNCAAGGNGGNNCCAANGCGNNATT ATCCANCNCNNCCNAGTTGTNNAAANNAGTNTNCCNCGAGGNTAAAAAAACTTTTNTCCGGCGGNGGCAGNTNGNGNAAATAAC NGTTTCCCCNCCNTTGTGTTNGGGGGCNCCCCCCCCCCCCCTNCAAANAANANAAANGNNNGNCGGNNATTTTNACCGTCGCGG GGGGCCCNCACCNCACCCGNAGNAAATCNACCANATCAAGNGAGGANGGNGGGNGAGGCCTTTTTTTTTTTNNAAAATCCCANA AAAACTTNNNNCCGNNGGGGGGGCTAAAAAAAAAACCCCCCCCNCCCACCCNNCCNGGGGGGGNGNAGGTTTNTTGTTTTTTTT TCCANAANACTNGGTTNGNGGGAAGAGATNAANNAACACACCCCCCCNCNTGNGGTCCTTNTTTCCCCNAANGGGTGNGGGNGG NNNATTCCTCCTNCNTNCCACNAANAAAGGGGGNNTTATTAAAAACTTNNCCTCAGGTNCNCTNGNGGGGGGGGGGGGGGGGNG GNCCANAANTNTTNCNCCCGGGNCGGGGNNAATTNCCCNGGGTNAGGNATCCTTCNAANAGAGGTTTTTAAAANACCTTNNCNC CCGGGGGGAAATNCCTGNTCCCCCCTCTCNNNAAGANGAAAAATAAAACTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCNA NGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGG NGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANN TNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAA Jeff LeJeune [email protected] 330-263-3739 Bioinformatics and Data Management

Upload: jacqueline-frederick

Post on 04-Jan-2016

20 views

Category:

Documents


3 download

DESCRIPTION

- PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bioinformatics  and  Data Management

CTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCAAAGNTNNGNNNANNACNNTTGGCCGAGAACTTNGNNGGGGNTNANTNNNATATTCCNATTTTGCCTAATACNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAAACANTNNNCNANTNNCATCNTNTTNGNANAACNNGGGGGGGAATTTTTTNNCAAGGNGGNNCCAANGCGNNATTATCCANCNCNNCCNAGTTGTNNAAANNAGTNTNCCNCGAGGNTAAAAAAACTTTTNTCCGGCGGNGGCAGNTNGNGNAAATAACNGTTTCCCCNCCNTTGTGTTNGGGGGCNCCCCCCCCCCCCCTNCAAANAANANAAANGNNNGNCGGNNATTTTNACCGTCGCGGGGGGCCCNCACCNCACCCGNAGNAAATCNACCANATCAAGNGAGGANGGNGGGNGAGGCCTTTTTTTTTTTNNAAAATCCCANAAAAACTTNNNNCCGNNGGGGGGGCTAAAAAAAAAACCCCCCCCNCCCACCCNNCCNGGGGGGGNGNAGGTTTNTTGTTTTTTTTTCCANAANACTNGGTTNGNGGGAAGAGATNAANNAACACACCCCCCCNCNTGNGGTCCTTNTTTCCCCNAANGGGTGNGGGNGGNNNATTCCTCCTNCNTNCCACNAANAAAGGGGGNNTTATTAAAAACTTNNCCTCAGGTNCNCTNGNGGGGGGGGGGGGGGGGNGGNCCANAANTNTTNCNCCCGGGNCGGGGNNAATTNCCCNGGGTNAGGNATCCTTCNAANAGAGGTTTTTAAAANACCTTNNCNCCCGGGGGGAAATNCCTGNTCCCCCCTCTCNNNAAGANGAAAAATAAAACTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCAAAGNTNNGNNNANNACNNTTGGCCGAGAACTTNGNNGGGGNTNANTNNNATATTCCNATTTTGCCTAATACNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAAACANTNNNCNANTNNCATCNTNTTNGNANAACNNGGGGGGGAATTTTTTNNCAAGGNGGNNCCAANGCGNNATTATCCANCNCNNCCNAGTTGTNNAAANNAGTNTNCCNCGAGGNTAAAAAAACTTTTNTCCGGCGGNGGCAGNTNGNGNAAATAACNGTTTCCCCNCCNTTGTGTTNGGGGGCNCCCCCCCCCCCCCTNCAAANAANANAAANGNNNGNCGGNNATTTTNACCGTCGCGGGGGGCCCNCACCNCACCCGNAGNAAATCNACCANATCAAGNGAGGANGGNGGGNGAGGCCTTTTTTTTTTTNNAAAATCCCANAAAAACTTNNNNCCGNNGGGGGGGCTAAAAAAAAAACCCCCCCCNCCCACCCNNCCNGGGGGGGNGNAGGTTTNTTGTTTTTTTTTCCANAANACTNGGTTNGNGGGAAGAGATNAANNAACACACCCCCCCNCNTGNGGTCCTTNTTTCCCCNAANGGGTGNGGGNGGNNNATTCCTCCTNCNTNCCACNAANAAAGGGGGNNTTATTAAAAACTTNNCCTCAGGTNCNCTNGNGGGGGGGGGGGGGGGGNGGNCCANAANTNTTNCNCCCGGGNCGGGGNNAATTNCCCNGGGTNAGGNATCCTTCNAANAGAGGTTTTTAAAANACCTTNNCNCCCGGGGGGAAATNCCTGNTCCCCCCTCTCNNNAAGANGAAAAATAAAACTCAAGGGGTNAGNNNTNTNAAAGNTGCCNTTCCNANGCTTGATANTTTCCGTTTNNTCNCACCTGGGNNCNNNTAATCGGATGNNGGACANANCAANGCGGGCCTTCACCCCATCNTGGNGGNCCNTNNGNCCNTTTNGCCANTCNCNTNCGCCCNNGGGGTNNTNCNTNGCAGGGGGNNTANCGGTTCCNGGGGGNCAAANNTNCCNCAATGGNTTTNGGANNGTGNCCCCCNCCNTGAGNANTTNAA

Jeff [email protected]

Bioinformatics and

Data Management

Page 2: Bioinformatics  and  Data Management

Lecture Goals

• Define Bioinformatics

• Explore NCBI’s website

• Introduce some useful tools in available Microsoft Excel

• Review other computer software used in Molecular Epidemiology

Page 3: Bioinformatics  and  Data Management

Bioinformatics Defined

Systematic approach to

store and classify and analyze

data and information and metadata

that allows for the

acquisition of knowledge and

enhancement of the understanding

of biological systems.

Page 4: Bioinformatics  and  Data Management

Data

Numbers derived from observations, experiments or calculations

Examples: Positive/NegativeA, T, G, Cill/healthy

Page 5: Bioinformatics  and  Data Management

Information

Data in context.

Data with associated explanations and interpretations.

Examples: Publications, available sequence data, that which is freely available.

Page 6: Bioinformatics  and  Data Management

Metadata

• Data about data

• Context in which information is used.

• One application's metadata is another application's data

Examples:Descriptive studies, reference lists, sources, Genbank accession numbers.

Page 7: Bioinformatics  and  Data Management

Knowledge and Understanding

• This is what we’re all doing here!

• Advancement of Science

• Solving of problems

Page 8: Bioinformatics  and  Data Management

Bioinformatics DefinedData Information Metadata

Knowledge & Understanding

Page 9: Bioinformatics  and  Data Management

Data Analysis Tools

• NCBI– Bibliographic information– Sequence analysis (nucleotide, protein)– Other

• Data scanning using Microsoft Excel• Other tools

– Gel comparisons– Spatial data-GIS– Temporal data – Statistical considerations

Page 10: Bioinformatics  and  Data Management

On-line Tools Available at NCBI

Page 11: Bioinformatics  and  Data Management

Blast Searches

Page 12: Bioinformatics  and  Data Management

Microsoft Excel

• Easily available

• Filters

• Pivot Table Reports and Charts

Page 13: Bioinformatics  and  Data Management

Other Software

• ClustalW multiple sequence alignments– www.ebi.ac.uk/clustalw/index.html

• BioNumerics

Page 14: Bioinformatics  and  Data Management

•Fingerprint types

•Character types

•Sequence types

•2-D gel types

•Matrix types

Page 15: Bioinformatics  and  Data Management

Geographic Information System (GIS) software

Page 16: Bioinformatics  and  Data Management

Cardinal Rules of Data Management

I. Save your data

II. Back-up your data

III. Record databases, and program versions used

IV. Write down sequence numbers

V. Record program parameters

VI. Use E-values

VII. Double check your results visually

VIII. In spreadsheets, one entry per column