1_introduction to bioinformatics

Upload: anonymous-pke8zox

Post on 03-Apr-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 1_introduction to Bioinformatics

    1/28

    341: Introduction to Bioinformatics

    Dr. Nataa Prulj & Prof. Yike Guo

    Department of Computing

    Imperial College London

    [email protected]

    Winter 2011

    1

  • 7/29/2019 1_introduction to Bioinformatics

    2/28

    Course overview

    Explosion in the availability of biological data: Sequences and microarrays (Prof. Guo)

    Networks: expected to be as useful as the sequence data inuncovering new biology (Dr. Prulj)

    The goal of systems biology: Systems-level understanding of biological systems, e.g. the cell

    Analyze not only individual components, but their interactions aswell and its functioning as a whole

    E.g.: Learn new biology from the topology of such interaction

    networks However, biological network research faces considerable

    challenges Incomplete and noisy data

    Computational infeasibility of many graph theoretic problems2

  • 7/29/2019 1_introduction to Bioinformatics

    3/28

    Course overviewWe will cover:

    1. Biological aspects: Basic biological concepts (e.g., DNA, genes, proteins, gene expression, ) Different types of biological networks Experimental techniques for acquiring the data and their biases Public databases and other sources of biological network data

    2. Sequence analysis (Prof. Yi-Ke Guo)3. Microarray analysis (Prof. Yi-Ke Guo)4. Graph theoretic aspects:

    Fundamental topics in graph theory (e.g. basic graph notation, graph representation, andspecial graph types)

    Basic graph algorithms (e.g., graph search/traversal algorithms and running time analysis) Important computational complexity concepts (e.g., complexity classes, subgraph

    isomorphism, and NP-completeness) which pose challenges on analyzing biological nets

    5. Existing approaches for analyzing and modeling biological networks: Structural properties of large networks

    Network models Network clustering Graph alignment Software tools for network analysis

    6. Applications: interplay of topology and biology Learn how the above methods have been applied Discuss valuable insights that have been learned: into biological function, evolution,

    complex diseases (e.g., cancer) and drug discovery3

  • 7/29/2019 1_introduction to Bioinformatics

    4/28

    Course overview

    Grading scheme: Two homework assignments

    Each assignment worth equally

    Due at the beginning of the class

    Written exam

    Standard College Grading Scheme will be used

    4

  • 7/29/2019 1_introduction to Bioinformatics

    5/28

    Course overview

    Course organization:1. Lectures

    Relevant theoretical concepts and examples

    2. Tutorials Exercises covering concepts covered in class

    3. Two homework assignments

    Opportunity to solve practical problems using the methods learned in class

    4. Written exam Testing students understanding of the concepts learned in lectures

    5

  • 7/29/2019 1_introduction to Bioinformatics

    6/28

    Course overview

    Textbooks and readings Recommended textbooks:

    Junker and Schreiber, Analysis of Biological Networks, Wiley, 2008. West, Introduction to graph theory, 2nd edition, Prentice Hall, 2001or T. Cormen et al., Analysis of Algorithms, 3rd eddition, MIT press, 2009. A list of up-to-date research papers selected by the instructor.

    Recommended readings: F. Kepes (Author, Editor), Biological Networks (Complex Systems and

    Interdisciplinary Science), World Scientific Publishing Company; 1stedition, 2007.

    Bornholdt and Schuster (Editors), Handbook of Graphs and Networks:From the Genome to the Internet, Wiley, 2003.or

    Dorogovtsev and Mendes (Authors), Evolution of Networks: FromBiological Nets to the Internet and WWW (Physics), Oxford UniversityPress, 2003.

    Chapter 17 from: Chen and Lonardi (Editors), Biological Data Mining,Chapman and Hall/CRC press, 2009.

    Chapter 4 from: Jurisica and Wigle (Editors), Knowledge Discovery inProteomics, CRC Press, 2005.

    LEDA: A Platform for Combinatorial and Geometric Computing, by Kurt

    Mehlhorn, Stefan Nher, Cambridge University Press, 1999. 6

  • 7/29/2019 1_introduction to Bioinformatics

    7/28

    Course overview

    When and where:

    Fridays, 2 5 pm (2 hours of lecture, 1 hour tutorail)

    145 Huxley

    Contact:

    E-mail: [email protected]

    Subject: 341 Bioinformatics

    Office hours:

    Fridays after class, 5 pm

    Office: 407 A Huxley

    7

  • 7/29/2019 1_introduction to Bioinformatics

    8/28

    Course overview

    Prerequisites: none

    Basic programming skills are desirable

    Introduction into biological concepts will be provided

    Course website (curriculum, class material, etc.):

    http://www.doc.ic.ac.uk/~natasha/course/index.html

    Academic code of honor

    8

  • 7/29/2019 1_introduction to Bioinformatics

    9/28

    Topics Introduction: biology (Dr. Przulj)

    Sequence analysis (Prof. Guo, 2 lectures)Microarray analysis (Prof. Guo, 3 lectures)

    Network biology (Dr. Przulj):

    Introduction to graph theory

    Network properties

    Network/node centralities

    Network motifs

    Network models

    Network/node clustering

    Network comparison/alignment

    Software tools for network analysis

    Interplay between topology and biology 9

  • 7/29/2019 1_introduction to Bioinformatics

    10/28

    Course overview

    Any questions so far?

    10

  • 7/29/2019 1_introduction to Bioinformatics

    11/28

    Course overview

    About you

    11

  • 7/29/2019 1_introduction to Bioinformatics

    12/28

    Introduction: biology

    12

  • 7/29/2019 1_introduction to Bioinformatics

    13/28

    Introduction: biology

    Cell- the building block of life

    Cytoplasm and organelles separated by membranes:

    Mitochondria, nucleus, etc.

    13

  • 7/29/2019 1_introduction to Bioinformatics

    14/28

    Introduction: biology

    Distinguish between:

    ProkaryotesSingle-celled, no cell nucleus or any other

    membrane-bound organelles The genetic material in prokaryotes is not membrane-bound

    The bacteriaand the archaea

    Model organism: E.coli

    EukaryotesHave "true" nuclei containing their DNA

    May be unicellular, as in amoebae

    May be multicellular, as in plants and animals

    Model organism: S. cerevisiae (bakers yeast) 14

  • 7/29/2019 1_introduction to Bioinformatics

    15/28

    Introduction: biology

    Nucleus contains DNA

    Deoxyribonucleic acid

    DNA nucleotides: A and T, C and G

    DNA structure: double helix

    15

  • 7/29/2019 1_introduction to Bioinformatics

    16/28

    Introduction: biology

    Chromosomes

    RNA:similar to DNA, except T U and single stranded

    16

  • 7/29/2019 1_introduction to Bioinformatics

    17/28

    17

    Introduction: biology

    Main role of DNA: long-term storage of genetic information

    Genes: DNA segments that carry this information

    Intron: part of gene not translated into protein, spliced out of mRNA

    Exon: mRNA translated into protein consists only of exon-derivedsequences

    Genome: total set of (unique) genes in an organism

    Every cell (except sex cells and

    mature red blood cells) contains

    the complete genome of an organism

    17

  • 7/29/2019 1_introduction to Bioinformatics

    18/28

    18

    Introduction: biology

    Codons: sets of three nucleotides

    4 nucleotides 43=64 possible codons

    Each codon codes for an amino acid

    64 codons produce 20different amino acidsMore than one codon stands for one amino acid

    Polypeptide:

    String of amino acids, composed from a 20-character alphabet

    Proteins: String composed of one or more polypeptides (70-3000 amino acids)

    Sequence of amino acids is defined by a gene

    Gene expression: information transmission from DNA to proteins

    Proteome: total set of proteins in an organism

  • 7/29/2019 1_introduction to Bioinformatics

    19/28

    Introduction: biology

    The 20 amino acids

    19

  • 7/29/2019 1_introduction to Bioinformatics

    20/28

    20

    Introduction: biology

    Levels of protein

    structure:

    20

  • 7/29/2019 1_introduction to Bioinformatics

    21/28

    Introduction: biology

    Genes vs. proteins

    Genes passive; proteins active

    Protein synthesis: from genes to proteins

    Transcription(in nucleus)Splicing(eukaryotes)

    Translation (in cytoplasm)

    21

  • 7/29/2019 1_introduction to Bioinformatics

    22/28

    Introduction: biology

    Transcription(in nucleus)

    RNA polymeraseenzyme builds an RNA strandfrom a gene (DNA is "unzipped)

    The gene is transcribed to messenger RNA(mRNA)

    Transcription is regulated by proteins called

    transcription factors

    22

  • 7/29/2019 1_introduction to Bioinformatics

    23/28

  • 7/29/2019 1_introduction to Bioinformatics

    24/28

    Introduction: biology

    Translation(in cytoplasm)

    Ribosomes synthesize proteins from mRNA

    mRNA is decoded and used as a template to guide the

    synthesis of a chain of amino acids that form a proteinTranslation: the process of converting the mRNA codon

    sequences into an amino acid polypeptide chain

    24

  • 7/29/2019 1_introduction to Bioinformatics

    25/28

    Introduction: biology

    Microarrays:

    Measure mRNA abundance for each gene

    The amount of transcribed mRNA correlates with

    gene expressionThe rate at which a gene produces the corresponding protein

    It is hard to measureprotein level directly!

    25

  • 7/29/2019 1_introduction to Bioinformatics

    26/28

    Introduction: biology

    Every cell* contains the complete genome of an organism

    How is the variety of different tissues encoded andexpressed?

    26

  • 7/29/2019 1_introduction to Bioinformatics

    27/28

    Introduction: biology

    27

  • 7/29/2019 1_introduction to Bioinformatics

    28/28

    Introduction: biology

    -ome and omics

    Genome and genomics

    Proteome and proteomics

    28