gene, proteins, and genetic code. protein synthesis in a cell
TRANSCRIPT
![Page 1: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/1.jpg)
Gene, Proteins, and Genetic Code
![Page 2: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/2.jpg)
Protein Synthesis in a Cell
![Page 3: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/3.jpg)
A protein sequence
>gi|7228451|dbj|BAA92411.1| EST AU055734(S20025) corresponds to a region …
MCSYIRYDTPKLFTHVTKTPPKNQVSNSINDVGSRRATDRSVASCSSEKSVGTMSVKNASSISFEDIEKSISNWKIPKVN
IKEIYHVDTDIHKVLTLNLQTSGYELELGSENISVTYRVYYKAMTTLAPCAKHYTPKGLTTLLQTNPNNRCTTPKTLKWD
EITLPEKWVLSQAVEPKSMDQSEVESLIETPDGDVEITFASKQKAFLQSRPSVSLDSRPRTKPQNVVYATYEDNSDEPSI
SDFDINVIELDVGFVIAIEEDEFEIDKDLLKKELRLQKNRPKMKRYFERVDEPFRLKIRELWHKEMREQRKNIFFFDWYE
SSQVRHFEEFFKGKNMMKKEQKSEAEDLTVIKKVSTEWETTSGNKSSSSQSVSPMFVPTIDPNIKLGKQKAFGPAISEEL
VSELALKLNNLKVNKNINEISDNEKYDMVNKIFKPSTLTSTTRNYYPRPTYADLQFEEMPQIQNMTYYNGKEIVEWNLDG
FTEYQIFTLCHQMIMYANACIANGNKEREAANMIVIGFSGQLKGWWNNYLNETQRQEILCAVKRDDQGRPLPDRDGNGNP
TELKEGFHMEEKDEPIQEDDQVVGTIQKYTKQKWYAEVMYRFIDGSYFQHITLIDSGADVNCIREDEILDQLVQTKREQV
VNSIYLHDNSFPKSMDLPDQKITEKRAKLQDIPHHEERLLDYREKKSRDGQDKLPMEVEQSMATNKNTKILLRAWLLST
A protein sequence may have a few hundreds to several thousands amino acids.
![Page 4: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/4.jpg)
Protein synthesis
![Page 5: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/5.jpg)
Genetic code ..ATTCACAGTGGA..
I
H
S
G
![Page 6: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/6.jpg)
Notes on translation
• Three Reading frames
• Third base not important
• 5’ -> 3’
• Start and end codon• Open Reading Frame (ORF)
• Each gene is an ORF, but not all ORF are genes.
![Page 7: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/7.jpg)
The Central Dogma of Molecular Biology
DNA RNA Proteintranscript translation
replication
genotype phenotype
![Page 8: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/8.jpg)
Exception – retroviruses
DNA RNA Proteintranscript translation
replication
genotype phenotype
![Page 9: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/9.jpg)
ProteinPhenotype
DNA(Genotype)
Biology
![Page 10: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/10.jpg)
Genes• One gene encodes one protein (or sometimes
RNA).• Like a program, it starts with start codon (e.g.
ATG), then each three code one amino acid. Then a stop codon (e.g. TGA) signifies end of the gene.
• Genes are dense in prokaryotes and sparse in eukaryotes.
• In the middle of a eukaryotic gene, there are introns that are spliced out (as junk) after transcription. Good parts are called exons. This is the task of gene finding.
![Page 11: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/11.jpg)
Gene related diseases
• Hemophilia: on X chromosome.• Sickle-Cell Anemia: single nucleotide mutation in the first
exon of beta-globin gene (removes a cutting site). 1 in 12 African Americans are carriers. (sick for homozygotes)
• BRCA1 gene (chr. 17q) – responsible for ½ inherited breast cancer (10% of breast cancer)
• Fragile X syndrome (mentally retard) – 1 in 1250 males, 2500 females (dominate, but females have partially expressed good gene). FMR-1 gene: tri-nucleotide repeats >200 causes disease.
• P53 gene: chr. 17p, tumor suppressor protein.
![Page 12: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/12.jpg)
Gene Prediction and AnnotationProkaryotes
1. Start/stop codon (ORF)2. Promoters3. Content4. Sequence similarity
![Page 13: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/13.jpg)
![Page 14: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/14.jpg)
Start Codon
May miss short genes.Do not know which start codon to use.Overlapping ORF at different reading frames.
![Page 15: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/15.jpg)
Promoters
<-- upstream downstream -->
5'-XXXXPPPPPPXXXXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXXXX-3‘
-35 -10 Gene to be transcribed
-10: T A T A A T 77% 76% 60% 61% 56% 82%-35: T T G A C A 69% 79% 61% 56% 54% 54%
Pribnow box
In prokaryotes, the promoter consists of two short sequences at -10 and -35 position upstream of the gene, that is, prior to the gene in the direction of transcription. The sequence at -10 is called the Pribnow box and usually consists of the six nucleotides TATAAT. The Pribnow box is absolutely essential to start transcription in prokaryotes. The other sequence at -35 usually consists of the six nucleotides TTGACA. Its presence allows a very high transcription rate.
These rules are only approximately correct.
![Page 16: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/16.jpg)
Scoring a 6-mer as Pribnow box
•We need a “score function” to measure the likelihood that a 6-mer is a pribnow box
![Page 17: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/17.jpg)
An exemplary function for pribnow box fitness evaluation
log()
![Page 18: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/18.jpg)
Content I – codon bias• A codon XYZ occurs with different freqencies in
coding regions and non-coding regions• different amino acids have different freq.• Diff. codons for the same amino acid have diff. freq.• In non-coding regions approx. p(X)*p(Y)*p(Z)
![Page 19: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/19.jpg)
http://www.kazusa.or.jp/codon/
![Page 20: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/20.jpg)
Codon bias• First use many known genes of the organism or
similar organisms to train codon frequency table.• Each codon ci has f(ci).
• Second compute the background frequency of each base bf(X) for X=A,C,G,T
• The “significance” of a codon c=XYZ is then –log( f(c) / (bf(X)*bf(Y)*bf(Z))).
• High average significance in a region is an indication of gene.
![Page 21: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/21.jpg)
![Page 22: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/22.jpg)
Content II - Hidden Markov Model (HMM)
![Page 23: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/23.jpg)
Eukaryotes
• Basic idea similar to Prokaryotes
• Difference:
![Page 24: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/24.jpg)
DNA-specific transcription factors
• These are the basic of gene-regulatory network• Another hot area in Bioinformatics
![Page 25: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/25.jpg)
Splicing
• Consensus sequences have been identified as necessary but not sufficient for splicing. In vertebrates, these sequences are (the slash identifies the exon-intron or intron-exon junction): • C(orA)AG/GTA(orG)AGT "donor" splice site • T(orC)nNC(orT)AG/G "acceptor" splice site. • A third sequence, which in yeast is TACTAAC , is necessary
within the intron sequence.
These rules are only approximately correct.
![Page 26: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/26.jpg)
![Page 27: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/27.jpg)
![Page 28: Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell](https://reader031.vdocuments.mx/reader031/viewer/2022013004/56649f4f5503460f94c714e5/html5/thumbnails/28.jpg)