genomic signal processing

21
MADE BY- SHOBHIT SRIVASTAVA 1109131125

Upload: shobhit-srivastava

Post on 16-Jul-2015

110 views

Category:

Engineering


0 download

TRANSCRIPT

MADE BY-SHOBHIT SRIVASTAVA1109131125

Protein coding region of genome is gene

the genome is the entirety of an organism'shereditary information.

Question??

The human genome contains 3164.7 million chemical nucleotide bases (A, C, T, and G).

The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene containing 2.4 million bases.

The total number of genes is estimated at 30,000 to 35,000.

Less than 2% of the genome is used in protein coding.

At least 50% of the genome is comprised of unused repetitive sequences.

GENOME doesn’t generate any signal.

REMEMBER-

GENOME

•DNA

•PROTEINS

PROPERTY

•PERIOD 3 BEHAVIOR

•ALTERNATIVE SPLICING

Gene to protein

Protein

mRNA

DNA

transcription

translation

CCTGAGCCAACTATTGATGAA

CCUGAGCCAACUAUUGAUGAA

PEPTIDE

Protein-coding regions of DNA have been found to have apeak at frequency 2π/3 in their Fourier spectra. This iscalled the period-3 property.

The period-3 property is related to the different statisticaldistributions of codons between protein-coding andnoncoding DNA sections.

The period-3 property can be used as a basis foridentifying the coding and non-coding regions in a DNASequence.

Identification of protein coding regions

Prediction of the proper reading frame

Comparing to traditional methods, signal processing methods are much quicker, and can be even more accurate in some cases.

By mapping the chemical bases of DNA to a number set, we give ourselves an effective “DNA signal” .

A properly defined Fourier transform is a powerful predictor of both the existence and the reading frame of protein coding regions in DNA sequences.

Their respective color mapping schemes can help in visually identifying protein coding regions.

FILTERS

ANTINOTCHFILTERS

INFINITEIMPULSE

RESPONSEFILTERS

MULTISTAGEFILTERS

Magnitude response of lowpass filter

Multiband response

Highpass filter .

AGRICULTURE

MODERN MEDICINE

FORERNSICS

EVOLUTIONARY STUDIES.

Challenges and Future Work• Genomic signal processing opens a new signal

processing frontier

• Sequence analysis: symbolic or categorical signal, classical signal processing methods are not directly applicable

• Increasingly high dimensionality of genetic data sets and the complexity involved call for fast and high throughput implementations of genomic signal processing algorithms

• Future work: spectral analysis of DNA sequence and data clustering of microarray data. Modify classical signal processing methods, and develop new ones.