an algorithm for detecting tpp riboswitches in...
Embed Size (px)
An algorithm for detecting TPP riboswitches in archaea
Department of Biology, Department of Math and Computer Science Denison University, Granville, OH 43023
Motif Sequence Level of conservation Location in sequence
GGGG High Towards 5’ end of the
UGAGA Perfect conservation in all
riboswitches No more than 30 bases from
CCCU Fair, some point mutations
Usually same distance away from TGAGA as GGGG is from
AACCUGA Low, most sequences had point
mutations for this motif Usually at the center of the
AGGGA Fair, some point mutations
observed Towards 3’end of the
Riboswitch mediated gene regulation by a) prevention of translation initiation b) prevention of proper splicing and c) premature transcription termination.
Characteristic secondary structure of the TPP riboswitch in the presence and absence of TPP
Secondary structures of the riboswitches predicted in K. cryptofilum (a, b) and C.maquilingensis (c).
Source genome of hit fRNAdb match? Nearby proteins
E. coli Yes 3’ side: thiamin biosynthesis protein thiC
E. coli Yes 3’ side: thiamine-‐binding periplasmic protein precursor
E. coli Yes 3’ side: hydroxyethylthiazole kinase
A.Thaliana Yes Within an open reading frame. Gene regulaAon via splicing
T. volcanium Yes 3’ side: Major facilitator superfamily permease
T.volcanium Yes 3’ side: Major facilitator superfamily permease
T.acidophilum Yes 3’ side: Major facilitator superfamily permease T.acidophilum Yes 3’ side: caAonic amino acid transporter related protein K. cryptofilum None HypotheAcal protein Kcr_0861
Sequence is part of the coding region although it does not code for any conserved protein domain. Sequence is located near 3’end of the coding region
K. cryptofilum None 3’ side: permease for cytosine uracil thiamine allantoin
C. maquilingensis None 3’ side: Nucleoside diphosphate kinase
• Miranda-Rios J, Navarro M and Soberon M. 2001. A conserved RNA structure (THI box) is involved in regulation of thiamin biosynthetic gene expression in bacteria. Proc. Natl. Acad. Sci. USA. 98: 9736 – 9741. • Nudler E and Mironov AS. 2004. The riboswitch control of bacterial metabolism. Trends in Biochemical Sciences. 29(1): 11 – 17. • Serganov A, Polonskaia A, Phan AT, Breaker RR and Patel DJ. 2006. Structural basis for gene regulation by a thiamin pyrophosphate-sensing riboswitch. Nature. 441: 1167 – 1171. • Winkler WC and Breaker RR. 2003. Genetic control by metabolite binding riboswitches. Chembiochem. 4:1024 – 23.
Riboswitches are short sequences of non-coding RNA (100-200nt in length) that are located in the UTRs of genes.
Riboswitches consist of highly specialized aptamer regions which recognize and bind to specific metabolites (Winkler and Breaker, 2003).
The TPP riboswitch binds to thiamin pyrophosphate.
Upon binding to a metabolite, the riboswitch changes its structural conformation, which results in regulation of gene expression (Nudler & Mironov, 2004).
The TPP Riboswitch
Has the characteristic structure displayed below (Miranda-Rios, 2001; Serganov et al, 2006).
Has a motif sequence UGAGA conserved 97% of the time.
Has been detected in the genomes of all three domains of life; but only in two archaea species of the order Thermoplasmatales (Miranda-Rios et al, 2001).
Step 1: Identify motifs in TPP riboswitches.
• Obtained sequences of 355 TPP riboswitches from fRNA Database.
• Performed multiple sequence alignment using ClustalX2.
• Identified six highly conserved motif sequences.
Step 2: Fragment whole genome of target for scanning.
• Fragments are 700nt with a 200nt overlap.
• Each snippet will be scanned for the motif sequences.
Step 3: Modified Smith-Waterman algorithm.
• Find all alignments in each fragment to motifs with score above threshold.
Step 4: Infer the best sequence of motifs.
• Dynamic programming algorithm determines the sequence of individual motifs in each fragment that results in the best total score.
Step 5: Predict secondary structure and function.
• Folded using the RNAfold server and then compared with the characteristic structure.
• Putative riboswitches have a strong resemblance to the characteristic structure.
• Nearby genes were determined using NCBI BLAST and the UCSC Genome Browser.
Possible new methods of gene regulation in K. cryptofilum:
• One of the two predictions in the K. cryptofilum genome was found to be located within an ORF.
• No information is available about whether the ORF is actually a gene.
• No information is available about possible introns in the coding region.
• Novel method of gene regulation may be employed here, such as ribosome shunting.
Generalizing the algorithm:
• Success of the devised algorithm suggests that it is possible to apply it to other kinds of riboswitches.
• Possible to prime the algorithm with motif sequences and characteristic structures of other riboswitches in a similar method.
• Further improvements to be made to the algorithm include incorporating a more efficient means of comparing secondary structure than the one employed here as well as automating the detection of motif sequences in known riboswitch sequences. Results
Putative TPP riboswitches predicted by the algorithm
TPP riboswitch conformation with and without TPP
Testing on known riboswitches:
• Tested on genomes known to possess at least one TPP riboswitch.
• Detected all known TPP riboswitches in each genome.
• Secondary structures of predicted riboswitches were similar to the characteristic structure.
Scanning other archaea:
• Executed algorithm on the genomes of 12 archaea species other than those of the order Thermoplasmatales.
• Three putative riboswitches detected from genomes of Caldivirga maquilingensis and Korarchaeum cryptofilum.
Chinmoy I.S. Bhatiya Jessen T. Havill Jeffrey S. Thompson chi[email protected] [email protected] [email protected]