rnai as a bioinformatics consumer

17
Ravi Sachidanandam works in Cold Spring Harbor Laboratory and is a physicist who is currently interested in computational biology. He is working on several computational aspects in RNAi and its application as a tool for biological research. He is also working on various aspects of splicing, as well as modelling and simulating systems involved in apoptosis, splicing, RNAi and live-cell imaging. Keywords: RNAi, miRNA, PTGS, TGS, gene silencing, chromatin Ravi Sachidanandam 1 Bungtown Road, Cold Spring Harbor Laboratory, NY 11724, USA Tel: þ1 516 367 8864 Fax: þ1 516 367 8389 E-mail: [email protected] RNAi as a bioinformatics consumer Ravi Sachidanandam Date received (in revised form): 17th March 2005 Abstract RNAi has shown great potential for use as a tool for biological discovery, analysis and therapeutics. The involvement of the RNAi pathway in post-transcription silencing, transcriptional silencing and epigenetic silencing as well as its use as a tool for forward genetics and therapeutics throws up several bioinformatics challenges. This paper delineates several areas of research and reviews work that has already been done, the tools that are available and the challenges that lie ahead. INTRODUCTION RNAi, or RNA interference, is the disruption of a gene’s expression by a double stranded RNA (dsRNA) in which one strand is complementary (either perfectly or imperfectly) to a section of the gene’s mRNA. 1 A dsRNA can enter the cytoplasm, through the expression of a hairpin (or inverted repeats), through viral gene expression or through artificial constructs that enter the cell through the cell membrane. The disruption can take the form of mRNA degradation, translational repression or transcriptional repression through epigenetic modifications. 2–5 The pathway can also lead to message modification by cleavage of mRNAs. The RNAi pathway is conserved across eukaryotic species, from yeasts to humans. Interestingly, in yeasts, Schizosaccharomyces pombe has portions of the RNAi pathway while Saccharomyces cerevisiae seems to have lost most of it. 6 Thus, it is a universal mechanism in biological systems and is amenable to use as a tool for forward genetics, where the expression of genes can be regulated by external triggers and phenotypic consequences can be observed. 7 The RNAi pathway and its various roles, depicted in Figure 1, are post- transcriptional gene silencing (PTGS) and transcriptional gene silencing (TGS). PTGS is the degradation of transcribed and processed mRNA in the cytoplasm. It also includes translational silencing. Degradation of mRNA. Short interfering RNAs (siRNAs) are 22 nt double-stranded RNA with 2 nucleotide 39 overhangs on both strands and a phosphate on the 59 end. 9 siRNAs in the cytoplasm get unwound and a single strand gets incorporated into the RNA-induced silencing complex (RISC). 10 The RISC along with the siRNA is guided to the complementary mRNA, by an unknown process. The mRNA can get silenced through degradation by Argonaute 2 slicing 11,12 when the siRNA strand shows perfect complementarity. Inhibition of translation. In plants and animals, short (22 nt) non-coding RNAs called microRNAs (miRNAs) regulate developmental pathways by the regulation of protein expression. 13,14 miRNAs usually block translation. Certain viruses have also been shown to use miRNAs to control gene expression, using the hosts’ machinery. 15 Cleavage of RNA. mRNAs get cleaved by miRNAs that have near- perfect matches to them. 16,17 146 & HENRY STEWART PUBLICATIONS 1467-5463. BRIEFINGS IN BIOINFORMATICS. VOL 6. NO 2. 146–162. JUNE 2005 by guest on May 14, 2011 bib.oxfordjournals.org Downloaded from

Upload: independent

Post on 28-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Ravi Sachidanandam

works in Cold Spring Harbor

Laboratory and is a physicist

who is currently interested in

computational biology. He is

working on several

computational aspects in RNAi

and its application as a tool for

biological research. He is also

working on various aspects of

splicing, as well as modelling

and simulating systems involved

in apoptosis, splicing, RNAi and

live-cell imaging.

Keywords: RNAi, miRNA,PTGS, TGS, gene silencing,chromatin

Ravi Sachidanandam

1 Bungtown Road,

Cold Spring Harbor Laboratory,

NY 11724, USA

Tel: þ1 516 367 8864

Fax: þ1 516 367 8389

E-mail: [email protected]

RNAi as a bioinformaticsconsumerRavi SachidanandamDate received (in revised form): 17th March 2005

AbstractRNAi has shown great potential for use as a tool for biological discovery, analysis and

therapeutics. The involvement of the RNAi pathway in post-transcription silencing,

transcriptional silencing and epigenetic silencing as well as its use as a tool for forward genetics

and therapeutics throws up several bioinformatics challenges. This paper delineates several

areas of research and reviews work that has already been done, the tools that are available and

the challenges that lie ahead.

INTRODUCTIONRNAi, or RNA interference, is the

disruption of a gene’s expression by a

double stranded RNA (dsRNA) in which

one strand is complementary (either

perfectly or imperfectly) to a section of

the gene’s mRNA.1 A dsRNA can enter

the cytoplasm, through the expression of

a hairpin (or inverted repeats), through

viral gene expression or through artificial

constructs that enter the cell through the

cell membrane.

The disruption can take the form of

mRNA degradation, translational

repression or transcriptional repression

through epigenetic modifications.2–5 The

pathway can also lead to message

modification by cleavage of mRNAs.

The RNAi pathway is conserved across

eukaryotic species, from yeasts to humans.

Interestingly, in yeasts, Schizosaccharomyces

pombe has portions of the RNAi pathway

while Saccharomyces cerevisiae seems to have

lost most of it.6 Thus, it is a universal

mechanism in biological systems and is

amenable to use as a tool for forward

genetics, where the expression of genes

can be regulated by external triggers and

phenotypic consequences can be

observed.7

The RNAi pathway and its various

roles, depicted in Figure 1, are post-

transcriptional gene silencing (PTGS) and

transcriptional gene silencing (TGS).

PTGS is the degradation of transcribed

and processed mRNA in the cytoplasm. It

also includes translational silencing.

• Degradation of mRNA. Short

interfering RNAs (siRNAs) are 22 nt

double-stranded RNA with 2

nucleotide 39 overhangs on both

strands and a phosphate on the 59

end.9 siRNAs in the cytoplasm get

unwound and a single strand gets

incorporated into the RNA-induced

silencing complex (RISC).10 The

RISC along with the siRNA is guided

to the complementary mRNA, by an

unknown process. The mRNA can

get silenced through degradation by

Argonaute 2 slicing11,12 when the

siRNA strand shows perfect

complementarity.

• Inhibition of translation. In plants and

animals, short (22 nt) non-coding

RNAs called microRNAs (miRNAs)

regulate developmental pathways by

the regulation of protein

expression.13,14 miRNAs usually block

translation. Certain viruses have also

been shown to use miRNAs to

control gene expression, using the

hosts’ machinery.15

• Cleavage of RNA. mRNAs get

cleaved by miRNAs that have near-

perfect matches to them.16,17

1 4 6 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

TGS is the silencing of gene expression

in the nucleus through changes in

chromatin structure directed in a

sequence-specific manner by dsRNA.

This is a heritable effect and probably

shares some of the pathway with the

PTGS mechanism. Two forms of TGS

are heterochromatin formation and

methylation of promoters:

• Heterochromatin formation.5 DNA in

the nucleus is wrapped around histone

proteins in a periodic arrangement,

which is then organised as a

condensed structure called

chromatin.18 This helps in packaging

the DNA into a small volume and

allows for control of expression

through the modification of the

histones via methylation, acetylation,

ubiquitination or phosphorylation of

specific residues.19 Heterochromatin

(as opposed to euchromatin) is

condensed and transcriptionally silent

chromatin.20 One pathway for its

formation involves the methylation of

the residue lysine 9 of the histone H3

(H3K9). In S. pombe it has been

shown that H3K9 methylation is

initiated by the association of a strand

of siRNA in the nucleus with the

RNA-induced transcriptional

dsRNA directedtranscriptional genesilencing (TGS)

Figure 1: RNAi pathway and its role in post-transcriptional and transcriptional gene silencing.Hairpin RNAs, from either microRNA (miRNA) genes or short hairpin RNAs (shRNAs)constructs, are transcribed and then processed in the nucleus by the Rnase III enzyme Drosha.These stem loop products contain a two-nucleotide 39 overhang and a 59 phosphate at thebase of the stem and are exported to the cytoplasm by exportin 5. In the cytoplasm, theDrosha product is further processed by Dicer into a 22-nucleotide RNA duplex with two-nucleotide 39 overhangs and 59 phosphates on both ends. One strand of these 22 nucleotideproducts (siRNAs or miRNAs) is incorporated into the RISC, composed of proteins includingArgonaute (Ago), FXRP, VIG, Tudor SN and Gemin 3 and 4. RISC, with a single 22-nucleotidesiRNA strand, can slice/cleave or block translation of an mRNA complementary to the siRNAstrand. Chemically synthesised siRNAs and shRNAs can also be introduced through the cellmembrane and become incorporated in the appropriate processing step. In transcriptionalgene silencing (TGS), double-stranded RNA transcribed from inverted repeats in repetitiveDNA induces heterochromatin formation and epigenetic silencing through H3K9 methylation.In S. pombe, siRNAs are components of the RNA-induced transcriptional silencing (RITS)complex, which includes Chp1, Tas3 and Ago. The RITS complex is involved in TGS. The RNAipathway can also induce promoter silencing through DNA methylation (a process not fullyunderstood). miRNAs are also implicated in programmed DNA elimination in organisms suchas Tetrahymena8

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 4 7

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

siRNA and shRNA cansilence genes

silencing (RITS) complex, consisting

of Chp1, Ago1 and Tas3.21 This

complex targets homologous regions

on the chromatin for H3K9

methylation. Other histone residues

can also be acetylated, methylated or

ubiquitinated and the pattern of such

modifications probably signifies the

heterochromatic or euchromatic state

of the chromatin.22

• Methylation of promoters. The

cytosine (C) of CpG (C followed by

G) motifs on the sequence can get

methylated by DNA methyl

transferase (DMTase) enzymes.23 This

can silence the underlying region of

the genome. RNAi can initiate the

methylation of gene promoters,

thereby blocking transcription.24,25

The exact mechanism of this process is

not understood.

The introduction of large dsRNA into

mammalian cells results in a general

response (interferon or protein kinase

PKR response) that leads to cell death.26

It was discovered that using shorter

dsRNA (,29 nt) can bypass this

response.27 Short double-stranded RNA

with 2 nt 39 overhangs and a 59 phosphate

group, called siRNAs, that mimic the

product of Dicer activity, can get

incorporated directly into the RISC and

result in silencing activity.28 This is a

popular method of silencing genes in cells.

Another method of inducing RNAi is

to insert hairpin constructs into the

genome using vectors which can get

expressed stably (Figure 2).29 The

expressed hairpins are processed by

Drosha and exported to the cytoplasm,

where Dicer acts on them to create

siRNAs, which then get incorporated

into the RISC. These constructs are

called short hairpin RNAs or shRNAs.29

shRNAs can also be chemically

synthesised and introduced into the

cytoplasm,30,31 in which case, it is

important to mimic the product of

Drosha, which has a 2 nt 39 overhang.

RNAi is also being proposed for

therapeutic uses,32 for example, by

targeting viral genes for silencing.

Applying this to a disease such as HIV or

polio is problematic since the viruses

mutate very rapidly and escape message

degradation by RNAi.33,34 There are trials

using siRNAs in the human eye against

vascular endothelial growth factor

� ������������ ��

��������� � �������

������������� ��

� �� ����

��� ��������

�� ���������

��� ���

����� � ������������ ����

Figure 2: A shRNA hairpin construct. Additional elements in the construct are used forengineering purposes. The U6 promoter ensures that the gene is expressed and theterminator ensures that only the hairpin gets expressed, that is, there is no transcriptional runthrough. The barcode at the end is a random 60 mer that is unique to each hairpin allowingidentification of the hairpin, either via microarrays or via the use of polymerase chain reaction(PCR). The oligo is synthesised as a DNA, and then cloned into a plasmid using PCR, grownand then inserted into a retroviral vector. For transient effects, the plasmid can be transfectedinto the cell with the effects seen about 48 hours after transfection. For permanent insertion,the vector is used to infect the cell. There are two possible orientations of the hairpin senseloop antisense (SAS) or antisense loop sense (ASS). If a 29 mer is used for the hairpin as shownin the figure, then the first 22 mer from the end is believed to be used for creating the siRNA.siRNA design considerations are used to design 19 mers that are then extended nine basepairs at the 59 end and one at the 39 end to obtain the 29 mer. Variations on this constructionare possible, depending on the biological needs

1 4 8 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

(VEGF) to treat wet age-related macular

degeneration (wet AMD).35

The exact details of the biology

delineated here are bound to change as this

is a rapidly evolving field, but some of the

broad themes and bioinformatics concerns

are bound to be relatively longer lasting.

Listed below are several areas in RNAi

research and the role of bioinformatics in

them. Each section will conclude with a

short list of bioinformatics challenges that

are definitely skewed by the author’s

preferences.

RNAi-RELATED PROTEINCODING GENE/FUNCTIONDISCOVERYThe RNAi pathway has not been fully

elucidated. Many more proteins than are

currently known may be involved in the

RNAi pathway. Experimentally, many of

the genes have been discovered using

mutational studies in Caenorhabditis

elegans, as well as screening all genes of the

genome using RNAi against them.36

Comparative genomics has helped with

RNAi research in several ways. The

discovery that Argonaute exists in S.

pombe but not in S. cerevisiae6 led to the

discovery of RNAi-controlled

heterochromatin silencing through

histone H3 lysine 9 methylation.37 The

prediction of an Argonaute homologue in

the archae Pyrococcus furiosus allowed the

crystallisation of the protein. This, in turn,

led to the determination of its crystal

structure, as well as the identification of

the function of the PIWI domain.10,11

A computational approach to

predicting novel RNAi genes involves

looking for domains that are peculiar to

proteins involved in RNAi. For example,

Argonaute has the PAZ and PIWI

domains,38 and Dicer has a DexH box

helicase domain,39 two RNaseIII

domains,40 a PAZ domain and a dsRBD

(double-stranded RNA binding)

domain.41 Thus proteins with PAZ and

PIWI domains are worthy of

experimental investigation. This kind of

analysis usually involves searching for

motifs using hidden Markov models

(HMMs)42 and phylogenetic analysis.43

Phylogenetic analysis has also helped with

understanding the functions of various

homologues of Argonaute.12,38 The study

of protein families involved in RNAi can

help reveal new members of these families

and uncover new mechanisms of the

pathway.

Bioinformatics challenges• Identification of all the counterparts to

RNAi proteins in each of the model

organisms.

• Understanding the evolution of the

pathway through phylogenetic

analysis, taking into account structural

elements of the RNAi proteins (such

as the differences in catalytic activity

between various forms of Argonaute).

The double-stranded RNA binding

domain is used in several other

pathways: how did it evolve to be

used in the RNAi pathway?41

• Identifying critical components of the

RNAi pathway through the analysis of

the topology of the pathway. There

are several experiments where RNAi

is used to silence parts of the pathways.

This process needs to be understood

and its temporal behaviour explained

(or predicted, since there are no data

out there).

miRNA PATHWAYSmiRNA pathways are developmentally

important in almost all eukaryotic

organisms,44 but may also have a major

role to play in the nervous systems of

mammals.45,46 miRNA genes are small

non-coding genes (,150 bp) whose

transcripts form hairpins called primary

miRNA (pri-miRNA). The pri-miRNAs

get processed by an RNaseIII enzyme

called Drosha in the nucleus into

precursor miRNAs (pre-miRNAs),

which have a two-nucleotide 39

overhang. The pre-miRNAs get exported

into the cytoplasm, where they are

processed by Dicer and then get

associated with the RISC (Figure 1).47

RNAi proteins

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 4 9

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

miRNA genes

This RISC-associated single-strand

miRNA blocks translation of the

complementary mRNA. The level of

match to target need not be very high, as

shown in Figure 3. miRNAs that have

near-perfect matches to an mRNA can

also direct cleaving (or message

destruction) of the target.16,17

MicroRNAs control the timing of gene

expression, especially during

development.44,48

Occurrence, expression andprediction of miRNA genesMany miRNA genes have been

experimentally identified in a variety of

organisms and can be accessed at the

miRNA registry website.49,50 The registry

was constructed by sequencing small

RNAs (� 22 nt) and mapping them to

the genome, and delineating the hairpins

that contain them.

miRNAs can occur in intergenic

regions, in introns of protein coding

regions or in exons and introns of non-

coding genes.51 The intronic miRNAs

probably rely on Pol II mediated pre-

mRNA transcription and splicing52 for

expression, while the intergenic ones are

believed to be transcribed independently

by Pol II polymerases.53

miRNAs can also occur in clusters

Figure 3: The first three hairpin structures are let 7 miRNA hairpins from Caenorhabditiselegans, Drosophila melanogaster and Homo sapiens. The shaded portions are the maturesequences that result from processing of the miRNA gene. The last two alignments are formature let 7 against two targets in the 39 UTR of lin 41 in C. elegans44

1 5 0 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

(separated by a few kilobases) that can be

identified by mapping the miRNA genes

to the genome. These clusters can be

transcribed together and are believed to

be miRNAs that control mRNAs from

pathways that are involved in related

functions. Not much is known about

their promoters, though in the case of C.

elegans some conserved motifs have been

identified.54 Identification of such

promoter sequences can help with the

prediction of miRNA genes.

It is almost impossible to identify

miRNA genes by scanning the genome of

any organism, as the number of putative

hairpins far outnumbers the number of

known genes. Additional signals are

necessary. One method of analysis uses

the known biases in targeting strands, but

this is not necessarily easy to use in the

miRNA case, as the level of match

between the miRNA single strand and

the mRNA is usually not very high.

There have been suggestions that miRNA

precursors form more stable hairpins,55

but this may not be sufficient to make

reasonably useful predictions. Another

method that has been used is the

conservation of sequences between

closely related species such as rice

(O. sativa) and Arabidopsis (A. thaliana).56

Discovery of new miRNA genes can

help identify critical features that control

the silencing behaviour of dsRNAs.

MiRscan57 and MiRseeker58 are

programs that use the conservation of

stem loop structures across species to

identify novel miRNA genes.

Target predictionThe binding sites of miRNAs are

presumed to be in the 39 UTRs and

usually occur in multiple copies (Figure

3).44,59 Combinatorial control, with

multiple miRNAs targeting a single gene,

is probably an important feature of

miRNA targeting, very similar to the

mode of transcription factor control of

genes.60–62

One of the first miRNAs to be

identified is let 7, which targets the gene

lin 41 (Figure 3).44 The alignments of let

7 and its target regions on the 39 UTR of

lin 41 show that several bulges as well as

G:U wobble base pairing are allowed.

Thus, the number of potential targets of

any miRNA is quite large, unless some

restrictions can be placed on these

matches.

Some rules for miRNA target matches

have been inferred from a mutational

study,60 which found that the strong

binding of the 59 end (the first eight base

pairs) of the mature miRNA to the 39

UTR sequence was very important and

G:U wobble pairing reduced the silencing

efficiency. In addition, it is thought that

having multiple binding sites for an

miRNA on the 39 UTR increases the

efficiency of silencing.60 This implies

multiple binding sites for different

miRNAs on the same UTR allows for

combinatorial control of the silencing

process. The mechanism of targeting has

not been settled. There have been several

attempts (for example, Rhoades et al.,63

John et al.64) at predicting targets, but the

agreement between the different attempts

at this for the same set of miRNAs is not

as good as one would expect, presumably

because there could be a large number of

targets for each miRNA65 and each group

uses their own criteria to limit results.

Models have been developed for miRNA

target interactions, based on

experiments.59

miRNAs could also target other

miRNAs for silencing.66 As discussed

below, miRNAs show tissue-specific

expression patterns; thus correlation of

mRNA with miRNA expression can help

narrow down targets. It should also be

possible to cluster miRNAs and mRNAs

based on expression patterns. There are a

number of websites67,68 and

papers63,64,69–71 that deal with miRNA

target predictions.

miRNA microarraysMicroarrays that can detect miRNAs have

been constructed to study the expression

of miRNAs. These experiments have

shown that miRNA expression shows

tissue72 and temporal specificity, being

miRNA targetprediction

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 1

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

predominantly found in developmental

pathways73 and brain tissue.45,46 Some of

these studies also show that there is a

correlation between mRNA and miRNA

expression in various tissues.73 This is an

important piece of evidence for validating

miRNA target predictions.

miRNA librariesPrecursor miRNAs can be artificially

synthesised and either transfected or stably

expressed in the genome (as explained in

the caption for Figure 2) which can then

be expressed using inducible promoters.

This can allow study of miRNA function.

The design of such libraries is not

difficult, but analysis of results can be

interesting. The analysis of mRNA

networks due to silencing of components

can help elucidate the topology of

pathways.8

Message modification, DNAsuppression, viral control,DNA methylationThese constitute a class of miRNA-

directed activities that have been recently

discovered, or seen only in a few cases so

far. The developmental gene HOXB8 is

cleaved by miRNA mir 196b in mouse.16

This seems to imply that target mRNAs

that have a high degree of

complementarity to miRNAs are likely to

be cleaved by them, and the location of

the target sequence need not be in the

UTRs. Viruses can use miRNAs and the

host’s RNAi machinery to silence

genes.15 There have also been reports of

HIV virus containing miRNAs.74

Genomic rearrangements in

Tetrahymena are believed to be RNAi-

controlled and are probably the result of

miRNA action.75,76 miRNAs have also

been shown to be involved in DNA

methylation.77

Bioinformatics challenges• Identification and prediction of

miRNA promoters.

• Identification of miRNA targets. This

is not a solved problem, many

underlying assumptions in miRNA

target discovery (such as homologies,

UTR location of target sites) can help

identify good targets, but may not

cover all possible sites, some of which

may be unique to certain species and

may not follow these rules.

• Genomes of viruses can be scanned for

possible miRNAs that can target host

genes.

• miRNAs must act in concert on

members of pathways, so it might be

useful to consider function in

analysing targets of miRNA. Thus

study of the topology of genetic

networks and tissue specificity of

genes can help with miRNA target

studies.

• Identification of miRNA-directed

cleavage targets. These might be an

important feature of biological systems

and can easily be probed by

bioinformatics means.

SHORT INTERFERING RNA(siRNA)Short interfering RNAs are short (22 nt)

double-stranded RNAs (dsRNA) with

two nucleotide 39 overhangs on both

strands and a phosphate on each 59 end.9

siRNAs are created by the action of Dicer

on dsRNAs. These can get incorporated

into RISC and silence genes that are

complementary to the RISC incorporated

strand.

Design principles for functionalsiRNAsIt was discovered that the strand whose 59

end is easier to open, or more unstable, is

favoured for inclusion in the RISC.78,79

RISC has a putative helicase activity of

unknown origin which probably unwinds

the dsRNA from the unstable end. This

probably explains the strand bias of RISC.

Several studies, including studies of

miRNA hairpins,80 have revealed biases

of base composition along the active

Function of miRNAs

1 5 2 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

strand of the siRNA. Using these has

allowed for rational design of siRNAs for

more efficient silencing. This is done

using a weight matrix, very similar to

how splice sites are scored.

The partial explanation for the weight

matrix is that it agrees with observations

on strand usage bias. The compositional

bias at different positions on functional

siRNAs are derived from studies on large

populations of siRNAs. It is not clear

how the underlying physical phenomena

are related to this. This is reminiscent of

the rules for detection of good splice sites

in genomic sequences.

There could be other unknown factors

that affect siRNA performance that have

not been discovered yet. In addition, it

has been observed that siRNAs tend to

show off target effects, that is targeting of

genes for silencing that were not the

intended targets.81,82 These factors can

confound good designs in actual

laboratory use.

siRNAs are synthesised with 2 nt

overhangs (usually TT) on the 39 ends of

each strand and transfected into cells.

siRNAs can also be produced by the

action of Dicer on dsRNA and by the

transfection of siRNA vectors into cells

for constant production.

Construction of ansiRNA librarysiRNAs are now routinely used to silence

genes in vitro. They can be ordered off the

shelf for common genes and a lot of effort

has gone into understanding their action.

Building an siRNA library involves

designing silencing constructs for each

mRNA using the biases in base

composition at each position as discussed

above.

The constructs have to be unique, in

that they cannot target other genes. To

ensure this, a database of non-redundant

mRNAs (excluding splice variants) needs

to be built, so that the siRNA constructs

can be checked for uniqueness. In

addition, designs that do not have GC

content within 30–60 per cent or contain

the sequences AAAA, TTTT, GGGG or

CCCC are eliminated. siRNAs can be

ordered from companies,83,84 which

synthesise them with TT overhangs on

the 39 ends of each strand. Using several

siRNAs against each gene of interest

ensures that you have different off-targets

for each siRNA and there is a chance that

at least one might work. There are several

websites that allow design of siRNAs.85,86

Since siRNAs cause silencing by

binding to targets that are

complementary, specificity is achieved by

low concentrations of siRNAs in the cell.

At higher concentrations there is a greater

likelihood of binding to targets that do

not show exact complementarity and,

hence, off-target effects can occur more

frequently.

Well-characterised siRNAs for several

genes can be found in the literature.

Several companies sell well-characterised

off-the-shelf siRNAs for genes. This takes

the guesswork from experiments.

However, siRNAs are expensive and the

use of high dosage (each cell usually

receives several siRNAs) makes induction

of other effects, such as off-target

silencing, more likely.

Use of siRNA librariesIn C. elegans, several genome-wide

screens have been used to identify

pathways important in fat metabolism87

and in the role of mitochondrion in

longevity.88 In higher organisms, siRNA

screens can be done, but they are

prohibitively expensive. In addition, this

work is most easily done in cell cultures,

though some medical applications of

siRNAs have been proposed in living

organisms.34

Bioinformatics challenges• A better algorithm for siRNA design

is still an open problem. There is room

for new ideas, consideration of factors

hitherto unconsidered, such as

secondary structure.89

• Designing an algorithm to find off-

targets would be a very useful tool in

designing siRNAs for genome-wide

Construction of siRNAlibrary

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 3

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

studies. There have been several

attempts, for example Yamada and

Morishita,90 but there is work to be

done here.

• Experiments to mimic effects of drugs

can be analysed using siRNAs. A

perturbative analysis to understand the

effects on genetic networks is

essential.73 This field is just starting, as

the expense of large-scale screens

comes down.

• A resource that would list all the data

on siRNAs that are known to

function, the conditions under which

they function and the design failures,

from the literature as well as anecdotal

information. It should allow easy

access to such data and would

definitely help improve design

algorithms. This would at least save

the labour of wasted effort on siRNAs

that do not work under certain

conditions.

• siRNA designs can also be improved

by using knowledge gained from

miRNA studies and analysis of

miRNA structures.80 A lot more work

needs to be done in this area, which

would allow fine-tuning siRNA

designs.

SHORT HAIRPINRNA (shRNA)Short hairpin RNAs result from the

expression of inverted repeats or

constructs that can fold into short

hairpins. shRNAs or short hairpin RNAs

are artificial constructs that can be inserted

into a genome and expressed

endogenously.29 The expressed hairpins

can then fold to form dsRNA, and

Drosha and Dicer can act on these

hairpins to create mature sequence, used

by the RISC to target the genes. The

hairpin constructs can be synthesised and

inserted into the genome, or transfected

into the cell, as explained in Figure 2.

Design principlesThe basic idea is to use the lessons learnt

from the siRNA designs to ensure that

the desirable strand is picked. For a 29

mer based design, the 22 mer from the 39

end of the shRNA hairpin is believed to

be picked by the action of Dicer on the

shRNA. The siRNA based design is used

to design the 21 mer, extended by 1 base

in the 39 direction and 7 base pairs in the

59 direction to make a 29 mer, which is

then synthesised as a hairpin with a 4 base

pair loop.

Another method of constructing these

is to use the context of a known miRNA.

An miRNA with a target strand of length

22 is picked, and the target sequence is

replaced with the antisense strand from

the design above. The complementary

strand is also replaced, taking care to

preserve the bulges, loops and types of

mismatches. This will probably ensure

that the hairpin gets processed by Drosha,

is properly and efficiently processed by

the Dicer and gets associated properly

with the RISC.

These ideas have been implemented

and are available as a web-based resource

for public use.85 shRNAs are cheaper

than siRNAs for large-scale studies.

shRNAs do not suffer from dosage

problems as they are expressed in small

numbers in each cell in contrast to most

applications of siRNAs where a large

number of siRNAs are transfected into

any given cell. They can be inserted into

germlines, allowing for stable suppression

of gene expression in organisms, and are

easier to manipulate than knockouts.

However, constructing and validating

libraries is labour intensive and time

consuming.

Building a library of shRNAsThe considerations for shRNA libraries

are identical to the ones for siRNA

libraries. There are several libraries that

are now commercially available.91,92

Use of the shRNA librariesThe libraries allow large-scale studies that

are prohibitively expensive with siRNAs.

shRNA construction

1 5 4 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

We describe below one method of using

an shRNA library.

• Design the study using the annotation

of the genes. For example, to design a

study on the effect of kinases on

cancer cells, find all the shRNAs that

silence kinases and get them from the

library.

• The list of shRNAs will form the pool

that can be obtained from the

laboratories that have constructed the

libraries. The relevant shRNAs are

infected into cells at a dilution that

ensures that, statistically, there is only

one shRNA per cell.

• The cells are grown for a few

generations.

• The shRNAs are extracted from the

cells using restriction enzymes, and

microarrays designed to detect either

the barcodes or the shRNAs are used

to find shRNAs that have been

depleted.

• Find the mRNAs that are targets of

the shRNAs that have been depleted

and design further studies which may

then involve traditional biological

work (knockouts etc).

Bioinformatics challenges• Simulate and model behaviours of

libraries of shRNAs when used in

populations of cells. This can help

understand outcomes of library

screens.

• A central repository of shRNA

constructs. Such a resource can act as a

clearing-house that can track results,

identify patterns in shRNA

performance and allow users to find

constructs from a variety of sources.

There exists one now called the

RNAi Codex,93 but more work needs

to be done.

• Design microarray probes for more

efficient study of shRNA expression

in cell populations. Since the shRNA

constructs can either be probed with

the hairpin, or with barcodes that

maybe inserted into the shRNAs,

there is not much leeway in terms of

sequence length and freedom for

design, so designing multiple probes

per construct that will allow

distinguishing different shRNAs is

crucial.

• Perturbative analysis of genetic

networks. There is a need to develop

techniques that can analyse results of

silencing of genes either on a one by

one basis or in groups. Such

techniques have been developed to

study the action of drugs to infer

genetic networks,8 but a lot of work

remains to be done.

EPIGENETICSEpigenetics is the study of heritable

modifications of the genome that leave

the DNA sequence unchanged,94 but can

have phenotypic consequences. A

consequence of this is the phenomenon of

imprinting, which is the dependence of

the expression of an allele of a gene on the

parent from whom it was inherited.95,96

Epigenetic modifications can silence

regions of the genome, and can be of two

types:

• Histone modifications. Chromatin

consists of DNA wrapped around

histones, which have long tails.

Residues in these tails can be modified

by attachment of various groups, such

as acetyl groups (acetylation), methyl

groups (methylation), ubiquitin groups

(ubiquitination) and phosphate groups

(phosphorylation).22 DNA is

negatively charged, while histones are

positively charged. Addition of

positive acetyl groups can prevent

heterochromatin formation while

adding a methyl group to lysine 9 on

the histone H3 tail (H3K9) reduces

Bioinformatics shRNAlibrary

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 5

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

the positive charge which can then

lead to compaction of DNA and the

formation of heterochromatin.20 Thus

a ‘histone code’, involving different

combinations of histone

modifications, has been postulated to

regulate the state of the chromatin.19

• DNA methylation. DNA methyl

transferases (DMTases) attach methyl

groups to C residues that are followed

by Gs (CpG dinucleotides).23 DNA

methylation occurs in plants, mammals

and fungi but is absent or nearly absent

in Drosophila, S. cerevisiae, S. pombe and

C. elegans.97 There is a connection

between DNA methylation and

histone modification as Methyl CpG

binding protein (meCP2) binds to

methylated regions, which can then

recruit histone de-acetylase complexes

(HDAC) which can remove acetyl

groups from histone tails and hence

change the state of the chromatin.98 In

Arabidopsis, miRNAs have been shown

to be involved in DNA methylation.77

The RNAi pathway plays a critical role

in the regulation of epigenetic silencing

by establishing heritable sequence specific

patterns of DNA methylation and histone

modifications.5 We have already discussed

possible mechanisms in the introduction.

These modifications are replicated on

newly synthesised DNA; thereby they

become heritable.99 The epigenome is the

combination of the genome along with its

epigenetic state. The big challenge is to

find the connection, if there is one,

between the sequence and the underlying

epigenetic state.

Histone modifications can be identified

using chromatin immunoprecipitation

(ChIP).100 In the ChIP assay DNA is

linked to its associated proteins using

formaldehyde and then sheared into small

fragments. The DNA protein complex

can then be probed for particular protein

modifications using antibodies

(immunoprecipitation). The DNA

associated with these proteins can be

delinked and hybridised against

microarrays (ChIP on chip), or analysed

using the polymerase chain reaction

(PCR).5,101

DNA methylation can be identified

using a methylation-sensitive restriction

enzyme (mcrBC) that can cut methylated

DNA but leaves unmethylated DNA

untouched,102 or else by bisulphite

treatment103 that will convert cytosine to

uracil and leave methylated Cs

untouched. Both techniques in

combination with PCR and microarray-

based hybridisation can help detect

methylation states. Comparison of

genomic regions between wild type cells

and cells with DMTases knocked out can

also be used to identify regions that are

methylated.104

In mammals, the occurrence of the

CpG dinucleotide is lower than expected,

probably because of the conversion of

methylated Cs into Ts, but there are

regions that show enrichment of CpGs

called CpG islands.23 CpG islands are

usually not methylated. About 60 per cent

of promoters in the mouse and human

genome contain CpG islands, signifying

their euchromatic state.105 CpG islands

have been cloned and sequenced106 and

these have been used to recognise CpG

islands computationally (using HMMs or

other pattern recognition techniques).107

In maize and other plants, the genome

has a high repeat content, which is usually

methylated. Thus, using an Escherichia coli

host strain that harbours a mcrBC system,

that cuts methylated DNA, methylated

DNA can be eliminated and reads

enriched in genes, called methyl-filtered

reads, can be obtained.102 The reads can be

computationally analysed to predict genes.

Bioinformatics challenges• Evolution of DNA methylation. Since

DNA methylation is not present in

yeasts, but is present in other

organisms and there seems to be a

connection between methylation and

histone modification in other

organisms, it will be useful to

understand the evolution of the

proteins and the pathway.97,108–111

RNAi in epigenetics

1 5 6 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

• Signals in genomic sequence for

methylation states. CpG islands are

an example of such signals, but there

might be other signals. It is known

that there are correlations between the

nature of the sequence and the

functional state of chromatin. For

example, tandem repeats and

transposons are silenced.

• Insulators. These are sequence

elements that isolate or insulate a

region of the chromatin from a

neighbouring region and perhaps

could block the spread of methylation.

For example, analysis of CpG islands

may yield binding motifs for factors

that block DNA methylation. The

zinc finger-binding protein CTCF

(CCCTC binding factor) regulates the

genomic imprinting of the H19 and

Igf2 genes.23,112 CTCF binds the

differentially methylated domain

(DMD) on the maternally inherited

chromosome preventing its

methylation and subsequently the

expression of Igf2.

• Better identification of CpG

islands.105 This problem was

suggested by Prof. Denise Barlow,

who has observed that certain

experimentally verified CpG islands in

mouse are not annotated in the mouse

genome viewer on Ensembl.113 There

have been some efforts directed

towards genome-wide CpG island

annotations, such as on the human chr

21 and chr22.107,114 Current

computational techniques based on

HMMs or average sequence statistics

cannot identify short CpG islands107

and there might be a need for species-

specific CpG island predictors, which

in turn might point to differences in

the mechanisms by which such islands

are generated.

• Dynamics of epigenetic processes

and modelling. Heterochromatic

DNA is compact and silent but not

static. Transcription from

heterochromatic DNA does occur and

this transcription is important for

maintaining the silenced state.37 The

dynamics of this process is not well

known and resolving this is one of the

big challenges. There are experimental

challenges, but bioinformatics

challenges involve modelling the

dynamics and making predictions that

can direct biology. There is also histone

exchange going on, and it is important

to observe these processes in vivo and

measure rate constants in a quantitative

manner. There are systems available to

study these processes and these promise

to allow real-time visualisation, which

will then allow modelling.115

Modelling these systems would also

allow ranking these processes in terms

of relative importance in any pathway.

• Data collection and curation.

Several epigenomic projects are

underway;116–118 in addition an

exploration of the role of epigenetic

states, such as DNA methylation, in

cancer119 is being carried out.

Collecting these data in a central

repository and allowing intuitive

browsing of the data is an interesting

problem to solve. Having multiple

centres with diverse methods of

analysis and display would be the key

to solving this problem. Data on genes

and regions that are maternally or

paternally imprinted would also need

to be collected. This will allow

discerning patterns in a location-

specific, or gene/tissue-specific,

manner. Patterns of this nature will

not be obvious at first glance, but will

become clear when looked at in a

genome-wide manner. It is quite

possible that there are no universal

rules when the epigenome is

concerned, but different rules in

different regions of the genome.

Acknowledgments

Susan Janicki discussed the manuscript critically,

edited it and pointed out a number of references.

Bioinformaticschallenges inepigenetics

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 7

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

Leemor Joshua Tor and Jidong Liu kindly shared

figures from their papers and talks. Mike Ronemus

and Matt Vaughn gave critical comments and freely

shared their knowledge of the field. Dhruv Pant

helped with Figure 3. The anonymous reviewers

helped with several comments, corrections and

suggestions.

References

1. Meister, G. and Tuschl, T. (2004),‘Mechanisms of gene silencing by doublestranded RNA’, Nature, Vol. 431(7006),pp. 343–349.

2. Bartel, D. P. (2004), ‘MicroRNAs:Genomics, biogenesis, mechanism, andfunction’, Cell, Vol. 116(2), pp. 281–297.

3. Reinhart, B. J., Slack, F. J., Basson, M. et al.(2000), ‘The 21 nucleotide let 7 RNAregulates developmental timing inCaenorhabditis elegans’, Nature, Vol.403(6772), pp. 901–906.

4. Scott, R. J. and Spielman, M. (2004),‘Epigenetics: Imprinting in plants andmammals – the same but different?’, Curr.Biol., Vol. 14(5), pp. 201–203.

5. Lippman, Z. and Martienssen, R. (2004),‘The role of RNA interference inheterochromatic silencing’, Nature, Vol.431(7006), pp. 364–370.

6. Aravind, L., Watanabe, H., Lipman, D. J. andKoonin, E. V. (2000), ‘Lineage specific lossand divergence of functionally linked genesin eukaryotes’, Proc. Natl Acad. Sci. USA, Vol.97(21), pp. 11319–11324.

7. Silva, J., Chang, K., Hannon, G. J. and Rivas,F. V. (2004), ‘RNA interference basedfunctional genomics in mammalian cells:Reverse genetics coming of age’, Oncogene,Vol. 23(51), pp. 8401–8409.

8. Mochizuki, K. and Gorovsky, M. A. (2005),‘A Dicer-like protein in Tetrahymena hasdistinct functions in genome rearrangement,chromosome segregation, and meioticprophase’, Genes Dev., Vol. 19(1), pp. 77–89.

9. Nykanen, A., Haley, B. and Zamore, P. D.(2001), ‘ATP requirements and smallinterfering RNA structure in the RNAinterference pathway’, Cell, Vol. 107(3),pp. 309–321.

10. Sontheimer, E. J. (2005), ‘Assembly andfunction of RNA silencing complexes’, Nat.Rev. Mol. Cell Biol., Vol. 6(2), pp. 127–138.

11. Song, J. J., Smith, S. K., Hannon, G. J. andJoshua-Tor, L. (2004), ‘Crystal structure ofArgonaute and its implications for RISCslicer activity’, Science, Vol. 305(5689), pp.1434–1437.

12. Liu, J., Carmell, M. A., Rivas, F. V. et al.(2004), ‘Argonaute2 is the catalytic engine of

mammalian RNAi’, Science, Vol. 305(5689),pp. 1437–1441.

13. Ambros, V. (2004), ‘The functions of animalmicroRNAs’, Nature, Vol. 431(7006), pp.350–355.

14. Dugas, D. V. and Bartel, B. (2004),‘MicroRNA regulation of gene expression inplants’, Curr. Opin. Plant Biol., Vol. 7(5), pp.512–520.

15. Pfeffer, S., Zavolan, M., Grasser, F. A. et al.(2004), ‘Identification of virus encodedmicroRNAs’, Science, Vol. 304(5671), pp.734–736.

16. Yekta, S., Shih, I. H. and Bartel, B. P.(2004), ‘MicroRNA directed cleavage ofHOXB8 mRNA’, Science, Vol. 304(5670),pp. 594–596.

17. Llave, C., Xie, Z., Kasschau, K. D. andCarrington, J. C. (2002), ‘Cleavage ofscarecrow-like mRNA targets directed by aclass of Arabidopsis miRNA’, Science, Vol.297(5589), pp. 2053–2056.

18. Alberts, B., Bray, D., Lewis, J. et al. (2002),‘Molecular Biology of the Cell’, 4th edn,Garland Publishing, New York.

19. Jenuwein, T. and Allis, C. D. (2001),‘Translating the histone code’, Science, Vol.293(5532), pp. 1074–1080.

20. Grewal, S. I. and Rice, J.C. (2004),‘Regulation of heterochromatin by histonemethylation and small RNAs’, Curr. Opin.Cell Biol., Vol. 16(3), pp. 230–238.

21. Verdel, A., Jia, S., Gerber, S. et al. (2004),‘RNAi mediated targeting ofheterochromatin by the RITS complex’,Science, Vol. 303(5658), pp. 672–676.

22. Peterson, C. L. and Laniel, M. A. (2004),‘Histones and histone modifications’, Curr.Biol., Vol. 14(14), pp. R546–551.

23. Bird, A. (2002), ‘DNA methylation patternsand epigenetic memory’, Genes Dev., Vol.16(1), pp. 6–21.

24. Kawasaki, H. and Taira, K, (2004),‘Induction of DNA methylation and genesilencing by short interfering RNAs inhuman cells’, Nature, Vol. 431(7005), pp.211–217.

25. Morris, K. V., Chan, S. W., Jacobsen, S. E.and Looney, D. J. (2004), ‘Small interferingRNA induced transcriptional gene silencingin human cells’, Science, Vol. 305(5688), pp.1289–1292.

26. Gil, J. and Esteban, M. (2000), ‘Induction ofapoptosis by the dsRNA dependent proteinkinase (PKR): Mechanism of action’,Apoptosis, Vol. 5(2), pp. 107–114.

27. Manche, L., Green, S. R., Schmedt, C. andMathews, M. B. (1992), ‘Interactionsbetween double stranded RNA regulators

1 5 8 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

and the protein kinase DAI’, Mol. Cell Biol.,Vol. 12(11), pp. 5238–5248.

28. Elbashir, S. M., Harborth, J., Lendeckel, W.et al. (2001), ‘Duplexes of 21 nucleotideRNAs mediate RNA interference in culturedmammalian cells’, Nature, Vol. 411(6836), pp.494–498.

29. Paddison, P. J., Caudy, A. A., Bernstein, E.et al. (2002), ‘Short hairpin RNAs (shRNAs)induce sequence specific silencing inmammalian cells’, Genes Dev., Vol. 16(8),pp. 948–958.

30. Siolas, D., Lerner, C., Burchard, J. et al.(2005), ‘Synthetic shRNAs as potent RNAitriggers’, Nat. Biotechnol., Vol. 23(2), pp.227–231.

31. Kim, D. H., Behlke, M. A., Rose, S. D. et al.(2005), ‘Synthetic dsRNA Dicer substratesenhance RNAi potency and efficacy’, Nat.Biotechnol., Vol. 23, pp. 222–226.

32. Hannon, G. J. and Rossi, J. J. (2004),‘Unlocking the potential of the humangenome with RNA interference’, Nature,Vol. 431(7006), pp. 371–378.

33. Lee, N. S. and Rossi, J. J. (2004), ‘Control ofHIV 1 replication by RNA interference’,Virus Res., Vol. 102(1), pp. 53–58.

34. Gitlin, L., Stone, J. K. and Andino, R.(2005), ‘Poliovirus escape from RNAinterference: Short interfering RNA targetrecognition and implications for therapeuticapproaches’, J. Virol., Vol. 79(2), pp. 1027–1035.

35. Acuity Pharmaceuticals, ‘siRNA use forAMD treatment’ (URL: http://www.acuitypharma.com).

36. Tabara, H., Sarkissian, M., Kelly, W. G. et al.(1999), ‘The rde 1 gene, RNA interference,and transposon silencing in C. elegans’, Cell,Vol. 99(2), pp. 123–132.

37. Volpe, T. A., Kidner, C., Hall, I. M. et al.(2002), ‘Regulation of heterochromaticsilencing and histone H3 lysine 9 methylationby RNAi’, Science, Vol. 297(5588), pp.1833–1837.

38. Carmell, M. A., Xuan, Z., Zhang, M. Q. andHannon, G. J. (2002), ‘The Argonautefamily: Tentacles that reach into RNAi,developmental control, stem cellmaintenance, and tumorigenesis’, Genes Dev.,Vol. 16(21), pp. 2733–2742.

39. Silverman, E., Edwalds, G. G. and Lin, R. J.(2003), ‘DExD/H box proteins and theirpartners: Helping RNA helicases unwind’,Gene, Vol. 312, pp. 1–16.

40. Lamontagne, B., Larose, S., Boulanger, J. andElela, S. A. (2001), ‘The RNase III family: Aconserved structure and expanding functionsin eukaryotic dsRNA metabolism’, Curr.Issues Mol. Biol., Vol. 3(4), pp. 71–78.

41. Tian, B., Bevilacqua, P. C., DiegelmanParente, A. and Mathews, M. B. (2004), ‘Thedouble stranded RNA binding motif:Interference and much more’, Nat. Rev. Mol.Cell Biol., Vol. 5(12), pp. 1013–1023.

42. Eddy, S. R., ‘Hmmer: Open source HMMtools’ (URL: http://hmmer.wustl.edu).

43. Felsenstein, J., ‘Phylip website’ (URL:http://evolution.genetics.washington.edu/phylip.html).

44. Pasquinelli, A. E., Reinhart, B. J., Slack, F.et al. (2000), ‘Conservation of the sequenceand temporal expression of let 7heterochronic regulatory RNA’, Nature, Vol.408(6808), pp. 86–89.

45. Sempere, L. F., Freemantle, S., Pitha Rowe,I. et al. (2004), ‘Expression profiling ofmammalian microRNAs uncovers a subset ofbrain expressed microRNAs with possibleroles in murine and human neuronaldifferentiation’, Genome Biol., Vol. 5(3),p. R13.

46. Miska, E. A., Alvarez Saavedra, E.,Townsend, M. et al. (2004), ‘Microarrayanalysis of microRNA expression in thedeveloping mammalian brain’, Genome Biol.,Vol. 5(9), p. 68.

47. Cullen, B. R. (2004), ‘Transcription andprocessing of human microRNA precursors’,Mol. Cell, Vol. 16(6), pp. 861–865.

48. Pasquinelli, A. E. and Ruvkun, G. (2002),‘Control of developmental timing bymicroRNAs and their targets’, Annu. Rev.Cell Dev. Biol., Vol. 18, pp. 495–513.

49. Griffiths Jones, S. (2004), ‘The microRNARegistry’, Nucleic Acids Res., Vol. 32, pp.D109–111.

50. miRNA Registry (URL: http://www.sanger.ac.uk/Software/Rfam/mirna/index.shtml).

51. Rodriguez, A., Griffiths Jones, S., Ashurst,J. L. and Bradley, A. (2004), ‘Identification ofmammalian microRNA host genes andtranscription units’, Genome Res., Vol.14(10A), pp. 1902–1910.

52. Ying, S. Y. and Lin, S. L. (2005), ‘IntronicmicroRNAs’, Biochem. Biophys. Res.Commun., Vol. 326(3), pp. 515–520.

53. Lee, Y., Kim, M., Han, J. et al. (2004),‘MicroRNA genes are transcribed by RNApolymerase II’, EMBO J., Vol. 23(20), pp.4051–4060.

54. Ohler, U., Yekta, S., Lim, L. P. et al. (2004),‘Patterns of flanking sequence conservationand a characteristic upstream motif formicroRNA gene identification’, RNA, Vol.10(9), pp. 1309–1322.

55. Bonnet, E., Wuyts, J., Rouze, P. and dePeer, Y. V. (2004), ‘Evidence thatmicroRNA precursors, unlike other non-

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 9

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

coding RNAs, have lower folding freeenergies than random sequences’,Bioinformatics, Vol. 20(17), pp. 2911–2917.

56. Bonnet, E., Wuyts, J., Rouze, P. and dePeer, Y. V. (2004), ‘Detection of 91 potentialconserved plant microRNAs in Arabidopsisthaliana and Oryza sativa identifies importanttarget genes’, Proc. Natl Acad. Sci. USA, Vol.101(31), pp. 11511–11516.

57. Mawson, M., Lim, L. P. and Lau, N. (2003),MiRscan website (URL: http://genesmitedu/mirscan/).

58. Tomancak, P. (2003), miRseeker website(URL: http://toylblgov:9050/cgi bin/miRseekerpl/).

59. Vella, M. C., Reinert, K. and Slack, F. J.(2004), ‘Architecture of a validatedmicroRNA::target interaction’, Chem. Biol.,Vol. 11(12), pp. 1619–1623.

60. Doench, J. G. and Sharp, P. A. (2004),‘Specificity of microRNA target selection intranslational repression’, Genes Dev., Vol.18(5), pp. 504–511.

61. Hobert, O. (2004), ‘Common logic oftranscription factor and microRNA action’,Trends Biochem. Sci., Vol. 29(9), pp. 462–468.

62. Xie, X., Lu, J., Kulbokas, E. J. et al. (2005),‘Systematic discovery of regulatory motifs inhuman promoters and 39 UTRs bycomparison of several mammals’, Nature, Vol.434(7031), pp. 338–345.

63. Rhoades, M. W., Reinhart, B. J., Lim, L. P.et al. (2002), ‘Prediction of plant microRNAtargets’, Cell, Vol. 110(4), pp. 513–520.

64. John, B., Enright, A. J., Aravin, A. et al.(2004), ‘Human microRNA targets’, PLoSBiol., Vol. 2(11), p. e363.

65. Lewis, B. P., Burge, C. B. and Bartel, D. P.(2005), ‘Conserved seed pairing, oftenflanked by adenosines, indicates thatthousands of human genes are microRNAtargets’, Cell, Vol. 120(1), pp. 15–20.

66. Lai, E. C., Wiel, C. and Rubin, G. M.(2004), ‘Complementary miRNA pairssuggest a regulatory role for miRNA:miRNAduplexes’, RNA, Vol. 10(2), pp. 171–175.

67. Rehmsmeier, M., Steffen, P., Hochsmann,M. and Giegerich, R., ‘RNA hybridwebserver’ (URL: http://bibiserv.techfak.uni�bielefeld.de/rnahybrid/).

68. MSKC Center, miRanda website (URL:http://www.microrna.org).

69. Rehmsmeier, M., Steffen, P., Hochsmann,M. and Giegerich, R. (2004), ‘Fast andeffective prediction of microRNA/targetduplexes’, RNA, Vol. 10(10), pp. 1507–1517.

70. Rajewsky, N. and Socci, N. D. (2004),

‘Computational identification of microRNAtargets’, Dev. Biol., Vol. 267(2), pp. 529–535.

71. Kiriakidou, M., Nelson, P. T., Kouranov, A.et al. (2004), ‘A combined computationalexperimental approach predicts humanmicroRNA targets’, Genes Dev., Vol. 18(10),pp. 1165–1178.

72. Babak, T., Zhang, W., Morris, Q. et al.(2004), ‘Probing microRNAs withmicroarrays: Tissue specificity and functionalinference’, RNA, Vol. 10(11), pp. 1813–1819.

73. Mansfield, J. H., Harfe, B. D., Nissen, R.et al. (2004), ‘MicroRNA responsive‘‘sensor’’ transgenes uncover Hox-like andother developmentally regulated patterns ofvertebrate microRNA expression’, Nat.Genet., Vol. 36(10), pp. 1079–1083.

74. Gardner, T. S., di Bernardo, D., Lorenz, D.and Collins, J. J. (2003), ‘Inferring geneticnetworks and identifying compound mode ofaction via expression profiling’, Science, Vol.301(5629), pp. 102–105.

75. Bennasser, Y., Le, S. Y., Yeung, M. L. andJeang, K. T. (2004), ‘HIV 1 encodedcandidate micro RNAs and their cellulartargets’, Retrovirology, Vol. 1(1), p. 43.

76. Liu, Y., Mochizuki, K. and Gorovsky, M. A.(2004), ‘Histone H3 lysine 9 methylation isrequired for DNA elimination in developingmacronuclei in Tetrahymena’, Proc. Natl Acad.Sci. USA, Vol. 101(6), pp. 1679–1684.

77. Bao, N., Lye, K. W. and Barton, M. K.(2004), ‘MicroRNA binding sites inArabidopsis class III HD ZIP mRNAs arerequired for methylation of the templatechromosome’, Dev. Cell, Vol. 7(5), pp.653–662.

78. Khvorova, A., Reynolds, A. and Jayasena, S.D. (2003), ‘Functional siRNAs and miRNAsexhibit strand bias’, Cell, Vol. 115(2), pp.209–216.

79. Schwarz, D. S., Hutvagner, G., Du, T. et al.(2003), ‘Asymmetry in the assembly of theRNAi enzyme complex’, Cell, Vol. 115(2),pp. 199–208.

80. Silva, J. M., Sachidanandam, R. and Hannon,G. J. (2003), ‘Free energy lights the pathtoward more effective RNAi’, Nat. Genet.,Vol. 35(4), pp. 303–305.

81. Jackson, A. L., Bartz, S. R., Schelter, J. et al.(2003), ‘Expression profiling reveals off-targetgene regulation by RNAi’, Nat. Biotechnol.,Vol. 21(6), pp. 635–637.

82. Jackson, A. L. and Linsley, P. S. (2004),‘Noise amidst the silence: Off-target effects ofsiRNAs?’, Trends Genet., Vol. 20(11), pp.521–524.

83. Ambion Company website (URL: http://

1 6 0 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

www.ambion.com/catalog/supp/silencer_lib.html).

84. Dharmacon Company website (URL: http://www.dharmacon.com).

85. Lee, J., Olson, A. and Sachidanandam, R.,‘A website to design siRNAs’ (URL: http://katahdin.cshl.org:9331/siRNA/RNAi.cgi?type¼siRNA).

86. Dharmacon, siDesign Center (URL: http://design.dharmacon.com/rnadesign/default.aspx?sid¼731615029/).

87. Ashrafi, K., Chang, F. Y., Watts, J. L. et al.(2003), ‘Genome-wide RNAi analysis ofCaenorhabditis elegans fat regulatory genes’,Nature, Vol. 421(6920), pp. 268–272.

88. Lee, S. S., Lee, R. Y., Fraser, A. G. et al.(2003), ‘A systematic RNAi screen identifiesa critical role for mitochondria in C. eleganslongevity’, Nat. Genet., Vol. 33(1), pp.40–48.

89. Yiu, S. M., Wong, P. W., Lam, T. W. et al.(2005), ‘Filtering of ineffective siRNAs andimproved siRNA design tool’, Bioinformatics,Vol. 21(2), pp. 144–151.

90. Yamada, T. and Morishita, S. (2005),‘Accelerated off-target search algorithm forsiRNA’, Bioinformatics, Vol. 21(8), pp. 1316–1324.

91. Paddison, P. J., Silva, J. M., Conklin, D. S.et al. (2004), ‘A resource for largescale RNAibased screens in mammals’, Nature, Vol.428(6981), pp. 427–431.

92. Berns, K., Hijmans, E. M., Mullenders, J.et al. (2004), ‘A large-scale RNAi screen inhuman cells identifies new components of thep53 pathway’, Nature, Vol. 428(6981), pp.431–437.

93. Olson, A., Sheth, N., Lee, J. et al., ‘A websiteto access and manage shRNA resources’(URL: http://codex.cshl.org).

94. Epigenetics (URL: http://en.wikipedia.org/wiki/Epigenetics/).

95. da Rocha, S. T. and Ferguson-Smith, A. C.(2004), ‘Genomic imprinting’, Curr. Biol.,Vol. 14(16), pp. R646–649.

96. Delaval, K. and Feil, R. (2004), ‘Epigeneticregulation of mammalian genomicimprinting’, Curr. Opin. Genet. Dev., Vol.14(2), pp. 188–195.

97. Tweedie, S., Charlton, J., Clark, V. and Bird,A. P. (1997), ‘Methylation of genomes andgenes at the invertebrate–vertebrateboundary’, Mol. Cell Biol., Vol. 17(3), pp.1469–1475.

98. Nan, X., Ng, H. H., Johnson, C. A. et al.(1998), ‘Transcriptional repression by themethyl CpG binding protein MeCP2involves a histone deacetylase complex’,Nature, Vol. 393(6683), pp. 386–389.

99. McNairn, A. J. and Gilbert, D. M. (2003),‘Epigenomic replication: Linking epigeneticsto DNA replication’, Bioessays, Vol. 25(7),pp. 647–656.

100. Das, P. M., Ramachandran, K., vanWert, J.and Singal, R. (2004), ‘Chromatinimmunoprecipitation assay’, Biotechniques,Vol. 37(6), pp. 961–969.

101. Robyr, D., Suka, Y., Xenarios, I. et al.(2002), ‘Microarray deacetylation mapsdetermine genome wide functions for yeasthistone deacetylases’, Cell, Vol. 109(4), pp.437–446.

102. Rabinowicz, P. D. (2003), ‘Constructinggene enriched plant genomic libraries usingmethylation filtration technology’, MethodsMol. Biol., Vol. 236, pp. 21–36.

103. Yang, A. S., Estecio, M. R. H., Doshi, K.et al. (2004), ‘A simple method for estimatingglobal DNA methylation using bisulfite PCRof repetitive DNA elements’, Nucleic AcidsRes., Vol. 32(3), p. e38.

104. Hattori, N., Abe, T., Hattori, N. et al.(2004), ‘Preference of DNAmethyltransferases for CpG islands in mouseembryonic stem cells’, Genome Res., Vol.14(9), pp. 1733–1740.

105. Antequera, F. (2003), ‘Structure, functionand evolution of CpG island promoters’, CellMol. Life Sci., Vol. 60(8), pp. 1647–1658.

106. Cross, S. H., Charlton, J. A., Nan, X. andBird, A. P. (1994), ‘Purification of CpGislands using a methylated DNA bindingcolumn’, Nat. Genet., Vol. 6(3), pp.236–244.

107. Takai, D. and Jones, P. A. (2002),‘Comprehensive analysis of CpG islands inhuman chromosomes 21 and 22’, Proc. NatlAcad. Sci. USA, Vol. 99(6), pp. 3740–3745.

108. Bird, A. P. (1995), ‘Gene number, noisereduction and biological complexity’, TrendsGenet., Vol. 11(3), pp. 94–100.

109. Hendrich, B. and Tweedie, S. (2003), ‘Themethyl CpG binding domain and theevolving role of DNA methylation inanimals’, Trends Genet., Vol. 19(5), pp.269–277.

110. Bestor, T. H. (1990), ‘DNA methylation:Evolution of a bacterial immune functioninto a regulator of gene expression andgenome structure in higher eukaryotes’,Philos. Trans R. Soc. London B Biol. Sci., Vol.326(1235), pp. 179–187.

111. Colot, V. and Rossignol, J. L. (1999),‘Eukaryotic DNA methylation as anevolutionary device’, Bioessays, Vol. 21(5),pp. 402–411.

112. Lewis, A. and Murrell, A. (2004), ‘Genomicimprinting: CTCF protects the boundaries’,Curr. Biol., Vol. 14(7), pp. 284–286.

& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 6 1

RNAi as a bioinformatics consumer

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from

113. Barlow, D. (2005), ‘CpG islands not wellannotated in the mouse genome’, PrivateCommunication, January.

114. Cross S. H., Clark, V. H., Simmen, M. W.et al. (2000), ‘CpG island libraries fromhuman chromosomes 18 and 22: Landmarksfor novel genes’, Mamm. Genome, Vol. 11(5),pp. 373–383.

115. Janicki, S. M., Tsukamoto, T., Salghetti, S. E.et al. (2004), ‘From silencing to geneexpression: Real time analysis in single cells’,Cell, Vol. 116, pp. 683–698.

116. Eckhardt, F., Beck, S., Gut, I. G. and Berlin,K. (2004), ‘Future potential of the Human

Epigenome Project’, Expert Rev. Mol. Diagn.,Vol. 4(5), pp. 609–618.

117. Novik, K. L., Nimmrich, I., Genc, B. et al.(2002), ‘Epigenomics: Genome-wide study ofmethylation phenomena’, Curr. Issues Mol.Biol., Vol. 4(4), pp. 111–128.

118. Bradbury, J. (2003), ‘Human epigenomeproject – up and running’, PLoS Biol., Vol.1(3), p. e82.

119. Yang, H. H. and Lee, M. P. (2004),‘Application of bioinformatics in cancerepigenetics’, Ann. New York Acad. Sci., Vol.1020, pp. 67–76.

1 6 2 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005

Sachidanandam

by guest on May 14, 2011

bib.oxfordjournals.orgD

ownloaded from