rnai as a bioinformatics consumer
TRANSCRIPT
Ravi Sachidanandam
works in Cold Spring Harbor
Laboratory and is a physicist
who is currently interested in
computational biology. He is
working on several
computational aspects in RNAi
and its application as a tool for
biological research. He is also
working on various aspects of
splicing, as well as modelling
and simulating systems involved
in apoptosis, splicing, RNAi and
live-cell imaging.
Keywords: RNAi, miRNA,PTGS, TGS, gene silencing,chromatin
Ravi Sachidanandam
1 Bungtown Road,
Cold Spring Harbor Laboratory,
NY 11724, USA
Tel: þ1 516 367 8864
Fax: þ1 516 367 8389
E-mail: [email protected]
RNAi as a bioinformaticsconsumerRavi SachidanandamDate received (in revised form): 17th March 2005
AbstractRNAi has shown great potential for use as a tool for biological discovery, analysis and
therapeutics. The involvement of the RNAi pathway in post-transcription silencing,
transcriptional silencing and epigenetic silencing as well as its use as a tool for forward genetics
and therapeutics throws up several bioinformatics challenges. This paper delineates several
areas of research and reviews work that has already been done, the tools that are available and
the challenges that lie ahead.
INTRODUCTIONRNAi, or RNA interference, is the
disruption of a gene’s expression by a
double stranded RNA (dsRNA) in which
one strand is complementary (either
perfectly or imperfectly) to a section of
the gene’s mRNA.1 A dsRNA can enter
the cytoplasm, through the expression of
a hairpin (or inverted repeats), through
viral gene expression or through artificial
constructs that enter the cell through the
cell membrane.
The disruption can take the form of
mRNA degradation, translational
repression or transcriptional repression
through epigenetic modifications.2–5 The
pathway can also lead to message
modification by cleavage of mRNAs.
The RNAi pathway is conserved across
eukaryotic species, from yeasts to humans.
Interestingly, in yeasts, Schizosaccharomyces
pombe has portions of the RNAi pathway
while Saccharomyces cerevisiae seems to have
lost most of it.6 Thus, it is a universal
mechanism in biological systems and is
amenable to use as a tool for forward
genetics, where the expression of genes
can be regulated by external triggers and
phenotypic consequences can be
observed.7
The RNAi pathway and its various
roles, depicted in Figure 1, are post-
transcriptional gene silencing (PTGS) and
transcriptional gene silencing (TGS).
PTGS is the degradation of transcribed
and processed mRNA in the cytoplasm. It
also includes translational silencing.
• Degradation of mRNA. Short
interfering RNAs (siRNAs) are 22 nt
double-stranded RNA with 2
nucleotide 39 overhangs on both
strands and a phosphate on the 59
end.9 siRNAs in the cytoplasm get
unwound and a single strand gets
incorporated into the RNA-induced
silencing complex (RISC).10 The
RISC along with the siRNA is guided
to the complementary mRNA, by an
unknown process. The mRNA can
get silenced through degradation by
Argonaute 2 slicing11,12 when the
siRNA strand shows perfect
complementarity.
• Inhibition of translation. In plants and
animals, short (22 nt) non-coding
RNAs called microRNAs (miRNAs)
regulate developmental pathways by
the regulation of protein
expression.13,14 miRNAs usually block
translation. Certain viruses have also
been shown to use miRNAs to
control gene expression, using the
hosts’ machinery.15
• Cleavage of RNA. mRNAs get
cleaved by miRNAs that have near-
perfect matches to them.16,17
1 4 6 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
TGS is the silencing of gene expression
in the nucleus through changes in
chromatin structure directed in a
sequence-specific manner by dsRNA.
This is a heritable effect and probably
shares some of the pathway with the
PTGS mechanism. Two forms of TGS
are heterochromatin formation and
methylation of promoters:
• Heterochromatin formation.5 DNA in
the nucleus is wrapped around histone
proteins in a periodic arrangement,
which is then organised as a
condensed structure called
chromatin.18 This helps in packaging
the DNA into a small volume and
allows for control of expression
through the modification of the
histones via methylation, acetylation,
ubiquitination or phosphorylation of
specific residues.19 Heterochromatin
(as opposed to euchromatin) is
condensed and transcriptionally silent
chromatin.20 One pathway for its
formation involves the methylation of
the residue lysine 9 of the histone H3
(H3K9). In S. pombe it has been
shown that H3K9 methylation is
initiated by the association of a strand
of siRNA in the nucleus with the
RNA-induced transcriptional
dsRNA directedtranscriptional genesilencing (TGS)
Figure 1: RNAi pathway and its role in post-transcriptional and transcriptional gene silencing.Hairpin RNAs, from either microRNA (miRNA) genes or short hairpin RNAs (shRNAs)constructs, are transcribed and then processed in the nucleus by the Rnase III enzyme Drosha.These stem loop products contain a two-nucleotide 39 overhang and a 59 phosphate at thebase of the stem and are exported to the cytoplasm by exportin 5. In the cytoplasm, theDrosha product is further processed by Dicer into a 22-nucleotide RNA duplex with two-nucleotide 39 overhangs and 59 phosphates on both ends. One strand of these 22 nucleotideproducts (siRNAs or miRNAs) is incorporated into the RISC, composed of proteins includingArgonaute (Ago), FXRP, VIG, Tudor SN and Gemin 3 and 4. RISC, with a single 22-nucleotidesiRNA strand, can slice/cleave or block translation of an mRNA complementary to the siRNAstrand. Chemically synthesised siRNAs and shRNAs can also be introduced through the cellmembrane and become incorporated in the appropriate processing step. In transcriptionalgene silencing (TGS), double-stranded RNA transcribed from inverted repeats in repetitiveDNA induces heterochromatin formation and epigenetic silencing through H3K9 methylation.In S. pombe, siRNAs are components of the RNA-induced transcriptional silencing (RITS)complex, which includes Chp1, Tas3 and Ago. The RITS complex is involved in TGS. The RNAipathway can also induce promoter silencing through DNA methylation (a process not fullyunderstood). miRNAs are also implicated in programmed DNA elimination in organisms suchas Tetrahymena8
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 4 7
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
siRNA and shRNA cansilence genes
silencing (RITS) complex, consisting
of Chp1, Ago1 and Tas3.21 This
complex targets homologous regions
on the chromatin for H3K9
methylation. Other histone residues
can also be acetylated, methylated or
ubiquitinated and the pattern of such
modifications probably signifies the
heterochromatic or euchromatic state
of the chromatin.22
• Methylation of promoters. The
cytosine (C) of CpG (C followed by
G) motifs on the sequence can get
methylated by DNA methyl
transferase (DMTase) enzymes.23 This
can silence the underlying region of
the genome. RNAi can initiate the
methylation of gene promoters,
thereby blocking transcription.24,25
The exact mechanism of this process is
not understood.
The introduction of large dsRNA into
mammalian cells results in a general
response (interferon or protein kinase
PKR response) that leads to cell death.26
It was discovered that using shorter
dsRNA (,29 nt) can bypass this
response.27 Short double-stranded RNA
with 2 nt 39 overhangs and a 59 phosphate
group, called siRNAs, that mimic the
product of Dicer activity, can get
incorporated directly into the RISC and
result in silencing activity.28 This is a
popular method of silencing genes in cells.
Another method of inducing RNAi is
to insert hairpin constructs into the
genome using vectors which can get
expressed stably (Figure 2).29 The
expressed hairpins are processed by
Drosha and exported to the cytoplasm,
where Dicer acts on them to create
siRNAs, which then get incorporated
into the RISC. These constructs are
called short hairpin RNAs or shRNAs.29
shRNAs can also be chemically
synthesised and introduced into the
cytoplasm,30,31 in which case, it is
important to mimic the product of
Drosha, which has a 2 nt 39 overhang.
RNAi is also being proposed for
therapeutic uses,32 for example, by
targeting viral genes for silencing.
Applying this to a disease such as HIV or
polio is problematic since the viruses
mutate very rapidly and escape message
degradation by RNAi.33,34 There are trials
using siRNAs in the human eye against
vascular endothelial growth factor
� ������������ ��
��������� � �������
������������� ��
� �� ����
��� ��������
�� ���������
��� ���
����� � ������������ ����
Figure 2: A shRNA hairpin construct. Additional elements in the construct are used forengineering purposes. The U6 promoter ensures that the gene is expressed and theterminator ensures that only the hairpin gets expressed, that is, there is no transcriptional runthrough. The barcode at the end is a random 60 mer that is unique to each hairpin allowingidentification of the hairpin, either via microarrays or via the use of polymerase chain reaction(PCR). The oligo is synthesised as a DNA, and then cloned into a plasmid using PCR, grownand then inserted into a retroviral vector. For transient effects, the plasmid can be transfectedinto the cell with the effects seen about 48 hours after transfection. For permanent insertion,the vector is used to infect the cell. There are two possible orientations of the hairpin senseloop antisense (SAS) or antisense loop sense (ASS). If a 29 mer is used for the hairpin as shownin the figure, then the first 22 mer from the end is believed to be used for creating the siRNA.siRNA design considerations are used to design 19 mers that are then extended nine basepairs at the 59 end and one at the 39 end to obtain the 29 mer. Variations on this constructionare possible, depending on the biological needs
1 4 8 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
(VEGF) to treat wet age-related macular
degeneration (wet AMD).35
The exact details of the biology
delineated here are bound to change as this
is a rapidly evolving field, but some of the
broad themes and bioinformatics concerns
are bound to be relatively longer lasting.
Listed below are several areas in RNAi
research and the role of bioinformatics in
them. Each section will conclude with a
short list of bioinformatics challenges that
are definitely skewed by the author’s
preferences.
RNAi-RELATED PROTEINCODING GENE/FUNCTIONDISCOVERYThe RNAi pathway has not been fully
elucidated. Many more proteins than are
currently known may be involved in the
RNAi pathway. Experimentally, many of
the genes have been discovered using
mutational studies in Caenorhabditis
elegans, as well as screening all genes of the
genome using RNAi against them.36
Comparative genomics has helped with
RNAi research in several ways. The
discovery that Argonaute exists in S.
pombe but not in S. cerevisiae6 led to the
discovery of RNAi-controlled
heterochromatin silencing through
histone H3 lysine 9 methylation.37 The
prediction of an Argonaute homologue in
the archae Pyrococcus furiosus allowed the
crystallisation of the protein. This, in turn,
led to the determination of its crystal
structure, as well as the identification of
the function of the PIWI domain.10,11
A computational approach to
predicting novel RNAi genes involves
looking for domains that are peculiar to
proteins involved in RNAi. For example,
Argonaute has the PAZ and PIWI
domains,38 and Dicer has a DexH box
helicase domain,39 two RNaseIII
domains,40 a PAZ domain and a dsRBD
(double-stranded RNA binding)
domain.41 Thus proteins with PAZ and
PIWI domains are worthy of
experimental investigation. This kind of
analysis usually involves searching for
motifs using hidden Markov models
(HMMs)42 and phylogenetic analysis.43
Phylogenetic analysis has also helped with
understanding the functions of various
homologues of Argonaute.12,38 The study
of protein families involved in RNAi can
help reveal new members of these families
and uncover new mechanisms of the
pathway.
Bioinformatics challenges• Identification of all the counterparts to
RNAi proteins in each of the model
organisms.
• Understanding the evolution of the
pathway through phylogenetic
analysis, taking into account structural
elements of the RNAi proteins (such
as the differences in catalytic activity
between various forms of Argonaute).
The double-stranded RNA binding
domain is used in several other
pathways: how did it evolve to be
used in the RNAi pathway?41
• Identifying critical components of the
RNAi pathway through the analysis of
the topology of the pathway. There
are several experiments where RNAi
is used to silence parts of the pathways.
This process needs to be understood
and its temporal behaviour explained
(or predicted, since there are no data
out there).
miRNA PATHWAYSmiRNA pathways are developmentally
important in almost all eukaryotic
organisms,44 but may also have a major
role to play in the nervous systems of
mammals.45,46 miRNA genes are small
non-coding genes (,150 bp) whose
transcripts form hairpins called primary
miRNA (pri-miRNA). The pri-miRNAs
get processed by an RNaseIII enzyme
called Drosha in the nucleus into
precursor miRNAs (pre-miRNAs),
which have a two-nucleotide 39
overhang. The pre-miRNAs get exported
into the cytoplasm, where they are
processed by Dicer and then get
associated with the RISC (Figure 1).47
RNAi proteins
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 4 9
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
miRNA genes
This RISC-associated single-strand
miRNA blocks translation of the
complementary mRNA. The level of
match to target need not be very high, as
shown in Figure 3. miRNAs that have
near-perfect matches to an mRNA can
also direct cleaving (or message
destruction) of the target.16,17
MicroRNAs control the timing of gene
expression, especially during
development.44,48
Occurrence, expression andprediction of miRNA genesMany miRNA genes have been
experimentally identified in a variety of
organisms and can be accessed at the
miRNA registry website.49,50 The registry
was constructed by sequencing small
RNAs (� 22 nt) and mapping them to
the genome, and delineating the hairpins
that contain them.
miRNAs can occur in intergenic
regions, in introns of protein coding
regions or in exons and introns of non-
coding genes.51 The intronic miRNAs
probably rely on Pol II mediated pre-
mRNA transcription and splicing52 for
expression, while the intergenic ones are
believed to be transcribed independently
by Pol II polymerases.53
miRNAs can also occur in clusters
Figure 3: The first three hairpin structures are let 7 miRNA hairpins from Caenorhabditiselegans, Drosophila melanogaster and Homo sapiens. The shaded portions are the maturesequences that result from processing of the miRNA gene. The last two alignments are formature let 7 against two targets in the 39 UTR of lin 41 in C. elegans44
1 5 0 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
(separated by a few kilobases) that can be
identified by mapping the miRNA genes
to the genome. These clusters can be
transcribed together and are believed to
be miRNAs that control mRNAs from
pathways that are involved in related
functions. Not much is known about
their promoters, though in the case of C.
elegans some conserved motifs have been
identified.54 Identification of such
promoter sequences can help with the
prediction of miRNA genes.
It is almost impossible to identify
miRNA genes by scanning the genome of
any organism, as the number of putative
hairpins far outnumbers the number of
known genes. Additional signals are
necessary. One method of analysis uses
the known biases in targeting strands, but
this is not necessarily easy to use in the
miRNA case, as the level of match
between the miRNA single strand and
the mRNA is usually not very high.
There have been suggestions that miRNA
precursors form more stable hairpins,55
but this may not be sufficient to make
reasonably useful predictions. Another
method that has been used is the
conservation of sequences between
closely related species such as rice
(O. sativa) and Arabidopsis (A. thaliana).56
Discovery of new miRNA genes can
help identify critical features that control
the silencing behaviour of dsRNAs.
MiRscan57 and MiRseeker58 are
programs that use the conservation of
stem loop structures across species to
identify novel miRNA genes.
Target predictionThe binding sites of miRNAs are
presumed to be in the 39 UTRs and
usually occur in multiple copies (Figure
3).44,59 Combinatorial control, with
multiple miRNAs targeting a single gene,
is probably an important feature of
miRNA targeting, very similar to the
mode of transcription factor control of
genes.60–62
One of the first miRNAs to be
identified is let 7, which targets the gene
lin 41 (Figure 3).44 The alignments of let
7 and its target regions on the 39 UTR of
lin 41 show that several bulges as well as
G:U wobble base pairing are allowed.
Thus, the number of potential targets of
any miRNA is quite large, unless some
restrictions can be placed on these
matches.
Some rules for miRNA target matches
have been inferred from a mutational
study,60 which found that the strong
binding of the 59 end (the first eight base
pairs) of the mature miRNA to the 39
UTR sequence was very important and
G:U wobble pairing reduced the silencing
efficiency. In addition, it is thought that
having multiple binding sites for an
miRNA on the 39 UTR increases the
efficiency of silencing.60 This implies
multiple binding sites for different
miRNAs on the same UTR allows for
combinatorial control of the silencing
process. The mechanism of targeting has
not been settled. There have been several
attempts (for example, Rhoades et al.,63
John et al.64) at predicting targets, but the
agreement between the different attempts
at this for the same set of miRNAs is not
as good as one would expect, presumably
because there could be a large number of
targets for each miRNA65 and each group
uses their own criteria to limit results.
Models have been developed for miRNA
target interactions, based on
experiments.59
miRNAs could also target other
miRNAs for silencing.66 As discussed
below, miRNAs show tissue-specific
expression patterns; thus correlation of
mRNA with miRNA expression can help
narrow down targets. It should also be
possible to cluster miRNAs and mRNAs
based on expression patterns. There are a
number of websites67,68 and
papers63,64,69–71 that deal with miRNA
target predictions.
miRNA microarraysMicroarrays that can detect miRNAs have
been constructed to study the expression
of miRNAs. These experiments have
shown that miRNA expression shows
tissue72 and temporal specificity, being
miRNA targetprediction
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 1
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
predominantly found in developmental
pathways73 and brain tissue.45,46 Some of
these studies also show that there is a
correlation between mRNA and miRNA
expression in various tissues.73 This is an
important piece of evidence for validating
miRNA target predictions.
miRNA librariesPrecursor miRNAs can be artificially
synthesised and either transfected or stably
expressed in the genome (as explained in
the caption for Figure 2) which can then
be expressed using inducible promoters.
This can allow study of miRNA function.
The design of such libraries is not
difficult, but analysis of results can be
interesting. The analysis of mRNA
networks due to silencing of components
can help elucidate the topology of
pathways.8
Message modification, DNAsuppression, viral control,DNA methylationThese constitute a class of miRNA-
directed activities that have been recently
discovered, or seen only in a few cases so
far. The developmental gene HOXB8 is
cleaved by miRNA mir 196b in mouse.16
This seems to imply that target mRNAs
that have a high degree of
complementarity to miRNAs are likely to
be cleaved by them, and the location of
the target sequence need not be in the
UTRs. Viruses can use miRNAs and the
host’s RNAi machinery to silence
genes.15 There have also been reports of
HIV virus containing miRNAs.74
Genomic rearrangements in
Tetrahymena are believed to be RNAi-
controlled and are probably the result of
miRNA action.75,76 miRNAs have also
been shown to be involved in DNA
methylation.77
Bioinformatics challenges• Identification and prediction of
miRNA promoters.
• Identification of miRNA targets. This
is not a solved problem, many
underlying assumptions in miRNA
target discovery (such as homologies,
UTR location of target sites) can help
identify good targets, but may not
cover all possible sites, some of which
may be unique to certain species and
may not follow these rules.
• Genomes of viruses can be scanned for
possible miRNAs that can target host
genes.
• miRNAs must act in concert on
members of pathways, so it might be
useful to consider function in
analysing targets of miRNA. Thus
study of the topology of genetic
networks and tissue specificity of
genes can help with miRNA target
studies.
• Identification of miRNA-directed
cleavage targets. These might be an
important feature of biological systems
and can easily be probed by
bioinformatics means.
SHORT INTERFERING RNA(siRNA)Short interfering RNAs are short (22 nt)
double-stranded RNAs (dsRNA) with
two nucleotide 39 overhangs on both
strands and a phosphate on each 59 end.9
siRNAs are created by the action of Dicer
on dsRNAs. These can get incorporated
into RISC and silence genes that are
complementary to the RISC incorporated
strand.
Design principles for functionalsiRNAsIt was discovered that the strand whose 59
end is easier to open, or more unstable, is
favoured for inclusion in the RISC.78,79
RISC has a putative helicase activity of
unknown origin which probably unwinds
the dsRNA from the unstable end. This
probably explains the strand bias of RISC.
Several studies, including studies of
miRNA hairpins,80 have revealed biases
of base composition along the active
Function of miRNAs
1 5 2 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
strand of the siRNA. Using these has
allowed for rational design of siRNAs for
more efficient silencing. This is done
using a weight matrix, very similar to
how splice sites are scored.
The partial explanation for the weight
matrix is that it agrees with observations
on strand usage bias. The compositional
bias at different positions on functional
siRNAs are derived from studies on large
populations of siRNAs. It is not clear
how the underlying physical phenomena
are related to this. This is reminiscent of
the rules for detection of good splice sites
in genomic sequences.
There could be other unknown factors
that affect siRNA performance that have
not been discovered yet. In addition, it
has been observed that siRNAs tend to
show off target effects, that is targeting of
genes for silencing that were not the
intended targets.81,82 These factors can
confound good designs in actual
laboratory use.
siRNAs are synthesised with 2 nt
overhangs (usually TT) on the 39 ends of
each strand and transfected into cells.
siRNAs can also be produced by the
action of Dicer on dsRNA and by the
transfection of siRNA vectors into cells
for constant production.
Construction of ansiRNA librarysiRNAs are now routinely used to silence
genes in vitro. They can be ordered off the
shelf for common genes and a lot of effort
has gone into understanding their action.
Building an siRNA library involves
designing silencing constructs for each
mRNA using the biases in base
composition at each position as discussed
above.
The constructs have to be unique, in
that they cannot target other genes. To
ensure this, a database of non-redundant
mRNAs (excluding splice variants) needs
to be built, so that the siRNA constructs
can be checked for uniqueness. In
addition, designs that do not have GC
content within 30–60 per cent or contain
the sequences AAAA, TTTT, GGGG or
CCCC are eliminated. siRNAs can be
ordered from companies,83,84 which
synthesise them with TT overhangs on
the 39 ends of each strand. Using several
siRNAs against each gene of interest
ensures that you have different off-targets
for each siRNA and there is a chance that
at least one might work. There are several
websites that allow design of siRNAs.85,86
Since siRNAs cause silencing by
binding to targets that are
complementary, specificity is achieved by
low concentrations of siRNAs in the cell.
At higher concentrations there is a greater
likelihood of binding to targets that do
not show exact complementarity and,
hence, off-target effects can occur more
frequently.
Well-characterised siRNAs for several
genes can be found in the literature.
Several companies sell well-characterised
off-the-shelf siRNAs for genes. This takes
the guesswork from experiments.
However, siRNAs are expensive and the
use of high dosage (each cell usually
receives several siRNAs) makes induction
of other effects, such as off-target
silencing, more likely.
Use of siRNA librariesIn C. elegans, several genome-wide
screens have been used to identify
pathways important in fat metabolism87
and in the role of mitochondrion in
longevity.88 In higher organisms, siRNA
screens can be done, but they are
prohibitively expensive. In addition, this
work is most easily done in cell cultures,
though some medical applications of
siRNAs have been proposed in living
organisms.34
Bioinformatics challenges• A better algorithm for siRNA design
is still an open problem. There is room
for new ideas, consideration of factors
hitherto unconsidered, such as
secondary structure.89
• Designing an algorithm to find off-
targets would be a very useful tool in
designing siRNAs for genome-wide
Construction of siRNAlibrary
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 3
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
studies. There have been several
attempts, for example Yamada and
Morishita,90 but there is work to be
done here.
• Experiments to mimic effects of drugs
can be analysed using siRNAs. A
perturbative analysis to understand the
effects on genetic networks is
essential.73 This field is just starting, as
the expense of large-scale screens
comes down.
• A resource that would list all the data
on siRNAs that are known to
function, the conditions under which
they function and the design failures,
from the literature as well as anecdotal
information. It should allow easy
access to such data and would
definitely help improve design
algorithms. This would at least save
the labour of wasted effort on siRNAs
that do not work under certain
conditions.
• siRNA designs can also be improved
by using knowledge gained from
miRNA studies and analysis of
miRNA structures.80 A lot more work
needs to be done in this area, which
would allow fine-tuning siRNA
designs.
SHORT HAIRPINRNA (shRNA)Short hairpin RNAs result from the
expression of inverted repeats or
constructs that can fold into short
hairpins. shRNAs or short hairpin RNAs
are artificial constructs that can be inserted
into a genome and expressed
endogenously.29 The expressed hairpins
can then fold to form dsRNA, and
Drosha and Dicer can act on these
hairpins to create mature sequence, used
by the RISC to target the genes. The
hairpin constructs can be synthesised and
inserted into the genome, or transfected
into the cell, as explained in Figure 2.
Design principlesThe basic idea is to use the lessons learnt
from the siRNA designs to ensure that
the desirable strand is picked. For a 29
mer based design, the 22 mer from the 39
end of the shRNA hairpin is believed to
be picked by the action of Dicer on the
shRNA. The siRNA based design is used
to design the 21 mer, extended by 1 base
in the 39 direction and 7 base pairs in the
59 direction to make a 29 mer, which is
then synthesised as a hairpin with a 4 base
pair loop.
Another method of constructing these
is to use the context of a known miRNA.
An miRNA with a target strand of length
22 is picked, and the target sequence is
replaced with the antisense strand from
the design above. The complementary
strand is also replaced, taking care to
preserve the bulges, loops and types of
mismatches. This will probably ensure
that the hairpin gets processed by Drosha,
is properly and efficiently processed by
the Dicer and gets associated properly
with the RISC.
These ideas have been implemented
and are available as a web-based resource
for public use.85 shRNAs are cheaper
than siRNAs for large-scale studies.
shRNAs do not suffer from dosage
problems as they are expressed in small
numbers in each cell in contrast to most
applications of siRNAs where a large
number of siRNAs are transfected into
any given cell. They can be inserted into
germlines, allowing for stable suppression
of gene expression in organisms, and are
easier to manipulate than knockouts.
However, constructing and validating
libraries is labour intensive and time
consuming.
Building a library of shRNAsThe considerations for shRNA libraries
are identical to the ones for siRNA
libraries. There are several libraries that
are now commercially available.91,92
Use of the shRNA librariesThe libraries allow large-scale studies that
are prohibitively expensive with siRNAs.
shRNA construction
1 5 4 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
We describe below one method of using
an shRNA library.
• Design the study using the annotation
of the genes. For example, to design a
study on the effect of kinases on
cancer cells, find all the shRNAs that
silence kinases and get them from the
library.
• The list of shRNAs will form the pool
that can be obtained from the
laboratories that have constructed the
libraries. The relevant shRNAs are
infected into cells at a dilution that
ensures that, statistically, there is only
one shRNA per cell.
• The cells are grown for a few
generations.
• The shRNAs are extracted from the
cells using restriction enzymes, and
microarrays designed to detect either
the barcodes or the shRNAs are used
to find shRNAs that have been
depleted.
• Find the mRNAs that are targets of
the shRNAs that have been depleted
and design further studies which may
then involve traditional biological
work (knockouts etc).
Bioinformatics challenges• Simulate and model behaviours of
libraries of shRNAs when used in
populations of cells. This can help
understand outcomes of library
screens.
• A central repository of shRNA
constructs. Such a resource can act as a
clearing-house that can track results,
identify patterns in shRNA
performance and allow users to find
constructs from a variety of sources.
There exists one now called the
RNAi Codex,93 but more work needs
to be done.
• Design microarray probes for more
efficient study of shRNA expression
in cell populations. Since the shRNA
constructs can either be probed with
the hairpin, or with barcodes that
maybe inserted into the shRNAs,
there is not much leeway in terms of
sequence length and freedom for
design, so designing multiple probes
per construct that will allow
distinguishing different shRNAs is
crucial.
• Perturbative analysis of genetic
networks. There is a need to develop
techniques that can analyse results of
silencing of genes either on a one by
one basis or in groups. Such
techniques have been developed to
study the action of drugs to infer
genetic networks,8 but a lot of work
remains to be done.
EPIGENETICSEpigenetics is the study of heritable
modifications of the genome that leave
the DNA sequence unchanged,94 but can
have phenotypic consequences. A
consequence of this is the phenomenon of
imprinting, which is the dependence of
the expression of an allele of a gene on the
parent from whom it was inherited.95,96
Epigenetic modifications can silence
regions of the genome, and can be of two
types:
• Histone modifications. Chromatin
consists of DNA wrapped around
histones, which have long tails.
Residues in these tails can be modified
by attachment of various groups, such
as acetyl groups (acetylation), methyl
groups (methylation), ubiquitin groups
(ubiquitination) and phosphate groups
(phosphorylation).22 DNA is
negatively charged, while histones are
positively charged. Addition of
positive acetyl groups can prevent
heterochromatin formation while
adding a methyl group to lysine 9 on
the histone H3 tail (H3K9) reduces
Bioinformatics shRNAlibrary
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 5
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
the positive charge which can then
lead to compaction of DNA and the
formation of heterochromatin.20 Thus
a ‘histone code’, involving different
combinations of histone
modifications, has been postulated to
regulate the state of the chromatin.19
• DNA methylation. DNA methyl
transferases (DMTases) attach methyl
groups to C residues that are followed
by Gs (CpG dinucleotides).23 DNA
methylation occurs in plants, mammals
and fungi but is absent or nearly absent
in Drosophila, S. cerevisiae, S. pombe and
C. elegans.97 There is a connection
between DNA methylation and
histone modification as Methyl CpG
binding protein (meCP2) binds to
methylated regions, which can then
recruit histone de-acetylase complexes
(HDAC) which can remove acetyl
groups from histone tails and hence
change the state of the chromatin.98 In
Arabidopsis, miRNAs have been shown
to be involved in DNA methylation.77
The RNAi pathway plays a critical role
in the regulation of epigenetic silencing
by establishing heritable sequence specific
patterns of DNA methylation and histone
modifications.5 We have already discussed
possible mechanisms in the introduction.
These modifications are replicated on
newly synthesised DNA; thereby they
become heritable.99 The epigenome is the
combination of the genome along with its
epigenetic state. The big challenge is to
find the connection, if there is one,
between the sequence and the underlying
epigenetic state.
Histone modifications can be identified
using chromatin immunoprecipitation
(ChIP).100 In the ChIP assay DNA is
linked to its associated proteins using
formaldehyde and then sheared into small
fragments. The DNA protein complex
can then be probed for particular protein
modifications using antibodies
(immunoprecipitation). The DNA
associated with these proteins can be
delinked and hybridised against
microarrays (ChIP on chip), or analysed
using the polymerase chain reaction
(PCR).5,101
DNA methylation can be identified
using a methylation-sensitive restriction
enzyme (mcrBC) that can cut methylated
DNA but leaves unmethylated DNA
untouched,102 or else by bisulphite
treatment103 that will convert cytosine to
uracil and leave methylated Cs
untouched. Both techniques in
combination with PCR and microarray-
based hybridisation can help detect
methylation states. Comparison of
genomic regions between wild type cells
and cells with DMTases knocked out can
also be used to identify regions that are
methylated.104
In mammals, the occurrence of the
CpG dinucleotide is lower than expected,
probably because of the conversion of
methylated Cs into Ts, but there are
regions that show enrichment of CpGs
called CpG islands.23 CpG islands are
usually not methylated. About 60 per cent
of promoters in the mouse and human
genome contain CpG islands, signifying
their euchromatic state.105 CpG islands
have been cloned and sequenced106 and
these have been used to recognise CpG
islands computationally (using HMMs or
other pattern recognition techniques).107
In maize and other plants, the genome
has a high repeat content, which is usually
methylated. Thus, using an Escherichia coli
host strain that harbours a mcrBC system,
that cuts methylated DNA, methylated
DNA can be eliminated and reads
enriched in genes, called methyl-filtered
reads, can be obtained.102 The reads can be
computationally analysed to predict genes.
Bioinformatics challenges• Evolution of DNA methylation. Since
DNA methylation is not present in
yeasts, but is present in other
organisms and there seems to be a
connection between methylation and
histone modification in other
organisms, it will be useful to
understand the evolution of the
proteins and the pathway.97,108–111
RNAi in epigenetics
1 5 6 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
• Signals in genomic sequence for
methylation states. CpG islands are
an example of such signals, but there
might be other signals. It is known
that there are correlations between the
nature of the sequence and the
functional state of chromatin. For
example, tandem repeats and
transposons are silenced.
• Insulators. These are sequence
elements that isolate or insulate a
region of the chromatin from a
neighbouring region and perhaps
could block the spread of methylation.
For example, analysis of CpG islands
may yield binding motifs for factors
that block DNA methylation. The
zinc finger-binding protein CTCF
(CCCTC binding factor) regulates the
genomic imprinting of the H19 and
Igf2 genes.23,112 CTCF binds the
differentially methylated domain
(DMD) on the maternally inherited
chromosome preventing its
methylation and subsequently the
expression of Igf2.
• Better identification of CpG
islands.105 This problem was
suggested by Prof. Denise Barlow,
who has observed that certain
experimentally verified CpG islands in
mouse are not annotated in the mouse
genome viewer on Ensembl.113 There
have been some efforts directed
towards genome-wide CpG island
annotations, such as on the human chr
21 and chr22.107,114 Current
computational techniques based on
HMMs or average sequence statistics
cannot identify short CpG islands107
and there might be a need for species-
specific CpG island predictors, which
in turn might point to differences in
the mechanisms by which such islands
are generated.
• Dynamics of epigenetic processes
and modelling. Heterochromatic
DNA is compact and silent but not
static. Transcription from
heterochromatic DNA does occur and
this transcription is important for
maintaining the silenced state.37 The
dynamics of this process is not well
known and resolving this is one of the
big challenges. There are experimental
challenges, but bioinformatics
challenges involve modelling the
dynamics and making predictions that
can direct biology. There is also histone
exchange going on, and it is important
to observe these processes in vivo and
measure rate constants in a quantitative
manner. There are systems available to
study these processes and these promise
to allow real-time visualisation, which
will then allow modelling.115
Modelling these systems would also
allow ranking these processes in terms
of relative importance in any pathway.
• Data collection and curation.
Several epigenomic projects are
underway;116–118 in addition an
exploration of the role of epigenetic
states, such as DNA methylation, in
cancer119 is being carried out.
Collecting these data in a central
repository and allowing intuitive
browsing of the data is an interesting
problem to solve. Having multiple
centres with diverse methods of
analysis and display would be the key
to solving this problem. Data on genes
and regions that are maternally or
paternally imprinted would also need
to be collected. This will allow
discerning patterns in a location-
specific, or gene/tissue-specific,
manner. Patterns of this nature will
not be obvious at first glance, but will
become clear when looked at in a
genome-wide manner. It is quite
possible that there are no universal
rules when the epigenome is
concerned, but different rules in
different regions of the genome.
Acknowledgments
Susan Janicki discussed the manuscript critically,
edited it and pointed out a number of references.
Bioinformaticschallenges inepigenetics
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 7
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
Leemor Joshua Tor and Jidong Liu kindly shared
figures from their papers and talks. Mike Ronemus
and Matt Vaughn gave critical comments and freely
shared their knowledge of the field. Dhruv Pant
helped with Figure 3. The anonymous reviewers
helped with several comments, corrections and
suggestions.
References
1. Meister, G. and Tuschl, T. (2004),‘Mechanisms of gene silencing by doublestranded RNA’, Nature, Vol. 431(7006),pp. 343–349.
2. Bartel, D. P. (2004), ‘MicroRNAs:Genomics, biogenesis, mechanism, andfunction’, Cell, Vol. 116(2), pp. 281–297.
3. Reinhart, B. J., Slack, F. J., Basson, M. et al.(2000), ‘The 21 nucleotide let 7 RNAregulates developmental timing inCaenorhabditis elegans’, Nature, Vol.403(6772), pp. 901–906.
4. Scott, R. J. and Spielman, M. (2004),‘Epigenetics: Imprinting in plants andmammals – the same but different?’, Curr.Biol., Vol. 14(5), pp. 201–203.
5. Lippman, Z. and Martienssen, R. (2004),‘The role of RNA interference inheterochromatic silencing’, Nature, Vol.431(7006), pp. 364–370.
6. Aravind, L., Watanabe, H., Lipman, D. J. andKoonin, E. V. (2000), ‘Lineage specific lossand divergence of functionally linked genesin eukaryotes’, Proc. Natl Acad. Sci. USA, Vol.97(21), pp. 11319–11324.
7. Silva, J., Chang, K., Hannon, G. J. and Rivas,F. V. (2004), ‘RNA interference basedfunctional genomics in mammalian cells:Reverse genetics coming of age’, Oncogene,Vol. 23(51), pp. 8401–8409.
8. Mochizuki, K. and Gorovsky, M. A. (2005),‘A Dicer-like protein in Tetrahymena hasdistinct functions in genome rearrangement,chromosome segregation, and meioticprophase’, Genes Dev., Vol. 19(1), pp. 77–89.
9. Nykanen, A., Haley, B. and Zamore, P. D.(2001), ‘ATP requirements and smallinterfering RNA structure in the RNAinterference pathway’, Cell, Vol. 107(3),pp. 309–321.
10. Sontheimer, E. J. (2005), ‘Assembly andfunction of RNA silencing complexes’, Nat.Rev. Mol. Cell Biol., Vol. 6(2), pp. 127–138.
11. Song, J. J., Smith, S. K., Hannon, G. J. andJoshua-Tor, L. (2004), ‘Crystal structure ofArgonaute and its implications for RISCslicer activity’, Science, Vol. 305(5689), pp.1434–1437.
12. Liu, J., Carmell, M. A., Rivas, F. V. et al.(2004), ‘Argonaute2 is the catalytic engine of
mammalian RNAi’, Science, Vol. 305(5689),pp. 1437–1441.
13. Ambros, V. (2004), ‘The functions of animalmicroRNAs’, Nature, Vol. 431(7006), pp.350–355.
14. Dugas, D. V. and Bartel, B. (2004),‘MicroRNA regulation of gene expression inplants’, Curr. Opin. Plant Biol., Vol. 7(5), pp.512–520.
15. Pfeffer, S., Zavolan, M., Grasser, F. A. et al.(2004), ‘Identification of virus encodedmicroRNAs’, Science, Vol. 304(5671), pp.734–736.
16. Yekta, S., Shih, I. H. and Bartel, B. P.(2004), ‘MicroRNA directed cleavage ofHOXB8 mRNA’, Science, Vol. 304(5670),pp. 594–596.
17. Llave, C., Xie, Z., Kasschau, K. D. andCarrington, J. C. (2002), ‘Cleavage ofscarecrow-like mRNA targets directed by aclass of Arabidopsis miRNA’, Science, Vol.297(5589), pp. 2053–2056.
18. Alberts, B., Bray, D., Lewis, J. et al. (2002),‘Molecular Biology of the Cell’, 4th edn,Garland Publishing, New York.
19. Jenuwein, T. and Allis, C. D. (2001),‘Translating the histone code’, Science, Vol.293(5532), pp. 1074–1080.
20. Grewal, S. I. and Rice, J.C. (2004),‘Regulation of heterochromatin by histonemethylation and small RNAs’, Curr. Opin.Cell Biol., Vol. 16(3), pp. 230–238.
21. Verdel, A., Jia, S., Gerber, S. et al. (2004),‘RNAi mediated targeting ofheterochromatin by the RITS complex’,Science, Vol. 303(5658), pp. 672–676.
22. Peterson, C. L. and Laniel, M. A. (2004),‘Histones and histone modifications’, Curr.Biol., Vol. 14(14), pp. R546–551.
23. Bird, A. (2002), ‘DNA methylation patternsand epigenetic memory’, Genes Dev., Vol.16(1), pp. 6–21.
24. Kawasaki, H. and Taira, K, (2004),‘Induction of DNA methylation and genesilencing by short interfering RNAs inhuman cells’, Nature, Vol. 431(7005), pp.211–217.
25. Morris, K. V., Chan, S. W., Jacobsen, S. E.and Looney, D. J. (2004), ‘Small interferingRNA induced transcriptional gene silencingin human cells’, Science, Vol. 305(5688), pp.1289–1292.
26. Gil, J. and Esteban, M. (2000), ‘Induction ofapoptosis by the dsRNA dependent proteinkinase (PKR): Mechanism of action’,Apoptosis, Vol. 5(2), pp. 107–114.
27. Manche, L., Green, S. R., Schmedt, C. andMathews, M. B. (1992), ‘Interactionsbetween double stranded RNA regulators
1 5 8 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
and the protein kinase DAI’, Mol. Cell Biol.,Vol. 12(11), pp. 5238–5248.
28. Elbashir, S. M., Harborth, J., Lendeckel, W.et al. (2001), ‘Duplexes of 21 nucleotideRNAs mediate RNA interference in culturedmammalian cells’, Nature, Vol. 411(6836), pp.494–498.
29. Paddison, P. J., Caudy, A. A., Bernstein, E.et al. (2002), ‘Short hairpin RNAs (shRNAs)induce sequence specific silencing inmammalian cells’, Genes Dev., Vol. 16(8),pp. 948–958.
30. Siolas, D., Lerner, C., Burchard, J. et al.(2005), ‘Synthetic shRNAs as potent RNAitriggers’, Nat. Biotechnol., Vol. 23(2), pp.227–231.
31. Kim, D. H., Behlke, M. A., Rose, S. D. et al.(2005), ‘Synthetic dsRNA Dicer substratesenhance RNAi potency and efficacy’, Nat.Biotechnol., Vol. 23, pp. 222–226.
32. Hannon, G. J. and Rossi, J. J. (2004),‘Unlocking the potential of the humangenome with RNA interference’, Nature,Vol. 431(7006), pp. 371–378.
33. Lee, N. S. and Rossi, J. J. (2004), ‘Control ofHIV 1 replication by RNA interference’,Virus Res., Vol. 102(1), pp. 53–58.
34. Gitlin, L., Stone, J. K. and Andino, R.(2005), ‘Poliovirus escape from RNAinterference: Short interfering RNA targetrecognition and implications for therapeuticapproaches’, J. Virol., Vol. 79(2), pp. 1027–1035.
35. Acuity Pharmaceuticals, ‘siRNA use forAMD treatment’ (URL: http://www.acuitypharma.com).
36. Tabara, H., Sarkissian, M., Kelly, W. G. et al.(1999), ‘The rde 1 gene, RNA interference,and transposon silencing in C. elegans’, Cell,Vol. 99(2), pp. 123–132.
37. Volpe, T. A., Kidner, C., Hall, I. M. et al.(2002), ‘Regulation of heterochromaticsilencing and histone H3 lysine 9 methylationby RNAi’, Science, Vol. 297(5588), pp.1833–1837.
38. Carmell, M. A., Xuan, Z., Zhang, M. Q. andHannon, G. J. (2002), ‘The Argonautefamily: Tentacles that reach into RNAi,developmental control, stem cellmaintenance, and tumorigenesis’, Genes Dev.,Vol. 16(21), pp. 2733–2742.
39. Silverman, E., Edwalds, G. G. and Lin, R. J.(2003), ‘DExD/H box proteins and theirpartners: Helping RNA helicases unwind’,Gene, Vol. 312, pp. 1–16.
40. Lamontagne, B., Larose, S., Boulanger, J. andElela, S. A. (2001), ‘The RNase III family: Aconserved structure and expanding functionsin eukaryotic dsRNA metabolism’, Curr.Issues Mol. Biol., Vol. 3(4), pp. 71–78.
41. Tian, B., Bevilacqua, P. C., DiegelmanParente, A. and Mathews, M. B. (2004), ‘Thedouble stranded RNA binding motif:Interference and much more’, Nat. Rev. Mol.Cell Biol., Vol. 5(12), pp. 1013–1023.
42. Eddy, S. R., ‘Hmmer: Open source HMMtools’ (URL: http://hmmer.wustl.edu).
43. Felsenstein, J., ‘Phylip website’ (URL:http://evolution.genetics.washington.edu/phylip.html).
44. Pasquinelli, A. E., Reinhart, B. J., Slack, F.et al. (2000), ‘Conservation of the sequenceand temporal expression of let 7heterochronic regulatory RNA’, Nature, Vol.408(6808), pp. 86–89.
45. Sempere, L. F., Freemantle, S., Pitha Rowe,I. et al. (2004), ‘Expression profiling ofmammalian microRNAs uncovers a subset ofbrain expressed microRNAs with possibleroles in murine and human neuronaldifferentiation’, Genome Biol., Vol. 5(3),p. R13.
46. Miska, E. A., Alvarez Saavedra, E.,Townsend, M. et al. (2004), ‘Microarrayanalysis of microRNA expression in thedeveloping mammalian brain’, Genome Biol.,Vol. 5(9), p. 68.
47. Cullen, B. R. (2004), ‘Transcription andprocessing of human microRNA precursors’,Mol. Cell, Vol. 16(6), pp. 861–865.
48. Pasquinelli, A. E. and Ruvkun, G. (2002),‘Control of developmental timing bymicroRNAs and their targets’, Annu. Rev.Cell Dev. Biol., Vol. 18, pp. 495–513.
49. Griffiths Jones, S. (2004), ‘The microRNARegistry’, Nucleic Acids Res., Vol. 32, pp.D109–111.
50. miRNA Registry (URL: http://www.sanger.ac.uk/Software/Rfam/mirna/index.shtml).
51. Rodriguez, A., Griffiths Jones, S., Ashurst,J. L. and Bradley, A. (2004), ‘Identification ofmammalian microRNA host genes andtranscription units’, Genome Res., Vol.14(10A), pp. 1902–1910.
52. Ying, S. Y. and Lin, S. L. (2005), ‘IntronicmicroRNAs’, Biochem. Biophys. Res.Commun., Vol. 326(3), pp. 515–520.
53. Lee, Y., Kim, M., Han, J. et al. (2004),‘MicroRNA genes are transcribed by RNApolymerase II’, EMBO J., Vol. 23(20), pp.4051–4060.
54. Ohler, U., Yekta, S., Lim, L. P. et al. (2004),‘Patterns of flanking sequence conservationand a characteristic upstream motif formicroRNA gene identification’, RNA, Vol.10(9), pp. 1309–1322.
55. Bonnet, E., Wuyts, J., Rouze, P. and dePeer, Y. V. (2004), ‘Evidence thatmicroRNA precursors, unlike other non-
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 5 9
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
coding RNAs, have lower folding freeenergies than random sequences’,Bioinformatics, Vol. 20(17), pp. 2911–2917.
56. Bonnet, E., Wuyts, J., Rouze, P. and dePeer, Y. V. (2004), ‘Detection of 91 potentialconserved plant microRNAs in Arabidopsisthaliana and Oryza sativa identifies importanttarget genes’, Proc. Natl Acad. Sci. USA, Vol.101(31), pp. 11511–11516.
57. Mawson, M., Lim, L. P. and Lau, N. (2003),MiRscan website (URL: http://genesmitedu/mirscan/).
58. Tomancak, P. (2003), miRseeker website(URL: http://toylblgov:9050/cgi bin/miRseekerpl/).
59. Vella, M. C., Reinert, K. and Slack, F. J.(2004), ‘Architecture of a validatedmicroRNA::target interaction’, Chem. Biol.,Vol. 11(12), pp. 1619–1623.
60. Doench, J. G. and Sharp, P. A. (2004),‘Specificity of microRNA target selection intranslational repression’, Genes Dev., Vol.18(5), pp. 504–511.
61. Hobert, O. (2004), ‘Common logic oftranscription factor and microRNA action’,Trends Biochem. Sci., Vol. 29(9), pp. 462–468.
62. Xie, X., Lu, J., Kulbokas, E. J. et al. (2005),‘Systematic discovery of regulatory motifs inhuman promoters and 39 UTRs bycomparison of several mammals’, Nature, Vol.434(7031), pp. 338–345.
63. Rhoades, M. W., Reinhart, B. J., Lim, L. P.et al. (2002), ‘Prediction of plant microRNAtargets’, Cell, Vol. 110(4), pp. 513–520.
64. John, B., Enright, A. J., Aravin, A. et al.(2004), ‘Human microRNA targets’, PLoSBiol., Vol. 2(11), p. e363.
65. Lewis, B. P., Burge, C. B. and Bartel, D. P.(2005), ‘Conserved seed pairing, oftenflanked by adenosines, indicates thatthousands of human genes are microRNAtargets’, Cell, Vol. 120(1), pp. 15–20.
66. Lai, E. C., Wiel, C. and Rubin, G. M.(2004), ‘Complementary miRNA pairssuggest a regulatory role for miRNA:miRNAduplexes’, RNA, Vol. 10(2), pp. 171–175.
67. Rehmsmeier, M., Steffen, P., Hochsmann,M. and Giegerich, R., ‘RNA hybridwebserver’ (URL: http://bibiserv.techfak.uni�bielefeld.de/rnahybrid/).
68. MSKC Center, miRanda website (URL:http://www.microrna.org).
69. Rehmsmeier, M., Steffen, P., Hochsmann,M. and Giegerich, R. (2004), ‘Fast andeffective prediction of microRNA/targetduplexes’, RNA, Vol. 10(10), pp. 1507–1517.
70. Rajewsky, N. and Socci, N. D. (2004),
‘Computational identification of microRNAtargets’, Dev. Biol., Vol. 267(2), pp. 529–535.
71. Kiriakidou, M., Nelson, P. T., Kouranov, A.et al. (2004), ‘A combined computationalexperimental approach predicts humanmicroRNA targets’, Genes Dev., Vol. 18(10),pp. 1165–1178.
72. Babak, T., Zhang, W., Morris, Q. et al.(2004), ‘Probing microRNAs withmicroarrays: Tissue specificity and functionalinference’, RNA, Vol. 10(11), pp. 1813–1819.
73. Mansfield, J. H., Harfe, B. D., Nissen, R.et al. (2004), ‘MicroRNA responsive‘‘sensor’’ transgenes uncover Hox-like andother developmentally regulated patterns ofvertebrate microRNA expression’, Nat.Genet., Vol. 36(10), pp. 1079–1083.
74. Gardner, T. S., di Bernardo, D., Lorenz, D.and Collins, J. J. (2003), ‘Inferring geneticnetworks and identifying compound mode ofaction via expression profiling’, Science, Vol.301(5629), pp. 102–105.
75. Bennasser, Y., Le, S. Y., Yeung, M. L. andJeang, K. T. (2004), ‘HIV 1 encodedcandidate micro RNAs and their cellulartargets’, Retrovirology, Vol. 1(1), p. 43.
76. Liu, Y., Mochizuki, K. and Gorovsky, M. A.(2004), ‘Histone H3 lysine 9 methylation isrequired for DNA elimination in developingmacronuclei in Tetrahymena’, Proc. Natl Acad.Sci. USA, Vol. 101(6), pp. 1679–1684.
77. Bao, N., Lye, K. W. and Barton, M. K.(2004), ‘MicroRNA binding sites inArabidopsis class III HD ZIP mRNAs arerequired for methylation of the templatechromosome’, Dev. Cell, Vol. 7(5), pp.653–662.
78. Khvorova, A., Reynolds, A. and Jayasena, S.D. (2003), ‘Functional siRNAs and miRNAsexhibit strand bias’, Cell, Vol. 115(2), pp.209–216.
79. Schwarz, D. S., Hutvagner, G., Du, T. et al.(2003), ‘Asymmetry in the assembly of theRNAi enzyme complex’, Cell, Vol. 115(2),pp. 199–208.
80. Silva, J. M., Sachidanandam, R. and Hannon,G. J. (2003), ‘Free energy lights the pathtoward more effective RNAi’, Nat. Genet.,Vol. 35(4), pp. 303–305.
81. Jackson, A. L., Bartz, S. R., Schelter, J. et al.(2003), ‘Expression profiling reveals off-targetgene regulation by RNAi’, Nat. Biotechnol.,Vol. 21(6), pp. 635–637.
82. Jackson, A. L. and Linsley, P. S. (2004),‘Noise amidst the silence: Off-target effects ofsiRNAs?’, Trends Genet., Vol. 20(11), pp.521–524.
83. Ambion Company website (URL: http://
1 6 0 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
www.ambion.com/catalog/supp/silencer_lib.html).
84. Dharmacon Company website (URL: http://www.dharmacon.com).
85. Lee, J., Olson, A. and Sachidanandam, R.,‘A website to design siRNAs’ (URL: http://katahdin.cshl.org:9331/siRNA/RNAi.cgi?type¼siRNA).
86. Dharmacon, siDesign Center (URL: http://design.dharmacon.com/rnadesign/default.aspx?sid¼731615029/).
87. Ashrafi, K., Chang, F. Y., Watts, J. L. et al.(2003), ‘Genome-wide RNAi analysis ofCaenorhabditis elegans fat regulatory genes’,Nature, Vol. 421(6920), pp. 268–272.
88. Lee, S. S., Lee, R. Y., Fraser, A. G. et al.(2003), ‘A systematic RNAi screen identifiesa critical role for mitochondria in C. eleganslongevity’, Nat. Genet., Vol. 33(1), pp.40–48.
89. Yiu, S. M., Wong, P. W., Lam, T. W. et al.(2005), ‘Filtering of ineffective siRNAs andimproved siRNA design tool’, Bioinformatics,Vol. 21(2), pp. 144–151.
90. Yamada, T. and Morishita, S. (2005),‘Accelerated off-target search algorithm forsiRNA’, Bioinformatics, Vol. 21(8), pp. 1316–1324.
91. Paddison, P. J., Silva, J. M., Conklin, D. S.et al. (2004), ‘A resource for largescale RNAibased screens in mammals’, Nature, Vol.428(6981), pp. 427–431.
92. Berns, K., Hijmans, E. M., Mullenders, J.et al. (2004), ‘A large-scale RNAi screen inhuman cells identifies new components of thep53 pathway’, Nature, Vol. 428(6981), pp.431–437.
93. Olson, A., Sheth, N., Lee, J. et al., ‘A websiteto access and manage shRNA resources’(URL: http://codex.cshl.org).
94. Epigenetics (URL: http://en.wikipedia.org/wiki/Epigenetics/).
95. da Rocha, S. T. and Ferguson-Smith, A. C.(2004), ‘Genomic imprinting’, Curr. Biol.,Vol. 14(16), pp. R646–649.
96. Delaval, K. and Feil, R. (2004), ‘Epigeneticregulation of mammalian genomicimprinting’, Curr. Opin. Genet. Dev., Vol.14(2), pp. 188–195.
97. Tweedie, S., Charlton, J., Clark, V. and Bird,A. P. (1997), ‘Methylation of genomes andgenes at the invertebrate–vertebrateboundary’, Mol. Cell Biol., Vol. 17(3), pp.1469–1475.
98. Nan, X., Ng, H. H., Johnson, C. A. et al.(1998), ‘Transcriptional repression by themethyl CpG binding protein MeCP2involves a histone deacetylase complex’,Nature, Vol. 393(6683), pp. 386–389.
99. McNairn, A. J. and Gilbert, D. M. (2003),‘Epigenomic replication: Linking epigeneticsto DNA replication’, Bioessays, Vol. 25(7),pp. 647–656.
100. Das, P. M., Ramachandran, K., vanWert, J.and Singal, R. (2004), ‘Chromatinimmunoprecipitation assay’, Biotechniques,Vol. 37(6), pp. 961–969.
101. Robyr, D., Suka, Y., Xenarios, I. et al.(2002), ‘Microarray deacetylation mapsdetermine genome wide functions for yeasthistone deacetylases’, Cell, Vol. 109(4), pp.437–446.
102. Rabinowicz, P. D. (2003), ‘Constructinggene enriched plant genomic libraries usingmethylation filtration technology’, MethodsMol. Biol., Vol. 236, pp. 21–36.
103. Yang, A. S., Estecio, M. R. H., Doshi, K.et al. (2004), ‘A simple method for estimatingglobal DNA methylation using bisulfite PCRof repetitive DNA elements’, Nucleic AcidsRes., Vol. 32(3), p. e38.
104. Hattori, N., Abe, T., Hattori, N. et al.(2004), ‘Preference of DNAmethyltransferases for CpG islands in mouseembryonic stem cells’, Genome Res., Vol.14(9), pp. 1733–1740.
105. Antequera, F. (2003), ‘Structure, functionand evolution of CpG island promoters’, CellMol. Life Sci., Vol. 60(8), pp. 1647–1658.
106. Cross, S. H., Charlton, J. A., Nan, X. andBird, A. P. (1994), ‘Purification of CpGislands using a methylated DNA bindingcolumn’, Nat. Genet., Vol. 6(3), pp.236–244.
107. Takai, D. and Jones, P. A. (2002),‘Comprehensive analysis of CpG islands inhuman chromosomes 21 and 22’, Proc. NatlAcad. Sci. USA, Vol. 99(6), pp. 3740–3745.
108. Bird, A. P. (1995), ‘Gene number, noisereduction and biological complexity’, TrendsGenet., Vol. 11(3), pp. 94–100.
109. Hendrich, B. and Tweedie, S. (2003), ‘Themethyl CpG binding domain and theevolving role of DNA methylation inanimals’, Trends Genet., Vol. 19(5), pp.269–277.
110. Bestor, T. H. (1990), ‘DNA methylation:Evolution of a bacterial immune functioninto a regulator of gene expression andgenome structure in higher eukaryotes’,Philos. Trans R. Soc. London B Biol. Sci., Vol.326(1235), pp. 179–187.
111. Colot, V. and Rossignol, J. L. (1999),‘Eukaryotic DNA methylation as anevolutionary device’, Bioessays, Vol. 21(5),pp. 402–411.
112. Lewis, A. and Murrell, A. (2004), ‘Genomicimprinting: CTCF protects the boundaries’,Curr. Biol., Vol. 14(7), pp. 284–286.
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I CS . VOL 6. NO 2. 146–162. JUNE 2005 1 6 1
RNAi as a bioinformatics consumer
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from
113. Barlow, D. (2005), ‘CpG islands not wellannotated in the mouse genome’, PrivateCommunication, January.
114. Cross S. H., Clark, V. H., Simmen, M. W.et al. (2000), ‘CpG island libraries fromhuman chromosomes 18 and 22: Landmarksfor novel genes’, Mamm. Genome, Vol. 11(5),pp. 373–383.
115. Janicki, S. M., Tsukamoto, T., Salghetti, S. E.et al. (2004), ‘From silencing to geneexpression: Real time analysis in single cells’,Cell, Vol. 116, pp. 683–698.
116. Eckhardt, F., Beck, S., Gut, I. G. and Berlin,K. (2004), ‘Future potential of the Human
Epigenome Project’, Expert Rev. Mol. Diagn.,Vol. 4(5), pp. 609–618.
117. Novik, K. L., Nimmrich, I., Genc, B. et al.(2002), ‘Epigenomics: Genome-wide study ofmethylation phenomena’, Curr. Issues Mol.Biol., Vol. 4(4), pp. 111–128.
118. Bradbury, J. (2003), ‘Human epigenomeproject – up and running’, PLoS Biol., Vol.1(3), p. e82.
119. Yang, H. H. and Lee, M. P. (2004),‘Application of bioinformatics in cancerepigenetics’, Ann. New York Acad. Sci., Vol.1020, pp. 67–76.
1 6 2 & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 146–162. JUNE 2005
Sachidanandam
by guest on May 14, 2011
bib.oxfordjournals.orgD
ownloaded from