functional and structural genomics study of singapore ... · as early as 1898, friedrich loeffler...
TRANSCRIPT
FUNCTIONAL AND STRUCTURAL GENOMICS
STUDY OF SINGAPORE GROUPER IRIDOVIRUS
CHEN LI MING
(B.SC., XIAMEN UNIVERSITY)
A THESIS SUBMITTED FOR
THE DEGREE OF DOCTOR OF PHILOSOPHY
DEPARTMENT OF BIOLOGICAL SCIENCES
NATIONAL UNIVERSITY OF SINGAPORE
2008
I
Acknowledgements
First of all, I would like to thank my supervisor, Professor Hew Choy Leong,
for giving me the opportunity to pursue my PhD study and for his guidance,
and mentorship.
Second, I would like to thank Dr Lance Miller for the discussion on the chip
set-up. I also appreciate Dr Jin Hua Han for helpful discussions on the data
analysis. I also appreciate Dr. Jayaraman Sivaraman for his advice on
crystallization, Dr. Song Jianxing and Dr. Yang Daiwen for their advices on
NMR. I thank Dr. Lin Qingsong for his advice on iTRAQ and proteome related
works. I thank Dr. Gong Zhiyuan for his advice on transgenetic fish studies.
Third, I appreciate the Genomic Institute of Singapore for the provision of the
facilities for DNA microarray and real-time RT-PCR work. I am grateful for Kun
Yan's assistance with chip spotting. I thank Dr Zhen Jun Li for her advice on
the molecular biological techniques. I thank Dr Yan Liu for his useful
suggestions and kind helps on the structural study of my project. I thank Mr.
Wang Fan for his suggestions on my project.
Finally, I would like to thank my other lab mates: Dr. Song Wenjun, Dr. Wu
Jinlu, Dr. Tang Xuhua, Ms Chen Jing, and Ms Tran Bich Ngoc for their
valuable discussion and friendship.
II
Summary
Singapore grouper iridovirus (SGIV), an iridovirus in the genus Ranavirus, is a major
pathogen that results in significant economic losses in grouper aquaculture. In this
thesis, first, we report the temporal and differential gene expression of SGIV using
SGIV viral DNA microarray; Second, we report the first proteomics study of grouper
embryonic cells (GEC) infected by Singapore grouper iridovirus (SGIV) to take an
insight into the interaction of SGIV and its host cell at proteome scale by iTRAQ;
third, we report that a novel viral coding protein ORF158L is involved in the
regulation of host histone H3 K79 methylation; finally, we report the structural study
of ORF158L. Our work provides important insights into the pathogenesis of
iridoviruses.
III
Table of Contents
Acknowledgements………………………………………………… …..I
Summary………………………………………………………………………II
Table of Contents………………………………………………….…………III
List of Tables………………………………………………………………….VIII
List of Figures………………………………………………………………....IX
Chapter 1 Introduction & Literature Review……………………………..1
1.1 Overview……………………………………………………………………..2
1.2 Introduction of Iridovirus…………………………………………………..2
1.3 Introduction of Singapore grouper iridovirus (SGIV)…………………....2
1.4 Transcriptional regulation & Replication cycle of the iridovirus………..3
1.5 Functional genomics………………………………………………………. .5
1.5.1 Introduction of Functional genomics………………………………...5
1.5.2 Gene expression profile and differential gene expression of the
iridovirus……………………………………………………………………..5
1.5.3 Functional genomic studies at proteomic scale by using iTRAQ…..7
1.6 Structural genomics………………………………………………………...10
1.6.1 Introduction of Structural genomics………………………………...10
1.6.2 NMR spectroscopy………………………………………………….…10
1.6.3 X-Ray crystallography……………………………………………......11
1.7 Scope of thesis……………………………………………………………….12
IV
Chapter 2 An investigation of temporal and differential gene expression
of Singapore grouper iridovirus by DNA microarray……………….….16
2.1 Summary………………………………………………………………….....17
2.2 Introduction………………………………………………………………....18
2.3 Materials and Methods……………………………………………………..20
2.3.1 Cell lines……………………………………………………………….20
2.3.2 Positive controls for the SGIV DNA microarray…………………....20
2.3.3 Preparation of amplicons for the SGIV DNA microarray………….20
2.3.4 Virus infection and CHX and aphidicoline treatments………….…21
2.3.5 Total RNA preparation, reverse transcription and labeling….…...21
2.3.6 Real-time PCR………………………………………………….……..22
2.4 Results…………………………………………………………………….….23
2.4.1 Viral microarray for grouper iridovirus………………………….…23
2.4.2 Temporal gene-expression analysis of the SGIV genome……..……23
2.4.3 SGIV viral gene expression with different concentrations of
CHX…………………………………………………………………….……24
2.4.4 SGIV viral gene expression with aphidicoline treatment…….…….24
2.5 Discussions …………………………………………………………….……27
Chapter 3 iTRAQ study of Grouper embryonic cells infected by
Singapore grouper Iridovirus: an insight into the Singapore grouper
iridovirus and host cell interaction……………………………….…....45
3.1 Summary……………………………………………………………….……46
3.2 Introduction………………………………………………………………....47
3.3 Materials and Methods……………………………………………………..47
3.3.1 Cell and virus infection……………………………………………….47
V
3.3.2 iTRAQ Labeling and Two Dimensional (2D) LC-MALDI MS….…47
3.3.3 RT PCR………………………………………………………….….….49
3.3.4 Western blotting………………………………………………….……49
3.4 Results………………………………………………………………….….…50
3.4.1 Identification of viral proteins by using iTRAQ………………….…50
3.4.2 Identification of differential expression host proteins by using
iTRAQ………………………………………………………………….……50
3.4.3 RT-PCR and western blot analysis of the viral proteins…………...51
3.4.4 Up regulation of host Histone H3 Lysine 79 (K79) methylation upon
SGIV infection……………………………………………………….……..52
3.5 Discussions ……………………………………………………………...….53
Chapter 4 A novel SGIV coding protein ORF158L is involved in the
regulation of K79 methylation of histione H3……………………....…81
4.1 Summary……………………………………………………………….……82
4.2 Introduction…………………………………………………………….……82
4.3 Materials and Methods………………………………………………….…..82
4.3.1 Cell and virus infection……………………………………….…….…82
4.3.2 Antibodies……………………………………………………….……..82
4.3.3 Knockdown of ORF158L……………………………………….……..83
4.3.4 Electron Microscopy……………………………………………….….83
4.3.5 DNA Microarray and Real Time RT PCR…………………….…….83
4.3.6 Immunofluorescence …………………………………………….……83
4.4 Results……………………………………………………………………85
4.4.1 Western blot against ORF158L………………………………………85
4.4.2 Knockdown of ORF158L……………………………………………..85
VI
4.4.3 DNA Microarray and Real time RT-PCR investigation of
transcriptsomes of viral genes when ORF158L was knocked down….….85
4.4.4 Subcellular localization of ORF158L and ORF158L & Histone H3
colocalization………………………………………………………………...86
4.4.5 iTRAQ study the effect of knockdown ORF158L at proteome
scale..................................................................................................................86
4.4.6 ORF158L involved in Histone H3 lysine 79 (K79) methylation
regulation…………………………………………………………………….86
4.5 Discussions…………………………………………………………………...88
Chapter 5 Structural study of ORF158L…………………………….97
5.1 Summary & Introduction……………………………………………………98
5.2 Materials and Methods………………………………………………………99
5.2.1 Construction of the expression plasmid…………………………….…99
5.2.2 Expression and purification of ORF158L…………………………..…99
5.2.3 Mass spectrometry analysis………………………………………….…99
5.2.4 Dynamic light scattering (DLS) study……………………………….…99
5.2.5 Circular dichroism (CD) study…………………………………………99
5.2.6 NMR sample preparation……………………………………………...100
5.2.7 NMR Experiments and data process. …………………………...........100
5.2.8 SeMet ORF158L preparation………………………………………….100
5.2.9 Crystallization …………………….……………………………………100
5.2.10 Data collection, structure solution and refinement..…………...........101
5.2.11 Expression and purification of histone H3 and H4 complex…….…102
5.2.11 Surface Plasmon Resonance (SPR)……………………………...........102
5.3 Results…………………………………………………………………………103
VII
5.3.1 Protein purification profiles of ORF158L ……………………………103
5.3.2 Mass spectrometry analysis……………………………………………103
5.3.3 Dynamic light scattering (DLS) study…………………………………103
5.3.4 Circular dichroism study…………………………………………….…103
5.3.5 1D NMR……………………………………………………………....…104
5.3.6 1H-15N HSQC study of ORF158L………………………………..........104
5.3.7 Overall Structure of ORF158L…………………………………………105
5.3.8 Sequence and structural homology….…………………………………106
5.3.9 Putative Histone binding region….…………………………………….106
Chapter 6 Achievements & Future experiments…………………...119
6.1 Achievements……………………………………………………………….…120
6.2 Future experiments……………………………………………………………120
References………………………………………………………………………….122
VIII
List of Tables
Table 2.1. Kinetic class of SGIV ORF expression. …………………………...31
Table 2.2 SGIV primers………………………………………………...............36
Table 2.3. Partial cDNA sequences of β-actin and GAPDH…………….........39
Table 2.4. Primers for Real time PCR………………………………………....40
Table 2.5. SGIV genes with temporal expression on array………………..…41
Table 3.1. 38 viral proteins were consistent with those that were reported…60
Table 3.2. 11 viral proteins were newly identified and first reported………...62
Table 3.3. 12 host proteins were up-regulated during SGIV infection………..63
Table 3.4. 5 host proteins were down-regulated during SGIV infection…..…..64
Table 3.5. GEC host proteins identified from iTRAQ analysis………………..65
Table 3.6 Primers for the 11 newly identified viral proteins…………………..80
Table 5.1 Data collection and refinement statistics of ORF158L……………..118
IX
List of Figures
Figure 1.1 The diagram of iridovirus replication cycle……………………....13
Figure 1.2 The chemical structure of iTRAQ reagent……………………..…14
Figure 1.3 Workflows of iTRAQ experiments………………………………...15
Figure 2.1. Hierarchical clustering gene tree of SGIV temporal gene expression
data……………………………………………………………………………….32
Figure 2.2. Validation DNA microarray result with real time RT-PCR…….33
Figure 2.3. Effect of CHX treatment on SGIV gene expression…………….34
Figure 2.4. SGIV gene expression profiles with aphidicoline treatment……35
Figure 3.1 Full length amplification of 11 novel genes of SGIV via RT-
PCR………………………………………………………………………………57
Figure 3.2 Presence of SGIV proteins expressed by ORF018R, 026R, 093L and 2
newly identified proteins encoded by ORF135L and 140R…………………..58
Figure 3.3 Histone H3 K79 methylation analysis of SGIV-infected GECs
compared to mock SGIV-infected GECs by western blot……………………59
Figure 4.1 Identification and knockdown of ORF158L in cell culture………90
Figure 4.2 Comparison of transcriptional profiles of SGIV with and without
ORF158L knock down…………………………………………………………..91
Figure 4.3 Subcellular localization of ORF158L………………………………92
X
Figure 4.4. The co localization study between ORF158L and histone H3…...93
Figure 4.5. iTRAQ study of ORF158L knockdown effects to histone H3……94
Figure 4.6. Western blot validation of histone H3 modification………………95
Figure 4.7. iTRAQ study of ORF158L knockdown effects in SGIV-infected
GECs………………………………………………………………………………96
Figure 5.1 Purification of ORF158L…………………………………………..107
Figure 5.2 Molecular weight determination by MS……………………….…108
Figure 5.3 DLS result of ORF158L……………………………………….…..109
Figure 5.4 CD results of ORF158L…………………………………………....109
Figure 5.5 Characterization of protein structures by one-dimensional NMR
spectroscopy…………………………………………………………………….111
Figure 5.6 Characterization of protein structures by two-dimensional NMR
spectroscopy……………………………………………………….……………112
Figure 5.7 Crystal picture of SeMET ORF158L………………….………….113
Figure 5.8 Purification of histone H3 & H4 complex………………….…….114
Figure 5.8 Structure of ORF158L and models of ORF158L and histones
complex …………………………………..…………………………….………115
Figure 5.9 Structure comparison and functional implication………….…....116
Figure 5.10 SPR study of ORF158L binding to histone H3 & H4 complex..117
2
1.1 Overview
As early as 1898, Friedrich Loeffler and Paul Frosch raised the first clue on the nature
of virus by demonstrating that the cause of the foot-and-mouth disease in livestock
was an infectious particle smaller than any known bacteria. Later on, people found
that viruses had protein coats named virus capsids and genomic contents inside the
virus. The materials for storage of the genomic information of the virus could be
either DNA or RNA. Based on the materials of the genomic content of the viruses, the
viruses can be classified into two major groups, i.e. DNA viruses and RNA viruses.
Viruses are further classified into single-stranded DNA (ssDNA), single-stranded
RNA (ssRNA) viruses and double-stranded DNA (dsDNA) viruses. The dsDNA
viruses are further classified into more than 20 families.
1.2 Introduction of Iridovirus
The Iridoviridae Family is one member of the DNA virus families. Iridoviruses are
large cytoplasmic DNA viruses and infect either insects or vertebrates. In 1954, the
first iridovirus was discovered by Smith and Xeros (Smith and Xeros, 1954). To date,
more than 100 iridoviruses belonging to the four genera of the iridovirus family have
been isolated. The four genera of the iridovirus family are Iridovirus, Chloriridovirus
and, Lymphocystivirus and Ranavirus (Chinchar et al., 2005).
1.3 Introduction of Singapore grouper iridovirus (SGIV)
In 1994, a novel member of Ranavirus, Singapore grouper iridovirus (SGIV), causing
significant economic losses in Singapore marine net cage farm was reported (Chua et
al., 1994). In 1998, SGIV was isolated from brown groupers (Qin et al., 2001). In
2004, the genomic DNA of SGIV was sequenced and 162 open reading frames (ORFs)
were predicted based on genomic sequence (Song et al., 2004). Until 2006, a total of
3
61 SGIV structural proteins had been identified by using proteomics methods (Song et
al., 2004; Song et al., 2006).
1.4 Transcriptional regulation and replication cycle of iridovirus
Transcriptional regulation of viral gene expression is one of the important subjects in
viral research. In 2001, the gene expression of Chilo iridescent virus (CIV), was
investigated and 137 viral transcripts were detected (D’Costa et al., 2001). Two years
later, Costa and coworkers further investigated the transcriptional mapping of the CIV
and found that 90 percent of the CIV genome encoding genes were transcripted
(D’Costa et al., 2004 ). The transcriptional program of CIV was studied by Costa and
coworkers using traditional methods, such as Northern blot (D’Costa et al., 2001;
D’Costa et al., 2004 ). Compared to Northern blot, the DNA microarray technology is
a newly emerging high through put methodology. DNA microarray can be used to
investigate hundreds, even thousands of genes simultaneously. Therefore, DNA
microarray can provide a better understanding of viral transcriptional programs. DNA
microarray has been widely used in viral gene transcriptional studies in the past
decade. Recently, in 2005, Lua and coworkers applied DNA microarray technology to
reveal the transcriptional programs of red sea beam iridovirus (RSIV) by monitoring
92 putative ORFs simultaneously (Lua et al., 2005 ). In 2007, similar work was
reported on the transcriptional profile of RSIV using DNA microarray and found that
97-99% of the RSIV ORFs were expressed (Lua et al., 2007). However, the gene
expression and transcriptional program of SGIV has not been investigated either by
Northern blot or DNA microarray. The study of the transcriptional profile of SGIV
may provide a profound insight into the replication and pathogenesis of the iridovirus
family.
4
The replication cycle of the iridovirus is very complicated and not fully understood.
Murti and coworkers proposed a model to elucidate the replication cycle of the
iridovirues(Murti et al., 1985). The first step of the replication cycle of iridovirus
particles is that iridovirus particles attach to the plasma membranes of host cells. After
attaching to membrane, the viral particles enter the host cells by phagocytosis (the
cellular process by which cell membrane engulfs solid particles to form an internal
phagosome.). Subsequent to the entry, the iridovirus particles are delivered to the
lysosomes of host cells, and they are uncoated inside the lysosomes. As a result, the
genome of iridoviruses is released. The released iridovirus genome is transported to
the host nucleus and the first stage of iridovirus genome replication is initiated. At the
same time, the iridovirus immediate early genes are transcribed, and the iridovirus
genome is transported to the cytoplasm of the cell and the iridovirus commences its
second stage genome replication in the cytoplasm. Besides, the early and late genes of
the iridoviruse are transcripted and translated. The genomes and structural proteins of
the iridoviruse begin to assemble to form new iridoviruses, and are released from the
host cells. The virus is now ready to initiate new replication cycles (Figure 1.1) (Murti
et al., 1985).
5
1.5 Functional genomics
1.5.1 Introduction of Functional genomics
Genomic projects, such as genome sequencing projects, have produced a vast wealth
of data. Functional genomics is a field of molecular biology that attempts to describe
large scale gene/protein functions and interactions by using the data produced by
genomic projects. Functional genomics focuses on the dynamic aspects such as gene
transcription, translation, and protein-protein interactions.
1.5.2 Gene expression profile and differential gene expression of the iridovirus
In the iridovirus replication cycle, the iridovirus genes are differentially expressed
(Williams, 1996). On one hand, some iridovirus genes are expressed at the first stage
of iridovirus genome replication in the nucleus of host cells. On the other hand, some
genes are expressed after the first stage of iridovirus genome replication. According to
the differential gene expression of iridoviruses, the iridoviruses genes are divided into
three groups: the immediate early genes, the early genes and the late genes. The
immediate early genes are expressed immediately after the primary infection, and
these genes encode proteins which play important roles in the trans-activation of
iridoviruses. Following the expression of the immediate early genes, the early genes
are expressed, and they encode proteins which play important roles in the iridovirus
genome replication. After the onset of the iridovirus genome replication, the late
genes are expressed, and they encode structural proteins of iridovirus particles. Given
the different functions of the three groups of iridovirus genes, the study on the
temporal and differential gene expression of iridoviruses can provide important
information on the pathogenesis mechanism of iridoviruses.
The gene expression profile and differential gene expression of iridoviruses are poorly
studied. In 2001, D’ Coasta and coworkers studied the differential gene expression of
6
Chilo iridescent virus (D’Costa et al., 2001). They reported 137 detectable transcripts ,
and classified these 137 transcripts into 3 groups: 38 immediate-early gene transcripts,
34 early gene transcripts and 65 late gene transcripts. Three years later, D’ Coasta and
coworkers further reported that more than 90 percent of the Chilo iridescent virus
genome was transcriptional active (D’Costa et al., 2004). The differential gene
expression of Chilo iridescent virus was studied by using the Northern blot (D’Costa
et al., 2001; D’Costa et al., 2004 ). The Northern blot can only study one gene per
time, and it is a low-throughput method used in gene expression studies. Compared to
Northern blot, DNA microarray is a newly emerging high-throughput technology.
DNA microarray can be used to study hundreds, even thousands, of genes
simultaneously. DNA microarray was used to study differential gene expression of 87
genes of the red sea beam iridovirus in grunt fin cells in 2005 (Lua et al., 2005). By
using DNA microarray, Lua and coworkers reported that as early as 3 hour
postinfection, some genes of the red sea beam iridovirus commenced expression.
After 8 hour postinfection, a rapid escalation of gene expression of the red sea beam
iridovirus was described. Furthermore, the gene expression of red sea beam iridovirus
was differentiated by using a de novo protein synthesis inhibitor (cycloheximide) and
a viral DNA replication inhibitor (phosphonoacetic acid). The differential gene
expression of the red sea beam iridovirus revealed that the 87 RSIV transcripts were 9
immediate-early gene transcripts, 40 early gene transcripts and 38 late gene transcripts.
In addition, in 2007, Lua and coworkers reported the gene expression profiles of the
red sea beam iridovirus in infected fish (Lua et al., 2007). Although RSIV transcripts
had been investigated by DNA microarray, many redundant data entries were
included. Additionally, the genomic sequence of RSIV can not be found in the NCBI
database. From the literature (Lua et al. 2005; Lua et al., 2007), we can only identify
7
around 10 unique conserved ORFs between RSIV and SGIV based on the functional
description of viral ORFs. Based on the limited descriptions about RSIV from the
Lua's paper, the RSIV could be similar to Rock bream iridovirus (RBIV) which is a
member of the Megalocytivirus genus, while SGIV is a member of Ranavirus.
Besides the 26 iridoviridae core genes, which are conserved among iridovirus, only
two other ORFs in Megalocytivirus have orthologs in Ranavirus (Eaton et al., 2007).
Thus, it would be valuable to investigate the gene expression profile as well as
differential gene expression of SGIV.
1.5.3 Functional genomics studies at proteomics scale by using iTRAQ
Functional genomics studies provide information, which should be useful to
understand complex biological systems. These studies, quite often, involve
differential comparison of expression with reference to a control state. Although tools
like nucleic acid microarrays are widely used to perform the differential comparison
of expression in gene transcriptional levels, these tools do not directly give indications
in the gene translational levels. Differences in gene expression do not necessarily
correspond directly to differences in protein expression, due to the following
arguments: I) Expression of proteins is regulated by the combination of transcription
rates and mRNA degradation rates, and is not evidently obvious from mRNA levels;
II) The protein concentrations are not always consistent with mRNA concentrations;
III) Many differential effects on proteins themselves come from post-translational
modifications such as phosphorylation or glycosylation, and these effects cannot be
measured or identified by looking at the mRNA levels (Zieske, 2006). More
importantly, compared to nucleic acid expression, proteins are effector molecules. As
a result, the investigation of differential expression at protein levels will contribute to
a better understanding of biological systems.
8
There are several technologies used to assess more accurately protein levels between
different biological states. I ) 2D-gel: In the 2D-gel technique, differentially expressed
spots are excised and analyzed by mass spectrometry (MS). The disadvantages of 2D-
gel technique are: ① Not all types of proteins are amenable to gels; ② The dynamic
range is somewhat limited; ③The low-abundance proteins are difficult to identify; ④
The resolution is restricted; ⑤ The throughput is relatively low. II ) chip-based MS:
the chip-based MS approaches have a relatively higher throughput, but the actual
identification of the proteins of interest is time-consuming, often relying on off-line
techniques to purify the potential marker(s) implied by the qualitative information
from the MS analyses. III ) Chromatographic approaches: Chromatographic
approaches are subject to diminished sample throughput as well as reproducibility
between samples and replicates. IV ) ICAT: The ICAT stands for Isotope Coded
Affinity Tags (Gygi et al., 1999). In this approach, two samples are labeled with
chemically identical tags that differ only in isotopic composition (heavy and light
pairs) and contain a thiol-reactive group (which covalently links to cysteine residues)
and a biotin moiety. The limitations of ICAT are: ① The ICAT reagents can only be
used to analyze protein peptides that contain a cysteine residue; As a result, many
important proteins, including those with post-translational modifications, are
overlooked by this technique; ② The ICAT can only be used to compare only two
samples. V ) SILAC, which stands for Stable Isotope Labeling by Amino Acids in
Cell Culture (Ong et al., 2002). This method incorporates isotopic labels into proteins
via metabolic labeling in the cell culture itself, rather than using a covalently linked
tag. Thus, cell samples to be compared are grown separately in media containing
either a heavy or light form of an essential amino acid (e.g. one that cannot be
synthesized by the cell). The advantages of SILAC are that it has a higher fidelity than
9
ICAT (incorporating nearly 100% efficiency) and does not require multiple chemical
processing and purification steps, thus ensuring that the samples to be compared have
been subjected to similar conditions throughout the experiment. This approach,
however, requires viable active cell lines to allow for the incorporation of the
respective heavy/light amino acids into the protein samples, and may not always be
available for all experimental samples.
Despite the broad range of biological questions that the above approaches have
successfully addressed, there is still a need for additional technologies that can carry
out global peptide labeling, retention of post-translational modification (PTM)
information, and simultaneous multiplexed (more than two samples) analyses. iTRAQ,
a new technique stands for Isobaric Tags for Relative and Absolute Quantitation, was
firstly developed by Darryl Pappin and colleagues at Applied Biosystems in 2004
(Ross et al., 2004), and can be used to label four samples simultaneously . This
unique approach labels samples with four independent reagents of the same mass that,
upon fragmentation in MS/MS, give rise to four unique reporter ions (m/z =114–117)
that are subsequently used to quantify the four different samples, respectively. To
date, iTRAQ can label samples with eight independent reagents of the same mass that,
upon fragmentation in MS/MS, give rise to eight unique reporter ions (m/z=113-121)
that are subsequently used to quantify the eight different samples. Figure 1.2 shows
the chemical design of iTRAQ reagents. Figure 1.3 shows the workflow of iTRAQ
experiments to investigate differential protein expression among samples. The amine
specificity of these reagents makes most peptides in a sample amenable to this
labeling strategy with no loss of information even from samples involving post-
translational modifications, such as the scrutiny of signal transduction pathways that
often involve phosphorylation. In addition, the multiplexing capacity of these reagents
10
allows for information replication within certain LC-MS/MS experimental regimes,
providing additional statistical validation within any given experiment.
1.6 Structural genomics
1.6.1 Introduction of structural genomics
Structural genomics is the determination of the three dimensional structures of all
proteins of a given organism, by experimental methods such as X-ray crystallography,
NMR spectroscopy or computational approaches such as homology modeling. The
complete sequencing of several genomes has provided us the protein repertories of
diverse organisms from all kingdoms (Green, 2001; Aparicio et al., 2002; Marsden et
al., 2006.). The challenge of understanding these gene products has led to the
development of functional genomics methods, which collectively aim to annotate the
raw sequence with biological understanding (Brenner, 2001). Structural genomics is
one such approach, with a unique promise to reveal the molecular function of protein
domains (Ashburner et al., 2000).
1.6.2 NMR spectroscopy
Nuclear magnetic resonance (NMR), which is a physical phenomenon based upon the
quantum mechanical magnetic properties of an atom's nucleus, also commonly refers
to the methods that exploit nuclear magnetic resonance to study molecules. In 1938,
Isidor Rabi first described and measured the nuclear magnetic resonance in molecular
beams (Rabi et al., 1938). Eight years later, in 1946, Felix Bloch and Edward Mills
Purcell refined the technique for use on liquids and solids (Bloch et al., 1946; Purcell
et al., 1946), for which they shared the Nobel Prize in physics in 1952. In 1953,
Overhauser defined the concept of nuclear overharser effect (NOEs) as the basis for
structural determination by NMR (Overhauser, 1953). Three decades later, in 1985,
the first solution-state protein structure was reported by applying the NMR in solving
11
protein structure (Williamson et al., 1985). Since then, NMR has become an
alternative method to X-ray crystallography for the structural determination of small
to medium sized proteins (<25kDa) in aqueous or micellar solutions. In recent years,
an impressive number of advances in biomolecular NMR spectroscopy have been
reported (Wider and Wüthrich, 1999; Fiaux et al., 2002; Fernandez and Wider, 2003).
NMR has emerged as a powerful probe for the study of protein structure (Clore and
Gronenborn, 1991; Sattler et al., 1999; Bax, 2003) and dynamics (Ishima and Torchia,
2000; Bruschweiler, 2003). In particular, studies of proteins with molecular mass on
the order of 100 kDa are now possible at a level of detail that was previously reserved
for much smaller systems (Kay, 2005). In general, the main procedures of structure
determination by NMR are sample preparation, data acquisition, data processing, data
assignments and structural calculations.
1.6.3 X-Ray crystallography
X-ray crystallography is the science of determining the arrangement of atoms within a
crystal from the manner in which a beam of X-rays is scattered by the electrons within
the crystal. The method produces a three-dimensional density of electrons within the
crystal, from which the mean atomic positions, their chemical bonds, their disorder
and much other information can be derived. By definition, a crystal is a solid in which
a desired minimum volume containing particular arrangement of atoms (its unit cell)
is repeated indefinitely along three principal directions known as the basis (or lattice)
vectors.
Three dimensional protein structures at atomic level can be determined by X-ray
crystallography. Growth of single well-defined diffracting crystal forms the basic and
essential prerequisite for X-ray crystallography to determine protein structures (Blow,
2002). The bottleneck to structure determination by X-ray crystallography is the
12
generation of high quality crystals (Chayen, 2004). In the late 1950’s, crystal
structures of proteins began to be solved, beginning with the structure of sperm whale
myoglobin by Max Perutz and Sir John Cowdery Kendrew, for which they were
awarded the Nobel Prize in Chemistry in 1962 (Kendrew et al., 1958). To date,
~39000 crystal structures of proteins have been determined by X-ray crystallography
(http://www.rcsb.org/pdb/statistics/holdings.do). For comparison, the nearest
competing method, NMR spectroscopy has produced roughly 6000 structures
(http://www.rcsb.org/pdb/statistics/holdings.do). X-ray crystallography is now used
routinely by scientists to determine how a pharmaceutical compound interacts with its
protein target and what changes might be advisable to improve it (Scapin, 2006).
1.7 Scope of thesis
In this thesis, we will present
1) The study of temporal and differential gene expression of SGIV using DNA
microarray.
2) The proteomics study of grouper embryonic cells infected by SGIV using iTRAQ.
3) The functional study of ORF158L, a novel protein coded by SGIV genome.
4) The structural study of ORF158L using NMR and X-ray crystallography.
14
Figure 1.2. The chemical structure of an iTRAQ reagent. A) Design of the iTRAQ
reagents-4plex consists of a charged reporter group, a peptide reactive group, and a
neutral balance portion to maintain an overall mass of 145 (Zieske and Lynn, 2006). B)
Design of the iTRAQ Reagents - 8plex consists of a charged reporter group, a peptide
reactive group, and a neutral balance portion to maintain an overall mass of 305
(http://www.appliedbiosystems.com.sg). (Isobaric, by definition, implies that any two
or more species have the same atomic mass but different arrangements.)
A
B
15
Fig. 1.3. Workflows of iTRAQ experiments. Once labeled with the iTRAQ Reagents,
the individual samples are then be pooled for further processing and analysis. During
subsequent MS/MS of the peptides, each isobaric tag produces a unique reporter ion
that identifies which samples the peptide originated and its relative abundance. A)
General scheme of a multiplex reaction of four different samples (S1–S4), designated
by four different colors(Zieske and Lynn, 2006). B) General scheme of a multiplex
reaction of eight different samples (S1–S8), designated by eight different
colors(http://www.appliedbiosystems.com.sg).
B
A
16
Chapter Two
An investigation of temporal and differential
gene expression of Singapore grouper
iridovirus by DNA microarray
Published:
Chen et al. 2006. Temporal and differential gene expression of Singapore grouper
iridovirus. J Gen Virol. 87:2907-15.
17
2.1 Summary
Singapore grouper iridovirus (SGIV), an iridovirus in the genus Ranavirus, is a major
pathogen that results in significant economic losses in grouper aquaculture. To
investigate further its infective mechanisms, for the first time, a viral DNA microarray
was generated for the SGIV genome to measure the expression of its predicted open
reading frames simultaneously in vitro. By using the viral DNA microarray, the
temporal gene expression of SGIV was characterized and the DNA microarray data
were consistent with the results of real-time RT-PCR studies. Furthermore, different-
stage viral genes (i.e. immediate-early, early and late genes) of SGIV were uncovered
by combining drug treatments and DNA microarray studies. These results should offer
important insights into the replication and pathogenesis of iridoviruses.
18
2.2 Introduction
Singapore grouper iridovirus (SGIV) (Chinchar et al., 2005; Williams et al., 2005;
Song et al., 2004), a novel iridovirus of the genus Ranavirus, is a large, icosahedral,
cytoplasmic DNA virus. The virus, which causes sleepy grouper disease (SGD), has
resulted in significant economic losses in marine net-cage farms in Singapore. It was
isolated successfully in 1998 from diseased brown-spotted grouper, Epinephelus
tauvina (Chua et al., 1994; Qin et al., 2001). The entire SGIV genome consists of
140,131 bp, and 162 open reading frames (ORFs), encoding polypeptides varying
from 41 to 1268 aa, were predicted from the sense and antisense DNA strands (Song
et al., 2004). These viruses are causative pathogens of serious systematic diseases in
farms of both feral and cultured groupers. So far, genomic sequences of two grouper
iridoviruses have been published: SGIV (Song et al., 2004)and grouper iridovirus
(GIV) (Tsai et al., 2005), with whole-genomic sequence similarity of >90 %. Willis et
al. (1977) designated 10 ‘early’ RNAs (of 47 mRNAs), expressed from 1 to 1.5 h after
frog virus 3 (FV-3) infection of fathead minnow cells by using isotopic labelling of
virus-specific RNA. The RNA transcriptional map of the Wiseana iridescent virus
(WIV) has been studied by using a combination of [35S]methionine pulse-labelling
and Northern blotting with WIV DNA probes (McMillan and Kalmakoff, 1994).
Similarly, the transcriptional map and temporal cascade of Chilo iridescent virus (CIV)
have been studied by carrying out Northern analyses with several putative CIV gene-
specific probes (D’Costa et al., 2001 ; D’Costa et al., 2004). However, at present, the
transcriptional program of viral genes in SGIV is still unclear. DNA microarrays
provide a potential tool for the simultaneous measurement of gene expression in all of
these viral genes. In this study, we constructed, for the first time, a viral DNA
microarray covering 127 predicted ORFs of the SGIV genome. By using this SGIV
19
DNA microarray, the transcriptional program of SGIV was uncovered. The temporal
expression of SGIV genes was further confirmed by real-time RT-PCR. By using
cycloheximide (CHX) and aphidicoline as inhibitors, the immediate-early (IE), early
(E) and late (L) viral genes were characterized.
20
2.3 Materials and methods
2.3.1 Cell lines.
Grouper embryonic (GE) cells from the brown-spotted grouper E. tauvina (Chew et
al., 1994) were cultured in Eagle's minimum essential medium containing 10 % fetal
bovine serum, 0.116 M NaCl, 100 IU penicillin G ml–1 and 100 µl streptomycin
sulfate ml–1. Culture media were equilibrated with HEPES to a final concentration of
5 mM and adjusted to pH 7.4 with NaHCO3.
2.3.2 Positive controls for the SGIV DNA microarray.
Given that there is no grouper cDNA library available, we partially sequenced cDNAs
of β-actin and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) from the GE
cells and designed unique amplicons for β-actin and GAPDH in the SGIV DNA
microarray as positive controls. The partial cDNA sequences of grouper β-actin and
GAPDH are shown in Table 2.3. β-Actin was also used for data normalization.
2.3.3 Preparation of amplicons for the SGIV DNA microarray.
One hundred and sixty-two SGIV ORFs were predicted on the basis of the published
SGIV sequence (Song et al., 2004). Two rounds of PCR were used to generate the
amplicons for the microarray. In the first round, specific primers with sizes ranging
from 18 to 22 bp were generated on the basis of the SGIV full-length genome [8 bp
universal sequences (TGACCATG), added to the 5' terminal of the forward primers,
were designed (Table 2.2). The amplicon sizes varied from 200 to 400 bp. Amplicons
whose BLAST scores against other ORFs exceeded 400 were excluded. The genomic
DNA of SGIV was used as template in the first round of PCR. In the second round,
the DNA fragments from the first round were used as template and 5'-amino-modified
universal primer 5'-GCTGAACAGCTATGACCATG-3' and ORF-specific reverse
primer were applied. AmpliTaq DNA Polymerase (Applied Biosystems) was used in
21
both rounds of PCR. Each PCR fragment was confirmed to be a single band and of
the correct size by running on a 2 % agarose gel (data not shown). The final 129
amplicons, representing 127 viral ORFs and two host housekeeping genes, β-actin and
GAPDH (purified with a QIAquick 96 PCR Purification Kit; Qiagen), were spotted
onto lysine-coated slides in duplicate.
2.3.4 Virus infection and CHX and aphidicoline treatments.
GE cells were mock-infected or infected with SGIV at an m.o.i. of 3 p.f.u. per cell. To
investigate the temporal expression of viral genes, total RNA was harvested from
mock-infected and SGIV-infected GE cells at 0, 1, 4, 8, 16, 32, 48, 72 and 96 h post-
infection (p.i.). CHX, a protein synthesis inhibitor that prevents de novo protein
synthesis by preventing translation, was used to study the transcription of viral IE
genes. To assess IE gene transcription, SGIV mock-infected and SGIV-infected
cultures were treated with different concentrations of CHX (50, 100, 200 or
500 µg/ml) 1 h before infection. Aphidicoline is a specific inhibitor of DNA
polymerase α. In the presence of aphidicoline, viral DNA replication is inhibited.
Given that the L genes were expressed after viral DNA replication, the expression of
L viral genes would be downregulated compared with those without aphidicoline
treatment. To examine the viral E genes, the transcriptomes from the cultures with
aphidicoline treatment and SGIV-infected at 3 p.f.u. per cell were compared with
those from the culture with mock aphidicoline treatment and SGIV-infected at 3 p.f.u.
per cell. In the aphidicoline treatment, aphidicoline at a final concentration of
30 µg ml–1 was added to the culture 1 h prior to SGIV infection.
2.3.5 Total RNA preparation, reverse transcription and labelling.
Total RNA was extracted and purified by using a Qiagen RNeasy Mini Kit. RNAsin
(10 units; Promega), 100 units DNase I and 10 µl enzyme buffer 3 (Roche) were
22
added to the total RNA solution, mixed well and incubated at room temperature for
20 min. The RNA samples were later purified with a Qiagen RNeasy column and
stored at –70 °C. For reverse-transcription reactions, 10 mM dATP, 10 mM dGTP,
10 mM dCTP, 2 mM dTTP (Invitrogen) and 8 mM aa-dUTP (Ambion) were used. For
each reverse-transcription reaction, 10 µg total RNA was reverse-transcribed by using
PowerScript Reverse Transcriptase (BD Clontech) with random primers [d(N)6,
0.5 µg/µl] (Life Technologies). After reverse transcription, the unused aa-dUTPs were
removed with Microcon YM-30 columns (Amicon). The cDNAs were coupled with
mono-functional NHS-ester Cy dyes (Amersham Biosciences). After removing
unincorporated/quenched Cy dyes with a QIAquick PCR purification kit (Qiagen), the
mixtures were hybridized to the SGIV DNA chip by using the MAUI hybridization
system (BioMicro Systems) and incubated overnight at 42 °C. The hybridizations
were repeated on duplicate arrays with independently prepared RNA. The data
obtained from the different arrays were consistent. The mean correlation coefficient of
127 viral elements of duplicates was 0.9865. The mean correlation coefficient of 127
viral elements between repeats was 0.9750.
2.3.6 Real-time PCR.
In order to validate the DNA microarray data, semi-quantitative real-time RT-PCR
was applied and β-actin was used as the control. The specific primers for real-time
RT-PCR were checked after PCR and showed a single, specific band after running on
2 % agarose gel. Information on the real-time PCR primers is provided in Table 2.4,
available in the JGV Online website. The total RNA samples were reverse-transcribed
using PowerScript Reverse Transcriptase (BD Clontech) with random primers [d(N)6,
0.5 µg µl–1]. cDNA (50 ng) was subsequently subjected to real-time PCR by using a
QuantiTect SYBR Green PCR kit (Qiagen) in the Lightcycler 2.0 system (Roche).
23
The real-time data were collected and analysed with the 2–∆∆CT method (Livak and
Schmittgen, 2001).
24
2.4 Results
2.4.1 Viral microarray for grouper iridovirus
To date, two grouper iridovirus genomes have been sequenced completely (Song et al.,
2004; Tsai et al., 2005). The specificity of the arrays was validated with cDNA probes
prepared from mock-infected and SGIV-infected GE cells. The cDNA probes from
uninfected cells detected only β-actin and GAPDH, while cDNA probes from infected
cells detected all SGIV DNA targets, as well as β-actin and GAPDH (data not shown).
2.4.2 Temporal gene-expression analysis of the SGIV genome
Total RNA was harvested from mock-infected cells and SGIV-infected cells at 0, 1, 4,
8, 16, 32, 48, 72 and 96 h p.i.
Of the 127 viral elements on the SGIV array, 16 (13 %) of the 127 investigated viral
ORFs commenced expression at 1 h p.i., 106 (83 %) commenced expression at 4 h p.i.
and five (4 %) ORFs commenced expression at 8 h p.i. (Table 2.5).
In our viral DNA microarray, 68 (53.5 %), 43 (34 %), two (1.5 %) and 14 (11 %) of
the 127 investigated viral ORFs were detected to reach maximum expression at 32, 48,
72 and 96 h p.i., respectively (Table 2.5).
Hierarchical clustering, in which the expression of each gene at every time point was
compared and grouped according to the similarity in gene-expression profiles, was
applied to examine the relationship between the genes and their expression patterns. A
coloured mosaic matrix, in which each column represents a time point and each row
indicates the expression pattern of a single ORF, was used to feature the temporal
viral gene-expression data generated from our viral DNA microarray. The ordered and
varied patterns of viral gene expression are illustrated in (Figure 2.1).
In order to validate the DNA microarray results, semi-quantitative real-time RT-PCR
was carried out separately to investigate the expression profile of one IE viral gene
25
(ORF086R), one E viral gene (ORF006R) and one L viral gene (ORF072R – major
capsid protein), with β-actin as controls. The results of real-time RT-PCR were
consistent with the DNA microarray data (Figure 2.2A–C). The consistency between
the real-time RT-PCR and viral DNA microarray data supports the general
applicability and utility of our array approach.
2.4.3 SGIV viral gene expression with different concentrations of CHX
We used the DNA microarrays to compare the gene expression of SGIV-infected GE
cells with that of mock-infected GE cells, both under CHX treatment. The normalized
data show that SGIV gene expression decreased with increasing concentration of
CHX (Figure 2.3A–D). This phenomenon suggests that the expression of SGIV viral
genes does indeed depend on the presence of one or more viral proteins. When the
viral gene expression was analysed in the presence of 500 µg CHX ml–1, we found
that 41 (32.3 %) ORFs displayed a 1.3-fold upregulation (listed in Table 2.1). These
41 ORFs were not sensitive to CHX treatment and were considered as strong IE gene
candidates (Table 2.1). As expected, the two putative IE genes, namely ORF086R and
ORF162L, were included in this candidate IE gene list (Table 2.1).
2.4.4 SGIV viral gene expression with aphidicoline treatment
In order to classify the remaining SGIV viral genes (except for the SGIV IE gene
candidates) into E and L genes, aphidicoline treatment was carried out together with
the simultaneous analysis of all viral transcriptomes. In this study, we compared the
transcriptomes of SGIV-infected cultures with aphidicoline treatment against those of
SGIV-infected cultures mock-treated with aphidicoline across time. We found that
several genes were upregulated (fold >1) and a number of viral genes showed down
regulation (fold <1) (Figure 2.4A-D). Viral genes that consistently displayed twofold
down regulation from 16 to 48 h p.i. with aphidicoline treatment were considered to
26
be L gene candidates. After analyzing the aphidicoline treatment across time, we
found that 50 (38.1 %) ORFs consistently displayed twofold down regulation and
were considered to be candidates for viral L genes (Table 2.1).
27
2.5 Discussion
Our investigation focused on the expression patterns of SGIV genes with both known
and unknown functions and offers new insights into virus replication and pathogenesis.
When the double time (DT), which means the time (h p.i.) at which the expression of a
viral gene, for the first time, showed a twofold upregulation compared with baseline
expression (0 h p.i.) (Paulose et al., 2001), was analysed, ORF086R, a putative IE
gene, commenced its expression as early as 1 h p.i., DNA replication- and
transcription-related genes, for example DNA polymerase (ORF128R) and the two
largest subunits of DNA-dependent RNA polymerase II (ORF073L and ORF104L),
increased their expression levels at 4 h p.i. and the major capsid protein (ORF072R)
commenced expression as late as 8 h p.i. (Table 2.5). These results indicate that SGIV
replication may proceed through a strictly temporally ordered transcriptional program.
These results are consistent with the notion, based on FV-3, that one or more IE
proteins are needed to activate viral E gene transcription and that one or more viral E
proteins are required to switch on viral L gene transcription (Willis et al., 1977;
Williams et al., 2005).
Another interesting finding is that SGIV genes vary in their peak time (PT), which is
defined as the time (h p.i.) at which the transcript of a viral gene accumulates to its
maximum amount. The PTs of SGIV genes range from 32 to 96 h p.i. (Table 2.5). No
relationship was found between the functions of SGIV genes and their PTs. Although
IE and E genes were expressed earlier than L genes in the SGIV replication cycle, the
abundances of all SGIV genes' transcriptomes in the host cell after 8 h p.i. were
substantial. These results are consistent with the earlier observations in FV-3 and CIV
(Chinchar and Yu, 1992; Chinchar et al., 1994; D’Costa et al., 2001).
28
When the gene tree was analysed, some of the viral ORFs encoding viral structural
proteins clustered together at the top of the gene tree. These include ORF072R,
encoding the viral major capsid protein, ORF019R, encoding a myristylated
membrane protein, ORF141R, encoding a glycoprotein, and another two ORFs,
ORF009L and ORF007L, encoding two proteins of unknown function that have been
identified from the mature viral particles by mass spectrometry (Song et al., 2006).
The clustering gene tree also shows a tendency for genes with similar functions, such
as ORF029L and ORF131R, both of which encode homologues of the Ig-like domain,
to be clustered together, despite being located apart from each other in the viral
genome. In the SGIV genome, a number of viral genes are novel and their function is
unknown. It has been reported that the co-expression of genes of known function with
novel genes may provide a relatively simple means to postulate the functions of these
poorly characterized ones (Eisen et al., 1998).
It has been reported that the IE, E and L transcripts of FV-3 were synthesized in three
coordinated phases (Willis et al., 1977.; Willis and Granoff, 1978). Similarly, SGIV
genes can be classified as IE genes, E genes and L genes. CHX-insensitive SGIV
genes are suggested to be IE genes. Aphidicoline-sensitive SGIV genes are suggested
to be L genes. When combining the results of CHX and aphidicoline treatments, the
127 SGIV elements on the microarray included 28 (22.1 %) IE genes, 49 (38.6 %) E
genes, 37 (29.1 %) L genes and 13 (10.2 %) unclassified genes (Table 2.1).
Early viral transcripts contain IE and E viral genes. It has been proposed that E
transcripts in FV-3 encode regulatory proteins and key catalytic enzymes (Goorha et
al., 1978; Goorha, 1982; Williams et al., 2005). Similar observations were made for
SGIV. SGIV E transcripts contain replication-related genes, e.g. DNA polymerase
29
(ORF128R), as well as transcription-related genes, such as the second-largest subunit
of DNA-directed RNA polymerase II (ORF073L).
Although combining DNA microarrays and drug treatments can provide a wealth of
information concerning the expression profile of different viral genes, the approach
has some inherent limitations. For example, in the list of unclassified ORFs,
ORF019R and ORF141L, which encode two structural proteins (a myristylated
membrane protein and a glycoprotein, respectively), are insensitive to the CHX
treatments, even at high concentrations (500 µg ml–1) and ORF146L, encoding
NTPase/helicase, shows a high sensitivity to the aphidicoline treatment. The possible
mechanisms behind drug sensitivity or resistance of these unclassified SGIV genes
need further investigation.
When investigating the temporal expression of different-stage genes, we found that
the IE genes commenced expression between 1 and 4 h p.i., most of the E genes
commenced expression at 4 h p.i. and most of the L genes commenced expression
between 4 and 8 h p.i. The expression of three of the E genes (ORFs 83R, 099R and
111R) and seven of the L genes (ORFs 008L, 010L, 021L, 055R, 089L, 116R and
154R) was found to increase as early as 1 h p.i. The functions of these viral E and L
genes are still unknown.
We also found several interesting phenomena in SGIV. First, ORF030L, which was
predicted to be a virus tegument protein (a structural protein), showed insensitivity to
both CHX and aphidicoline. ORF030L might have other functions besides being a
tegument protein of SGIV. Second, the SGIV genome contained: (i) ORF144R,
encoding a homologue of the FGF (fibroblast growth factor) 22 of rat, a major active
species of presynaptic organizing molecule (Umemori et al., 2004), (ii) ORF145R,
30
encoding a homologue of the mouse FGF 10, which is related closely to FGF 22
(Tagashira et al., 1997; Okazaki et al., 2002; Strausberg et al., 2002; Umemori et al.,
2004). FGF 22 and FGF 10 play important roles in presynaptic differentiation
(Umemori et al., 2004). Expression of FGF homologues by SGIV may play an
important role in forming the clinical symptoms of SGIV-infected groupers.
Although it has been reported that SGIV is an enveloped virus that enters cells by
endocytosis to start the viral infection cycle and buds from the plasma membrane in
the late infection phage (Qin et al., 2001), little is known about the processes that
occur during SGIV infection. By combining the results of SGIV temporal gene-
expression profiles and different-stage viral genes, the patterns of different-stage viral
gene expression are uncovered. Our results should provide new insights into the
processes of the SGIV-infected cascade and the pathogenesis and replication
strategies of SGIV. Our SGIV DNA microarrays coupled with global biochemical and
genetic strategies might greatly accelerate the functional analysis of a number of
functionally unknown genes in the SGIV genome. Given that most ORFs from GIV,
as well as a number of ORFs from other iridovirus genomes, such as Ambystoma
tigrinum virus (Jancovich et al., 2003), Chilo iridescent virus (Jakob et al., 2001),
Infectious spleen and kidney necrosis virus (He et al., 2001), Lymphocystis disease
virus 1 (Tidona and Darai, 1997), orange-spotted grouper iridovirus (Lu et al., 2005)
and tiger frog virus (He et al., 2002), are homologous to those of SGIV (Lu et al.,
2005; Tsai et al., 2005; Song et al., 2004), our results should also be valuable to
research on these viruses.
31
Table 2.1. K
inetic class of SGIV
ORF exp
ression.
ORF
CHX treatment
Aph
idicolin
e treatm
ent
Class
ORF
CHX treatment
Aph
idicolin
e treatm
ent
Class
ORF
CHX treatment
Aph
idicolin
e treatm
ent
Class
ORF001L
E/L
IE/E
E
ORF057L
E/L
L
L
ORF114L
E/L
L
L
ORF003R
E/L
IE/E
E
ORF059L
E/L
L
L
ORF116R
E/L
L
L
ORF004L
IE
IE/E
IE
ORF060R
E/L
IE/E
E
ORF118R
E/L
L
L
ORF005L
IE
IE/E
IE
ORF061R
E/L
L
L
ORF119R
E/L
IE/E
E
ORF006R
E/L
IE/E
E
ORF064R
E/L
L
L
ORF120L
IE
L
? ORF007L
E/L
L
L
ORF065R
IE
IE/E
IE
ORF121R
IE
IE/E
IE
ORF008L
E/L
L
L
ORF066R
IE
IE/E
IE
ORF122L
E/L
IE/E
E
ORF009L
E/L
L
L
ORF067L
IE
IE/E
IE
ORF123L
E/L
IE/E
E
ORF010L
E/L
L
L
ORF068L
E/L
L
L
ORF124R
E/L
IE/E
E
ORF012L
E/L
L
L
ORF069L
E/L
L
L
ORF125R
IE
IE/E
IE
ORF014L
E/L
L
L
ORF070R
E/L
L
L
ORF126R
IE
L
? ORF016L
E/L
IE/E
E
ORF071R
IE
IE/E
IE
ORF127R
E/L
L
L
ORF018R
E/L
L
L
ORF072R
E/L
L
L
ORF128R
E/L
IE/E
E
ORF019R
IE
L
? ORF073L
E/L
IE/E
E
ORF131R
E/L
IE/E
E
ORF020L
IE
L
? ORF075R
E/L
L
L
ORF132R
IE
IE/E
IE
ORF021L
E/L
L
L
ORF076L
E/L
IE/E
E
ORF133R
IE
IE/E
IE
ORF024L
IE
IE/E
IE
ORF077L
IE
IE/E
IE
ORF134L
E/L
IE/E
E
ORF025L
E/L
L
L
ORF078L
E/L
IE/E
E
ORF135L
E/L
IE/E
E
ORF026R
E/L
L
L
ORF081L
E/L
IE/E
E
ORF136R
IE
IE/E
IE
ORF029L
IE
IE/E
IE
ORF082L
E/L
IE/E
E
ORF137R
IE
IE/E
IE
ORF030L
IE
IE/E
IE
ORF083R
E/L
IE/E
E
ORF139R
IE
IE/E
IE
ORF031L
IE
IE/E
IE
ORF084L
E/L
L
L
ORF140R
E/L
L
L
ORF033L
E/L
IE/E
E
ORF085R
E/L
IE/E
E
ORF141R
IE
L
? ORF034L
E/L
IE/E
E
ORF086R
IE
IE/E
IE
ORF143L
IE
L
? ORF035L
E/L
IE/E
E
ORF087R
IE
IE/E
IE
ORF144R
E/L
IE/E
E
ORF036L
E/L
IE/E
E
ORF088L
E/L
L
L
ORF145R
E/L
IE/E
E
ORF037L
E/L
IE/E
E
ORF089L
E/L
L
L
ORF146L
IE
L
? ORF038L
E/L
L
L
ORF090L
E/L
L
L
ORF147L
IE
IE/E
IE
ORF039L
E/L
L
L
ORF091L
E/L
IE/E
E
ORF148R
E/L
IE/E
E
ORF041L
IE
IE/E
IE
ORF092L
IE
L
? ORF149R
IE
IE/E
IE
ORF042R
E/L
IE/E
E
ORF093L
E/L
L
L
ORF150L
E/L
IE/E
E
ORF043R
E/L
IE/E
E
ORF096R
E/L
IE/E
E
ORF151L
E/L
IE/E
E
ORF045L
E/L
L
L
ORF097L
E/L
L
L
ORF152R
E/L
IE/E
E
ORF046L
E/L
L
L
ORF098R
IE
L
? ORF154R
E/L
L
L
ORF047L
E/L
IE/E
E
ORF099R
E/L
IE/E
E
ORF155R
IE
L
? ORF048L
IE
IE/E
IE
ORF101R
E/L
IE/E
E
ORF156L
IE
L
? ORF049L
E/L
IE/E
E
ORF102L
E/L
IE/E
E
ORF157R
E/L
IE/E
E
ORF050L
E/L
IE/E
E
ORF103R
IE
L
? ORF158L
E/L
IE/E
E
ORF051L
IE
IE/E
IE
ORF104L
IE
IE/E
IE
ORF159R
E/L
IE/E
E
ORF052L
E/L
IE/E
E
ORF107R
IE
L
? ORF160L
IE
IE/E
IE
ORF054R
E/L
IE/E
E
ORF110L
E/L
IE/E
E
ORF162L
IE
IE/E
IE
ORF055R
E/L
L
L
ORF111R
E/L
IE/E
E
ORF056R
E/L
L
L
ORF112R
E/L
IE/E
E
“IE” means the immediate early gene. “E” means the early stage viral gene. “L” means the late stage viral gene. “?” m
eans the
ORF is an unclassified viral gene. “IE/E” indicates the ORF could be an immediately early gene or an early stage viral gene. “E/L”
indicates the ORF could be an early stage viral gene or an late stage viral gene.
32
Figure 2.1. Hierarchical clustering gene tree of SGIV temporal gene expression data.
Pearson’s correlation coefficient, in which the temporal expression ratios of genes
were compared pairwise and grouped according to their similarity, were applied to
cluster the viral genes by using genespring. Each column indicates a time point and
each row indicates the expression profile of a viral gene. Different colors were used to
illustrate the different expression level. Green color indicates a low expression level.
Red color indicates a high expression level. Black color indicates a intermediate
expression level. The color intensities mean the magnitude of down regulation or up
regulation from the mean, as the color bar indicates.
33
Figure 2.2. Validation DNA microarray result with real time RT-PCR. There genes
(ORF086R, ORF006R and ORF072R) were selected for studies. Gene expression
level (Y axis) was plotted against time (X axis). The left panel presents the DNA
microarray data and the right panel presents the real time RT-PCR data.
34
Figure 2.3. Effect of CHX treatment on SGIV gene expression. The vertical axis
represents the normalized expression level of viral genes in the presence of different
concentration of CHX with 3PFU/cell SGIV-infection at 2h p.i. and the horizontal
axis represents the normalized expression level of viral genes with mock SGIV-
infection and different concentration CHX treatment. a). 50µg/ml CHX; b). 100µg/ml
CHX; c). 200µg/ml CHX; d). 500µg/ml CHX.
35
Figure 2.4. SGIV gene expression profiles with aphidicoline treatment. Vertical axis
represents the viral expression level with aphidicoline treatment and 3PFU/cell SGIV-
infection and horizontal axis represents the viral expression level with mock
aphidicoline treatment and 3PFU/cell SGIV.
36
Table 2.2 SGIV primers
ORF Forward primer sense primer
ORF001L TGACCATGAAGAGTTTCCCGCTGCAA ACCTAACGCGCCACAAGCT ORF003R TGACCATGAGCTACGATAGATTCACCATGG GGCCTTGTTCGTCTTGACTGTA ORF004L TGACCATGTTACCCATCTTTCTAGCGTCG TGATCTCTTCCCACTACGCTG ORF005L TGACCATGTTTGCTTATCGGGTCGGC GCCGACGATGACAAGACCA ORF006R TGACCATGAGAGGTAAAAATTCCCATCGTC TTCTTCTTGCTCGTCGTCTG ORF007L TGACCATGTGACGATAACTATAAGCCCGG AGGGTATATCTATCGGTTCGGC ORF008L TGACCATGAAAGGGCGAAAAGGGACC CCCCTATGGGTCCTTGGTCT ORF009L TGACCATGGGCTACAAACGACCCACTG GAGCCAATTCCTGTCCCGT ORF010L TGACCATGCTTTTGGTTTCGAGCCCG CCCGAATCGTGGCAACAA ORF012L TGACCATGCCGAATGTTTGACCCGAA GAGCGCGGACAAGATGGAT ORF014L TGACCATGACTACGGGTTCGATGGCTG CACCCGTTGTCGCAGTTT ORF016L TGACCATGCGTTTTAGACGAGTGGCTGAC ATCTCCATGACATTTTCGCG ORF018R TGACCATGGTCAGAGACCAAGGCGTTCG CACGTCACATCTGGTCTACACA ORF019R TGACCATGAACCCTGACTCTTTGGATTGC TTTAACGTCGTCCATCACCA ORF020L TGACCATGTGCTAAACCTCAAGACTCTGGA CTCTTTTGGGGGCTCTGTATTG ORF021L TGACCATGTCAATTTGCTCCGCGGAA TACCTAGTCTGCCAAAGTCTGC ORF023R TGACCATGACTCTTGCGTCTTGTGGGG GCAATGGACACGCTATGGT ORF024L TGACCATGAAATCGGGGAAACCAAACT ATGTCCATATTGTACCAGTCCG ORF025L TGACCATGAAACCAAAAGACCTGCAGTGT AAGAGGTCTTCTGGCCAATTGT ORF026R TGACCATGGGACATCATGCCAGCAGAG CGTCGGCGTTATATCCTTG ORF029L TGACCATGAAACAAAGGCGGGCTCTATA CTCGTTTAGTTTGGGCAGTTT ORF030L TGACCATGCAAGCACATGCAAACGTACAC ACGCGTCTGATGTATCTTGTCT ORF031L TGACCATGTTCGTGCGAGGGTAGCTACT TCGTTTCGGACAGGTTCG ORF033L TGACCATGGGGAATCCGTAACCATAACACA GGTATTTCGTACGCTCTCAGGT ORF034L TGACCATGGATCCGAGCGTGTACATTTG TCCGGTATTTTTGTCGATCTGT ORF035L TGACCATGAATAAAAAAGACAGCGCCCT TGCAACACATGTATATCAACGG ORF036L TGACCATGGGAAAAATAACGTCTCGCTTGA AACTTTCTTGCAGGTATTCTCG ORF037L TGACCATGCGTGGTTTACATCGACAGTGAC ATAATGTTTTCCGCGTTTGC ORF038L TGACCATGGCAATATAGCCGAAACAGGG CGTTAAGTCCGTTGTTGTTGC ORF039L TGACCATGAATGGCGCGGAGATGTTT ATCTATCGATGAGCTGTTGACC ORF041L TGACCATGACGGCAGAGAAATTCGCG GACAATCACGCCTCCGGTT ORF042R TGACCATGTATTCTAATATGGACCGCGA ATTTTATGCACGGAACGGA ORF043R TGACCATGTGGGGCAGCTAAAGGAATT CGTTCTTCGCATACTTTTGA ORF045L TGACCATGGCATCCAAACGTAACCTCAGT CCGTCCGCACCCTTTATT ORF046L TGACCATGGGACGCGAACGATATAGTGA GAAGTCCTTGAGGGCCTGGT ORF047L TGACCATGCTTATTGTCCCGCGTTGG TTATTCGTACGTCCAGCTGC ORF048L TGACCATGTGTAGGCATCATCAAACGCA TACTCAAGTTCCATCAAGACGG ORF049L TGACCATGTGGCAGTACGTCACGGAATA CATTTTTTTTGCCTAACTCCGG ORF050L TGACCATGGAGAAATGCGGCAAAGGTAC CAATATATTGAAGTAGGGGCGC ORF051L TGACCATGGGTTATCAGTCCGAATGCGA TAATCCATTTTGCGGTTCTG ORF052L TGACCATGTCTTTAGGCAGTTTTACGACGA TTAAACTTGTTCCGCCGC ORF054R TGACCATGATAGTGGGCATGTGCATTTTG GCATAAATACCGCGGCAT ORF055R TGACCATGCGAGATAAACGGCAGCACT GCCGTCTTTACCCATAGCG ORF056R TGACCATGAGCGACGGAACCGAATTA TTAATTTGTAGAGCCTTGAGGG ORF057L TGACCATGACGTGTGGACCGAAAAGAAT GCTTATCGTAACCCGCTTTGT
Continued on following page
37
ORF Forward primer sense primer
ORF059L TGACCATGCGCCCACCGGTAAATGAGTA TTACAGGTAAGAATGCGCCC ORF060R TGACCATGTAGGCTGATGAAAATAAACGGC CCATGAGATCGTTTTCGGA ORF061R TGACCATGCAAAGATGTGGTGAAGGCG TCATCGCAAGGTGTAAAAGC ORF062R TGACCATGCCACCCAGGAGCCCTCTT GTTCTATGGGGTCTGGCGT ORF063L TGACCATGGGGCTCCTGGGTGATATCAT CGGCGGCGAATTGATAGA ORF064R TGACCATGTGTACCGTACCGCGTTTGAA GGTCCTGTTTGCTTTGGTTC ORF065R TGACCATGTACTGTCTGATATTGAAA TATGATATTTTAGTCGAA ORF066R TGACCATGATGACCCCAAAGTTTACGGG TTAGCCGCCGAACGAGTCT ORF067L TGACCATGCATTCACGCTAAACTCGACG TCCACAACTTCTTCTTTAGGCG ORF068L TGACCATGAAGAAGTGGCGTTTGGACA AGTCCTATTTTCGGTACGTTCG ORF069L TGACCATGAATGGAAAACGCGCAGTC TACCGCCATTATTCAGTTGACA ORF070R TGACCATGACGAAGCGTGGGAACATTT TGTATCTAATCCGTAAAGGCGT ORF071R TGACCATGTAACGGTGAACCAAACAG GAACGTCTACCTCTTTTGT ORF072R TGACCATGGGTATTTTTCAGCCTCCATTCC CACTACCAACGCGAACCTCT ORF073L TGACCATGTCGGAGAGATGGAACAGTGG TACCTATGGCCTTCAGTTCTTG ORF075R TGACCATGGCCACTCGTAAGATCGCCA GCGTATGGAACCGTAACCTT ORF076L TGACCATGGTAGGAATGAGCACAGCGC CTCGCCGCGAATTTTCTCTA ORF077L TGACCATGAACGAGTACGATTGGGCAA AAGCGAGCGTTGATTTTCC ORF078L TGACCATGTCCAAACACGCCAGAACAT GCCCGCGTCAATGAGTTT ORF081L TGACCATGCGCTAGATGGTCCGTGAAC GTCTGATCCGTACGCCTGTG ORF082L TGACCATGTTTACCCAGGAGCCACGT ACATTCTTCGTCGTCGCC ORF083R TGACCATGCAGAGAACCGGTGATGCT GCAATCGTTCCGTTTTCC ORF084L TGACCATGGTCCAATACCTCGCAAGCTT GTATGTCTTTTTCGCCACCAAC ORF085R TGACCATGCTATTACGAACTATGGTCCGCG AAGACGGTCATGGGTTCG ORF086R TGACCATGGATCGGGACGTTCGTTGT GAAAATCGTTGCCCGGTG ORF087R TGACCATGACCTGATGGGTATACGTGCAA CTCCGTTATGCGCCCAAT ORF088L TGACCATGAGAGTGCAGGGGACGTTGTG TGCAGCTTCTCCAACTTCG ORF089L TGACCATGGCCCATGAGTAATTTGTCCAAC TTAATCCATCCTGCGCCT ORF090L TGACCATGTGAAAAAGTGGCCGCAGA TACGGTCTTTAAATCCTCGAGC ORF091L TGACCATGAAACAAATGTGCTACGCTTCC TGGTTTGCCCACGTTTTGTAC ORF092L TGACCATGCAAATATGCTACGCGGAATC TTTGTTTTCGCGCGATAATC ORF093L TGACCATGTTATGCCCGAAAACTACGACT AATTTGGTCGTCGTTTGCA ORF096R TGACCATGGTATATGAACGCGTGCAC CTCCTATTCTCTACATTTGC ORF097L TGACCATGTGATACGAGCGCACGGAAG ACCTAGCGGCTACAAACAGTCT ORF098R TGACCATGACCGGGCGAAAGTTTATCA CGTTTGGCGCAGTCTCTT ORF099R TGACCATGGACACTGAAAAAGATTGCGTTG CATCTTTCACAATTGGACAACG ORF101R TGACCATGACGAAGACAGCTGGGCCAT CGATCTAAAGGTTCCACGACG ORF102L TGACCATGAAAACCATCACGATTGACGTC TTACACTCCTCCTCTCAATCGC ORF103R TGACCATGAATGCTAGTCATGTATCTCGGG TGAGAAGAGAATGGTTGGGC ORF104L TGACCATGAGACCCACGCCTATAGACAGAT CGAAGACGGTTCCTGTTTGTAT ORF105R TGACCATGTGTACCCTGTGCCCATTTT AGTAGACAGGATGATGCGCA ORF107R TGACCATGCAGAGGGCAACGGGACCT TATTATTCCGGCCGCCACG ORF108L TGACCATGGGTGGAGGACTCGGGAGA CCCTGATCCCTACGTTCTCTC ORF110L TGACCATGCTATTTTTCTCGCGACCGTAT TACAAGTTAAGATGCAGCTGGC ORF111R TGACCATGTAGAAACCATGAACATAGGCGA AGTTTACGGGAGTGCCATTT ORF112R TGACCATGAGAAAGGAGAGACGGGGATG CGACGTTTTCTTCTCCTTATGC ORF114L TGACCATGGCGACGGCGGTGTATTTT CGCTATGGCTCCGCTTTT
Continued on following page
38
ORF Forward primer sense primer
ORF115R TGACCATGTTTCGGACGGCAACTTGA AGCGTTGGTTACCGAAGAGT ORF116R TGACCATGTTGACCGATAAAAGAGACGACG CACCCCAAATGATCGTATAGTT ORF118R TGACCATGAACGGAAACCCCTCTAGAAAA CGAGGGACAAAACGAAGTAAGT ORF119R TGACCATGGTTTTTAACAACGGCTATTCGC TTTGCACCACGGAAGGAT ORF120L TGACCATGGTCAAAGCGTTCGCCGTT GAGGTGTCCCATATGGCGT ORF121R TGACCATGTGGACTCTTGCACGATATATCA TCGGGTATGGAAAATCTCAGT ORF122L TGACCATGGGATACGCTCGCTGCAAA CCATTTTTTCACCAACGCC ORF123L TGACCATGCCGAGCGAGTAATAGTGTTTTC TCATAAATTGGCGAACTCGTTG ORF124R TGACCATGGTGCCGTGCGTAGGATTT CGTGCCTCCTATAAAAGCGT ORF125R TGACCATGCCGTGCGTGGGTATGTGTA CATTGACCTCCTGAGAACGTTC ORF126R TGACCATGGGGTGGCCGTGTACAAAT TGAGTATATCCCCACGAGCCT ORF127R TGACCATGTGTTCTGCCGAAAATCAATGTC ATCCCCATTTACCTCCGTTA ORF128R TGACCATGAGGCTCGAGTATGTGGTCTTAA CCTTTCTGTATGCCACCCTC ORF131R TGACCATGGATTTTTCGGCGGTGCTT CAATAAACGGTAAGCGCCA ORF132R TGACCATGGAAAAACCGTTTAGGAGCGTAT TGAGATATGCGCCTTGTACTGT ORF133R TGACCATGGTATACTGCGAAGGACCTCCC GCGCCTCTTCTTCCATATCG ORF134L TGACCATGGGCAAATAAAACTGGATGTGGA CGTTAAAAAATCTGGCGCGT ORF135L TGACCATGCGGAGCAACACCACGTCTATA GGTTCCGGCGTCTTCTTT ORF136R TGACCATGGGATGAATCAAGAAATGCAGAC GAAAAGGGATGCAGCAACA ORF137R TGACCATGGGTATGCCCGATGGAAACT CCCCGCAGTTGAATGGAA ORF139R TGACCATGCGTATCAAAACGACAAACTCCA TTTTAAATACTACGCCGCGA ORF140R TGACCATGGAAAAAGCTTTAGAGACGGCG TTCTTTTCCGCCTCTTCG ORF141R TGACCATGCGTTGGTGATCGCGACTAT GCGAGGTTGTTGGGATAGTAAT ORF142L TGACCATGTTCTGGTCCGCCTCTGTTG AAACTGTTGGCACTGGCG ORF143L TGACCATGATGGACGAGTTTCTCCGTACTA ACGCCCAAAAAGATACCCAAC ORF144R TGACCATGTGTGAGAGCGGGCCAGATA TTCCCGCCTTGTCTATGG ORF145R TGACCATGACGTGTGTATGTACTCCGGCG CGGGAGGTAAGGCTTTCTG ORF146L TGACCATGGCTGGGAATGCTTGAGAAA GTCGGCAATGTCAACAATCTG ORF147L TGACCATGCGTGATGACCGTGAGCAAATT TTACCTTAACCACGATGCGA ORF148R TGACCATGTTTACGGTTACGAGGATACACG GCTTCCGAAAGACATACGTGC ORF149R TGACCATGCCCTCACAGTGCAAGACAAG TTGTTTCGACGTCGTACCAT ORF150L TGACCATGGCGGAGATGCTTGTTCAGAT AATGATTGGCCCTTTCCG ORF151L TGACCATGGGGTACGGTGTTGGAAATTATA CTTCAAAAAGCAATTCCACCG ORF152R TGACCATGGGATAAGACGGCGTGGTGT CCACCGCGCAGTTTAGAT ORF154R TGACCATGGATTTCATCCGTTTCTGGTACA AAGCGCGTCGAAAAGATC ORF155R TGACCATGCGACATTTCCGGGTGCAA CGTCTTTCTTCTCCGTTTCGT ORF156L TGACCATGTACAATGCCATCTCGCGA CTAATAATTTTTCAGCTGCGCG ORF157R TGACCATGAGGCGGCGAAGAATGGTT CGGTTTCCAAAGTATTCGGC ORF158L TGACCATGTACGCACCTAAACCACGGG TTAAACGGCCGCCAAAGT ORF159R TGACCATGGCCCCGGTCACAAACAAA CATACCTTTTCCTCGCTTGTG ORF160L TGACCATGTGCACTACGGTCACATTCGA AGGCACGCTCACAACGAT ORF162L TGACCATGAAACGCGTGGTAAAGATGGT TCACTCTTCTTCGTCGGGA
39
Table 2.3. Partial cDNA sequences of β-actin and GAPDH
Gene name cDNA Sequence β-actin GAGAAGAGCT ACGAGCTGCC TGACGGACAG
GTCATCACCA TCGGCAATGA GAGGTTCCGT TGCCCAGAGG CCCTCTTCCA GCCTTCCTTC CTCGGTATGG AGTCCTGCGG AATCCACGAG ACCACCTACA ACAGCATCAT GAAGTGCGAC GTCGACATCC GTAAGGACCT GTACGCCAAC ACCGTGCTGT CTGGAGGTAC CACCATGTAC CCAGGCATCG CTGACAGGAT GCAAAAGGAG ATCACAGCCC TGGCCCCATC CAC
GAPDH GGACACCACC TGGTCCTCGG TGTAACCCAA CACTCCCTTC ATGGGCCCAT GTGCGGCCTT CTTGCAGGCT TCCTTAATGT CAGCATAAGA TGCAGGCTTG GACAGACGGC ATGTCAGGTC CACCACTGAC ACATCAGCAA CTGGCACCCT GAATGCCATG CCTGTCAGCT TACCGTTGAG CTCAGGGATG ACTTTGCCCA CTGCCTTGGC GGCACCGGTG GAGGCTGGAA TGATGTTTGT GTGGGCACAG CGGCCATCAC GCCAGGCCTT GGCGCTGGG
40
Table 2.4. Primers for Real time PCR
Gene Forward primer Reverse primer
β-actin TACGAGCTGCCTGACGGACA GGCTGTGATCTCCTTTTGCA ORF086R GATCGGGACGTTCGTTGT GAAAATCGTTGCCCGGTG ORF006R AGAGGTAAAAATTCCCATCGTC TTCTTCTTGCTCGTCGTCTG ORF072R GGTATTTTTCAGCCTCCATTCC CACTACCAACGCGAACCTCT
41
Table 2.5. S
GIV
genes with temporal expression on array
a
Continued
on follow
ing p
age
ORF
Putative Fun
ction
DTb
PTc
0H P.I.
1H P.I.
4H P.I.
8H
P.I.
16H
P.I.
32H
P.I.
48H
P.I.
72H
P.I.
96H
P.I.
ORF00
1L
Unknown
4 32
0.0676
0.0486
0.753
1.998
7.471
22.15
20.17
10.55
8.324
ORF00
3R
3-beta-hydroxy-delta-5-C27
-steroid oxidoredu
ctase
4 32
0.01
0.01
1.1
2.144
8.419
20.35
19.62
10.2
7.823
ORF00
4L
Unknown
4 96
0.01
0.01
0.763
2.062
7.603
16.49
17.55
16.04
21.98
ORF00
5L
Unknown
4 32
0.0624
0.119
0.423
1.926
8.427
23.18
22.17
12.91
8.006
ORF00
6R
EARLY protein
4 48
0.01
0.01
0.631
2.1
6.742
16.8
19.73
12.83
11.73
ORF00
7L
Unknown
4 32
0.0362
0.047
0.108
1.211
13.49
32.47
27.13
12.31
8.244
ORF00
8L
Unknown
1 48
0.0621
0.165
0.106
0.953
9.051
26.89
35.49
14.58
11.15
ORF00
9L
Unknown
8 32
0.0275
0.0267
0.0328
1.145
16.82
32.8
27.22
9.999
6.402
ORF01
0L
Unknown
1 48
0.082
0.171
0.128
1.367
9.598
27.19
27.4
9.325
5.572
ORF01
2L
Capsid protein
4 32
0.0563
0.051
0.162
1.339
11.72
25.64
23.05
9.238
5.819
ORF01
4L
Unknown
4 32
0.0488
0.0424
0.71
2.104
9.42
23.28
29.83
12.25
9.543
ORF01
6L
Unknown
4 32
0.01
0.01
0.505
2.552
12.12
24.91
23.62
12.96
11.66
ORF01
8R
Unknown
4 32
0.021
0.01
0.0934
2.01
15.78
26.17
19.02
9.831
5.658
ORF01
9R
Myristylated mem
brane protein
8 48
0.0492
0.0636
0.0912
1.604
9.501
25.6
14.47
9.851
7.464
ORF02
0L
Unknown
4 48
0.02
0.01
0.895
1.816
13.69
27.78
32.57
8.795
8.28
ORF02
1L
Unknown
1 48
0.0609
0.21
0.366
1.775
9.658
23.62
23.91
9.112
8.518
ORF02
4L
Unknown
4 48
0.165
0.0469
1.617
1.319
4.127
12.25
15.15
9.985
10.69
ORF02
5L
Coding SAP m
otif
4 32
0.0207
0.01
0.423
2.838
15.51
26.23
22.48
8.822
6.466
ORF02
6R
Unknown
4 32
0.0384
0.0255
0.146
1.602
13.2
32.56
32.26
12.47
7.526
ORF02
9L
Hom
olog to im
munoglobu
lin (Ig)-like domains
4 96
0.01
0.01
0.524
0.912
5.452
16
21.16
27.62
32.77
ORF03
0L
Tegum
ent p
rotein
4 32
0.0208
0.0213
0.54
1.537
9.892
22.33
16.16
10.07
10.7
ORF03
1L
Hom
olog to im
munoglobu
lin (Ig)-like domains
4 32
0.0272
0.0263
0.5
2.181
7.568
18.74
18.43
14.85
14.12
ORF03
3L
Hom
olog to im
munoglobu
lin (Ig)-like domains
4 96
0.01
0.01
0.756
5.528
12.19
14.66
13.23
16.83
29.18
ORF03
4L
Unknown
4 48
0.0263
0.01
0.754
2.515
11.86
26.78
27.13
12.37
13.66
ORF03
5L
Hom
olog to im
munoglobu
lin (Ig)-like domains
4 32
0.01
0.01
0.471
2.036
13.04
29.14
18.47
11.79
10.3
ORF03
6L
Unknown
4 96
0.01
0.01
0.829
1.483
6.465
18.43
23.26
18.28
24.67
ORF03
7L
Unknown
4 32
0.0366
0.052
0.706
1.858
8.713
21.37
16.85
10.4
10.75
ORF03
8L
Capsid protein
4 32
0.0211
0.01
0.0532
1.171
13.25
33.76
24.09
8.172
5.549
ORF03
9L
Unknown
4 32
0.0212
0.0286
0.0951
1.473
13.9
26.7
20.69
8.49
6.631
ORF04
1L
Unknown
4 72
0.0825
0.0663
0.914
1.505
6.3
15.74
21.53
22
20.21
ORF04
2R
Unknown
4 32
0.0568
0.0884
0.377
1.842
9.633
24.85
21.41
11.61
8.757
ORF04
3R
Unknown
4 32
0.01
0.01
0.0644
2.403
13.3
24.19
21.89
10.44
5.475
ORF04
5L
Unknown
4 32
0.0245
0.0224
0.0794
0.714
13.09
32.39
24.11
16.45
12.05
ORF04
6L
Unknown
4 32
0.0296
0.01
0.0978
0.55
12.45
36.83
33.87
18.44
15.28
ORF04
7L
Ribonucleoside-diph
osphate reductase beta subunit
4 32
0.0731
0.0879
0.482
1.458
8.156
21.12
17.71
9.143
10.66
42
ORF
Putative Fun
ction
DTb
PTc
0H P.I.
1H P.I.
4H P.I.
8H
P.I.
16H
P.I.
32H
P.I.
48H
P.I.
72H
P.I.
96H
P.I.
ORF04
8L
CARD-like Caspase
4 48
0.01
0.01
0.652
1.992
9.251
21.88
26.09
17.89
17.49
ORF04
9L
dUTPase
4 48
0.01
0.01
0.672
2.034
8.252
18.85
20.8
19.09
19.04
ORF05
0L
Unknown
4 48
0.0322
0.0218
0.817
1.183
5.301
17.54
24.26
21.71
22.6
ORF05
1L
Unknown
1 48
0.01
0.0244
1.785
1.981
4.91
15.91
16.68
10.66
9.028
ORF05
2L
D5 family NTPase
4 48
0.0918
0.0547
0.416
1.153
9.033
26.19
28.84
9.909
7.188
ORF05
4R
Unknown
4 32
0.01
0.01
0.322
1.794
8.647
23.03
20.35
10.31
7.666
ORF05
5R
Unknown
1 32
0.01
0.0233
0.108
1.63
14.78
33.7
26.64
12.86
7.446
ORF05
6R
Unknown
4 32
0.025
0.01
0.111
1.275
9.603
23.79
21.52
10.7
7.876
ORF05
7L
Unknown
4 48
0.0388
0.039
0.13
1.705
12.5
29.86
31.98
11.56
7.81
ORF05
9L
Unknown
4 48
0.0774
0.057
0.233
1.331
10.52
30.67
31.84
10.77
8.641
ORF06
0R
NTPase/helicase
4 48
0.049
0.0404
0.102
1.642
11.24
26.16
32.79
10.74
6.805
ORF06
1R
Unknown
4 32
0.0309
0.0371
0.132
1.923
12.73
27.58
24.01
11.04
7.011
ORF06
4R
Ribonucleoside-diph
osphate reductase alph
a subunit
4 48
0.01
0.01
0.306
1.278
6.551
34.87
42.32
29.13
22.34
ORF06
5R
Unknown
4 96
0.0381
0.047
0.48
1.523
5.881
25.25
24.26
28.64
36.1
ORF06
6R
Unknown
4 48
0.01
0.01
0.558
1.78
6.129
21.35
22.22
15.83
12.21
ORF06
7L
Deoxynu
cleoside kinase
4 48
0.01
0.01
0.93
2.278
8.441
20
20.67
11.68
9.749
ORF06
8L
Immun
oglobu
lins and m
ajor histocompatibility com
plex
proteins signature
4 48
0.0305
0.0214
0.207
1.98
11.6
31.02
38.9
10.41
7.365
ORF06
9L
Unknown
4 48
0.01
0.01
0.543
1.761
14.82
31.31
33.09
15.71
11.79
ORF07
0R
Thiol oxidoredu
ctase
4 32
0.043
0.0555
0.24
1.825
9.335
24.46
23.95
10.22
8.029
ORF07
1R
Unknown
4 48
0.064
0.0328
1.119
2.007
4.2
11.29
16.66
13.54
16.09
ORF07
2R
Major capsid protein
8 32
0.021
0.0284
0.0283
1.222
17.27
30.08
20.96
8.718
5.25
ORF07
3L
DNA-directed RNA polym
erase II second largest subunit
4 48
0.0738
0.0546
0.489
1.485
7.783
26.29
33.79
13.45
11.39
ORF07
5R
Unknown
4 48
0.01
0.01
0.14
1.083
13.18
27.97
28.75
16.94
11.27
ORF07
6L
Purine nucleoside phosphorylase
4 48
0.0299
0.0294
0.231
1.232
10.27
29.73
31.43
12.88
11.45
ORF07
7L
Unknown
4 48
0.0463
0.0396
0.512
1.65
9.402
26.18
28.96
13.71
12.68
ORF07
8L
Tyrosine kinase
4 32
0.102
0.0598
0.42
1.507
10.02
20.32
18.95
9.266
9.412
ORF08
1L
Tyrosine kinase
4 32
0.0566
0.0525
0.193
1.904
11.37
31.35
30.6
10.18
7.478
ORF08
2L
Unknown
4 48
0.078
0.0644
0.505
1.467
10.6
25.71
28
14.33
12.86
ORF08
3R
Unknown
1 48
0.0521
0.124
0.333
1.341
8.063
25.42
29
11.8
10.63
ORF08
4L
RNase III
4 32
0.0576
0.0679
0.371
1.476
10.41
26.03
23.15
10.88
9.993
ORF08
5R
Transcription elongation factor TFIIS
4 48
0.0612
0.0594
0.762
2.242
6.375
14.81
18.15
11.99
12.92
ORF08
6R
Immediate-early protein ICP-18
1 96
0.01
0.0213
1.252
2.721
4.249
8.056
12.99
13.17
18.25
ORF08
7R
Unknown
4 96
0.01
0.01
0.715
2.341
9.216
17.87
18.71
18.68
24.32
ORF08
8L
Myristylated mem
brane protein
4 48
0.0565
0.0444
0.159
1.161
13.32
29.55
32.1
11.92
9.186
Continued
on follow
ing p
age
43
ORF
Putative Fun
ction
DTb
PTc
0H P.I.
1H P.I.
4H P.I.
8H
P.I.
16H
P.I.
32H
P.I.
48H
P.I.
72H
P.I.
96H
P.I.
ORF08
9L
Unknown
1 32
0.01
0.0204
0.202
1.45
17.46
34.46
28.87
11.23
7.823
ORF09
0L
Unknown
4 32
0.033
0.0264
0.373
1.572
13.69
31.21
26.46
9.846
8.51
ORF09
1L
Unknown
4 48
0.0348
0.0562
0.894
1.139
4.283
13.93
38.91
12.84
13.11
ORF09
2L
Unknown
4 32
0.01
0.01
0.412
1.502
10.76
26.74
21.91
9.616
5.963
ORF09
3L
Unknown
4 32
0.01
0.01
0.13
1.773
14.08
23.74
21.03
7.382
5.424
ORF09
6R
Unknown
4 32
0.031
0.026
0.283
1.794
12.58
32.71
31.14
15.21
14.27
ORF09
7L
DNA repair protein RAD2
4 32
0.0225
0.01
0.635
2.402
9.911
26.63
19.38
13.39
9.514
ORF09
8R
Unknown
4 32
0.01
0.01
0.205
2.236
13.87
35.29
32.98
13.4
8.163
ORF09
9R
Unknown
1 32
0.01
0.0307
0.262
2.374
9.022
25.88
23.92
9.494
5.497
ORF10
1R
Unknown
4 32
0.01
0.01
0.394
1.559
14.97
34.59
20.56
9.695
8.43
ORF10
2L
Ubiqutin/ribosomal protein
4 96
0.01
0.01
0.831
1.148
3.968
13.22
19.46
32.81
43.58
ORF10
3R
Unknown
4 32
0.097
0.0838
0.404
2.18
11.1
20.2
16.7
8.654
7.217
ORF10
4L
DNA-dependent RNA polym
erase II largest subunit
4 48
0.0932
0.0665
0.278
1.583
7.875
23.96
31.07
13.21
11.6
ORF10
7R
Unknown
4 48
0.0581
0.0727
0.304
1.23
6.632
16.14
16.99
11.33
9.846
ORF11
0L
Unknown
4 32
0.0236
0.0258
0.327
1.231
11.62
28.13
22.28
17.28
16.49
ORF11
1R
Unknown
1 32
0.0255
0.0941
0.623
4.472
7.711
12.54
12.23
9.275
10.05
ORF11
2R
Unknown
4 32
0.01
0.01
0.225
2.294
21.49
35
24.18
15.41
15.29
ORF11
4L
Unknown
4 32
0.0233
0.0202
0.0607
1.545
16.16
34.98
27.62
11.02
6.353
ORF11
6R
Putative replication factor
1 48
0.0399
0.0966
0.617
2.471
10.55
26.74
30.36
12.82
11.1
ORF11
8R
Unknown
4 32
0.01
0.01
0.34
3.044
12.51
23.46
20.94
9.413
5.483
ORF11
9R
Unknown
4 32
0.0273
0.01
0.116
1.563
13.15
28.87
25.62
11.43
7.243
ORF12
0L
Unknown
4 32
0.01
0.01
0.0825
1.222
19.32
33.36
22.25
9.247
7.148
ORF12
1R
Unknown
4 32
0.01
0.01
0.0808
1.47
20.49
29.24
22.94
9.974
6.213
ORF12
2L
Unknown
4 48
0.01
0.01
0.614
1.927
9.268
18.3
20.29
14.28
12.15
ORF12
3L
Unknown
4 48
0.061
0.0441
0.674
1.897
7.898
22.55
23.5
15.89
13.47
ORF12
4R
Unknown
4 32
0.01
0.01
0.514
3.841
17.11
22.87
17.72
14.33
15.29
ORF12
5R
Unknown
4 96
0.0275
0.0255
0.437
2.593
11.96
17.19
31.53
22.17
38.52
ORF12
6R
Unknown
8 32
0.01
0.01
0.01
0.391
14.03
44.12
36.53
23.62
21.31
ORF12
7R
Unknown
4 32
0.01
0.01
0.148
0.791
22.69
79.21
33.57
17.47
15.88
ORF12
8R
DNA polym
erase
4 32
0.01
0.01
0.854
1.615
7.494
25.33
22.54
12.61
8.829
ORF13
1R
Hom
olog to im
munoglobu
lin (Ig)-like domains
4 96
0.0539
0.0754
0.918
1.203
5.303
18.53
21.35
20.99
23.78
ORF13
2R
Unknown
1 32
0.0295
0.0985
0.339
1.303
9.76
20.66
15.43
10.86
10.5
ORF13
3R
Unknown
4 32
0.01
0.01
1.021
1.875
7.432
16.13
14.4
9.866
8.449
ORF13
4L
ATPase
4 32
0.0235
0.01
0.425
2.598
13.28
25.22
24.5
11.49
8.952
ORF13
5L
Unknown
4 32
0.01
0.01
1.192
2.829
9.361
22.13
18.94
10.3
7.891
ORF13
6R
Unknown
4 96
0.0301
0.01
0.493
2.151
7.686
19.47
22.58
21.63
23.64
ORF13
7R
Unknown
4 96
0.0375
0.0502
0.392
1.95
6.597
13.74
14.71
12.94
15.2
ORF13
9R
Unknown
4 32
0.01
0.01
0.362
1.495
9.121
18.45
17.81
8.388
6.771
Continued
on follow
ing p
age
44
ORF
Putative Fun
ction
DTb
PTc
0H P.I.
1H P.I.
4H P.I.
8H
P.I.
16H
P.I.
32H
P.I.
48H
P.I.
72H
P.I.
96H
P.I.
ORF14
0R
Unknown
8 32
0.01
0.01
0.01
0.723
13.38
27.84
26.64
12.96
9.252
ORF14
1R
Glycoprotein
4 32
0.01
0.01
0.0387
1.171
31.62
51.5
33.57
15.68
11.44
ORF14
3L
Unknown
1 48
0.01
0.0279
0.12
2.067
7.829
20.15
22.41
11.44
10.79
ORF14
4R
Fibroblast grow
th factor
4 96
0.0279
0.0289
0.785
3.053
5.551
13.37
15.79
19.05
27.75
ORF14
5R
Fibroblast grow
th factor
4 72
0.01
0.01
0.652
2.563
5.48
12.72
14.26
16.01
13.25
ORF14
6L
NTPase/helicase
1 32
0.01
0.0248
0.951
1.361
7.03
20.28
19.98
20.21
15.73
ORF14
7L
Unknown
4 48
0.01
0.01
0.281
0.91
9.568
28.73
30.69
17.97
14.65
ORF14
8R
Unknown
4 32
0.0486
0.044
0.618
1.897
9.069
20.49
16.8
9.829
8.848
ORF14
9R
Unknown
4 96
0.01
0.01
1.026
2.106
3.765
9.37
14.54
18.28
19.66
ORF15
0L
Unknown
4 32
0.0302
0.01
0.498
3.45
8.257
19.03
17.93
8.851
6.952
ORF15
1L
Unknown
4 32
0.01
0.01
0.26
7.408
12.19
16.55
12.89
7.02
4.962
ORF15
2R
Helicase
4 32
0.0367
0.0612
0.654
1.434
8.278
22.94
19.35
13.58
11.46
ORF15
4R
Unknown
1 48
0.0241
0.0514
0.691
1.393
8.316
23.51
30.19
13.48
11.05
ORF15
5R
Hom
olog to mam
malian semaphorin
1 32
0.01
0.026
0.246
1.958
17.64
31.76
29.41
12.62
10.63
ORF15
6L
Unknown
4 48
0.01
0.01
0.367
1.449
10.5
27.01
27.92
14.12
9.041
ORF15
7R
Unknown
4 32
0.0306
0.0224
0.295
3.237
8.808
23.21
20.38
11.13
9.478
ORF15
8L
Unknown
4 48
0.0686
0.0571
0.302
1.262
8.138
18.93
27.84
16.71
13.81
ORF15
9R
Unknown
4 48
0.0598
0.0567
0.474
1.415
9.523
25.76
27.08
13.15
11.37
ORF16
0L
Unknown
4 32
0.0416
0.055
0.245
1.675
10.52
25.74
24.39
11.67
8.834
ORF16
2L
Immediate-early protein ICP-46
4 48
0.114
0.0389
0.766
1.421
6.881
22.91
28.38
12.17
8.785
a The DNA M
icroarray data from the M
icroarray im
ages were norm
alized using “per spot divided by control channel and per chip normalized to
β-actin”.
b DT m
eans the tim
e (hours post infection) at which the exp
ression of a viral gene is, for first tim
e, showed two fold up-regulating com
pared to
baseline exp
ression (0h p.i.).
c PT m
eans the tim
e (hours post infection) at which the transcriptome of a viral gene accumulates to its maxim
um amount.
45
Chapter Three
iTRAQ study of Grouper embryonic cells
infected by Singapore Grouper Iridovirus: an
insight into pathogen and host cell interaction
Chen et al. 2008. iTRAQ analysis of Singapore grouper iridovirus infection in a
grouper embryonic cell line. J Gen Virol.. (Accepted).
46
3.1 Summary
Here we report the first proteomics study of grouper embryonic cells (GEC) infected
by Singapore grouper iridovirus (SGIV). The differential proteomes of GEC with and
without virus infection were studied and quantitated with iTRAQ labeling followed
by LC-MS-MS. There are forty nine viral proteins recognized and eleven of them
were identified for the first time. Moreover, approximately seven hundreds host
proteins were revealed. In SGIV-infected grouper embryonic cells, fourteen host
proteins were up-regulated and five host proteins were down-regulated. In this chapter,
we present a new insight into the interaction of the virus and the host cell at a
proteome scale. This study might significantly contribute to the understanding of
infection mechanism and pathogenesis of SGIV.
47
3.2 Introduction
Recently, Chen and coworkers discovered that 127 ORFs of SGIV were
transcriptionally active (Chen et al., 2006). Although a total of 51 SGIV proteins were
identified, the translational products of the remaining 111 ORFs are unknown (Song
et al., 2004; Song et al., 2006).
Isobaric Tags for Relative and Absolute Quantification (iTRAQ) can be used for the
comparison of up to four different samples simultaneously (Ross et al., 2004. ). In this
study, iTRAQ was used to investigate the quantitative proteomics analysis of SGIV-
infected cells compared to the SGIV mock-infected cells. To our knowledge, this is
the first proteomics study on iridovirus and host interaction.
3.3 Materials and methods
3.3.1 Cell and virus infection
Protocols for cell and virus infection were described earlier in Chapter 2.
3.3.2 iTRAQ Labeling and Two Dimensional (2D) LC-MALDI MS
The mock-infected cells and infected cells (48 hours post infection of SGIV) were
lysed with lysis buffer (0.5 M triethylammonium bicarbonate, pH 8.5, containing 1 %
SDS). One hundred micrograms of total proteins from SGIV-infected cell lysate and
equivalent amount of proteins from mock-infected cell lysate were used for iTRAQ
experiments. Firstly, the proteins were reduced and cysteines blocked according to the
iTRAQ kit‘s protocol (Applied Biosystems). Then, the samples were 10X diluted with
0.5 M triethylammonium bicarbonate, pH 8.5. 25 microliters of a 1 µg/µL trypsin
(Applied Biosystems) and incubated at 37 °C overnight. The samples were vacuum-
dried and reconstituted with 30 µL 0.5 M triethylammonium bicarbonate, pH 8.5. The
proteins from the mock-infected cells were labeled with iTRAQ 114 reagents, and the
proteins from the SGIV-infected cells were labeled with iTRAQ 115 reagent. After
48
that, the labeled samples were pooled and purified using a strong cation exchange
(SCX) column (Applied Biosystems). The bound peptides were eluted with 5 %
NH4OH in 30 % methanol.
After drying, the iTRAQ-labeled peptides were resuspended with 20 µL 5 mM
KH2PO4 buffer containing 5 % ACN, pH 3.0, and separated using an Ultimate dual-
gradient LC system (Dionex-LC Packings). At the first dimension separation, we used
a 0.3 × 150-mm SCX column (FUS-15-CP, Poros 10S, Dionex-LC Packings). The
mobile phase A and B was 5 mM KH2PO4 buffer, pH 3.0, containing 5 % ACN and 5
mM KH2PO4 buffer, pH 3.0, containing 5 % ACN and 500 mM KCl, respectively,
with a flow rate of 6 µL/min. The eluants with step gradients of mobile phase B
(unbound, 0-5, 5-10, 10-15, 15-20, 20-30, 30-40, 40-50, and 50-100 %) were captured
alternatively with two 0.3 × 1-mm trap columns (3-µm C18 PepMap, 100 Å, Dionex-
LC Packings) and washed with 0.05 % trifluoroacetic acid (TFA) to remove salts. At
the second dimension separation, we used a 0.2 × 50-mm reverse-phase column
(Monolithic PS-DVB, Dionex-LC Packings) by using 2 % ACN with 0.05 % TFA as
mobile phase A and 80 % ACN with 0.04 % TFA as mobile phase B, respectively,
with a gradient of 0-60 % mobile phase B in 15 min and a flow rate of 2.7 µL/min.
The LC fractions were mixed with MALDI matrix solution in a flow rate of 5.4
µL/min through a 25-nL mixing tee (Upchurch Scientific, Oak Harbor, WA) and
spotted onto 192-well MALDI target plates (Applied Biosystems) with a Probot
Micro Fraction collector (Dionex-LC Packings).
The samples were analyzed by using an ABI 4700 Proteomics Analyzer MALDI
TOF/TOF mass spectrometer (Applied Biosystems, Foster City, CA). GPS Explorer
software version 3.5 (Applied Biosystems) employing MASCOT search engine
(version 2.1; Matrix Science) was used for peptide and protein identification and
49
iTRAQ quantification. The NCBI database was used as the database. Cysteine
methanethiolation, N-terminal iTRAQ labeling and iTRAQ labeled lysine were
selected as fixed modifications during iTRAQ data analysis.
3.3.3 RT PCR
Total RNA was extracted from infected cell culture at late stage (48 hours post
infection) using an RNaesy mini kit (QIAGEN). The extracted RNA was treated with
RNase-free DNase I (QIAGEN). RT-PCR was conducted as described in one step RT
PCR kit (QIAGEN) protocol using treated RNA and gene specific primers. Briefly, in
reverse transcription step, cDNA was synthesized from total RNA at 50○C for 30 min.
The reverse transcriptase was inactivated by a heating step at 95○C for 15 min. The
PCR amplification of target genes was started after activation of the HotStarTaq DNA
polymerase. The amplification reactions were carried out for 30 cycles under
conditions of 95○C for 30s, 55○C for 30s and 72○C for 1.5min per cycle. The RT-PCR
products then were analyzed with 1.2% agarose gel.
3.3.4 Western blotting
Infected cells (48 hour post infection) and mock-infected cells (for mock-infected
control) were harvested. The cell lysate was treated with 5X SDS loading dye and
boiled for 10 min. The proteins were separated by SDS-PAGE 15% gel and blotted
onto a Nitrocellulose membrane (Amersham) in blotting buffer (3.21 g Tris base,
14.25 g glycine and 200 ml methanol per liter) for 1h at 70V. The blotted membrane
was then blocked in 5% skim milk, 1% BSA in TBST (20 mM Tris pH 7.4, 154 mM
NaCl, 0.1% Tween 20) for 1hour, washed 3 × 10 min with TBST. The membranes
were incubated with the first antibodies (different dilution for different antibodies) for
1 hour, washed with 3 × 10 min TBST, treated with second antibodies for 1hour: anti-
mouse antibody (Chemicon) or anti-rabbit antibody (Amersham), washed 3 × 10 min.
50
After that, the membranes were treated with a mixture of Super Signal West Pico
Stable Peroxide Solution and luminol enhancer solution for 5 min and ready for
exposure to X-ray film and image developing.
3.4 Results
3.4.1 Identification of viral proteins by using iTRAQ
By using iTRAQ, a total of 49 viral proteins were identified. The total ion score
confidence interval calculation total ion (total ion score C.I. %) of 48 viral proteins is
larger than 95% (Table 3.1. & Table 3.2). Only ORF022L, which has been reported
previously, showed a total ion score C.I. % of 86.351%. A total of 38 viral proteins
have been reported earlier (Song et al., 2004; Song et al., 2006), and these proteins
are listed in Table 3.1. The remaining 11 viral proteins identified from iTRAQ all
showed a total ion score C.I. % >95% and were first reported in this study. These 11
newly identified viral proteins are listed in Table 3.2. Among these 11 newly
identified viral proteins, only ORF049L was predicted to a dUTPase-like protein. The
other 10 proteins are functionally unknown and are considered as novel proteins.
Given that 162 open reading frames was predicted from the SGIV genome, only 49
viral proteins were identified by iTRAQ. The reasons for this phenomenon could be: 1)
not all 162 open reading frames have translational products in cell culture; 2) iTRAQ
is not sensitive enough to identify the low abundant proteins; 3) some of viral proteins
might be expressed at a different stage of viral infection.
3.4.2 Identification of differential expression host proteins by using iTRAQ
Given the unknown nature of the grouper Epinephelus tauvina genome, when we
analyzed the iTRAQ data against NCBI, only 743 host proteins were identified (Table
3.5). Among these host proteins, only 726 host protein showed a total icon score C.I.
% >95% (Table 3.5). When we compared the SGIV-infected GECs (iTRAQ 115
51
labeled) to the mock SGIV-infected GECs (iTRAQ 114 labeled), 12 host proteins
(total icon score C.I. % >95%) were >1.5 fold up-regulated, and 5 host proteins (total
icon score C.I. % >95%) were >1.5 fold down-regulated (Table 3.3. & Table 3.4.).
Among the up-regulated host proteins, ATP synthase F1 alpha subunit in ATP
synthesis coupled proton transport; dihydrofolate synthase in folic acid and derivative
biosynthetic process (Tettelin et al., 2005); mucin-associated surface protein in the
innate immune system; alpha-2-macroglobulin isoform 1 is involved in protease
inhibition; transposase is an enzyme that binds to single-stranded DNA and can
incorporate it into genomic DNA. In the down-regulated proteins, Chain A Rotamer
Strain As A Determinant Of Protein Structural Specificity and ribosomal protein S27a
are involved in ubiquitination; histone H3 plays critical roles in genomic DNA
packaging and gene transcriptional regulation; Rho GTPase is involved in complex
mechanical processes such as cell motility and phagocytosis (Wherlock and Mellor,
2002). Given that two up-regulated proteins, Matrix protein and putative head-tail
adaptor, do not match to any of the 162 ORFs from the SGIV genome, we classified
them into host proteins.
3.4.3 RT-PCR and western blot analysis of the viral proteins
The transcriptsomes of the 11 newly identified proteins were characterized by using
RT-PCR (Table 3.6). Figure 3.1 shows that these 11 new viral proteins have full
length transcriptional products. The transcriptional products of the other 38 viral
proteins have been reported (Song et al., 2004; Song et al., 2006). We generated
specific antibodies against 5 viral proteins (i.e. ORF018R, ORF026R, ORF093L,
ORF135L, and ORF140R) by using the purified recombinant viral expressed in E.
coli. The full length protein products for ORF018R, ORF026R, and ORF093L have
not been reported before. Here, by using specific antibodies, we demonstrate that
52
ORF018R, ORF026R and ORF093L encode full length proteins. In addition, we
further demonstrated that the two newly identified viral proteins, ORF135L and
ORF140R, code for two full length viral proteins.
3.4.4 Up regulation of host Histone H3 Lysine 79 (K79) methylation upon SGIV
infection
iTRAQ data showed that host histone H3 was down-regulated upon SGIV infection
(Table 3.4). To validate this finding, we carried out western blot against histone H3
by using the commercial polyclonal antibodies against histone H3. We were surprised
to find that total histone H3 was not down-regulated upon SGIV-infection (Data not
shown). Because we did not include any kinds of post translational modification when
we analyzed the iTRAQ data, the iTRAQ data only showed that the non modified
form of histone H3 was down-regulated upon SGIV infection. Given that total histone
H3 did not change upon SGIV infection, we postulated that the modified form of
histone H3 might be up-regulated. Histone H3 can undergo many kinds of post
translational modifications, such as methylation, acetylation, and phosphorylation
(Zhang and Reinberg, 2001; Lacoste et al., 2003; Li et al., 2007). In order to figure
out what kind of modifications that could account for the findings presented above,
we reexamined the iTRAQ raw data, and found that the peptide “EIAQDFKTDLR”
of histone H3 was the only peptide identified from iTRAQ analysis and the non-
modified form of this histone H3 peptide showed down regulation upon SGIV
infection. This histone H3 peptide sequence “EIAQDFKTDLR” contains a lysine
residue (i.e. K), which is referred as K79 in histone H3, and K79 can undergo
methylations. We proposed that the K79 methylation form of histone H3 could be up-
regulated. Combination of the up regulation of the K79 methylated form of histone
H3 and the down-regulation of the K79 non-methylated histone H3 would make the
53
total histone H3 unchanged. To test this hypothesis, we carried out western blots
against K79 methylated histone H3 by using the specific polyclonal antibodies against
the K79 methylated form of histone H3. The results are shown in Figure 3. The first
panel shows that the K79 methylated form of histone H3 was up-regulated upon
SGIV infection; the middle panel shows that the total histone H3 was not changed
upon SGIV infection; the bottom panel shows that the beta actins, which were used as
controls, were not changed (Figure 3.3). By combining the iTRAQ and western blot
data, we conclude that upon SGIV infection, the host histone H3 K79 methylation is
up-regulated, and the total host histone H3 is not changed.
3.5 Discussion
In this study, for the first time, we have examined the infection of SGIV with host cell
at proteome scale by using iTRAQ. We have identified 49 SGIV viral proteins (Table
3.1 & Table 3.2). Eleven out of the 49 viral proteins were first reported (Table 3.1).
The transcriptional products for these 11 viral ORFs were validated by RT-PCR
(Figure 3.1). Two newly identified viral proteins, ORF135L and ORF140R, were also
validated by western blot (Figure 3.2). In addition, the translational products of
another three viral proteins, ORF018R, ORF026R, and ORF093L, were confirmed by
western blot (Figure 3.2). Among the 49 SGIV viral proteins, only 6 viral proteins are
predicted to share homologies with known functional proteins: ORF049L coded the
dUTPase-like proteins, ORF072R for the major capsid proteins, ORF081L, the
tyrosine kinase-like proteins, ORF084L, the RNase III like proteins, ORF086R, the
putative immediate-early proteins, and ORF102L, the ubiqutin/ribosomal protein like
proteins. The remaining 43 viral proteins are all functionally unknown. Our studies
identified that these viral proteins were real viral coding proteins. These results would
be valuable for the functional studies of these viral proteins.
54
The proteomics scale analysis of SGIV-infected GECs by iTRAQ also provides
important insight into the SGIV and host cell interactions by providing information on
the differential expression of host proteins. These differential expressions of host
proteins are involved in multi cellular processes, and are closely linked together to
form a complex host response to SGIV infection. The dihydrofolate synthase, which
is involved in the folic acid biosynthetic process, was up-regulated (Tettelin et al.,
2005). The up regulation of dihydrofolate synthase may lead to the up regulation of
folic acid that is needed to replicate DNA (Jennings, 1995), and the folic acid
biosynthetic process needs ATPs as an energy source. It is not surprising to find that
the ATP synthase F1 alpha subunit, which is involved in ATP synthesis, was also up-
regulated in SGIV-infected GECs. The up regulation of dihydrofolate synthase and
ATP synthase F1 alpha subunit may play important roles in the replication of SGIV
genomic DNA in the host cells. When the host cells were infected by SGIV, host
mucin-associated surface protein, which is involved in the innate immune regulation,
and Rho GTPase, which is involved in complex mechanical processes such as cell
motility and phagocytosis, was up-regulated. This phenomenon may account for the
regulation of the host immune responses, when SGIV infects host cells. In addition,
the up-regulated of alpha-2-macroglobulin isoform 1, which is involved in protease
inhibition, could play roles in the protection of newly synthesized viral proteins
during SGIV replication inside host cells. The up regulation of transposase, which is
involved in the incorporation of DNA fragments into genomic DNA, could play
important role in integrating SGIV genomic DNA into host genome. We also found
that two proteins, Chain A Rotamer Strain As A Determinant Of Protein Structural
Specificity and ribosomal protein S27a, which are involved in ubiquitination process,
were down-regulated upon SGIV infection. Given that SGIV genome contains an
55
open reading frame coding the ubiquitin like protein, the down regulation of host
unbiquitination related proteins could play very important role in the regulation of
host protein degradation.
Another interesting finding is that the K79 methylation of histone H3 was up-
regulated during SGIV infection (Figure 3.3). Histone H3 plays an very important role
in genomic package. Histone H3 together with H2A, H2B, and H4, assembles to form
one octameric nucleosome core particle by wrapping 146 base pairs of DNA around
the protein spool in 1.65 left-handed super-helical turn (Luger et al., 1997). Histones
are subjected to an enormous number of posttranslational modifications, including
acetylation and methylation of lysines (K) and arginines (R), phosphorylation of
serines (S) and threonines (T), ubiquitylation and sumoylation of lysines, as well as
ribosylation (Peterson and Laniel, 2004). Since chromatin is used as the physiological
template for all DNA-mediated processes, histone modifications play very important
roles in controlling the structure and/or function of the chromatin fiber, and different
modifications can yield distinct functional consequences. For histone H3, the
posttranslational modification could be the acetylation of lysines (K), methylation of
lysines (K) and arginines (R), and phosphorylation of serines (S) and threonines (T).
The acetylation of histone H3 could happen at K4, K9, K14, K18, K23, and K27. K4,
K9 and K27 can also be subjected to methylation. Besides K4, K9 and K27,
methylation of histone H3 can also happen at R27, K36 and K79. Besides acetylating
and methylation, histone H3 can also be subjected to phosphorylations. The
phosphorylation of histone H3 can occur at T3, S10, T11, and S28. In addition, the
different modifications of histone H3 may lead to different functional consequences.
Histone H3 lysine methylations are closely related to the gene silencing and gene
activations. It has been reported that G9a/GLP-mediated H3-K9 methylation functions
56
in the silencing of individual genes (Tachibana et al., 2005). Besides H3-K9
methylation, the methylation of histone H3 K27 has been reported to be linked to
several silencing phenomena including homeotic-gene silencing, X inactivation and
genomic imprinting (Cao and Zhang, 2004). However, the methylations on the sites of
histone H3 K4, K36 and K79 are associated with gene activations (Martin and Zhang,
2005). Compared to histone H3 K4 methylation, the functions of histone H3 K36 and
K79 methylation are not well defined (Martin and Zhang, 2005). Recent study showed
that histone H3 K79 methylation could be involved in DNA repair (Huyen et al.,
2004). Additionally, histone H3 K79 methylation was also reported to be involved in
leukemic transformation (Okada et al., 2005). Our results showed that histone H3
K79 methylation was up-regulated when host cells were infected by virus. Given that
little is known about the function of histone H3 K79 methylation, our finding may
open a new clue to the function of histone H3 K79 methylation.
57
Figure 3.1 Full length amplification of 11 novel genes of SGIV via RT-PCR. Total
RNA (harvested after 48h of infection) was isolated by using the One Step RT-PCR
kit. Lane C: control, lane M: Marker 100 bp DNA ladder (Axygen)
58
Figure 3.2 Presence of SGIV proteins expressed by ORF018R, 026R, 093L and 2
newly identified proteins encoded by ORF135L and 140R. Lane C: Mock-infected for
control.
59
Figure 3.3 Histone H3 K79 methylation analysis of SGIV-infected GECs compared to
mock SGIV-infected GECs by western blot. Left lane represents the sample from the
mock SGIV-infected GEC. Right lane represents the sample from the SGIV-infected
GEC. First panel represents K79 methylated form of histone H3. Middle panel
represents histone H3. Bottom panel represents the beta actin.
Mock-infection Infection
GEC
60
Table 3.1. T
hirty eight viral proteins that were repo
rted earlier.
Protein
Name
Accession
Number
Gene
Function
Protein
MW
Protein
PI
Peptide
Count
Total Ion
Score
Total Ion
Score
C.I. %
Best Ion
Score
Best Ion
Score C.I.
%
ORF006R
gi|56692643
Unknown
33730.1
5.27
3 253.23
100
100.56
99.9999
ORF007L
gi|56692644
Unknown
32309.3
6.75
2 219.9
100
154.44
100
ORF008L
gi|56692645
Unknown
23577.4
9.52
6 522.32
100
159.33
100
ORF012L
gi|56692649
Unknown
129769
9.46
26
2068.35
100
165.88
100
ORF015L
gi|56692652
Unknown
6854.2
9.1
2 114.44
100
62.66
99.6127
ORF016L
gi|56692653
Unknown
49104.1
5.08
1 63.29
99.665
63.29
99.665
ORF018R
gi|56692655
Unknown
34786.3
6.01
10
769.15
100
146.11
100
ORF019R
gi|56692656
Unknown
40453.8
7.3
3 237.67
100
98.86
99.9999
ORF022L
gi|56692659
Unknown
22143.3
12.2
1 47.19
86.351
47.19
86.351
ORF025L
gi|56692662
Unknown
63935.3
7.2
11
1012.99
100
150.64
100
ORF026R
gi|56692663
Unknown
67740.6
6.38
3 190.93
100
69.82
99.9255
ORF038L
gi|56692675
Unknown
21278.9
9.72
1 84.04
99.9972
84.04
99.9972
ORF039L
gi|56692676
Unknown
129475
8.37
8 697.47
100
124.08
100
ORF043R
gi|56692680
Unknown
82574
5.36
5 354.99
100
108.62
100
ORF045L
gi|56692682
Unknown
23793.4
9.8
5 558.93
100
150.42
100
ORF046L
gi|56692683
Unknown
26298.3
9.87
3 321.98
100
126.78
100
ORF055R
gi|56692692
Unknown
24630
10.65
7 703.56
100
179.01
100
ORF056R
gi|56692693
Unknown
24647.8
6.26
4 512.39
100
187
100
ORF057L
gi|56692694
Unknown
141855
8.83
1 52.48
95.9626
52.48
95.9626
ORF067L
gi|56692704
deox
ynucleoside kinase
23673.6
6.43
5 374.87
100
94.9
99.9998
Table
to b
e co
nti
nued
61
Protein
Name
Accession
Number
Gene
Function
Protein
MW
Protein
PI
Peptide
Count
Total Ion
Score
Total Ion
Score
C.I. %
Best Ion
Score
Best Ion
Score C.I.
%
ORF069L
gi|56692706
Unknown
68132.5
9.62
9 671.82
100
106.27
100
ORF072R
gi|56692709
major capsid protein
52794.1
6.31
10
887.63
100
161.91
100
ORF075R
gi|56692712
Unknown
21389.3
4.5
3 182.42
100
71.43
99.9486
ORF081L
gi|56692715
tyrosine kinase
94045.7
8.75
3 165.6
100
62.39
99.5878
ORF082L
gi|56692719
Unknown
26561.7
9.02
3 219.63
100
96.44
99.9998
ORF084L
gi|56692721
RNase III
45984.4
9.17
1 72.63
99.961
72.63
99.961
ORF086R
gi|56692723
putative immediate-
early protein
18446.8
7.88
4 437.01
100
137.22
100
ORF088L
gi|56692725
Unknown
57860.9
4.89
1 62.11
99.5604
62.11
99.5604
ORF090L
gi|56692727
Unknown
47281.4
7.62
5 344.54
100
85.45
99.998
ORF093L
gi|56692730
Unknown
52020.3
7.18
2 106.61
100
54.39
97.3992
ORF098R
gi|56692735
Unknown
33404.9
9.49
2 103.72
100
56.4
98.3628
ORF101R
gi|56692738
Unknown
36873.3
6.11
6 443.26
100
148.07
100
ORF102L
gi|56692739
ubiqutin/ribosom
al
protein
9690.69
6.71
4 378.43
100
141.97
100
ORF119R
gi|56692756
Unknown
10275.3
9.36
1 77.65
99.9877
77.65
99.9877
ORF124R
gi|56692761
Unknown
22127.3
6.49
1 110.32
100
110.32
100
ORF125R
gi|56692762
Unknown
23111.8
5.91
2 98.56
99.9999
49.62
92.1999
ORF151L
gi|56692788
Unknown
24116.5
6.43
1 100.48
99.9999
100.48
99.9999
ORF156L
gi|56692793
Unknown
33642.7
5.6
8 683.34
100
126.85
100
Table 3.1. C
ond.
62
Table 3.2. E
leven newly identified viral proteins.
Protein
Name
Accession
Number
Gene
Function
Protein
MW
Protein
PI
Peptide
Count
Total Ion
Score
Total Ion
Score C.I.
%
Best Ion
Score
Best Ion
Score C.I.
%
ORF049L
gi|56692686
dUTPase
18678.9
6.51
3 322.16
100
140.84
100
ORF099R
gi|56692736
Unknown
10650.5
5.13
3 221.15
100
92.47
99.9996
ORF107R
gi|56692744
Unknown
41572.6
3.91
1 84
99.9972
84
99.9972
ORF111R
gi|56692748
Unknown
32503.8
5.47
1 90.83
99.9994
90.83
99.9994
ORF118R
gi|56692755
Unknown
41293.7
8.76
4 303.46
100
112.84
100
ORF127R
gi|56692764
Unknown
21925.3
7.01
1 57.35
98.6845
57.35
98.6845
ORF135L
gi|56692772
Unknown
15150
5.95
3 248.72
100
94.23
99.9997
ORF136R
gi|56692773
Unknown
13116.6
7.52
1 80.94
99.9942
80.94
99.9942
ORF140R
gi|56692777
Unknown
36630.9
4.85
14
1408.35
100
181.16
100
ORF155R
gi|56692792
Unknown
68875.1
5.03
2 136.06
100
82.44
99.9959
ORF162L
gi|56692799
Unknown
49351.3
6.49
3 158.02
100
61.4
99.4823
63
Table 3.3. T
welve host proteins that were up
-regulated during SGIV
infection.
Protein
Name
Accession
Number
Protein
MW
Protein
PI
Peptide
Count
Total
Ion
Score
Total
Ion
Score
C.I. %
Best
Ion
Score
Best Ion
Score
C.I. %
Avg
iTRAQ
ratio *
(115/114)
ATP synthase F1, alpha subunit
gi|67920483
57766.36
4.86
1 55.95
98.18407
55.95
98.18407
1.55494
Dihydrofolate synthase
gi|76797895
48305.22
5.66
1 53.35
96.69554
53.35
96.69554
2.466123
Hypothetical protein CHGG_03067
gi|116207548
26253.87
8.71
1 55.46
97.96717
55.46
97.96717
2.174742
Matrix protein
gi|549374
40992.82
8.78
1 63.26004 99.66264
64.68
99.75672
2.925198
mucin-associated surface protein
(MASP)
gi|71659996
43040.39
4.89
1 55.46
97.96717
55.46
97.96717
2.174742
PREDICTED: hypothetical protein
gi|118103728
110544.7
6.86
1 53.35
96.69554
53.35
96.69554
2.466123
PREDICTED: similar to alpha-2-
macroglobulin isoform
1
gi|119893035
177819.5
5.72
7 615.15
100
126.37
100
1.639291
PREDICTED: similar to CG16944-
PA, isoform
A
gi|91092844
36679.73
9.8
1 51.61
95.06712
51.61
95.06712
1.539152
procollagen, type IV
, alpha 1, isoform
CRA_a
gi|149057575
114733.1
8.68
1 51.78
95.25648
51.78
95.25648
2.672385
putative head-tail adaptor
gi|15830847
13401.89
9.99
1 53.35
96.69554
53.35
96.69554
2.466123
putative signal peptide protein
gi|121528555
150867.5
9.15
1 53.98
97.14175
53.98
97.14175
2.466123
transposase
gi|84501238
68937.66
9.19
1 53.35
96.69554
53.35
96.69554
2.466123
64
Table 3.4. F
ive host proteins that were do
wn-regulated during SGIV
infection.
Protein
Name
Accession
Number
Protein
MW
Protein
PI
Peptide
Count
Total
Ion
Score
Total
Ion
Score
C.I. %
Best
Ion
Score
Best Ion
Score
C.I. %
Avg
iTRAQ
ratio *
(115/114)
Chain A, R
otam
er Strain As A Determinant
Of Protein Structural Specificity
gi|5821952
9711.81
6.56
2 137.49
100
69.48
99.91944
0.630019
histone 3
gi|73761832
13666.76
11.02
1 73.3 99.96657
73.3 99.96657
0.524929
histone H3 gi|119690002
15074.91
11.04
1 73.3 99.96657
73.3 99.96657
0.524929
hypothetical protein LOC449552
gi|53933246
13450.3
7.78
1 67.99
99.88647
67.99
99.88647
0.636123
PREDICTED: similar to ribosomal protein
S27a
gi|82934837
5322.599
5.05
1 125.64
100
125.64
100
0.627329
Rho GTPase
gi|66816373
24570.32
7.56
1 51.65
95.11235
51.65
95.11235
0.660471
65
Table 3.5. GEC host proteins identified from iTRAQ analysis
Protein Name
Chain , Modified Beta Trypsin (Monoisopropylphosphoryl Inhibited) (E.C.3.4.21.4) (Neutron Data) Chain , Structure Of Hydrolase (Serine Proteinase) [Segment 3 of 4] Pollen allergen Sal k 1 14-3-3 epsilon2 isoform [Schistosoma bovis] 14-3-3.a protein [Fundulus heteroclitus] 26S proteasome regulatory subunit S4 like AAA ATpase [Cryptosporidium parvum Iowa II] 3-deoxy-7-phosphoheptulonate synthase [Lactococcus lactis subsp. lactis Il1403] 3-deoxy-7-phosphoheptulonate synthase [Planctomyces maris DSM 8797] 3-hydroxy-3-methylglutaryl-Coenzyme A synthase 1 (soluble) [Danio rerio] 3-monooxygenase/tryptophan 5-monooxygenase activation protein, gamma polypeptide [Danio rerio] 40S ribosomal protein S11 [Stizostedion vitreum] 40S ribosomal protein S12 40S ribosomal protein S13 40S ribosomal protein S16 40S ribosomal protein S18 40S ribosomal protein S19 40S ribosomal protein S3a 40S ribosomal protein S3a [Siniperca chuatsi] 40S ribosomal protein S6 40S ribosomal protein S6 [Pseudopleuronectes americanus] 40S ribosomal protein S7 40S ribosomal protein S8 40S ribosomal protein Sa-like protein [Sparus aurata] 40S ribosomal small subunit protein S10 [Xenopus laevis] 4SNc-Tudor domain protein short form [Danio rerio] 50S ribosomal protein L22 [Flavobacterium sp. MED217] 60S acidic ribosomal protein P1 [Platichthys flesus] 60S ribosomal protein L10a 60S ribosomal protein L18 60S ribosomal protein L22 [Scophthalmus maximus] 60S ribosomal protein L23 [Ustilago maydis 521] 60S ribosomal protein L28 (L29) (YL24) (RP44) (L27a) (RP62) 60S ribosomal protein L31 60S ribosomal protein L34 60S ribosomal protein L35 [Scophthalmus maximus] 60S ribosomal protein L6 [Pagrus major] 60S ribosomal protein L7a 60S ribosomal protein L8 ABC-type transport system, ATPase component [Uncultured methanogenic archaeon RC-I] ABR117Cp [Ashbya gossypii ATCC 10895] acetylcholinesterase - rabbit acetyl-CoA C-acyltransferase [Pyrobaculum aerophilum str. IM2] acidic ribosomal phosphoprotein P0 [Mus musculus] actin [Aspergillus nidulans FGSC A4] actin [Gossypium hirsutum] actin [Gregarina polymorpha] actin [Haemaphysalis longicornis]
66
actin [Haliotis rufescens] actin [Japetella diaphana] actin [Kluyveromyces polysporus] actin [Leucocryptos marina] actin [Mastigoteuthis magna] actin [Micronuclearia podoventralis] actin [Monodonta perplexa perplexa] actin [Nuclearia simplex] actin [Oxystele tigrina] actin [Panax ginseng] actin [Phaseolus acutifolius] actin [Phoma nigrificans] actin [Pterosperma cristatum] actin [Saccharomyces kunashirensis] actin [Saccharomyces spencerorum] actin [Symbiodinium sp. clade C] actin 1 [Paxillus involutus] actin 2 [Strongylocentrotus purpuratus] actin 5 [Aedes aegypti] actin related protein 2/3 complex subunit 2 [Homo sapiens] Actin, muscle actinin, alpha 3 [Homo sapiens] actinin, alpha 3 [Xenopus laevis] actin-like protein [Homo sapiens] actin-related protein - Japanese pufferfish acyl carrier protein (ACP) [Pseudomonas entomophila L48] adenine nucleotide translocator s254 [Takifugu rubripes] adenosylhomocysteinase [Solibacter usitatus Ellin6076] adenylate kinase [Nitrobacter winogradskyi Nb-255] alanyl-tRNA synthetase [Danio rerio] Aldehyde dehydrogenase [Caulobacter sp. K31] aldolase c, fructose-bisphosphate [Danio rerio] alpha actinin 4 [Homo sapiens] alpha tropomyosin [Gillichthys mirabilis] alpha tubulin [Gillichthys mirabilis] alpha tubulin [Notothenia coriiceps] alpha tubulin 1 [Rhynchopus sp. ATCC 50230] Alpha-actinin-2 (Alpha actinin skeletal muscle isoform 2) (F-actin cross-linking protein) Alpha-enolase (2-phospho-D-glycerate hydro-lyase) (Non-neural enolase) (NNE) (Enolase 1) alpha-internexin alpha-tubulin [Bombyx mori] alpha-tubulin [Dicyema japonicum] alpha-tubulin [Jakoba incarcerata] alpha-tubulin [Medicago truncatula] alpha-tubulin [Spathidium sp.] alpha-tubulin [Strobilidium sp.] alpha-tubulin [Vorticella microstoma] alpha-tubulin 2 [Homarus gammarus] annexin 11a isoform 2 [Danio rerio] annexin A2 [Monopterus albus] annexin max3 [Oryzias latipes] annexin max3 [Pagrus major] Asparagine synthetase [glutamine-hydrolyzing] (Glutamine-dependent asparagine synthetase) (Cell cyc
67
aspartate-tRNA ligase [Cryptococcus neoformans var. neoformans JEC21] ATCXXS1 (C-TERMINAL CYSTEINE RESIDUE IS CHANGED TO A SERINE 1); thiol-disulfide exchange intermedia ATP synthase alpha-subunit [Ciona intestinalis] ATP synthase beta chain, mitochondrial precursor, putative [Plasmodium falciparum 3D7] ATP synthase beta chain, mitochondrial, putative [Arabidopsis thaliana] ATP synthase beta subunit [Pachysandra procumbens] ATP synthase F1 subunit alpha [Reclinomonas americana] ATP synthase F1, alpha subunit [Crocosphaera watsonii WH 8501] ATP synthase F1, beta subunit [Frankia sp. EAN1pec] ATP synthase subunit B [Brucella melitensis 16M] ATP synthase subunit B [Mycoplasma hyopneumoniae 232] ATP synthase subunit B [Mycoplasma mycoides subsp. mycoides SC str. PG1] atpase ATPase associated with various cellular activities, AAA_5 [Nocardioides sp. JS614] ATPase, AAA family protein [Tetrahymena thermophila SB210] AtRABA1e (Arabidopsis Rab GTPase homolog A1e); GTP binding [Arabidopsis thaliana] beta actin [Atherina boyeri] beta tubulin [Apusomonas proboscidea] beta tubulin [Culex pipiens pipiens] beta tubulin [Gillichthys mirabilis] beta tubulin [Glomus claroideum] beta tubulin [Glomus sp. BEG19] beta-2 tubulin [Aedes aegypti] beta-2 tubulin [Gadus morhua] beta-actin [Macrobrachium rosenbergii] beta-actin [Monopterus albus] beta-actin [Mustela putorius furo] beta-actin [Pomacentrus moluccensis] beta-actin [Vanessa cardui] beta-tubulin [Basidiobolus microsporus] beta-tubulin [Bombyx mori] beta-tubulin [Bombyx mori] beta-tubulin [Bombyx mori] beta-tubulin [Chaetosphaeria chlorotunicata] beta-tubulin [Crassostrea gigas] beta-tubulin [Dicyema japonicum] beta-tubulin [Esslingeriana idahoensis] beta-tubulin [Halocynthia roretzi] beta-tubulin [Microbotryum violaceum] beta-tubulin [Monosiga brevicollis] beta-tubulin [Phytophthora palmivora] beta-tubulin [Schistosoma haematobium] beta-tubulin [Schistosoma japonicum] beta-tubulin [Sclerospora graminicola] biliverdin reductase B (flavin reductase (NADPH)) [Mus musculus] BiP, heat shock protein 3 Branched-chain amino acid aminotransferase II [Xylella fastidiosa Dixon] calcyclin-associated protein, CAP50=Ca2+/phospholipid-binding protein L-14 fragment [rabbits, lung, calmodulin calmodulin [Gonothyraea loveni] calmodulin [Homo sapiens] calreticulin [Paralichthys olivaceus] capping protein (actin filament) muscle Z-line, alpha 2 [Mus musculus]
68
capping protein (actin filament) muscle Z-line, alpha 2 [Takifugu rubripes] carboxyl-terminal processing protease [Prochlorococcus marinus str. MIT 9211] cation transport ATPase, HAD family [Methanobrevibacter smithii ATCC 35061] cation transporter p-type ATPase a CtpA [Mycobacterium ulcerans Agy99] CCTgamma CG5384-PA [Drosophila melanogaster] CG8036-PB, isoform B [Drosophila melanogaster] Chain A, Beta-Actin-Profilin Complex Chain A, Complex Between Rabbit Muscle Alpha-Actin: Human Gelsolin Domain 1 Chain A, Crystal Structure Of Bovine Liver Glutamate Dehydrogenase Complexed With Gtp, Nadh, And L- Chain A, Crystal Structure Of The D3b Subcomplex Of The Human Core Snrnp Domain At 2.0a Resolution Chain A, Crystal Structure Of Two D-Glyceraldehyde-3-Phosphate Dehydrogenase Complexes: A Case Of A Chain A, Rotamer Strain As A Determinant Of Protein Structural Specificity Chain A, Structure Of Hyper-Vil-Trypsin Chain A, Structure Of The Dead Domain Of Human Eukaryotic Initiation Factor 4a, Eif4a Chain A, Uncomplexed Rat Trypsin Mutant With Asp 189 Replaced With Ser (D189s) Chain B, Tubulin Alpha-Beta Dimer, Electron Diffraction chaperone protein DnaJ [Geobacter lovleyi SZ] chaperone protein DnaK [Shewanella amazonensis SB2B] chaperonin [Mesobuthus gibbosus] chaperonin [Mesobuthus gibbosus] chaperonin containing TCP1, subunit 2 (beta) [Danio rerio] chaperonin containing TCP1, subunit 4 (delta) [Homo sapiens] chaperonin containing TCP1, subunit 8 (theta) [Homo sapiens] chaperonin containing TCP1, subunit 8 (theta) [Xenopus tropicalis] chaperonin GroEL , truncated [Syntrophus aciditrophicus SB] chaperonin GroEL [Myxococcus xanthus DK 1622] chaperonin subunit 6a (zeta) [Mus musculus] chaperonin subunit 7 [Pagrus major] COG0366: Glycosidases [Bacillus anthracis str. A2012] conserved hypothetical protein [Coccidioides immitis RS] conserved hypothetical protein [Gibberella zeae PH-1] conserved hypothetical protein [Stigmatella aurantiaca DW4/3-1] conserved hypothetical protein TIGR00157 [Methylococcus capsulatus str. Bath] core-binding factor, beta subunit isoform 2 [Homo sapiens] CypA protein [Xenopus laevis] cytoplasmic actin CyII [Heliocidaris erythrogramma] cytoskeletal actin IIIa [Strongylocentrotus purpuratus] cytosolic heat shock protein 70 [Trimastix marina] cytosolic malate dehydrogenase A [Oryzias latipes] cytosolic malate dehydrogenase thermostable form [Sphyraena idiastes] dihydrofolate synthase [Streptococcus agalactiae 18RS21] Dihydrolipoyl dehydrogenase, mitochondrial precursor (Dihydrolipoamide dehydrogenase) (Glycine clea DNA excision repair protein Rad2 [Neosartorya fischeri NRRL 181] DNA polymerase dnak protein BiP [Tetrahymena thermophila SB210] dnaK-type molecular chaperone hsc74 - yeast (Pichia angusta) (fragments) elongation factor 1 alpha [Euzercon latum] elongation factor 1a [Antonina graminis] Elongation factor 1-alpha (EF-1-alpha) elongation factor 1-alpha [Rhincalanus nasutus]
69
Elongation factor 1-gamma (EF-1-gamma) (eEF-1B gamma) elongation factor 2 [Halichondria sp. AR-2003] elongation factor 2 [Nematostella vectensis] elongation factor-1 alpha [Micromegistus bakeri] elongation factor-1 alpha [Ostrinia nubilalis] elongation factor-2 [Echiniscus viridissimus] enolase 1, (alpha) [Danio rerio] Enoyl Coenzyme A hydratase, short chain, 1, mitochondrial [Bos taurus] enoyl Coenzyme A hydratase, short chain, 1, mitochondrial [Mus musculus] ENSANGP00000010754 [Anopheles gambiae str. PEST] ENSANGP00000012302 [Anopheles gambiae str. PEST] ENSANGP00000021214 [Anopheles gambiae str. PEST] ENSANGP00000024287 [Anopheles gambiae str. PEST] eukaryotic translation elongation factor 2 [Mus musculus] eukaryotic translation elongation factor 2 [Rana sylvatica] Eukaryotic translation elongation factor 2, like [Danio rerio] eukaryotic translation initiation factor 2, subunit 1 alpha [Rattus norvegicus] Eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta) (eIF3 p66) (eIF3d) eukaryotic translation initiation factor 3, subunit 3 (gamma) [Danio rerio] eukaryotic translation initiation factor 3, subunit 3 gamma, 40kDa [Homo sapiens] eukaryotic translation initiation factor 4A2 [Mus musculus] ezrin/radixin/moesin (ERM)-like protein [Ciona intestinalis] F1-ATPase beta subunit=H(+)-transporting ATPase beta subunit [cattle, heart, Peptide Mitochondrial Partial, 19 aa] fatty acid binding protein 7, brain, a [Danio rerio] Fe-S protein assembly chaperone HscA [Polaromonas naphthalenivorans CJ2] Fe-S protein assembly chaperone HscA [Stigmatella aurantiaca DW4/3-1] fibrillarin [Danio rerio] Fibrillarin CG9888-PA [Drosophila melanogaster] Filamin-B (FLN-B) (Beta-filamin) (Actin-binding-like protein) (ABP-280-like protein) flagellar protein FlaG protein [Shewanella sp. W3-18-1] flightless I homolog [Homo sapiens] formyl transferase domain protein [Campylobacter jejuni RM1221] fructose bisphosphate aldolase [Polypterus senegalus] fructose-bisphosphate aldolase A-2 [Acipenser baerii] Fructose-bisphosphate aldolase C (Brain-type aldolase) Fructose-bisphosphate aldolase, non-muscle type fubp1 [Xenopus tropicalis] GA10751-PA [Drosophila pseudoobscura] gamma actin-like protein [Mus musculus] Genome polyprotein [Contains: Coat protein (CP)] glucosamine-6-phosphate deaminase [Cytophaga hutchinsonii ATCC 33406] glucose regulated protein, 58kDa [Xenopus laevis] glucose-regulated protein 78 [Paralichthys olivaceus] glucose-regulated protein 78, putative [Leishmania major] glucose-regulated protein 94 [Paralichthys olivaceus] glutamate oxaloacetate transaminase 2, mitochondrial [Mus musculus] glutathione S-transferase [Micropterus salmoides] glyceraldehyde-3-phosphate dehydrogenase [Oncorhynchus tshawytscha] glyceraldehyde-3-phosphate dehydrogenase, type I [Flavobacteria bacterium BAL38] Glycine dehydrogenase subunit [Geobacillus thermodenitrificans NG80-2] GTPase ObgE [Caminibacter mediatlanticus TB-2] GTP-binding nuclear protein Ran (GTPase Ran) GTP-binding protein LepA [Tropheryma whipplei TW08/27]
70
guanine nucleotide binding protein (G protein), beta polypeptide 2-like 1 [Danio rerio] H2A histone family, member Z [Homo sapiens] hCG1642531 [Homo sapiens] heart-type fatty acid-binding protein [Fundulus heteroclitus] heat shock 60 kD protein 1 [Danio rerio] heat shock 60kDa protein 1 (chaperonin) [Gallus gallus] Heat shock 70 kDa protein cognate 4 (Hsc 70-4) heat shock cognate 70 [Rhynchosciara americana] heat shock cognate 70 [Xiphophorus hellerii] heat shock cognate 70 kDa protein [Pimephales promelas] heat shock cognate 70 protein [Spodoptera frugiperda] heat shock protein [Numida meleagris] Heat shock protein [Plasmodium berghei strain ANKA] heat shock protein 10 [Monopterus albus] heat shock protein 60 [Trichinella spiralis] heat shock protein 60 kDa [Paralichthys olivaceus] heat shock protein 70 [Anatolica polita borealis] heat shock protein 70 [Bombyx mori] heat shock protein 70 [Capsaspora owczarzaki] heat shock protein 70 [Cryptosporidium andersoni] heat shock protein 70 [Cyclospora cercopitheci] heat shock protein 70 [Euplotes aediculatus] heat shock protein 70 [Geodia cydonium] heat shock protein 70 [Laternula elliptica] heat shock protein 70 [Latimeria chalumnae] heat shock protein 70 [Lucilia cuprina] heat shock protein 70 [Mizuhopecten yessoensis] heat shock protein 70 [Omphisa fuscidentalis] heat shock protein 70 [Oxytricha sp. LPJ-2005] heat shock protein 70 [Paralichthys olivaceus] heat shock protein 70 [Spumella uniguttata] heat shock protein 70 isoform 3 [Oryzias latipes] heat shock protein 70 isoform 5 [Oryzias latipes] heat shock protein 70 precursor [Neocallimastix patriciarum] heat shock protein 70, hsp70A2 heat shock protein 70-B cytosolic isoform [Bodo saltans] heat shock protein 70kDa [Pholcus phalangioides] heat shock protein 9 [Mus musculus] heat shock protein 90 heat shock protein 90 [Bufo gargarizans] heat shock protein 90 beta [Paralichthys olivaceus] Heat shock protein DnaJ-like protein [Lentisphaera araneosa HTCC2155] Heat shock protein Hsp70 [Xylella fastidiosa Dixon] heat shock protein, Hsp70 [Cryptosporidium parvum Iowa II] heat-shock protein [Littorina plena] heat-shock protein Hsp70 [Oopsacas minuta] heat-shock protein Hsp70 [Petrosia ficiformis] heavy metal translocating P-type ATPase [Caldicellulosiruptor saccharolyticus DSM 8903] hexokinase 1 [Danio rerio] high density lipoprotein-binding protein (vigilin) [Danio rerio] histone 3 [Frankliniella sp. KY-2004] Histone H2A histone h2a variant [Cryptococcus neoformans var. neoformans JEC21] histone H2B (ISS) [Ostreococcus tauri]
71
histone H2B [Anacropora matthai] Histone H2B 1.2 (H2B1.2) histone H3 [Anelosimus eximius] histone H4 histone H4 [Cucumis sativus] host actin-1 [Rhodomonas chrysoidea] HSC71 [Rivulus marmoratus] HSP60 [Carassius auratus] HSP70 [Chlamys farreri] Hsp70 [Hydrogenobacter hydrogenophilus] HSP70 [Oxyuranus scutellatus scutellatus] Hsp70 protein [Cetorhinus maximus] Hsp70 protein [Odontaspis ferox] Hsp70Bbb CG5834-PA [Drosophila melanogaster] hypothetical protein [Ictalurus punctatus] hypothetical protein [Neurospora crassa OR74A] hypothetical protein [Paramecium tetraurelia] hypothetical protein [Paramecium tetraurelia] hypothetical protein [Paramecium tetraurelia] hypothetical protein [Paramecium tetraurelia] hypothetical protein [Vitis vinifera] hypothetical protein [Vitis vinifera] hypothetical protein [Vitis vinifera] hypothetical protein A - mouse (fragment) hypothetical protein Aave_1075 [Acidovorax avenae subsp. citrulli AAC00-1] hypothetical protein An07g09990 [Aspergillus niger] hypothetical protein BA3462 [Bacillus anthracis str. Ames] Hypothetical protein CBG03435 [Caenorhabditis briggsae] Hypothetical protein CBG06421 [Caenorhabditis briggsae] Hypothetical protein CBG18934 [Caenorhabditis briggsae] hypothetical protein CdifQ_04001593 [Clostridium difficile QCD-32g58] hypothetical protein CHGG_03067 [Chaetomium globosum CBS 148.51] hypothetical protein Chro.80400 [Cryptosporidium hominis TU502] hypothetical protein CNBC3800 [Cryptococcus neoformans var. neoformans B-3501A] hypothetical protein CNBG1220 [Cryptococcus neoformans var. neoformans B-3501A] hypothetical protein DR_1619 [Deinococcus radiodurans R1] hypothetical protein FG03504.1 [Gibberella zeae PH-1] hypothetical protein GLP_456_20463_41396 [Giardia lamblia ATCC 50803] hypothetical protein GTNG_2000 [Geobacillus thermodenitrificans NG80-2] hypothetical protein lmo2611 [Listeria monocytogenes EGD-e] hypothetical protein LOC335798 [Danio rerio] hypothetical protein LOC337595 [Danio rerio] hypothetical protein LOC393186 [Danio rerio] hypothetical protein LOC394635 [Xenopus tropicalis] hypothetical protein LOC432005 [Xenopus laevis] hypothetical protein LOC436741 [Danio rerio] hypothetical protein LOC436954 [Danio rerio] hypothetical protein LOC449552 [Danio rerio] hypothetical protein LOC494995 [Xenopus laevis] hypothetical protein LOC495253 [Xenopus laevis] hypothetical protein LOC496627 [Xenopus tropicalis] hypothetical protein LOC548683 [Xenopus tropicalis] hypothetical protein LOC550333 [Danio rerio] hypothetical protein LOC641421 [Danio rerio]
72
hypothetical protein LOC734647 [Xenopus laevis] hypothetical protein LOC767746 [Danio rerio] hypothetical protein LOC779096 [Xenopus laevis] hypothetical protein MGG_05095 [Magnaporthe grisea 70-15] hypothetical protein MtrDRAFT_AC149032g20v2 [Medicago truncatula] hypothetical protein OsI_002780 [Oryza sativa (indica cultivar-group)] hypothetical protein OsI_004782 [Oryza sativa (indica cultivar-group)] hypothetical protein PC301050.00.0 [Plasmodium chabaudi chabaudi] hypothetical protein Pfl_3860 [Pseudomonas fluorescens PfO-1] hypothetical protein PH0744 [Pyrococcus horikoshii OT3] hypothetical protein PMM1513 [Prochlorococcus marinus subsp. pastoris str. CCMP1986] hypothetical protein Ppro_1975 [Pelobacter propionicus DSM 2379] hypothetical protein PST_0570 [Pseudomonas stutzeri A1501] hypothetical protein RUMTOR_00917 [Ruminococcus torques ATCC 27756] hypothetical protein SareDRAFT_0666 [Salinispora arenicola CNS205] hypothetical protein Sbal195DRAFT_0316 [Shewanella baltica OS195] hypothetical protein SNOG_03450 [Phaeosphaeria nodorum SN15] hypothetical protein SNOG_14343 [Phaeosphaeria nodorum SN15] hypothetical protein SPAC18G6.03 [Schizosaccharomyces pombe 972h-] hypothetical protein Sputw3181_0469 [Shewanella sp. W3-18-1] hypothetical protein Tc00.1047053509601.30 [Trypanosoma cruzi strain CL Brener] hypothetical protein TTHERM_00145010 [Tetrahymena thermophila SB210] hypothetical protein TTHERM_00349000 [Tetrahymena thermophila SB210] hypothetical protein TVAG_346800 [Trichomonas vaginalis G3] hypothetical protein UM03791.1 [Ustilago maydis 521] hypothetical protein UM05511.1 [Ustilago maydis 521] hypothetical protein WH5701_07924 [Synechococcus sp. WH 5701] hypothetical protein, conserved [Leishmania braziliensis] ibosomal protein L1p [Plasmodium yoelii yoelii str. 17XNL] isocitrate dehydrogenase (NADP+) [Nicotiana tabacum] Isocitrate dehydrogenase 1 (NADP+), soluble [Homo sapiens] isocitrate dehydrogenase, NADP-dependent [Sphingopyxis alaskensis RB2256] K18, simple type I keratin [Oncorhynchus mykiss] keratin 15 [Danio rerio] keratin 18 [Danio rerio] Keratin, type I cytoskeletal 18 (Cytokeratin-18) (CK-18) (Keratin-18) (K18) lactate dehydrogenase-A [Epinephelus coioides] lactate dehydrogenase-A; NAD+:lactate oxidoreductase; LDH-A [Chromis punctipinnis] lamin b2 [Danio rerio] late histone L3 H2b [Strongylocentrotus purpuratus] lipoprotein [Sinorhizobium meliloti] L-lactate dehydrogenase A chain (LDH-A) L-lactate dehydrogenase A chain (LDH-A) LOC397850 protein [Xenopus laevis] M116.5 [Caenorhabditis elegans] Matrix protein methyl-accepting chemotaxis sensory transducer [Pseudoalteromonas atlantica T6c] methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 1, methenyltetrahydrofolate cyclohydrolas Methyltransferase type 12 [Shewanella denitrificans OS217] methyltransferase, FkbM family protein [Campylobacter fetus subsp. fetus 82-40] MGC108369 protein [Xenopus tropicalis] MGC75629 protein [Xenopus tropicalis] MGC79035 protein [Xenopus laevis]
73
minichromosome maintenance complex component 7 [Mus musculus] mitochondrial isocitrate dehydrogenase 2-like [Oreochromis mossambicus] mitochondrial malate dehydrogenase 2, NAD [Tamandua tetradactyla] M-TAXREB107 [Mus musculus] mucin-associated surface protein (MASP) [Trypanosoma cruzi strain CL Brener] muscle actin [Lethenteron japonicum] myosin regulatory light chain [Scophthalmus maximus] myosin, heavy polypeptide 9, non-muscle [Homo sapiens] N-acetylneuraminate lyase subunit [Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305] nascent polypeptide-associated complex (NAC) domain-containing protein [Arabidopsis thaliana] nascent polypeptide-associated complex alpha subunit [Homo sapiens] natural killer enhancing factor [Paralichthys olivaceus] natural killer enhancing factor [Scophthalmus maximus] NHP2 non-histone chromosome protein 2-like 1 [Danio rerio] NHP2 non-histone chromosome protein 2-like 1 [Ictalurus punctatus] Nonspecific lipid-transfer protein (Propanoyl-CoA C-acyltransferase) (NSL-TP) (Sterol carrier prote nucleolin 2 [Cyprinus carpio] nucleoside diphosphate kinase [Oreochromis mossambicus] Nucleoside diphosphate kinase [Prochlorococcus marinus str. MIT 9211] nucleotide-sugar transporter, putative [Aspergillus fumigatus Af293] omega class glutathione-S-transferase [Takifugu rubripes] Os01g0805900 [Oryza sativa (japonica cultivar-group)] P4hb-prov protein [Xenopus tropicalis] p68 RNA helicase [Xenopus laevis] P68-LIKE PROTEIN (DEAD BOX FAMILY OF RNA HELICASES) [Encephalitozoon cuniculi GB-M1] paREP2b [Pyrobaculum aerophilum str. IM2] PAS/PAC sensor signal transduction histidine kinase [Flavobacterium johnsoniae UW101] Peptidyl-prolyl cis-trans isomerase B1 precursor (PPIase B1) (Rotamase B1) peroxiredoxin 3 [Bos taurus] peroxiredoxin 4 [Bos taurus] phenylacetate-coenzyme A ligase [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] phosphogluconate dehydrogenase [Mus musculus] Phosphoglycerate kinase [Rhodobacter sphaeroides ATCC 17025] phosphoglycerate mutase [Chelon labrosus] phosphoglycerate mutase 1 family [Acidovorax sp. JS42] phosphoglyceromutase [Bacteroides fragilis YCH46] P-II family protein [Chlorobium tepidum TLS] Pkm2 protein [Danio rerio] polyubiquitin [Fragaria x ananassa] PPAT5 [Hyaloperonospora parasitica] Precorrin-6y C5,15-methyltransferase, subunit CbiE [Sagittula stellata E-37] predicted protein [Ostreococcus lucimarinus CCE9901] PREDICTED: AHNAK nucleoprotein 2 [Pan troglodytes] PREDICTED: hypothetical protein [Gallus gallus] PREDICTED: hypothetical protein [Gallus gallus] PREDICTED: hypothetical protein [Monodelphis domestica] PREDICTED: hypothetical protein [Monodelphis domestica] PREDICTED: hypothetical protein [Monodelphis domestica] PREDICTED: similar to 40S ribosomal protein S10 [Macaca mulatta] PREDICTED: similar to 40S ribosomal protein S20 [Macaca mulatta]
74
PREDICTED: similar to 60 kDa heat shock protein, mitochondrial precursor (Hsp60) (60 kDa chaperonin PREDICTED: similar to 60S ribosomal protein L23a [Canis familiaris] PREDICTED: similar to 60S ribosomal protein L23a [Macaca mulatta] PREDICTED: similar to 60S ribosomal protein L32 [Canis familiaris] PREDICTED: similar to 60S ribosomal protein L35 [Bos taurus] PREDICTED: similar to actinin, partial [Danio rerio] PREDICTED: similar to Activator 1 140 kDa subunit (Replication factor C large subunit) (A1 140 kDa PREDICTED: similar to alpha-1 tubulin, partial [Strongylocentrotus purpuratus] PREDICTED: similar to alpha-2-macroglobulin isoform 1 [Bos taurus] PREDICTED: similar to CG16944-PA, isoform A [Tribolium castaneum] PREDICTED: similar to CG4046-PA [Tribolium castaneum] PREDICTED: similar to CG5826-PA [Tribolium castaneum] PREDICTED: similar to CG9277-PB, isoform B [Tribolium castaneum] PREDICTED: similar to dynein, cytoplasmic, heavy polypeptide 1 isoform 6 [Canis familiaris] PREDICTED: similar to expressed in non-metastatic cells 1, protein (NM23A) (nucleoside diphosphate PREDICTED: similar to Gsn protein isoform 1 [Danio rerio] PREDICTED: similar to H2A histone family, member Z [Mus musculus] PREDICTED: similar to Heat shock cognate 71 kDa protein (Heat shock 70 kDa protein 8) [Canis famili PREDICTED: similar to heat shock protein 84b [Monodelphis domestica] PREDICTED: similar to heat shock protein protein [Strongylocentrotus purpuratus] PREDICTED: similar to histone 1, H2ai (predicted) isoform 1 [Canis familiaris] PREDICTED: similar to histone H2b-616 [Bos taurus] PREDICTED: similar to hoip-prov protein [Strongylocentrotus purpuratus] PREDICTED: similar to Keratin, type I cytoskeletal 18 (Cytokeratin 18) (K18) (CK 18) [Canis familiaris] PREDICTED: similar to KIAA0152, [Danio rerio] PREDICTED: similar to KIAA1641 protein isoform 6 [Pan troglodytes] PREDICTED: similar to LOC494641 protein [Strongylocentrotus purpuratus] PREDICTED: similar to Myelin P2 protein [Gallus gallus] PREDICTED: similar to nonmuscle myosin heavy chain [Danio rerio] PREDICTED: similar to OTTHUMP00000028706 [Danio rerio] PREDICTED: similar to OTTHUMP00000028706, partial [Danio rerio] PREDICTED: similar to PLC alpha [Monodelphis domestica] PREDICTED: similar to Probable leucyl-tRNA synthetase, mitochondrial precursor (Leucine--tRNA ligas PREDICTED: similar to ribosomal protein isoform 1 [Pan troglodytes] PREDICTED: similar to ribosomal protein S15a [Rattus norvegicus] PREDICTED: similar to ribosomal protein S2 [Macaca mulatta] PREDICTED: similar to ribosomal protein S27a [Mus musculus] PREDICTED: similar to SMT3 suppressor of mif two 3 homolog 2 isoform 1 [Rattus norvegicus] PREDICTED: similar to solute carrier family 14 (urea transporter), member 1 [Canis familiaris] PREDICTED: similar to squamous cell carcinoma antigen recognized by T cells 3 [Apis mellifera] PREDICTED: similar to Thioredoxin-dependent peroxide reductase, mitochondrial precursor (Peroxiredo PREDICTED: similar to tubulin alpha 6 [Danio rerio] PREDICTED: similar to Tubulin alpha-1 chain [Tribolium castaneum] PREDICTED: similar to Tubulin alpha-2 chain (Alpha-tubulin 2) [Canis familiaris] PREDICTED: similar to Tubulin beta-6 chain (Beta-tubulin class-VI) [Danio rerio]
75
PREDICTED: similar to ubiquitin A-52 residue ribosomal protein fusion product 1 isoform 1 [Mus musc probable acyl-CoA dehydrogenase [Cellulophaga sp. MED134] probable serine/threonine protein kinase [Rhodopirellula baltica SH 1] probableacetyltransferase/hydrolase [delta proteobacterium MLMS-1] procollagen, type IV, alpha 1, isoform CRA_a [Rattus norvegicus] Proliferating cell nuclear antigen (PCNA) (Cyclin) proliferating cell nuclear antigen [Bombyx mori] proliferation-associated 2G4, a [Danio rerio] protease, serine, 3 [Mus musculus] proteasome (prosome, macropain) subunit, alpha type 1 [Rattus norvegicus] proteasome (prosome, macropain) subunit, beta type, 3 [Danio rerio] proteasome 26S non-ATPase subunit 11 [Homo sapiens] proteasome alpha 3 subunit isoform 1 [Homo sapiens] proteasome beta 1 subunit [Homo sapiens] protein disulfide isomerase-related protein (provisional) [Danio rerio] protein disulfide-isomerase (EC 5.3.4.1) precursor - Caenorhabditis elegans (fragment) protein disulfide-isomerase [Ictalurus punctatus] protein Y6B3B.8 [imported] - Caenorhabditis elegans putative Aspartyl-tRNA synthetase [Oryza sativa (japonica cultivar-group)] putative DNA mismatch repair protein [Vibrio cholerae B33] putative eukaryotic translation initiation factor 4A [Silene latifolia] putative head-tail adaptor [Escherichia coli O157:H7 str. Sakai] putative primase [Lactococcus bacteriophage phi31] putative response regulator [marine gamma proteobacterium HTCC2207] putative ribosomal protein L7 protein [Oncorhynchus mykiss] putative signal peptide protein [Ralstonia pickettii 12J] putative TonB-dependent outer membrane receptor protein [Bacteroides fragilis NCTC 9343] putative transient receptor protein 2 [Pagrus major] Putative transposase, mutator type [Burkholderia xenovorans LB400] quail calpain [Coturnix coturnix] RAB18, member RAS oncogene family [Homo sapiens] RAB18, member RAS oncogene family variant [Homo sapiens] ras-related GTP-binding protein RAB10 [Canis lupus familiaris] Ras-related protein ORAB-1 rCG64140 [Rattus norvegicus] Rho GTPase [Dictyostelium discoideum AX4] ribosomal protein homologous to yeast S24 [Homo sapiens] Ribosomal protein L10a [Danio rerio] ribosomal protein L11 [Branchiostoma belcheri] ribosomal protein L11 [Mus musculus] ribosomal protein L13 [Branchiostoma lanceolatum] ribosomal protein L13a [Danio rerio] ribosomal protein L14 - human ribosomal protein L17 [Homo sapiens] ribosomal protein L19 [Xenopus laevis] ribosomal protein L21 [Siniperca chuatsi] ribosomal protein L23 [Canis lupus familiaris] ribosomal protein L27a [Plasmodium yoelii yoelii str. 17XNL] ribosomal protein L34 [Branchiostoma belcheri tsingtaunese] ribosomal protein L36 [Gallus gallus] ribosomal protein L4 [Pagrus major] ribosomal protein L5 [Mus musculus] Ribosomal protein L6 [Canis familiaris]
76
Ribosomal protein L6 [Danio rerio] ribosomal protein L7 [Oryzias latipes] ribosomal protein Large P0 [Protopterus dolloi] ribosomal protein S13 [Xenopus tropicalis] ribosomal protein S14 [Rattus norvegicus] ribosomal protein S20 [Solea senegalensis] Ribosomal protein S20 CG15693-PA [Drosophila melanogaster] ribosomal protein S27 [Homo sapiens] ribosomal protein S29 [Danio rerio] ribosomal protein S3 [Danio rerio] ribosomal protein S3 [Moniliophthora perniciosa] ribosomal protein S4, X-linked [Monodelphis domestica] ribosomal protein S8 [Gillichthys mirabilis] ribosomal protein S8 [Protopterus dolloi] ribosomal protein S9 [Homo sapiens] ribosomal protein S9 [Plutella xylostella] ribosomal protein, large, P0 [Danio rerio] RNA binding protein p37 AUF1 [Rattus norvegicus] rpL14 protein [Takifugu rubripes] RPS28 (RIBOSOMAL PROTEIN S28); structural constituent of ribosome [Arabidopsis thaliana] rs23 protein [Ciona intestinalis] S11 ribosomal protein [Sparus aurata] S14e ribosomal protein-like protein [Philodina sp. NPS-2005] S-adenosyl-L-homocysteine hydrolase [Blastopirellula marina DSM 3645] S-adenosyl-L-homocysteine hydrolase [Prosthecochloris aestuarii DSM 271] SEC22 vesicle trafficking protein-like 1 [Mus musculus] Serum albumin precursor (Allergen Bos d 6) (BSA) SET translocation [Mus musculus] signal transduction histidine kinase, nitrate/nitrite-specific, NarQ [Methylobacillus flagellatus KT] similar to alpha-Tubulin at 84B [Xenopus laevis] simple type II keratin K8b (S2) [Oncorhynchus mykiss] SMT3 supressor of mif two 3 homolog 2 [Mus musculus] Sodium/potassium-transporting ATPase alpha-3 chain (Sodium pump 3) (Na(+)/K(+) ATPase 3) (Alpha(III solute carrier family 25, member 5 [Mus musculus] SSR gamma subunit [Rattus norvegicus] stratifin [Homo sapiens] stress protein HSC70 [Xiphophorus maculatus] suppressor [Xenopus laevis] t-complex polypeptide 1 chaperonin delta chain - Japanese pufferfish t-complex protein 4a - bovine (fragments) Temporarily Assigned Gene name family member (tag-210) [Caenorhabditis elegans] tetra-ubiquitin [Saccharum hybrid cultivar H32-8560] Tln1 protein [Danio rerio] TPA: TPA_inf: HDC03427 [Drosophila melanogaster] TPR repeat [Delftia acidovorans SPH-1] Transcriptional regulator, MarR/EmrR family [Clostridium acetobutylicum ATCC 824] transfer RNA-Ala synthetase [Bombyx mori] translation elongation factor 2 [Spodoptera exigua] transporter, NRAMP family protein [Flavobacteriales bacterium HTCC2170] transposase [Oceanicola batsensis HTCC2597] triose phosphate isomerase [Polypterus ornatipinnis] triose-phosphate isomerase [Cyanidioschyzon merolae]
77
triosephosphate isomerase [Saccoglossus kowalevskii] triosephosphate isomerase B [Xiphophorus maculatus] tropomyosin [Fundulus heteroclitus] tropomyosin 2, beta [Mus musculus] tropomyosin4-1 [Takifugu rubripes] troponin C-like protein Try10-like trypsinogen [Mus musculus] trypsin (EC 3.4.21.4) precursor - bovine TUA1 (ALPHA-1 TUBULIN) [Arabidopsis thaliana] tubulin [Geodia cydonium] tubulin alpha 6 [Danio rerio] Tubulin alpha chain tubulin beta chain - rat (fragments) Tubulin beta chain (Beta tubulin) Tubulin beta-2 chain (Beta-tubulin 2) tubulin, alpha 4, isoform CRA_b [Homo sapiens] tubulin, alpha 4b [Xenopus tropicalis] tubulin, alpha 8 [Gallus gallus] tubulin, alpha 8 like 3 [Danio rerio] tubulin, beta 2a [Xenopus tropicalis] Tubulin, beta 2C [Homo sapiens] Tubulin, Beta family member (tbb-4) [Caenorhabditis elegans] type 2 actin [Pleurochrysis carterae] type II keratin E3 [Oncorhynchus mykiss] type II keratin subunit protein ubiquitin ubiquitin ubiquitin [Scyliorhinus torazame] Ubiquitin family protein [Tetrahymena thermophila SB210] ubiquitin/actin fusion protein [Gymnochlora stellata] ubiquitin/actin fusion protein 2 [Bigelowiella natans] ubiquitin-activating enzyme E1 (A1S9T and BN75 temperature sensitivity complementing) [Danio rerio] ubiquitin-like protein fubi and ribosomal protein S30 precursor [Homo sapiens] ubiquitin-like protein Ublp94.4 [Acanthamoeba castellanii] ubquitin/ribosomal protein S27Ae fusion protein [Curculio glandium] Unknown (protein for MGC:160678) [Xenopus laevis] Unknown (protein for MGC:82390) [Xenopus laevis] unknown [Homo sapiens] unknown [Schistosoma japonicum] unknown [Trichinella spiralis] unknown protein encoded by prophage CP-933K [Escherichia coli UTI89] unnamed protein product [Candida glabrata] unnamed protein product [Candida glabrata] unnamed protein product [Homo sapiens] unnamed protein product [Homo sapiens] unnamed protein product [Macaca fascicularis] unnamed protein product [Macaca fascicularis] unnamed protein product [Mus musculus] unnamed protein product [Mus musculus] unnamed protein product [Mus musculus] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis]
78
unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis]
79
unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] unnamed protein product [Tetraodon nigroviridis] uracil-DNA glycosylase [Borrelia burgdorferi B31] UvrD/REP helicase [uncultured Prochlorococcus marinus clone ASNC2150] valosin containing protein [Danio rerio] vf14-3-3c protein [Vicia faba] V-Fos transformation effector [Scophthalmus maximus] vitronectin precursor [Homo sapiens] voltage-dependent anion channel [Paralichthys olivaceus] voltage-dependent anion channel 2 [Gallus gallus] YlACT1 [Yarrowia lipolytica] Zgc:112271 protein [Danio rerio]
Zgc:153867 protein [Danio rerio]
80
Table 3.6 Primers for the 11 newly identified viral proteins
ORFs Left primer Right primer
ORF049L ATGTACGTAT ACCCCGCAAT TCATTTTTTT TGCCTAACTC C ORF099R ATGCCCGTAT CAGATAAAGA TCATCTTTCA CAATTGGACA ORF107R ATGCAACGTT CCATAAATTT G TTACATGGCA GACTGCGCG ORF111R ATGGTGAATG AAATGTATAC A CTATAAAAGC GGGATGTTAA ORF118R ATGACCAGCC GGCGTTTC CTAAGCGAAT CCGAGCGTT ORF127R ATGTTAACCG CAATTGTAAC TTATTCGTTA AACGAATATC T ORF135L ATGGAAGAAC TCGTACAAGC TTACGGTTCC GGCGTCTT ORF136R ATGTACCCTG TATTGAAAAG TTATACAATT TTATAAACGG C ORF140R ATGTCAGAAT TTAGGGGTCA CTACGCTTTT TTATCCGCT ORF155R ATGGCACTAT TGATATCTAT TTACTCGCAA CTATCAATCA ORF162L ATGGCTTTCG TCACAGACAA TCACTCTTCT TCGTCGGGAA AC
81
Chapter Four
A novel SGIV-coding protein ORF158 is involved
in the regulation of K79 methylation of histone
H3
82
4.1 Sumarry
ORF158L, one of the 162 open reading frames (ORFs) predicted from the SGIV
genome, is a functionally unknown gene. In this chapter, we report that: ORF158L (i)
encodes a full length protein (~16Kd); (ii) is not essential for SGIV replication in cell
culture; (iii) is not engaged in the transcriptional regulation of SGIV gene expression
in cell culture; (iv) does not affect viral protein translation; (v) is highly accumulated
in the host nucleus and viral assembly center; (vi) is co-localized with histone H3 of
the host cells; (vii) is involved in the regulation of lysine 79 methylation of host
histone H3.
83
4.2 Introduction
Singapore Grouper Iridovirus (SGIV), an iridovirus of the ranavirus genus, was
isolated from brown-spotted grouper in 1998 after they caused significant economic
losses in Singapore marine net cage farms in 1994 (Chua et al., 1994; Qin et al.,
2001). In 2004, the genome of SGIV was sequenced, 162 Open Reading Frames
(ORFs) were predicted based on the 140,131 base pair length viral genome, and 26
SGIV proteins were identified (Song et al., 2004). Two years later, in 2006, additional
25 SGIV viral proteins were reported by proteomics methods (Song et al., 2006).
Although the proteins of 51 ORFs out of the 162 ORFs are reported, the translational
products of the remaining 117 ORFs are unknown (Song et al., 2004; Song et al.,
2006). Recently, Chen and coworkers reported that 127 ORFs of SGIV were
transcriptionally active (Chen et al., 2006). ORF158L was one of the 162 ORFs
predicted in the SGIV genome. Although ORF158L was reported to be
transcriptionally active and was classified as one of the SGIV early genes, the
translational product and function of the ORF158L remain unknown (Chen et al.,
2006). The present study aims to elucidate the biological function of the ORF158L
encoded protein.
4.3 Materials and Methods
4.3.1 Cell and virus infection
Protocols for cell and virus infection were described earlier in Chapter 2.
4.3.2 Antibodies
The ORF158L gene was PCR amplified and cloned into pET-15b (Novagen). The
recombinant protein was over expressed at 18 °C overnight with 0.1mM IPTG by
using a BL21 strain (DE3; Novagen). The recombinant protein was first purified by
using a Ni-NTA column (Qiagen) and was further purified by gel filtration
84
chromatography using a Superdex 75 column (Amersham). The purified protein was
used for generation of monoclonal antibodies. Monoclonal antibodies against beta
actin, polyclonal antibodies against histone H3 and K79 monomethylated histone H3
were purchased from ABCAM.
4.3.3 Knockdown of ORF158L
Morpholinos sequence against ORF158L (158_MO) is
5'-AGTTTGCTACGATGGCCCACCCCAT-3'. Control MO (Ctr_MO) sequence is
5'-CCTCTTACCTCAGTTACAATTTATA-3'. MOs were delivered into GEC cells
by electrophoresis using Nucleofactor (Amaxa). Forty hours after transfection, 5 MOI
SGIV were added into the cells. And the samples were harvested at 48h p.i. for further
investigations.
4.3.4 Electron Microscopy
The embryonic cells were fixed in 2.5% glutaraldehyde, 2% paraformaldehyde-PBS,
postfixed with 1% osmium tetroxide, followed by dehydration and infiltration. The
samples were sliced into ultra thin sections and stained with 2% uranyl acetate and
1% lead citrate. The samples were examined with a Philips CM10 electron
microscopy at 100 kv.
4.3.5 DNA Microarray and Real Time RT PCR
The methods for DNA Microarray and real time RT-PCR were described in chapter 2.
4.3.6 Immunofluorescence
Cells either treated or untreated with SGIV and morpholinos were cultured in Lab-
Tek® II CC2™ Chamber Slide™ (Nalge). In the case of SGIV infection, 5MOI SGIV
were used and the cells were investigated 48h p.i. later. The procedure of immunolabeling
is described next: first, the cells were washed with 1 × PBS; followed by the addition of
fixation buffer (3% w/v) and 100% methanol (pre-cooled at -20°C) and subsequently
85
the samples were washed with 1 × PBS; after adding blocking buffer (1% BSA, 1 ×
PBS), samples were incubated with antibodies. 1 hour later, the samples were washed
with 1 × PBS, followed by incubation with Alexa Fluor probes (Invitrogen) for 1 hour;
subsequently, trihydrochloride (Invitrogen) was added to samples and the samples
were washed with 1 × PBS. The chamber was removed and the sample was sealed
with cover slides. Finally, the samples were observed using Olympus Fluoview
FV500.
4.4 Results
4.4.1 Western blot against ORF158L When 158 mAbs was used to do western blots,
no protein could be detected in the SGIV mock-infected cell lysate (Fig 4.1. A.
Middle lane) and only one protein band was detected in the SGIV-infected cell lysate
(Fig 4.1. A. First lane). The ORF158L expressed in SGIV-infected cells was the same
molecular size as the tag-removed recombinant ORF158L proteins expressed in E.
coli (Fig 4.1. A. First lane and third lane). This study demonstrated that ORF158L
encoded a full length viral protein upon infection.
4.4.2 Knockdown of ORF158L Fig 4.1. B shows no detectable ORF158L in the
158_MO treated GEC with SGIV-infected when compared to the ORF158L in the
CTR_MO treated GEC with SGIV-infected (Fig 4.1. B).This evidence shows that
ORF158L was efficiently knocked down by anti-ORF158 MO (i.e. 158_MO).
Under EM, there was minimal alteration of virus assembly, the morphology of mature
SGIV particles did not change and the budding of SGIV particles was normal when
ORF158L was knocked down (Fig 4.1. C).
4.4.3 DNA Microarray and Real time RT-PCR investigation of transcriptomes of
viral genes when ORF158Lwas knocked down There was no obvious variation in
viral transcription upon ORF158L knockdown (Fig 4.2. A). Real time RT PCR results
86
of 7 ORFs were consistent with DNA microarray data showing that the transcription
of viral genes was not significantly affected when we knocked down ORF158L (Fig
4.2. B&C).
4.4.4 Subcellular localization of ORF158L and ORF158L & Histone H3
colocalization At late SGIV-infected stage (i.e. 48h p.i.), the morphology of nucleus
of host cells (i.e. GECs) was significantly altered (Fig 4.3.). The inter chromosome
spaces in host nucleus were dramatically increased at the late stage of SGIV-infected
and ORF158L accumulated in the host nuclear and viral assembly center (Fig 4.3). In
the 158_MO treated cells, no fluorescent signals corresponding to ORF158L were
detected (Fig 4.3. N). This phenomenon confirmed that ORF158L was successfully
knocked down.
Furthermore, the colocalization of ORF158L and histone H3 at host cell nucleus was
observed using immunofluorecence (Fig 4.4).
4.4.5 iTRAQ study of the effect of ORF158L knockdown at proteome scale
iTRAQ data identified 51 viral proteins and all these viral proteins were not
significantly changed when we knocked down ORF158L during SGIV-infection (Fig
4.7). Given the unknown nature of the host genome, only 700 plus host proteins were
identified from the iTRAQ data and most of the identified host proteins were not
notably changed when we knocked down ORF158L during SGIV infection compared
to those without ORF158L knockdown during SGIV infection (Data not shown).
4.4.6 ORF158L is involved in Histone H3 lysine 79 (K79) methylation regulation
iTRAQ data showed that host histone H3 were up-regulated when we knocked down
ORF158L during SGIV-infection (Figure 4.5). To validate this finding, we carried out
western blot against histone H3 by using the commercial polyclonal antibodies
against histone H3 . Surprisingly, the total histone H3 was not up-regulated when we
87
knocked down ORF158L during SGIV-infection (Figure 4.6). Because we did not
include any kind of post translational modification when we analyzed the iTRAQ data,
the iTRAQ data only showed that the non modified form of histone H3 was down-
regulated upon SGIV infection. Given that total histone H3 did not change when we
knocked down ORF158L during SGIV infection, we suspected that the modified form
of histone H3 might be down-regulated. Histone H3 can undergo many kinds of post
translational modifications, such as methylation, acetylation, and phosphorylation. To
determine what kind of modification could account for the findings presented above,
we reexamined the iTRAQ raw data, and found that the peptide “EIAQDFKTDLR”
of histone H3 was the only peptide identified from iTRAQ analysis and the non
modified form of this histone H3 peptide was up-regulated when we knocked down
ORF158L during SGIV infection. This histone H3 peptide sequence
“EIAQDFKTDLR” contains a lysine (i.e. K), which is referred as K79 in histone H3 ,
and this K79 can undergo methylation. We proposed that K79 methylated form of
histone H3 could be down-regulated. A combination of up regulation of the K79
methylation form of histone H3 and down-regulation of K79 non-methylated histone
H3 would make the total histone H3 unchanged. To test this hypothesis, we carried
out western blots against K79 methylated histone H3 by using the specific polyclonal
antibodies against K79 methylation form of histone H3. The results are shown in
Figure 4.6. The first panel shows that the K79 methylation form of histone H3 was
up-regulated upon SGIV-infected and down-regulated when we knocked down
ORF158L during SGIV infection; the second panel shows that total histone H3 was
not changed; the third panel shows that ORF158L was successfully knocked down;
the bottom panel shows beta actin, which was used as controls, were not changed
(Figure 4.6). By combining the iTRAQ and western blot data, we can conclude that
88
K79 monomethylation of histone H3 was up-regulated when GECs were infected by
SGIV. When ORF158L was knocked down in SGIV-infected GECs, the K79
monomethylation of histone H3 was down-regulated compared to those SGIV-
infected GECs without ORF158L knockdown (Figure 4.6).
4.5 Discussion
A total of 12 iridovirus genomes including SGIV has been sequenced (Tidona and
Darai, 1997; He et al., 2001; Jakob et al., 2001; He et al., 2002; Jancovich et al.,
2003; Do et al., 2004; Tan et al., 2004; Zhang et al., 2004; Song et al., 2004; Tsai et
al., 2005; Lu et al., 2005; Delhon et al., 2006). Genomic annotations have predicted a
number of ORFs based on the genomic sequences. In SGIV, a total of 162 ORFs was
predicted on the basis of the viral genomic sequence, and only 23 are highly
homologous to other iridovirus proteins with specific biological functions (Song et al.,
2004). To date, only 45 viral proteins out of the 162 ORFs have been detected at the
protein level using mass spectrometry (Song et al., 2004; Song et al., 2006). Recently,
127 ORFs out of the 162 ORFs have been characterized at the transcriptional level
and are classified into three groups (i.e. immediate early, early and late genes) (Chen
et al., 2006). Although ORF158L has been classified to be an early gene, the
corresponding proteins and its functions remain unknown (Chen et al., 2006). In this
study, we found that ORF158L was involved in the regulation of histone H3 K79
methylation.
Besides methylation, histone H3 can be acetylated and phosphorylated, and these
modifications affect gene expression (Zhang and Reinberg, 2001; Lacoste et al., 2003;
Li et al., 2007). The different modifications of histone H3 , including acetylation,
phosphorylation, and methylation, have different functions. For example, acetylations
of histone H3 are generally related to transcriptional activation, while, methylations of
89
histone H3 are connected with both transcriptional activation and transcriptional
repression (Roth et al., 2001. ; Zhang and Reinberg, 2001). The methylation of
histone H3 can either be lysine methylation or arginine methylation. The K4, K9, K27,
K36 and K79 of H3 have been reported to have potential to be methylated (Li et al.,
2007). Generally, the methylation of K9 and K27 of histone H3 are related to
transcriptional repression, whereas, K4, K36 and K79 of histone H3 are linked to
transcriptional activation (Martin and Zhang, 2005). Although K79 methylation of
histone H3 was regarded to be linked with gene transcriptional activation, the viral
coding protein regulated the host histone H3 K79 methylation were still unknown. We
proposed that ORF158L could benefit the SGIV by regulating host gene transcription.
90
Figure 4.1 Identification and knockdown of ORF158L in cell culture. A.) Upper panel
is western blot by using monoclonal antibodies against ORF158L. Lower panel is
western blot by using beta actin monoclonal antibodies. First lane is the cell lysate of
GECs which are 48 hours post infection with 5MOI SGIV. Second lane is the cell
lysate of mock SGIV-infected GECs. Third lane is the purified ORF158L protein with
his-tag cleaved by thrombin. B.) First lane is the cell lysate of anti 158 morpholinos
(158_MO) treated GECs which are 48 hours post infection with 5MOI SGIV. Second
lane is the cell lysate of control morpholinos (Ctr_MO) treated GECs which are 48
hours post infection with 5MOI SGIV. C.) Left is the EM picture of a section of
158_MO treated GECs which are 48 hours post infection with 5MOI SGIV. Right is
the EM picture of a section of Ctr_MO treated GECs which are 48 hours post
infection with 5MOI SGIV. ‘ ’ points to mature SGIV particles.
91
0
5000
10000
15000
20000
ORF009L ORF033L ORF041L ORF049L ORF051L ORF120L ORF155R
Signal Intensity
CTR_MO
158_MO
0
0.5
1
1.5
2
ORF009L ORF033L ORF041L ORF049L ORF051L ORF120L ORF155R
2-∆∆Ct
Ctr_MO_2-∆∆Ct
158_MO_2-∆∆Ct
B
C
Figure 4.2 A comparison of transcriptional profiles of SGIV with and without
ORF158L knock down. A.) X axis is the DNA microarray data of Control
morpholinos (Ctr_MO) treated GECs which are 48 hours post infection with 5MOI
SGIV normalized to beta actin. Y axis is the DNA microarray data of anti 158
morpholinos (158_MO) treated GECs which are 48 hours post infection with 5MOI
SGIV normalized to beta actin. B.) Fold changes between 158_MO treated and
Ctr_MO treated DNA microarray data of random selected ORFs. X axis lists the
random selected ORFs.Y axis is the fold changes of selected ORFs when we
compared the normalized DNA microarray data of 158_MO to Ctr_MO. C.) The
investigations of the fold changes between 158_MO treated and Ctr_MO treated
samples of random selected ORFs in Figure 2 B by Real time RT-PCR.
92
Figure 4.3 Subcellular localization of ORF158L. Blue colors represent DAPI staining ds DNA.
Green color represents the ORF158L.
94
00.20.40.60.81
1.21.41.61.82
CTR_MO 158_MO
Figure 4.5. iTRAQ study of ORF158L knockdown effects to histone H3. When the
SGIV-infection GECs with 158_MO treatment was compared to those with
CTR_MO treatment at proteomics scale by iTRAQ, non modification form of histone
H3 was found to be up-regulated when ORF158L was knocked down.
95
Figure 4.6. Western blot of histone H3 modification. First lane represents SGIV mock-
infected GEC; second lane represents SGIV-infected GEC; third lane represents SGIV-
infected GEC with CTR_MO treatment; fourth lane represents SGIV-infection GEC with
158_MO treatment. First panel represents K79 monomethylation form of histone 3; second
panel represents histone 3; third panel represents ORF158L; fourth panel represents beta
actin.
96
0 0.5 1 1.5 2
ORF162LORF156LORF155RORF151LORF140RORF136RORF135LORF127RORF125RORF124RORF119RORF118RORF111RORF107RORF102LORF101RORF099RORF098RORF093LORF090LORF088LORF086RORF084LORF082LORF081LORF075RORF072RORF069LORF067LORF057LORF056RORF055RORF049LORF046LORF045LORF043RORF039LORF038LORF026RORF025LORF022LORF019RORF018RORF016LORF015LORF012LORF008LORF007LORF006R
Fold changes (158_MO/CTR_MO)
Figure 4.7. iTRAQ study of ORF158L knockdown effects in SGIV-infected GECs. Viral protein
comparison between SGIV-infection GECs with 158_MO treatment and those with CTR_MO
treatment at proteomic scale by iTRAQ shows that the synthesis of viral proteins were not inhibited
when ORF158L was knock down during SGIV-infection of GECs.
98
5.1 Summary & Introduction
The functional studies show that ORF158L is involved in the regulation of host
histone H3 K79 methylation. In this chapter, we report the structural study of
ORF158L. After we failed to solve the structure of ORF158L by NMR, we solved the
structure of ORF158L by X-ray crystallography at 1.8 Å resolution. The structure of
ORF158L is highly similar to ASF1, which is an anti-silencing factor involved in
nucleosome assembly. We also verified that ORF158L can bind to H3 & H4 complex
with sub-micromolar binding affinity using SPR.
99
5.2 Materials and Methods
5.2.1 Construction of the expression plasmid
Full length DNA coding sequence of ORF158L, which coded 138 amino acids, was
amplified by PCR from the SGIV genome and the PCR products were further cloned
into the pET-15b vector. The sequence was verified by DNA sequencing. The
resulting construct was transformed into DE3-star strain for protein expression.
5.2.2 Expression and purification of ORF158L
Large scale expression of ORF158L was done by the following procedure: the
transformed cells were cultured in LB with 100µg/ml of ampicillin; 0.2mM IPTG was
added when OD600 value of cells reached 1.0; ORF158L was induced to overexpress
by culturing the cells at 18°C over night. The gene product of ORF158L was purified
under native conditions by Ni-NTA chromatography (Qiagen) and his-tag was
cleaved by thrombin. The elutes from Ni-NTA column were further purified by
SuperdexTM 75, followed by mono Q ion exchange chromatography.
5.2.3 Mass spectrometry analysis
Purified ORF158L was subjected to MALDI-TOF mass spectrometry to verify its
molecular mass.
5.2.4 Dynamic light scattering (DLS) study
In order to assess the homogeneity of ORF158L, the purified protein was
characterized using DLS with Dynapro instrument (Protein Solutions Inc). The data
were analyzed by using Dynamics 5.0 (Moradian-Oldak, 1998 ).
5.2.5 Circular dichroism (CD) study
The secondary structure of ORF158L was characterized by CD. ORF158L was
prepared in 10-20 mM sodium phosphate buffer, pH6.8. The experiment was
100
conducted in the far UV range (190-260 nm) with Jasco spectropolarimeter J-720 at
25°C.
5.2.6 NMR sample preparation
15N and 15N/13C-labeled ORF158L were prepared in M9 medium with the addition of
(15NH4)2SO4 for 15N labeling and (15NH4)2SO4 and [
13C] glucose for 15N/13C double
labeling, respectively. Samples for NMR spectroscopy were concentrated to 0.5 mM
in the buffer which contained 20 mM sodium phosphate buffer with pH6.6, 1 mM
Mg2Cl2, 5 mM 2-ME, 150 mM NaCl. Finally, 450 µl of protein solution was mixed
with 50µl of D2O (final concentration is 10%) and transferred to an NMR tube
(Cambridge Isotope Laboratories, Inc.). All samples were examined by 15% SDS-
PAGE after purification.
5.2.7 NMR Experiments and data processing.
All NMR spectra were acquired at 298K on a Bruker 800MHz spectrometer. All
NMR data were processed with NMRPipe (Delaglio 1995.) and analyzed with
NMRview (Johnson 1994.). Structure calculation could be performed with the
program CYANA (Herrmann 2002.) and the phi and psi angle constraints could be
generated from TALOS program (Cornilescu 1999). Ten conformers with the lowest
final values of the CYANA target function could be chosen to represent the most
probable structures.
5.2.8 SeMet ORF158L preparation
SeMet ORF158L was prepared by using M9 SeMET growth media kits and
following the manufacturer’s protocol (Medicilon).
5.2.9 Crystallization
Purified native ORF158L was concentrated to approximately 8 mg/ml in the buffer
consisting of 20 mM Tris pH8.5, 150 mM NaCl, 1 mM MgCl2, 1 mM DTT. Initial
101
crystallization screening was performed using the sparse matrix approach using kits
from Hampton Research (Screen I and II). The protein was crystallized by the
hanging drop vapor diffusion method at 25°C. The best crystals appeared when
equilibrating against reservoir solution containing 3% glycerol, 18% PEG3350, 0.1 M
Tris pH8.5, 0.3 M NaAc with a drop size of 1 µl protein solution mixed with 1 µl
reservoir solution. Similar to the native ORF158, the recombinant SelMet ORF158
was concentrated to approximately 8mg/ml in the same buffer as the native protein.
SelMet substituted crystals were grown under essentially the same conditions as the
native ORF158. In order to obtain bigger SelMet crystals macroseeding was
employed. The crystals grew to a final size of approximately ~0.2×0.2×0.05mm3
(Figure 5.7).
5.2.10 Data collection, structure solution and refinement
Prior to data collection, both native and SelMet crystals were soaked in the reservoir
solution with step wise increased concentrations of glycerol (5% glycerol, 10%
glycerol, 15% glycerol, 20% glycerol, 25% glycerol) as cryo-protectant. Crystals were
picked up in a nylon loop, and frozen at 100K in N2 (gas) cold stream. Synchrotron
data were measured at beamline X29, NSLS, Brookhaven National Laboratory.
Diffraction data were processed using the program HKL2000. For phase
determination, the resolution range from 50 to 2.5Å was chosen. Two of the three
expected Se sites of one asymmetric unit were found using the program BNP. BNP
was further used to automatically build 90% of the total model. Remaining parts of
the model were built manually using the program O. Further cycles of model building,
alternating with refinement using the program CNS resulted in the final model, with
an R-factor of 0.207 (Rfree=0.245). The data collection and refinement statistics of
ORF158 was listed in Table 5.1.
102
5.2.11 Expression and purification of H3 and H4 complex
The plasmid, which co-expresses histone H3 and H4, was kindly provided by Dr.
James T. Kadonaga. Large scale expression of recombinant H3 and H4 was
performed by the following procedure: transformed cells were cultured in LB with
100µg/ml of ampicillin; 0.4mM IPTG was added when OD600 value of cells reached
0.6; Recombinant H3 and H4 was induced to overexpress by culturing the cells at
37°C for 2 hours. Unless stated otherwise, all subsequent operations were performed
at 4 °C. The cells were pelleted by centrifugation (4,000g, 17min). The pellets were
resuspended in the buffer (20 ml/liter bacterial cell culture), which contains 10 mM
Hepes (pH7.6), 0.1 M NaCl, 10% (v/v) glycerol, 1 mM dTT, and lysed by three
cycles of sonication of 30 s each. The lysate was subjected to centrifugation (10,000 g,
30 min), and the supernatant was discarded. The pellet was suspended in 0.25 N HCl
(10ml/liter bacterial cell culture) and dispersed with a dounce homogenizer. The
resulting suspension was incubated at -20 °C for 30 min and subjected to
centrifugation (10,000 g, 60 min). The supernatant was collected and neutralized with
0.17 volumes of 2M Tris base. The H3-H4 tetramers were purified by monoS
chromatography (flow rate = 0.8ml/min; fraction size = 0.5 ml). Protein was eluted
with a linear gradient (20 ml) from 0.1 to 2 M NaCl in the buffer, which contains 10
mM Hepes (pH7.6), 0.1 M NaCl, 10% (v/v) glycerol, 1 mM dTT. The H3 and H4
complex was eluted at about 1 M NaCl. (Figure 5.8)
5.2.12 Surface Plasmon Resonance (SPR)
Surface plasmon resonance (SPR) was carried out on a BIACORE 3000 instrument,
which uses SPR technology to investigate the interactions of macromolecules with
each other and with small ligands by immobilizing one of the ligands on a gold
surface followed by flowing the second partner (analyte) over the immobilized ligand
103
surface. One hundred microliters of 25µg/ml histone H3 and H4 complex were
captured on the CM5 (Biacore). After ligand immobilization on the analyte cell, both
the analyte and the reference cells were washed with the histone binding buffer (10
mM HEPES [pH 7.4], 150 mM NaCl, and 0.005% Surfactant P20). Serial dilution
from 0.1 µM to 25 µM of ORF158L protein sample was injected across the chip.
Binding proteins were removed by 10 microliters of 100 mM HCl after each
dissociation cycle. The data were fitted to a 1:1 binding mode with mass transfer
effect using the BIAevaluation 3.1 software (BIAcore).
5.3 Results & Discussion
5.3.1 Protein purification profiles of ORF158L
ORF158L was cloned into the pET-15b vector, expressed as a soluble form, purified
with AF, SEC, and IEC to achieve the appropriate homogeneity level for downstream
studies. The chromatography profile of ORF158L is shown in Figure 5.1.
5.3.2 Mass spectrometry analysis
The molecular mass of ORF158L is 16050 Dalton, which matched the theoretical
M.W. of 16069 Dalton. The SeMET ORF158L is 16240 Da (Figure 5.2), which
means all 4 methionines in ORF158L were replaced by Se-Met. The sequence here is
referring to the full length of ORF158L (15770 Da) plus GSH residues (299 Da) at the
N-terminal of recombinant ORF158L after thrombin cleavage of the his tag.
5.3.3 Dynamic light scattering (DLS) study
The DLS data shows that ORF158L behaves as a 81.8 kDa protein by aggregating to
be ORF158L polymers (Figure 5.3).
5.3.4 Circular dichroism study
The CD spectrum measured in the far UV region (190-260nm) showed that the
secondary structure of ORF158L is β sheet-dominant (Figure 5.4).
104
5.3.5 1D NMR
A simple one-dimensional proton experiment is the most basic spectrum in NMR
spectroscopy that can be acquired in a short time (usually not longer than a few
minutes) for samples as dilute as 0.01 mM. It provides us the folding status of
proteins (i.e. a protein is folded or unfolded). The protein folding status is important
for any further functional or structural studies on the protein as only folded proteins
retain their functional activity and three-dimensional structure. Figure 5.5 A shows a
typical one-dimensional proton NMR spectrum of a folded protein (upper panel) and a
typical one-dimensional proton NMR spectrum of an unfolded protein (lower panel).
The folded protein shows signal dispersion downfield (left) of 8.5 ppm and upfield (to
the right) of 1 ppm in one-dimensional proton NMR spectrum. With comparison, an
unfolded protein sample shows strong signals around 8.3 ppm, which is the region
characteristic for amide groups in random coil conformation, and no signal dispersion
below approximately 8.5 ppm, and also no further signals to the right of the strong
methyl peak at 0.8 ppm (Figure 5.5 A). Figure 5.5 B shows the 1D NMR spectra of
ORF158L. The 1D NMR spectra of ORF158L identifies that ORF158L is a well folded
protein (Figure 5.5 B). The signal dispersion upfield (to the right) of 0 ppm in one-
dimension proton NMR spectrum indicates that ORF158L is a well folded and beta
sheet-dominant protein.
5.3.6 1H-15N HSQC study of ORF158L
A one-dimensional spectrum of the protein molecule contains overlapping signals from
many hydrogen atoms because the differences in chemical shifts are often smaller than
the resolving power of the experiment. This problem has been bypassed by designing
experimental conditions that yield a two-dimensional NMR spectrum. Compared to 1H-
15N HSQC spectrum of ORF129, which is another protein coded in the SGIV genome,
105
1H-15N HSQC spectrum of ORF158L shows much better dispersions (Figure 5.6).
However, about 20% of the backbone amino acids signals were undetectable in the
1H-15N HSQC spectrum of ORF158L. This phenomenon indicates that ORF158L may
have dynamic regions. As a result, we failed to solve the structure of ORF158L by
NMR.
5.3.7 Overall structure of ORF158L
The structure of recombinant SelMet ORF158 of SGIV was solved by the
multiwavelength anomalous dispersion (MAD) method from synchrotron data and
refined to a present R-factor of 0.207 (Rfree=0.245) at 1.8 Å resolution. Statistics for
the Ramachandran plot from an analysis using PROCHECK gave over 86% of
nonglycine residues in the most favored region and 0% residues in the disallowed
region. The asymmetric unit consists of an ORF158 molecule comprising 133
residues from Ile5 to Ala137. Neither the N-terminal four residues nor the last
residues at the C-terminal had interpretable density and were not modeled. The
organization of ORF158 molecules in the crystal is consistent with it being a
monomer in the solution studies such as gel filtration experiments. Figure 5.9 A
shows the ribbon representation of the crystal structure of ORF158.
ORF158 molecule consists mainly of two β-sheets each consists of four and seven
antiparallel β-strands respectively and forms an elongated β- sandwich domain. In
addition, two short α-helices on one surface of the molecule with several loops
between the secondary structures. The overall architecture of ORF158 resembles an
immunoglobulin-like fold. The cavity at the β-sandwich domain is predominantly
hydrophobic in nature. We suggest that the cavity may recognize a specific small
hydrophobic ligand and interact with side chains of the cavity residues. However the
exact roles of this cavity are not yet established.
106
The structure of ORF158 has a concave surface on one side of the β-sandwich domain.
This concave surface spread across strands from β9 to β11 of the β-sheet. The
calculated electrostatic surface potential shows strong negative charge in the concave
region. It is worth mentioning here that several conserved residues of ORF158 are
located in this region and would suggest an important functional role for this surface
feature. Our functional data suggest that ORF158 is a histone binding protein and we
speculate that this function could be related to this structural feature.
5.3.8 Sequence and structural homology
Blast searches revealed that ORF158 has no significant sequence homology with any
protein with known function. However it exhibits a maximum identity of less than
18% with two hypothetical proteins from Ostreococcus lucimarinus and Laccaria
bicolor. A search for topologically similar proteins within the PDB data base was
performed with the program DALI. Interestingly, significant structural similarity was
found between Asf1 and ORF158. It is worth mentioning here that the BLAST search
against all the protein sequences did not identify any similarity with the Asf1 protein.
The sequence identity among these proteins was only 13%, however their three
dimensional structures are highly homologous. The highest structural similarity is
observed between ORF158 and the Asf1, yielding an rmsd of 3.4 Å for 73 Cα atoms,
with 13% sequence identity. This observation suggests that there may be an
evolutionarily conserved structural and functional relationship among these proteins,
while at the same time the amino acid sequences of these proteins have diverged.
5.3.9 Putative histone binding region
Taking clues from the structural resemblance of ORF158 with Asf1, we hypothesize
that ORF158 could be a histone binding protein and this property is required to
regulate host histone H3 K79 methylation during SGIV replication. To verify our
107
hypothesis we inferred the ORF158 Histone proteins complex by superimposing
ORF158 on Asf1 of the heterotrimer complex with Histone 3 (H3) and Histone 4 (H4)
proteins. In the inferred complex, the Histone proteins, particularly H3 protrudes into
the negatively charged surface concave region and established several hydrogen
bonding contacts with the nearby conserved residues. The key residue Val94 of Asf1,
Ala 92 in the case of ORF158, occupies in the same pocket. Figure 5.9 shows the
inferred ORF158 complex model. Further to verify this hypothesis, we have
expressed and purified Histone proteins (Figure 5.8) and examined the formation of
complex with ORF158. The histone proteins are shown to co-localize with ORF158 in
the nucleus of SGIV-infected cells (Figure 4.4). Furthermore we have also carried out
Surface Plasmon Resonance (SPR) to obtain the binding affinities between them.
ORF158 binds with Histone which in the order of 0.362µM indicates the formation of
heterotrimer complex (Figure 5.10). Based on our experimental observations and
inferred Histone complex model, we propose that ORF158 binds with histone proteins
in a way similar to Asf1. Figure 5.9 E shows the electrostatic surface potential of
ORF158. The surface concave is located in a strong negative patch. It has been shown
for Asf1 that a mutation of Val94 with Arg completely abolished the histone binding
property of Asf1. This may be due to the introduction of large side chain residues in
this pocket. A similar argument holds true for ORF158, mutating Ala 92 with a large
side chain amino acid may affect its histone recognition. Future studies will be
directed to understand the role of key residues in ORF158 to recognize histone
proteins.
108
Figure 5.1 Purification of ORF158L. A) Gel filtration profile of ORF158L on
Superdex-30 gel filtration column (Amersham); B) Ion exchange chromatography of
ORF158 on HiTraq 5ml Q column (Amersham): C) SDS page of final purified
ORF158L.
109
Figure 5.2 Molecular Mass determination by MS. A) MS results of Native ORF158L;
B) MS results of SeMET ORF158L.
112
Figure 5.5 Characterization of protein structures by one-dimensional NMR
spectroscopy: A). Upper panel: A typical one-dimensional proton NMR spectrum of a
folded protein. Lower panel: A typical one-dimensional proton NMR spectrum of an
unfolded protein sample. B). One-dimensional proton NMR spectrum of ORF158L.
113
Figure 5.6 Characterization of protein structures by two-dimensional NMR
spectroscopy A) the 1H-15N HSQC spectrum of an unfolded protein ORF129;
B) the 1H-15N HSQC spectrum of ORF158L;
115
Figure 5.8 Purification of histone H3 & H4 complex. A) Chromatograph profile of
histone H3 & H4 complex monoS column (Amersham); B) SDS page of purified
histone H3 & H4 complex.
116
Figure 5.9 Structure of ORF158L and models of ORF158L and histones complex. A)
Overall structure of ORF158L; B) Structure comparison between ORF158L and Asf1;
C) The model of ORF158L and histone H3 complex; D) The model of ORF158L and
histone H3 & H4 complex; E) The electrostatic surface potential of ORF158L, blue
color represents positive charge, red color represents negative charge.
118
Table 5.1 Data collection and refinement statistics of ORF158L
Data collection and refinement statistics
Space group C2 Unit cell dimensions (Å) a=59.98, b= 41.01, c= 52.33 α=90, β= 95.32, γ=90.0 Data set Peak Inflection
Resolution range (Å) 50-1.8 Wavelength (Å) 0.979 0.9792 Observed hkl 86250 73264 Unique hkl 11855 10223 Completeness (%) 99.9 99.9 Overall (I/oI) 14.8 12.3 Rsym a (%) 6.2 6 Refinement and quality Resolution range [I>o(I)] 1.8 -20.0 Rwork b (no. of reflections) 0.2074(21617) Rfree c (no. of reflections) 0.2447(1093) rmsd bond lengths (A) 0.006 rmsd bond angles (deg) 1.3 Ramachandran plot Most favored regions (%) 86 Additional allowed regions (%) 13.2 Generously allowed regions (%) 0.9 Disallowed regions (%) 0
120
6.1 Achievements
In this thesis, we have presented the following achievements:
1) We are the first to study the temporal and differential gene expression of
Singapore grouper iridovirus (SGIV) by DNA microarray. To our knowledge,
this is one of the first studies where DNA microarray technique and also real-
time RT-PCR were used for the investigation of fish viruses. (Chen et al.,
2006)
2) We are the first to study the SGIV and host cell interaction at proteomics scale
by using iTRAQ. (Chen et al., 2008)
3) We are the first to demonstrate that ORF158L is a real protein encoded by
SGIV.
4) We are the first to characterize the function of ORF158L and show that
ORF158L is involved in the regulation of histone H3 K79 methylation.
5) We are the first to elucidate the crystal structure of ORF158L.
6.2 Future experiments
In future, we are going to further investigate the following scientific questions:
1) We are going to study the mechanism about how ORF158L regulates host
histone H3 K79 methylation and the consequences of the regulation of host
histone H3 K79 methylation.
2) By combining knockdown, DNA microarray and iTRAQ, we are going to
carry functional studies of SGIV genes.
3) With the help of NMR and X-ray crystallography, we are going to continue
the structural studies of both viral structural proteins and viral non structural
proteins.
121
4) Based on the functional genomics and structural genomics studies of SGIV,
we are going to uncover the SGIV pathogenesis and provide valuable
knowledge.
5) Based on the functional genomics and structural genomics studies of SGIV,
we are going to design new drugs and raise new strategies in the treatment of
SGIV infection on grouper.
122
References:
Aparicio, S., J. Chapman, E. Stupka et al. 2002. Whole-genome
shotgun assembly and analysis of the genome of Fugu rubripes.
Science 297:1301-1310.
Ashburner, M., C. A. Ball, J. A. Blake et al. 2000. Gene ontology: tool
for the unification of biology. The Gene Ontology Consortium. Nature
Genet. 25:25-29.
Bax, A. 2003. Weak alignment offers new NMR opportunities to study
protein structure and dynamics. Protein Sci. 12:1-16.
Bloch, F., W. W. Hansen, and M. E. Packard. 1946. Nuclear induction.
Phys.Rev. 69:127-150.
Blow, D. M. 2002. Rearrangement of Cruickshank's formulae for the
diffraction-component precision index. Acta Crystallogr D Biol
Crystallogr. 58:792-797.
Brenner, S. E. 2001. A tour of structural genomics. Nat Rev Genet.
2:801-809.
Bruschweiler, R. 2003. New approaches to the dynamic interpretation
and prediction of NMR relaxation data from proteins. Curr. Opin. Struct.
Biol. 13:175-183.
Cao, R., and Y. Zhang. 2004. The functions of E(Z)/EZH2-mediated
methylation of lysine 27 in histone H3. Curr. Opin. Genet. Dev. 14:155-
164.
Chayen, N. E. 2004. Turning protein crystallization from an art into a
science. Curr. Opin. Struct. Biol. 14:577-583.
Chen, L. M., B. N. Tran, Q. Lin, T. K.Lim, F. Wang, C. L. Hew. 2008.
123
iTRAQ analysis of Singapore grouper iridovirus infection in a grouper
embryonic cell line. J Gen Virol. (Accepted).
Chen, L. M., F. Wang, W. J. Song, and C. L. Hew. 2006. Temporal
and differential gene expression of Singapore grouper iridovirus. J Gen
Virol. 87:2907-2915.
Chew, L. M., G. H. Ngoh, M. K. Ng, J. M. Lee, P. Chew, J. Li, Y. C.
Chan, and J. L. C. Howe. 1994. Grouper cell line for propagating
grouper viruses. Singap J Prim Ind. 22:113-116.
Chinchar, V. G., S. Essbauer, J. G. He, A. Hyatt, T. Miyazaki, V.
Seligy, and T. Williams. 2005. Family Iridoviridae. In Virus Taxonomy:
Eighth Report of the International Committee on Taxonomy of Viruses,
pp. 150–162. Edited by C. M. Fauquet, M. A. Mayo, J. Maniloff, U.
Desselberger & L. A. Ball. San Diego: Elsevier/Academic Press.
Chinchar, V. G., J. Han, J. Mao, I. Brooks, and K. Srivastava. 1994.
Instability of frog virus 3 mRNA in productively infected cells. Virology
203: 187-192.
Chinchar, V. G., and W. Yu. 1992. Metabolism of host and viral
mRNAs in frog virus 3-infected cells. Virology 186:435-443.
Chua, F. H. C., M. L. Ng, K. L. Ng, J. J. Loo, and J. Y. Wee. 1994.
Investigation of outbreaks of a novel disease, ‘sleepy grouper disease,’
affecting the brown-spotted grouper, Epinephelus tauvina Forskal. J
Fish Dis 17:417-427.
Clore, G. M., and A. M. Gronenborn. 1991. Structures of larger
proteins in solution: three- and four-dimensional heteronuclear NMR
spectroscopy. Science 252:1390-1399.
124
Cornilescu, G., Delaglio, E & Bax, A. . 1999. Protein backbone angle
restraints from searching a database for chemical shift and sequence
homology. J Biomol. NMR. 13:289-302.
D’Costa, S. M., H. Yao, and S. L. Bilimoria. 2001. Transcription and
temporal cascade in Chilo iridescent virus infected cells. Arch Virol
146:2165–2178.
D’Costa, S. M., H. Yao, and S. L. Bilimoria. 2004. Transcriptional
mapping in Chilo iridescent virus infections. Arch Virol 149:723-742.
Delaglio, F., S. Grzesiek, et al. 1995. NMRPIPE - A Multidimensional
Spectral Processing System Based on UNIX Pipes. J Biomol. NMR.
6:277-293.
Delhon, G., E. R. Tulman, C. L. Afonso, Z. Lu, J. J. Becnel, B. A.
Moser, G. F. Kutish, and D. L. Rock. 2006. Genome of invertebrate
iridescence virus type 3 (Mosquito iridescent virus). J Virol 80:8439-
8449.
Do, J. W., C. H. Moon, H. J. Kim et al. 2004. Complete genomic DNA
sequence of rock bream iridovirus. Virology 325:351-363.
Eaton, H. E., J. Metcalf, E. Penny, V. Tcherepanov, C. Upton, and C.
R. Brunetti. 2007. Comparative genomic analysis of the family
Iridoviridae: re-annotating and defining the core set of iridovirus genes.
Virol J. 4: 11.
Eisen, M. B., P. T. Spellman, P. O. Brown, and D. Botstein. 1998.
Cluster analysis and display of genome-wide expression patterns. Proc
Natl Acad Sci U S A 95:14863-14868.
Fernandez, C., and G. Wider. 2003. TROSY in NMR studies of the
125
structure and function of large biological macromolecules. Curr. Opin.
Struct. Biol. 13:570-580.
Fiaux, J., E.B. Bertelsen, A. L. Horwich, and K. Wüthrich. 2002.
NMR analysis of a 900 K GroEL-GroES complex. Nature 418:207-221.
Goorha, R. 1982. Frog virus 3 DNA replication occurs in two stages. J
Virol. 43:519-528.
Goorha, R., G. Murti, A. Granoff, and R. Tirey. 1978. Macromolecular
synthesis in cells infected by frog virus 3. VIII. The nucleus is a site of
frog virus 3 DNA and RNA synthesis. Virology 84:32-50.
Green, E. D. 2001. Strategies for the systematic sequencing of
complex genomes. Nat Rev Genet. 2:573-583.
Gygi, S. P., B. Rist, S. A. Gerber, F. Turecek, M. H. Gelb, and R.
Aerbersold. 1999. Quantitative analysis of complex protein mixtures
using isotope-coded affinity tags. Nature Biotechnology 17:994-999.
He, J. G., M. Deng, S. P. Weng, Z. Li, S. Y. Zhou, Q. X. Long, X. Z.
Wang, and S. M. Chan. 2001. Complete genome analysis of the
mandarin fish infectious spleen and kidney necrosis iridovirus. Virology
291:126-139.
He, J. G., L. Lu, M. Deng et al. 2002. Sequence analysis of the
complete genome of an iridovirus isolated from the tiger frog. Virology
292:185-197.
Herrmann, T., Giintert,P.and Wuthrich,K. 2002. Protein NMR
structure determination with automated NOE assignment using the new
software CANDID and the torsion angle dynamics algorithm DYANA.
J.Mol.Biol. 319:209-227.
126
Huyen, Y., O. Zgheib, R. A. J. Ditullio et al. 2004. Methylated lysine
79 of histone H3 targets 53BP1 to DNA double-strand breaks. Nature
432:406-411.
Ishima, R., and D. A. Torchia. 2000. Protein dynamics from NMR.
Nat. Struct. Biol. 7:740-743.
Jakob, N. J., K. Müller, U. Bahr, and G. Darai. 2001. Analysis of the
first complete DNA sequence of an invertebrate iridovirus: coding
strategy of the genome of Chilo iridescent virus. Virology 286:182-196.
Jancovich, J. K., J. H. Mao, V. G. Chinchar et al. 2003. Genomic
sequence of a ranavirus (family Iridoviridae) associated with
salamander mortalities in North America. Virology 316:90-103.
Jennings, E. 1995. Folic acid as a cancer preventing agent. Medical
Hypotheses 45:297-303.
Johnson, B. a. B., R.. . 1994. NMR VIEW - A Computer-Program for
the Visualization and Analysis of NMR Data. Journal of Biomolecular
NMR 4:603-614.
Kay, L. E. 2005. NMR studies of protein structure and dynamics.
Journal of Magnetic Resonance 173:193-207.
Kendrew, J. C., G. Bodo, H. M. Dintzis, R. G. Parrish, H. Wyckoff,
and D. C. Phillips. 1958. A Three-Dimensional Model of the Myoglobin
Molecule Obtained by X-Ray Analysis. Nature 181:662-666.
Lacoste, N., R. T. Utley, J. M. Hunter, and T. Jenuwein. 2003. An
epigenetic road map for histone lysine methylation. J. Cell.Sci.
116:2117-2124.
Li, B., M. Carey, and J. L. Workman. 2007. The role of chromatin
127
during transcription. Cell 128:707-719.
Livak, K. J., and T. D. Schmittgen. 2001. Analysis of relative gene
expression data using real-time quantitative PCR and the 2-CT method.
Methods 25:402-408.
Lu, L., S. Y. Zhou, C. Chen, S. P. Weng, S. M. Chan, and J. G. He.
2005. Complete genome sequence analysis of an iridovirus isolated
from the orange-spotted grouper, Epinephelus coioides. Virology
339:81-100.
Lua, D. T., M. Yasuike, I. Hirono, and T. Aoki. 2005. Transcription
Program of Red Sea Bream Iridovirus as Revealed by DNA
Microarrays. J. Virol. 79:15151-15164.
Lua, D. T., M. Yasuike, I. Hirono, H. Kondo, and T. Aoki. 2007.
Transcriptional profile of red seabream iridovirus in a fish model as
revealed by viral DNA microarrays. Virus Genes 35:449-461.
Luger, K., A. W. Mäder, R .K. Richmond, D. F. Sargent, and T. J.
Richmond. 1997. Crystal structure of the nucleosome core particle at
2.8 A resolution. Nature 389:251-260.
Marsden, R. L., J. A. Ranea, A. Sillero et al. 2006. Exploiting protein
structure data to explore the evolution of protein function and biological
complexity. Philos Trans R Soc Lond B Biol Sci. 361:425-440.
Martin, C., and Y. Zhang. 2005. The diverse functions of histone
lysine methylation. Nat Rev Mol Cell Biol. 6:838-849.
McMillan, N. A., and J. Kalmakoff. 1994. RNA transcript mapping of
the Wiseana iridescent virus genome. Virus Res. 32:343-352.
Moradian-Oldak, J., Leung, W., Fincham, A. G.. 1998 Temperature
128
and pH-dependent supramolecular self-assembly of amelogenin
molecules: a dynamic light-scattering analysis. J. Struct. Biol. 122:320-
327.
Murti, K. G., R. Goorha, and A. Granoff. 1985. An unusual
replication strategy of an animal iridovirus. Adv. Virus Res. 30:1-19.
Okada, Y., Q. Feng, Y. Lin et al. 2005. hDOT1L links histone
methylation to leukemogenesis. Cell 121:167-178.
Okazaki, Y., M. Furuno, T. Kasukawa et al. 2002. Analysis of the
mouse transcriptome based on functional annotation of 60,770 full-
length cDNAs. Nature 420:563-573.
Ong, S. E., B. Blagoev, I. Kratchmarova, D. B. Kristensen, H. Steen,
A. Pandey, and M. Mann. 2002. Stable isotope labeling by amino
acids in cell culture, SILAC, as a simple and accurate approach to
expression proteomics. Molecular and Cellular Proteomics 1:376-386.
Overhauser, A. W. 1953. Nuclear Overhauser effect. Phys.Rev.
91:476.
Paulose, M. M., N. K. Ha, C. Xiang et al. 2001. Transcription program
of human herpesvirus 8 (Kaposi's sarcoma-associated herpersvirus). J
Virol. 75:4843-4853.
Peterson, C. L., and M. A. Laniel. 2004. Histones and histone
modifications. Curr. Biol. 27:R546-551.
Purcell, E. M., H. C. Torrey, and R.V. Pound. 1946. Resonance
Absorption by Nuclear Magnetic Moments in a Solid. Phys. Rev. 69:37-
38.
Qin, Q. W., T. J. Lam, Y. M. Sin, H. Shen, S. F. Chang, G. H. Ngoh,
129
and C. L. Chen. 2001. Electron microscopic observations of a marine
fish iridovirus isolated from brown-spotted grouper, Epinephelus
tauvina. J. Virol. Methods 98:17–24.
Rabi, I. I., J. R. Zacharias, S. Millman, and P. Kusch. 1938. A New
Method of Measuring Nuclear Magnetic Moment. Physical Review
53:318.
Ross, P. L., Y. N. Huang, J. N. Marchese et al. 2004. Multiplexed
protein quantitation in Saccharomyces cerevisiae using aminereactive
isobaric tagging reagents. Molecular and Cellular Proteomics 3:1154-
1169.
Roth, S. Y., Denu, J. M., Allis, C. D. 2001. Histone acetyltransferases.
Annu. Rev. Biochem. 70: 81-120.
Sattler, M., J. Schleucher, and C. Griesinger. 1999. Heteronuclear
multidimensional NMR experiments for the structure determination of
proteins in solution employing pulsed field gradients. Prog. Nucl. Magn.
Reson. Spectrosc. 34:93-158.
Scapin, G. 2006. Structural biology and drug discovery. Curr. Pharm.
Des. 12:2087-2097.
Smith, K. M., and N. Xeros. 1954. An unusual virus disease of a
dipterous larva. Nature 173:866–867.
Song, W., Q. Lin, S. B. Joshi, T. K. Lim, and C. L. Hew. 2006.
Proteomic studies of the Singapore grouper iridovirus. Mol Cell
Proteomics 5:256–264.
Song, W. J., Q. W. Qin, J. Qiu, C. H. Huang, F. Wang, and C. L. Hew.
2004. Functional genomics analysis of Singapore grouper iridovirus:
130
complete sequence determination and proteomic analysis. J. Virol.
78:12576–12590.
Strausberg, R. L., E. A. Feingold, L. H. Grouse et al. 2002.
Generation and initial analysis of more than 15,000 full-length human
and mouse cDNA sequences. Proc Natl Acad Sci U S A 99:16899-
16903.
Tachibana, M., J. Ueda, M. Fukuda et al. 2005. Histone
methyltransferases G9a and GLP form heteromeric complexes and are
both crucial for methylation of euchromatin at H3-K9. Genes Dev
19:815-826.
Tagashira, S., H. Harada, T. Katsumata, N. Itoh, and M. Nakatsuka.
1997. Cloning of mouse FGF10 and up-regulation of its gene
expression during wound healing. Gene 197:399-404.
Tan, W. G., T. J. Barkman, C. V. Gregory, and K. Essani. 2004.
Comparative genomic analyses of frog virus 3, type species of the
genus Ranavirus (family Iridoviridae). Virology 323:70-84.
Tettelin, H., V. Masignani, M. J. Cieslewicz et al. 2005. Genome
analysis of multiple pathogenic isolates of Streptococcus agalactiae:
implications for the microbial 'pan-genome'. Proc. Natl. Acad. Sci.
U.S.A. 102:13950-13955.
Tidona, C. A., and G. Darai. 1997. The complete DNA sequence of
lymphocystis disease virus. Virology 230:207-216.
Tsai, C. T., J. W.Ting, M. H. Wu, M. F.Wu, I. C. Guo, and C. Y.
Chang. 2005 Complete genome sequence of the grouper iridovirus
and comparison of genomic organization with those of other
131
iridoviruses. J Virol. 79:2010-2023.
Umemori, H., M. W. Linhoff, D. M. Ornitz, and J. R. Sanes. 2004.
FGF22 and its close relatives are presynaptic organizing molecules in
the mammalian brain. Cell 118:257-270.
Wherlock, M., and H. Mellor. 2002. The Rho GTPase family: a Racs
to Wrchs story. Journal of Cell Science 115:239-240.
Wider, G., and K. Wüthrich. 1999. NMR spectroscopy of large
molecules and multimolecular assemblies in solution. Curr. Opin. Struct.
Biol. 9:594-601.
Williamson, M. P., Havel, T. F., and Wüthrich, K. 1985. Solution
conformation of proteinase inhibitor IIA from bull seminal plasma by 1H
nuclear magnetic resonance and distance geometry. J. Mol. Biol. 182, 295–
315.
Williams, T., S. V. Barbosa, and V. G. Chinchar. 2005. A decade of
advances in iridovirus research. Adv Virus Res. 65:173–248.
Williams, T. 1996. The iridoviruses. Adv. Virus Res. 46:345-412.
Williams T., C. V. G., Darai G., Hyatt A., Kalmakoff J. & Seligy V.
2000. Iridoviridae. Seventh report of the International Committee on
the Taxonomy of Viruses.
Willis, D. B., and A. Granoff. 1978. Macromolecular synthesis in cells
infected by frog virus 3. IX. Two temporal classes of early viral RNA.
Virology 86:443-453.
Willis, D. B., R. Goorha, M. Miles, and A. Granoff. 1977.
Macromolecular synthesis in cells infected by frog virus 3. VII.
Transcriptional and post-transcriptional regulation of virus gene
132
expression. J Virol. 24:326-342.
Zhang, Q. Y., F. Xiao, J. Xie, Z. Q. Li and J. F. Gui. 2004. Complete
genome sequence of lymphocystis disease virus isolated from China. J
Virol. 78:6982-6994.
Zhang, Y., and D. Reinberg. 2001. Transcription regulation by histone
methylation: interplay between different co-valent modifications of the
core histone tails. Genes Dev. 15:2343-2360.
Zhu, J. W., Z.. . 2005. FAST: a novel protein structure alignment
algorithm. Proteins. 58:618-627.
Zieske, R. L. 2006. A perspective on the use of iTRAQTM reagent
technology for protein complex and profiling studies. J. Exp Bot
57:1501-1508.