the putative synaptotagmin protein encoded by the syt1 gene of the picoplanktonic alga micromonas is...
Post on 04-Apr-2018
216 Views
Preview:
TRANSCRIPT
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
1/17
36 Mukherjee
Int. J. Biosci. 2012
RESEARCH PAPER
The putative synaptotagmin protein encoded by the SYT1 gene
of the picoplanktonic alga Micromonas is a novel member of
C2-domain containing proteins: evidence from in silico
characterization and homology modeling
Ashutosh Mukherjee
Department of Botany, Dinabandhu Mahavidyalaya, Bongaon, North 24 Parganas - 743235, West
Bengal, India
Received: 14 September 2012Revised: 21 September 2012Accepted: 22 September 2012
Key words: Disorder, template, dendrogram, ramachandran plot, flexibility, electrostatic potential.
Abstract
Synaptotagmin proteins are a class of membrane trafficking proteins and controls endocytosis and exocytosis of
synaptic vesicles in animals. Increasing number of plant nucleotide and protein data shows they are also present
in plants.Micromonas pusilla is a picophytoplanktonic alga belonging to Prasinophyceae which is believed to be
the ancient member of green plant lineage and thus, very useful in various evolutionary studies. The SYT1 gene of
this alga encodes a putative synaptotagmin which shows novel features. In this study, this protein has been
characterized by several bioinformatic tools. The protein contains several novel motifs and domains besides the
C2 domain. The three dimensional structure has been predicted in silico by homology modeling to gather
knowledge about the structure of the ancient forms of the plant synaptotagmin protein. The C2 domain in this
protein itself is somewhat different from the known structures. The spatial distribution of the active site amino
acids around the calcium ion showed that some amino acids outside the C2 domain are also involved in calcium
binding which is a novel feature of this protein.
Corresponding Author: Ashutosh Mukherjee ashutoshcaluniv@gmail.com
International Journal of Biosciences (IJB)ISSN: 2220-6655 (Print) 2222-5234 (Online)
Vol. 2, No. 10(1), p. 36-52, 2012http://www.innspub.net
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
2/17
37 Mukherjee
Int. J. Biosci. 2012
Introduction
Synaptotagmins are a group of membrane
trafficking proteins characterized by the presence of
an N-terminal transmembrane region (TMR), a
linker of variable length and two tandemly arranged
C-terminal C2 domains (Craxton, 2004), called C2A
and C2B. The C2 domain is a Ca2+-binding protein
domain, approximately 130-145 amino acids long
which are found in many membrane-associated
signaling proteins in a large number of organisms
(Nalefski and Falke, 1996). It is considered that Ca2+
neutralizes negatively charged residues in the loop
regions of the C2 domain and permits its interaction
with phospholipids in the membrane which leads to
trafficking (Rizo and Sudhof, 1998). In mammals,
there are 15 members of synaptotagmin family and
many of these proteins act in the regulated synaptic
vesicle exocytosis required for efficient
neurotransmission (Craxton, 2004). They are
calcium sensors and regulate exocytosis and
endocytosis of synaptic vesicles. Although they were
thought to be exclusive to animals, they have also
identified from plants (Lewis and Lazarowitz,
2010). From the sequenced plant genomes, many
synaptotagmin genes have been identified byseveral computational procedures (Craxton, 2004).
The picoplanktonic alga Micromonas pusilla is an
important model organism in developmental
biology and evolutionary biology, as it belong to
Prasinophyceae which is thought to be the anciently
diverged sister clade to land plants (Worden et al.,
2009). Analyses of the genome of this small
unicellular eukaryote offer valuable insights into the
dynamic nature of early plant evolution. The
genome of this picoplankton contains one SYT1
gene which encodes one C2-domain containing
protein annotated as putative synaototagmin
(Worden et al., 2009). The protein is 1053 amino
acid long and the C2 domain spans for 214 amino
acids, which is much longer than the average length
of a C2 domain (130-145 amino acids). Additionally,
Initial BLAST (Altschul et al., 1990) search against
NCBI non-redundant protein database revealed
several plant synaptotagmins with high sequence
similarity in the C2-domain region but outside the
C2-domain, no sequence similarity was found with
any other protein. As this is a 1053 amino acid long
protein and C2-domain only spans for 214 amino
acids, a large portion of the protein is
uncharacterized. Thus, further characterization
including the presence of known or novel domains,
motifs in this region is needed for better
understanding of the structural and functional
properties of this ancient form of C2-domain
containing putative synaptotagmin protein.
Biological function of a protein is also the
manifestation of its tertiary structure and
knowledge of the structural organization of the
protein is a prerequisite for understanding its
functional aspects (Paital et al., 2011). However, no
three-dimensional structure of this C2 domain
containing protein from Micromonas is known.
Thus, it would be useful to recognize the 3D
structure of this protein for the understanding of its
functional aspects. In absence of crystal structure,
homology modeling, which is done in silico,
provides a faster way to obtain structural insight
into the protein (Dolan et al., 2012). Additionally,
identification of the Ca2+ binding residues andknowledge about their interaction with the ligand
are necessary for understanding of its functional
properties. This study was conducted with the help
of several bioinformatics approaches including
homology modeling to a) investigate the
physicochemical, structural and functional
properties of this protein, b) analyze the structure of
the whole protein and the C2 domain and c) study
the interaction of the active site amino acid residues
with the Ca2+ ion.
Materials and methods
Sequence retrieval
The Micromonas pusilla Ca2+-lipid binding protein
sequence containing C2 domain i.e. putative
synaptotagmin (GenBank accession
XP_002504251; GI: 255082530; further called as
SYT1 in this study) was downloaded from the NCBI
Refseq (Pruitt et al., 2007) database
(http://www.ncbi.nlm.nih.gov/projects/RefSeq/).
http://www.ncbi.nlm.nih.gov/projects/RefSeq/http://www.ncbi.nlm.nih.gov/projects/RefSeq/ -
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
3/17
Fig. 1. Dendrogram showing the phylogenetic relationship of the SYT1 from Micromonas with other C2-domain
containing proteins. The SYT1 fromMiromonas is shown in a grey box.
The protein sequence was predicted by conceptual
translation from an mRNA sequence of
Micromonas sp. RCC299 (Worden et al., 2009).
The protein is 1053 amino acids long and the C2
domain (COG5038) spans from residue 282-495.
Three dimensional crystal structure of this protein
was not yet available in the Protein Data bank. This
sequence was further utilized for characterization
and structure prediction.
38 Mukherjee
Int. J. Biosci. 2012
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
4/17
Fig. 2. Multiple sequence alignment of the templates and the target protein as visualized with Jalview.
Phylogenetic analysis
Protein sequences related to SYT1 were searched
using NCBI BLASTP (Altschul et al., 1990)
program. For evaluating the phylogenetic
relationship, the resulting sequences (excluding
hypothetical and predicted sequences) were aligned
using alignment explorer in Mega 5.0 (Tamura et
al., 2011) with default parameters. Unrooted
phylogenetic tree of these sequences was
constructed by the neighbor-joining (NJ) method in
Mega 5 program. The level of confidence was
estimated using bootstrap analysis of 1000
replications.
39 Mukherjee
Int. J. Biosci. 2012
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
5/17
40 Mukherjee
Int. J. Biosci. 2012
Fig. 3. Ramachandran plot of the modeled SYT1 protein.
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
6/17
41 Mukherjee
Int. J. Biosci. 2012
Fig. 4. Details of the modeled three-dimensional structure of SYT1 protein. A) Ribbon diagram of the protein as
shown in Chimera. The alpha helices are shown in orange, beta sheets are shown in yellow and loops are coloured
in cyan; B) Position of the C2-domain (orange) into the protein.
Physicochemical analysis
The computation of various physicochemical
parameters, such as amino acid composition,
isoelectric point (pI), total number of negatively and
positively charged residues, instability index,
aliphatic index and Grand Average of Hydropathy
(GRAVY), was done using ProtParam tool
(Gasteiger et al., 2005) available at
http://us.expasy.org/ tools/protparam.html.
Fig. 5. Topology of the modeled SYT1 protein as predicted by PDBsum. Helices and strands outside the C2-
domain are shown in red and pink, respectively. The helices and strands of C2-domain are shown in blue and
green, respectively.
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
7/17
42 Mukherjee
Int. J. Biosci. 2012
Fig. 6. Flexibility of modeled three-dimensional structure of SYT1. A) Flexibility to rigidity as shown in a
gradient of red to white in the 3D model; B) Flexibility along the length of the protein as indicated by peaks; C)
Flexibility as indicated in a red white gradient over the entire sequence.
Fig. 7. A) Protein disorder (disordered regions are indicated as blue regions); B) Interacting surface (shown as
red regions) and C) Surface electrostatic potential of SYT1 (Red portions are electronegative and blue portions are
electropositive. White portions are neutral).
Fig. 8. Interaction of Ca2+
ion with the SYT1 protein. a. Three-dimensional orientation of side chains of activesite residues surrounding Ca2+ ion (cyan ribbon represent part of C2-domain and orange ribbon includes
important amino acids for Ca2+ binding outside the C2-domain); b. LIGPLOT of SYT1 complexed with Ca2+.
Structural and functional characterization
Secondary structure prediction was carried out with
SOPMA (Geourjon and Deleage 1995). The CDD
database (Marchler-Bauer et al., 2011) was searched
for domains using CD search (Marchler-Bauer and
Bryant, 2004). Motifs were predicted using Multiple
Em for Motif Elicitation (MEME) suite (Bailey et
al., 2009) respectively using default parameters to
gain insight about its function. Motifs found with
MEME were further searched with MAST tool for
known matches for the motifs. Motif Scan (Pagni et
al., 2007; Sigrist et al., 2010) server (http://hits.isb-
sib.ch/cgi-bin/PFSCAN) and SMART (Schultz et al.,
1998; Letunic et al., 2012) server
(http://smart.embl-heidelberg.de/) were also usedfor scanning signature domains with the default
http://hits.isb-sib.ch/cgi-bin/PFSCANhttp://hits.isb-sib.ch/cgi-bin/PFSCANhttp://smart.embl-heidelberg.de/http://smart.embl-heidelberg.de/http://hits.isb-sib.ch/cgi-bin/PFSCANhttp://hits.isb-sib.ch/cgi-bin/PFSCAN -
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
8/17
43 Mukherjee
Int. J. Biosci. 2012
parameters, including outlier homologs and
homologs of known structures, Pfam domains,
signal peptides and internal repeats. The SOSUI
(Hirokawa et al., 1998) program
(http://bp.nuap.nagoyau.ac.jp/sosui/sosui_submit.
html) was employed to predict the presence of any
transmembrane region. Subcellular localization was
predicted using TargetP (Emanuelsson et al., 2000)
1.1 server
(http://www.cbs.dtu.dk/services/TargetP/abstract.
php). Protein disorder was predicted using
Disopred (Ward et al., 2004) (http://
bioinf.cs.ucl.ac.uk/disopred/) server.
Homology modeling
Primarily, HHpred (Sding et al., 2005) server
(http://toolkit.tuebingen.mpg.de/hhpred)as well as
PSI- BLAST (Altschul et al., 1997) server
(http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Pro
teins) was used for identification of suitable
templates from the PDB protein structure database
(Berman et al., 2000). However, HHpred only
identified some templates with coiled coil region
aligned with a very small region (approximately
from 400th to 650th residue) of the target protein.PSI-BLAST, on the other hand, could not find any
significant match. The Phyre2 (Kelley and
Sternberg, 2009) web server
(http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi
?id=index) was also employed for modeling.
However, only 31% of the protein could be modeled
by the normal mode. Intensive mode could not be
employed on Phyre2 as it requires protein less than
1000 amino acids long.
Finally, I-TASSER (Zhang, 2007; Roy et al., 2010),
the iterative threading assembly refinement server
(http://zhanglab.ccmb.med.umich.edu/I-
TASSER/), was chosen to generate the homology
models because it is automated and easy to use,
its algorithm incorporates multiple templates, and
it has a high degree of accuracy based on blind
CASP experiments (Roy et al., 2010). Rather than
specifying one single template for homology
modeling, I-TASSER was allowed to incorporate
multiple templates since it is recommended that
multiple templates should be used in order to avoid
biasing the model toward one protein or one set
of side chain conformations (Ginalski, 2006;
Rhodes, 2006). Sequence alignments of the target
protein and the templates were performed using
CLUSTALW(http://www.ch.embnet.org/software/
ClustalW.html) (Larkin et al., 2007) and visualized
with Jalview (Clamp et al., 2004; Waterhouse et al.,
2009). I-TASSER generated five predicted
structures for the protein of which the model with
the highest C-score was chosen for further analysis.
Validation and analysis of the 3D model
After modeling, the validation of the modeled
structure was carried out using Protein Structure
Validation Suite (PSVS) tool (Bhattacharya, et al.,
2007) available at http://psvs-1_4-dev.nesg.org/.
Within PSVS, the model was analyzed by
PROCHECK (Laskowski et al., 1993) and
Molprobity (Lovell et al., 2003). 3D structures of
the proteins and protein-calcium complex were
visualized with Chimera (Pettersen et al., 2004).
For an at-a-glance overview of the topology of the
modeled protein, PDBsum (Laskowski, 2009) webserver was used (http://www.ebi.ac.uk/pdbsum/).
Molecular surface area and contact volume was
calculated with the web-based tool Voss Volume
Voxelator (http://www.molmovdb.org/cgi-
bin/3v.cgi) (Voss, 2007; Voss et al., 2006). To know
the secondary structure and topology of the protein,
the 3D structure was submitted to the PDBsum
(Laskowski, 2009) server
(http://www.ebi.ac.uk/pdbsum/). B-factor profiles
of the modeled protein were investigated using the
web-based tool for the analysis of protein flexibility
FlexServ(http://mmb.pcb.ub.es/FlexServ/)(Camps
et al., 2009), with Normal Mode Analysis employed.
This server incorporates the protocols for the
coarse-grained determination of protein dynamics
using different algorithms. For further annotation
and identification of protein interface identification,
the structure was analysed with Polyview (Porollo
and Meller, 2007) server
(http://polyview.cchmc.org/). To identify the likely
http://bp.nuap.nagoyau.ac.jp/sosui/sosui_submit.htmlhttp://bp.nuap.nagoyau.ac.jp/sosui/sosui_submit.htmlhttp://www.cbs.dtu.dk/services/TargetP/abstract.phphttp://www.cbs.dtu.dk/services/TargetP/abstract.phphttp://toolkit.tuebingen.mpg.de/hhpredhttp://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteinshttp://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteinshttp://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=indexhttp://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=indexhttp://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=indexhttp://www.ch.embnet.org/software/ClustalW.htmlhttp://www.ch.embnet.org/software/ClustalW.htmlhttp://www.ch.embnet.org/software/ClustalW.htmlhttp://www.ebi.ac.uk/pdbsum/http://www.molmovdb.org/cgi-bin/3v.cgihttp://www.molmovdb.org/cgi-bin/3v.cgihttp://www.ebi.ac.uk/pdbsum/http://mmb.pcb.ub.es/FlexServ/http://polyview.cchmc.org/http://polyview.cchmc.org/http://mmb.pcb.ub.es/FlexServ/http://www.ebi.ac.uk/pdbsum/http://www.molmovdb.org/cgi-bin/3v.cgihttp://www.molmovdb.org/cgi-bin/3v.cgihttp://www.ebi.ac.uk/pdbsum/http://www.ch.embnet.org/software/ClustalW.htmlhttp://www.ch.embnet.org/software/ClustalW.htmlhttp://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=indexhttp://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=indexhttp://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteinshttp://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteinshttp://toolkit.tuebingen.mpg.de/hhpredhttp://www.cbs.dtu.dk/services/TargetP/abstract.phphttp://www.cbs.dtu.dk/services/TargetP/abstract.phphttp://bp.nuap.nagoyau.ac.jp/sosui/sosui_submit.htmlhttp://bp.nuap.nagoyau.ac.jp/sosui/sosui_submit.html -
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
9/17
44 Mukherjee
Int. J. Biosci. 2012
biochemical function of the protein from its three-
dimensional structure, ProFunc (Laskowski et al.,
2005a; Laskowski et al., 2005b) server
(http://www.ebi.ac.uk/thornton-
srv/databases/profunc/) was employed. Binding
site prediction was performed with I-TASSER which
also generated a ligand-protein complex. The ligand
(Ca2+) bound with active site residues was plotted
with LIGPLOT (Wallace et al., 1995) within
PDBsum.
Protein structure accession numbers
The homology model of the protein was submitted
to the Protein Model Data Base i.e. PMDB
(Castrignan et al., 2006) at
http://mi.caspur.it/PMDB/ and assigned the
identifiers PM0078184.
Results and discussion
Phylogenetic relationship of SYT1 with other
members of C2 domain containing proteins BLAST
search of SYT1 identified several C2 domain
containing proteins including some hypothetical
and predicted proteins. These hypothetical and
predicted proteinswere excluded for dendrogram
preparation. Finally, Micromonas SYT1 and the
other 57 related proteins (supplementary material,
table S1) were used for phylogenetic tree
construction. All of them had either one or two C2
domains (table 3). Besides plant synaptotagmin,
these proteins included several membrane proteins
with single C2 domain, calcium-dependent lipid-
binding domain-containing proteins, CLB1 and
other C2 domain containing proteins. The
dendrogram showed that SYT1 ofMicromonas is
distinctly different from all the other 57 proteins
(Fig. 1).
Table 1. ProtParam table showing different physicochemical properties of the C2 domain containing protein.
Parameters Value Explanation
pI 5.10 Indicates that the protein is acidic.
Total number of negatively
charged residues (Asp + Glu)
155 Total number of negatively charged residues is
greater than Total number of positively charged
residues. This indicates that the protein is
intracellular.
Total number of positively
charged residues (Arg + Lys)
125
The instability index (II) 39.61 This classifies the protein as stable.
Aliphatic index 82.74 Indicates that this globular protein is thermostable.
Grand average of hydropathicity
(GRAVY)
-0.263 A negative GRAVY score indicates that the protein is
hydrophilic.
Table 2. Secondary structure of the C2 domain containing protein as predicted by SOPMA.
Parameters Number of amino acids Percentage of amino acids
Alpha helix (Hh) 386 36.66
310 helix (Gg) 0 0.00
Pi helix (Ii) 0 0.00
Beta bridge (Bb) 0 0.00
Extended strand (Ee) 187 17.16
Beta turn (Tt) 89 8.45Bend region (Ss) 0 0.00
http://www.ebi.ac.uk/thornton-srv/databases/profunc/http://www.ebi.ac.uk/thornton-srv/databases/profunc/http://www.ebi.ac.uk/thornton-srv/databases/profunc/http://www.ebi.ac.uk/thornton-srv/databases/profunc/ -
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
10/17
45 Mukherjee
Int. J. Biosci. 2012
Table 3. Motifs predicted using MEME.
Motif Width Sites E-value Start
position
p-value Sequence
Motif 1 8 2 4.4e-001 234 7.72e-11 FMGWQQSK
453 1.16e-12 WMVWPRCI
Motif 2 6 2 1.4e+001 504 3.10e-08 LQVRWP
550 4.16e-10 LCVRWY
Motif 3 6 2 7.4e+001 387 1.10e-08 EFECSF
404 1.97e-08 VFPCFG
Physicochemical properties
The physicochemical properties of the C2 domain
containing protein fromMicromonas was predicted
using Expasys ProtParam server
(http://expasy.org/cgi-bin/protparam) using the
protein sequence and the results are shown in table
1. The most frequent amino acid present in the
sequence was found to be alanine (157 residues,
14.9%) and the least was that of cystine (5 residues,
0.5%). The total number of negatively charged
residues (Asp + Glu) was 155 and the total number
of positively charged residues (Arg + Lys) was 125
which indicate the protein to be intracellular as
intracellular proteins have higher fraction of
negatively charged residues. The calculated
isoelectric point (pI) is useful for the fact that at
isoelectric point, the solubility is the least and the
mobility in an electric field is zero. Isoelectric point(pI) is the pH at which the surface of protein is
covered with charge but net charge of protein is
zero. The calculated isoelectric point (pI) was
computed to be 5.10 which indicates that the
protein is acidic. The high aliphatic index (82.74)
indicates that this protein is stable for a wide range
of temperature range. This is important to combat
various stressful environments which is natural for
a signaling protein. The instability index (39.61)
also provides the evidence that the protein in stable.The Grand Average Hydropathicity (GRAVY) value
is negative (-0.263) which indicates better
interaction of the protein with water. The SOSUI
program also showed an average of hydrophobicity
of -0.263343 confirming that the protein is a soluble
protein. Prediction of its subcellular localization
with TargetP showed the protein is localized in
chloroplast with a 63 amino acid long target
peptide.
Structural and functional properties
Table 2 presents the results of secondary structure
prediction analysis by SOPMA from which it is clear
that random coil is predominantly present (37.13%),
followed by alpha helix (36.66%) and extended
strand (17.16%). SOPMA also predicted the
presence of Beta turn (8.45%).
The Conserved Domain Database showed only thepresence of C2-domain (COG5038). No other
domains were found. The scan for motifs with
MEME showed the presence of three motifs (table
3). The sequence of the motifs are
[FW]M[GV]W[PQ][QR][CS][IK], L[CQ]VRW[PY]
and [EV]F[EP]C[FS][FG]. All the motifs were
present in two copies in the sequence. Of these,
motif 1 and 3 were the part of the C2 domain.
Search for the presence of these motifs in other
proteins with MAST revealed some interestingresults. For motif 1, MAST resulted into 26 proteins
Random coil (Cc) 391 37.13
Ambigous states 0 0.00
Other states 0 0.00
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
11/17
46 Mukherjee
Int. J. Biosci. 2012
with E-values less than 10 which include several
FAB fragments, FV fragments and few indolicidin
(antimicrobial cationic peptide), membrane
glycoprotein and one cytochrome. For motif 2, 12
sequences were identified with E-value less than 10
and include few S-phase Kinase associated protein
and many Dienelactone hydrolase. Only 5 proteins
were identified with motif 3 which include bacterial
toxins. SMART identified several low complexity
regions as well as two coiled coil regions (table 4).
No low-complexity regions fell into the C2 domain.
It also identified two SCOP domains (d1i19a1 i.e.
FAD linked oxidase, C-terminal domain, and
d1hcia4 i.e. spectrin repeat). The Motif scan tool
identified one amidation site, two N-glycosylation
sites, fourteen Casein kinase II phosphorylation
sites, eighteen N-myristoylation sites, sixteen
Protein kinase C phosphorylation sites, one cell
attachment sequence, one each of Alanine rich,
Arginine rich and Glycine rich regions as well as one
octapeptide repeat (table 5).
Initial BLAST search against NCBI non-redundant
protein database showed many plant
synaptotagmins and some other C2 domaincontaining proteins in the top BLAST hits. Also,
ProFunc identified several synaptotagmin genes
related with SYT1 of Micromonas from plants.
Surprisingly, these proteins only showed similarity
in the C2 domain region. The C-terminal and N-
terminal regions outside the C2 domain did not
show any sequence similarity with any other
proteins. The CD search showed that the C2 domain
spans from Asp282 to Gly495. The results showed
the presence of another small domain of the
superfamily cl01482. As shown in CDD, this
superfamily represents bacterial proteins related to
CpxP, a periplasmic protein that forms part of a
two-component system which acts as a global
modulator of cell-envelope stress in Gram-negative
bacteria. In this protein, this domain spans from
Gly816 to Arg874.
Disordered regions of a protein facilitate
interactions of the protein and allow more
modification sites in the protein (Paital et al., 2011).
The total disordered amino acid residues were 378
(35.89%) as predicted by Disopred. However, they
were spread over the protein in 14 regions. The
longest disordered region was spread from Glu47 to
Thr181. However, the C2 domain was not
disordered as no amino acid within this region was
found disordered. These disordered regions playsignificant roles in protein interaction (Paital et al.,
2011). From these results, it seems that this protein
interacts with other proteins with novel properties.
Table 4. Motifs identified with SMART.
Motif No. of sites Amino acid positions E-value
Low complexity 11 28-42, 56-77, 82-103, 117-134, 165-176, 241-
252, 607-621, 636-655, 693-707, 742-760,
1018-1035
---
Coiled coil 2 816-865, 940-980 ---
SCOP: d1i19a1
(FAD-linked oxidases, C-
terminal domain)
1 198-321 2.20e+00
SCOP: d1hcia4
(Spectrin repeat)
1 800-851 1.40e-01
Among the top five models generated by I-TASSER,
each was with a C-score. The C-score is a
confidence score for estimating the quality of a
predicted model: a high C-score signifies a model
with a high confidence and vice-versa. Models with
a C-score > -1.5 generally have a correct fold (Royet
al., 2010). The structure with the highest C-score (-0.9) was used for further studies. The template-
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
12/17
47 Mukherjee
Int. J. Biosci. 2012
modeling score (TM-score) provides a sensitive
measure of overall topology difference between a
predicted structure and template, with a higher
score indicating a better structural match. A TM-
score >0.5 indicates correct overall topology for a
modeled structure. The TM-score for the modeled
protein of this study was 0.600.14 which indicates
that the model had correct overall topology.
Additionally, the normalized Z-score for each
threading alignment between the target and a given
template indicates the significance of the alignment
compared to the average. I-TASSER documentation
advises that a threading alignment with a
normalized Z-score >1 reflects a confident
alignment. In this study, normalized Z-score for the
top 10 templates used by I-TASSER ranged from
1.02-3.53 which reflects the confidence of
alignment.
Table 5. Motifs identified with Motif Scan.
Motif information No. of
sites
Amino acid residues E-value
Amidation site 1 81-84 ---
Nglycosylation site 2 79-82, 519-522 ---Casein kinase II
phosphorylation site
14 44-47, 65-68, 162-165, 198-201, 261-264, 349-352,
385-388, 535-538, 711-714, 847-850, 950-953, 985-
988, 999-1002, 1043-1046
---
Nmyristoylation site 18 88-93, 126-131, 227-232, 335-340, 372-377, 404-409,
430-435, 467-472, 531-536, 627-632, 644-649, 654-
659, 663-638, 684-689, 758-763, 802-807, 826-821,
995-1000
---
Protein kinase C
phosphorylation site 16
23-25, 33-35, 37-39, 46-48, 81-83, 127-129, 241-243,
394-396, 535-537, 565-567, 670-672, 723-725, 795-
797, 807-809, 844-846, 950-952
---
Cell attachment
sequence
1 74-175 ---
Alanine-rich region 1 123145 0.07
Arginine-rich region 1 25-69 5.7
Glycine-rich region 1 627-706 0.00059
Octapeptide repeat 1 473-480 3.6
The PSVS suite analyzed the protein structure with
the help of several tools. According to PROCHECKprogram, Ramachandran plot (figure 3) of the
shading represents the different regions of the plot.
The darker the area, the more favorable is the -
combination. Residues in most favored regions,
additionally allowed regions and generously allowed
regions were 79%, 14.7% 4.9%, respectively. Only
1.3% residues were in disallowed region. Molprobity
evaluates the stereochemical quality of a structure
by calculating phi and psi torsion angles, backbone
bond lengths and backbone bond angles.
Molprobity provides a clashscore as a result of an
all-atom contact analysis which is performed after
adding hydrogen atoms to a structure. When non-donor acceptor atoms overlap by more than 0.4 ,
at least one of the two atoms must be modeled
incorrectly. A clash at this location is noted and
incorporated into the clashscore, which is simply
the number of clashes per 1000 atoms (Lovell et al.,
2003). In this study, the clashscore was quite low
(169.39). All these quality evaluation measures
showed that the modeled structure was quite
reliable.
Overall three-dimensional structure of the protein
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
13/17
48 Mukherjee
Int. J. Biosci. 2012
The modeled protein belongs to the / structural
class (Chou and Zhang, 1995) as evidenced from
figure 4A. It is also notable that the protein formed
a V-shaped structure. One part of this V-shaped
structure has prevalence of beta sheets and the
other part has the prevalence of alpha helices. The
volume of the protein was 149974 3.The C2
domain lies in the beta sheet prevalent area (figure
4B). The modeled structure was submitted to
PDBsum to show the secondary structures
graphically. This showed the presence of 21 helices
and 31 strands (which formed 10 sheets) and 10
beta hairpins. The topology (figure 5) showed that
the N-terminal part is primarily consisted of beta
sheets, while the C-terminal portion was made
primarily of alpha helices along with some small
beta strands. Of the 31 beta strands, 23 were present
in the N-terminal region. The B factor, which
reflects spatial uncertainty, was calculated using the
web-based tool for the analysis of protein flexibility,
FlexServ. The minimum B-factor for a residue was
measured to be 4.663 2 and the maximum B-factor
was 304.671 2. The protein has six regions in form
of six peaks which have B-factor values more than
100 2 (figure 6A). In general, several loop regionsshowed more flexibility as shown in figure 6B.
Maximum flexibility was showed by Pro119, Leu120,
Pro121, Thr482, Ala483, Pro718, and Leu719 (figure
6C). As loops do not form any rigid structure in the
protein, these flexible regions seemed to be vital for
structural modifications of the protein.
The disordered regions were mainly situated in the
loop regions of the protein (figure 7A). 19 beta
strands contained disordered regions in them in
contrast to only 4 alpha helices. The longest
disordered region was Glu47 to Thr181 which
contained 6 beta strands and only 1 alpha helix. The
Polyview 3D program estimated the interacting
residues of the protein. Total 275 residues were
predicted as interacting i.e. interfacial (figure 7B).
Comparison of the data of disordered regions and
interacting residues showed that 30 interacting
residues were predicted to be disordered.
Comparing the results of FlexServ and Polyview, it
was evident that all of the amino acids which
contribute to the flexibility of the protein except
Pro121 form the interacting surfaces of the protein.
The distribution of electrostatic potentials
(figure7C) showed that the C2-domain is primarily
neutral with some negatively charged regions and a
few positively charged regions. It is also notable that
the highly flexible region of the protein has either
positive or negative electrostatic potentials. The
presence of charged residues in the loop regions of
high flexibility suggests their participation in
dynamic charge-mediated interactions with other
molecules.
Structure of the C2 domain and ca2+binding
residues
The C2 domain was consisted of 4 sheets (9
strands). Of these 9 strands, one very small strand
(Asp425-Arg427) was not shown as strand in the I-
TASSER generated model as viewed by Chimera,
but showed in PDBsum topology (figure 5).
Otherwise the topology generated by the PDBsum
matched with the modeled structure. The C2
domain also contains three small alpha helices.
However, the C2 domain is not fully formed ofhelices and strands. 125 of 214 residues (58.41%)
did not form any helix or sheet. Usually, the C2
domain forms a beta-sheet scaffold with eight anti
parallel strands connected by loops (Reddy and
Reddy, 2004). Loops 1-3 are placed on top of the
sheets and coordinate with Ca2+ binding (Sutton et
al., 1995). This binding of C2 domain with Ca2+ ion
facilitates its interaction with negatively charged
phospholipids. The protein studied here, however,
interacts with Ca2+ ion with the help of amino acids
within the C2-domain as well as amino acids
outside the C2 domain (Asp545, Pro546, Lys547,
Ala548 and Gln549), as shown by I-TASSER. The
Ca2+ ion is surrounded by nine amino acids (figure
8A) The protein with a similar binding site was,
surprisingly showed by one integrin alphaXbeta2
ectodomain from human (PDB ID: 3K6S) (Xie et al.,
2010). The Ca2+ bound model was submitted to
PDBsum and the LIGPLOT showed bonding of the
Ca2+ ion with the backbone nitrogen of Phe424. The
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
14/17
49 Mukherjee
Int. J. Biosci. 2012
Ca2+ ion formed hydrogen bonds with Leu422,
Asp545 and Pro546 (figure 8B).
Conclusion
The putative synaptotagmin protein from the
picoeukaryoic planktonMicromonas investigated in
this study is a novel member of the C2-domain
containing protein family as it did not show any
sequence similarity with other members of the C2
domain family outside the C2-domain as shown by
NCBI BLAST search. The NJ tree developed on the
basis of sequence alignment also showed that the
protein is distinct from other members of the C2-
domain containing proteins from the plant
kingdom. Finally, this analysis provides insight into
the unique structural properties as well as its
novelty for interaction with Calcium. The predicted
model of the protein is useful for different
experimental purposes in relation to the different
signaling mechanisms involving this protein. The
interaction between the protein and the Ca2+-ion
proposed in this study are useful for understanding
the potential mechanism of action of this protein
and also its evolutionary significance.
Acknowledgement
The facility situated at the Department of Botany,
Dinabandhu Mahavidyalaya is gratefully
acknowledged.
References
Altschul SF, Gish W, Miller W, Myers EW,
Lipman DJ. 1990. Basic local alignment search
tool. Journal of Molecular Biology215(3), 403-410.
Altschul SF, Madden TL, Schffer AA, Zhang
J, Zhang Z, Miller W, Lipman DJ. 1997.
Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids
Research 25, 3389-3402.
Bailey TL, Boden M, Buske FA, Frith M,
Grant CE, Clementi L, Ren J, Li WW, Noble
WS. 2009. MEME SUITE: tools for motif
discovery and searching. Nucleic Acids Research
37, W202-W208.
Berman HM, Westbrook J, Feng Z, Gilliland
G, Bhat TN, Weissig H, Shindyalov IN,
Bourne PE. 2000. The Protein Data Bank.
Nucleic Acids Research 28, 235-242.
Bhattacharya A, Tejero R, Montelione GT.
2007. Evaluating protein structures determined by
structural genomics consortia. Proteins 66, 778-
795.
Camps J, Carrillo O, Emperador A, Orellana
L, Hospital A, Rueda M, Cicin-Sain D,
D'Abramo M, Gelp JL, Orozco M. 2009.
FlexServ: an integrated tool for the analysis of
protein flexibility. Bioinformatics 25(13), 1709-
1710.
Castrignan T, De Meo PD, Cozzetto D,
Talamo IG, Tramontano A. 2006. The PMDB
Protein Model Database. Nucleic Acids Research
34, D306-D309.
Cedano J, Aloy P, Prez-Pons JA, Querol E.
1997. Relation between amino acid composition
and cellular location of proteins. Journal of
Molecular Biology266(3), 594-600.
Chou KC, Zhang CT. 1995. Prediction of protein
structural classes. Critical Reviews in Biochemistry
and Molecular Biology30, 275-349.
Clamp M, Cuff J, Searle SM, Barton GJ.
2004. The Jalview Java Alignment Editor.
Bioinformatics 20, 426-427.
Craxton M. 2004. Synaptotagmin gene content of
the sequenced genomes. BMC Genomics 5, 43.
Dolan MA, Noah JW, Hurt D. 2012.
Comparison of common homology modeling
algorithms: application of user-defined alignments.
In: Orry A. J.W. and Abagyan R, eds. Homology
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
15/17
50 Mukherjee
Int. J. Biosci. 2012
Modeling: Methods and Protocols, Methods in
Molecular Biology, vol. 857, Humana Prerss, USA,
399-414.
Emanuelsson O, Nielsen H, Brunak S, von
Heijne G. 2000. Predicting Subcellular
localization of proteins based on their N-terminal
amino acid sequence. Journal of Molecular Biology
300(4), 1005-1016.
Gasteiger E, Hoogland C, Gattiker A, Duvaud
S, Wilkins MR, Appel RD, Bairoch A. 2005.
Protein Identification and Analysis Tools on the
ExPASy Server. In: Walker JM, ed. The Proteomics
Protocols Handbook. Humana Press, Totowa, New
Jersey, USA, 571-607.
Geourjon C, Delage G. 1995. SOPMA:
Significant improvements in protein secondary
structure prediction by consensus prediction from
multiple alignments. Computer applications in the
biosciences 11, 681-684.
Ginalski K. 2006. Comparative modeling for
protein structure prediction. Current Opinion inStructural Biology16(2), 172-177.
Hirokawa T, Boon-Chieng S, Mitaku S. 1998.
SOSUI: classification and secondary structure
prediction system for membrane proteins.
Bioinformatics 14, 378-379.
Kelley LA, Sternberg MJE. 2009. Protein
structure prediction on the web: a case study using
the Phyre server. Nature Protocol 4, 363-371.
Larkin MA, Blackshields G, Brown NP,
Chenna R, McGettigan PA, McWilliam H,
Valentin F, Wallace IM, Wilm A, Lopez R,
Thompson JD, Gibson TJ, Higgins DG. 2007.
ClustalW and ClustalX version 2. Bioinformatics
23(21), 2947-2948.
Laskowski RA. 2009. PDBsum new things.
Nucleic Acids Research 37, D355-D359.
Laskowski RA, MacArthur MW, Moss DS,
Thornton JM. 1993. PROCHECK: a program to
check the stereochemistry of protein structures.
Journal of Applied Crystallography26, 283-291.
Laskowski RA, Watson JD, Thornton JM.
2005a. ProFunc: a server for predicting protein
function from 3D structure. Nucleic Acids Research
33, W89-W93.
Laskowski RA, Watson JD, Thornton JM.
2005b. Protein function prediction using local 3D
templates. Journal of Molecular Biology 351, 614-
626.
Letunic I, Doerks T, Bork P. 2012. SMART 7:
recent updates to the protein domain annotation
resource. Nucleic Acids Research 40(D1), D302-
D305.
Lewis JD, Lazarowitz SG. 2010.Arabidopsis
synaptotagmin SYTA regulates endocytosis and
virus movement protein cell-to-cell transport.
Proceedings of the National Academy of Sciences
USA107(6), 2491-2496.
Lovell SC, Davis IW, Arendall WB, de Bakker
PIW, Word JM, Prisant MG, Richardson JS,
Richardson DC. 2003. Structure validation by
C geometry: , and C deviation. Proteins 50,
437-450.
Marchler-Bauer A, Bryant SH. 2004. CD-
Search: protein domain annotations on the fly.
Nucleic Acids Research. 32, W327-W331.
Marchler-Bauer A, Lu S, Anderson JB,
Chitsaz F, Derbyshire MK, Deweese-Scott C,
Fong JH, Geer LY, Geer RC, Gonzales NR,
Gwadz M, Hurwitz DI, Jackson JD, Ke Z,
Lanczycki CJ, Lu F, Marchler GH,
Mullokandov M, Omelchenko MV,
Robertson CL, Song JS, Thanki N, Yamashita
RA, Zhang D, Zhang N, Zheng C, Bryant SH.
2011. CDD: a Conserved Domain Database for the
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
16/17
51 Mukherjee
Int. J. Biosci. 2012
functional annotation of proteins. Nucleic Acids
Research 39, D225-D229.
Nalefski EA, Falke JJ. 1996. The C2 domain
calcium-binding motif: Structural and functional
diversity. Protein Science 5, 2375-2390.
Pagni M, Ioannidis V, Cerutti L, Zahn-Zabal
M, Jongeneel CV, Hau J, Martin O,
Kuznetsov D, Falquet L. 2007. MyHits:
improvements to an interactive resource for
analyzing protein sequences. Nucleic Acids
Research 35, W433-W437.
Paital B, Kumar S, Farmer R, Tripathy NK,
Chainy GBN. 2011. In silico Prediction and
characterization of 3D structure and binding
properties of catalase from the commercially
important crab, Scylla serrata. Interdisciplinary
Sciences: Computational Life Science 3, 110-120.
Pettersen EF, Goddard TD, Huang CC, Couch
GS, Greenblatt DM, Meng EC, Ferrin TE.
2004. UCSF Chimera - a visualization system for
exploratory research and analysis. Journal ofcomputational chemistry25(13), 1605-1612.
Porollo A, Meller J. 2007. Versatile Annotation
and Publication Quality Visualization of Protein
Complexes Using POLYVIEW-3D. BMC
Bioinformatics 8, 316.
Pruitt KD, Tatusova T, Maglott TR. 2007.
NCBI reference sequences (RefSeq): a curated non-
redundant sequence database of genomes,
transcripts and proteins. Nucleic Acids Research
35, D61-D65.
Reddy VS, Reddy ASN. 2004. Proteomics of
calcium-signaling components in plants.
Phytochemistry65, 1745-1776.
Rhodes G. 2006. Crystallography Made Crystal
Clear. 3rd ed., Academic Press, Burlington, MA.
Rizo J, Sudhof TC. 1998. C2-domains, structure
and function of a universal Ca2+ -binding domain.
Journal of Biological Chemmistry 273, 15879-
15882.
Roy A, Kucukural A, Zhang Y. 2010. I-
TASSER: a unified platform for automated protein
structure and function prediction. Nature Protocol
5(4), 725-738.
Schultz J, Milpetz F, Bork P, Ponting CP.
1998. SMART, a simple modular architecture
research tool: Identification of signaling domains.
Proceedings of the National Academy of Sciences
USA95, 5857-5864.
Sigrist CJA, Cerutti L, de Castro E,
Langendijk-Genevaux PS, Bulliard V,
Bairoch A, Hulo N. 2010. PROSITE, a protein
domain database for functional characterization and
annotation. Nucleic Acids Research 38, D161-D166.
Sding J, Biegert A, Lupas AN. 2005. The
HHpred interactive server for protein homology
detection and structure prediction. Nucleic AcidsResearch 33, W244-W248.
Sutton RB, Davletov BA, Berghuis AM,
Sudhof TC, Sprang SR. 1995. Structure of the
first C2 domain of synaptotagmin I: a novel
Ca2+/phospholipid-binding fold. Cell 80, 929-938.
Tamura K. Peterson D, Peterson N, Stecher
G, Nei M, Kumar S. 2011. MEGA5: Molecular
Evolutionary Genetics Analysis using Maximum
Likelihood, Evolutionary Distance, and Maximum
Parsimony Methods. Molecular Biology and
Evolution 28, 2731-2739.
Voss NR, Gerstein M, Steitz TA, Moore PB.
2006. The geometry of the ribosomal polypeptide
exit tunnel. Journal of Molecular Biology 360(4),
893-906.
-
7/30/2019 The Putative Synaptotagmin Protein Encoded by the SYT1 Gene of the Picoplanktonic Alga Micromonas is a Novel
17/17
52 Mukherjee
Int. J. Biosci. 2012
Voss NR. 2007. Geometric Studies of RNA and
Ribosomes, and Ribosome Crystallization PhD
dissertation, Yale University.
Wallace AC, Laskowski RA, Thornton JM.
1995. LIGPLOT: a program to generate schematic
diagrams of protein-ligand interactions. Protein
Engineering design & selection 8(2), 127-134.
Ward JJ, McGuffin LJ, Bryson K, Buxton BF,
Jones DT. 2004. The DISOPRED server for the
prediction of protein disorder. Bioinformatics. 20,
2138-2139.
Waterhouse AM, Procter JB, Martin DMA,
Clamp M, Barton GJ. 2009. Jalview version 2: A
Multiple Sequence Alignment and Analysis
Workbench. Bioinformatics 25 (9), 1189-1191.
Worden AZ, Lee J-H, Mock T. Rouz P,
Simmons MP, Aerts AL, Allen AE, Cuvelier
ML, Derelle E, Everett MV, Foulon E,
Grimwood J, Gundlach H, Henrissat B,
Napoli C, McDonald SM, Parker MS,
Rombauts S, Salamov A, Von Dassow P,
Badger JH, Coutinho PM, Demir E, Dubchak
I, Gentemann C, Eikrem W, Gready JE, John
U, Lanier W, Lindquist EA, Lucas S, Mayer
KF, Moreau H, Not F, Otillar R, Panaud O,
Pangilinan J, Paulsen I, Piegu B, Poliakov A,
Robbens S, Schmutz J, Toulza E, Wyss T,
Zelensky A, Zhou K, Armbrust EV,
Bhattacharya D, Goodenough UW, Van de
Peer Y, Grigoriev IV. 2009. Green evolution
and dynamic adaptations revealed by the genomes
of the marine picoeukaryote Micromonas. Science
324, 268-272.
Xie C, Zhu J, Chen X, Mi L, Nishida N,
Springer TA. 2010. EMBO Journal 29(3), 666-
679.
Zhang Y. 2007. Template-based modeling and
free modeling by I-TASSER in CASP7. Proteins
69(S8), 108-117.
top related