author's personal copy - upv/ehu · cg31997 102b1 (iv) not detected in embryos. low level...

8
This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author’s institution, sharing with colleagues and providing to institution administration. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Upload: others

Post on 21-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

This article was published in an Elsevier journal. The attached copyis furnished to the author for non-commercial research and

education use, including for instruction at the author’s institution,sharing with colleagues and providing to institution administration.

Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies areencouraged to visit:

http://www.elsevier.com/copyright

Page 2: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

A novel family of single VWC-domain proteins in invertebrates

T.J. Sheldona,*, I. Miguel-Aliagab, A.P. Gouldb, W.R. Taylora, D. Conklinc

a Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UKb Division of Developmental Neurobiology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK

c Department of Computing, City University, London EC1V 0HB, UK

Received 20 August 2007; revised 4 October 2007; accepted 10 October 2007

Available online 18 October 2007

Edited by Takashi Gojobori

Abstract Using a combination of sequence analysis tools, anovel family of 13 short Drosophila melanogaster proteins withsimilarity to a single von Willebrand factor C-domain (SVC pro-teins) has been identified. SVCs are predicted to be secreted anda structural model for the family is proposed, using a known vonWillebrand factor C-domain (VWC) structure as template. AllSVCs possess eight cysteines, consistent with the loss of onebonded pair from the 10-cysteine canonical pattern of mostVWC domains. Unlike most Drosophila genes, many SVCsare not expressed during development, and misexpression hasno apparent effect on development or viability. SVCs appear tobe restricted to arthropods. A role for SVCs in response to envi-ronmental challenges is proposed.� 2007 Federation of European Biochemical Societies. Pub-lished by Elsevier B.V. All rights reserved.

Keywords: von Willebrand factor; VWC; Vago; Drosophila

1. Introduction

One of the most successful ways to determine the function of

novel protein families is to find sequence similarity to a protein

that has already been well-characterised. In the absence of

appreciable sequence similarity, especially when the target

sequence is short, a cysteine scaffold can provide evidence of

an evolutionary relationship.

Analysis of the Drosophila melanogaster genome previously

identified a group of eight related proteins with no apparent

sequence similarity to any other known proteins [1]. That

study made use of structural information encoded in contact

matrices to improve pairwise alignments and thus identify

more distantly related sequences, thereby speculating that the

eight proteins were structurally related to insulin-like growth

factor (IGF). Here, protein motif analysis is used as an alter-

native methodology to re-evaluate the same protein family

and identify additional members. This approach reveals that

the family possesses a disulfide arrangement more similar to

the C-domain of von Willebrand factor (VWC) than to IGF.

Members of this new protein family are thus named single

VWC domain proteins (SVC). Expression and functional anal-

ysis of the Drosophila members of this family are consistent

with a role as response proteins induced by certain environ-

mental stresses.

2. Materials and methods

2.1. Homology searching and alignmentA multiple alignment of the original family of Drosophila proteins

was used to create a motif model [2], based on the noted conservationof an 8-cysteine pattern within the family. Eight positions were chosento represent the model, comprising 6 cysteine profiles of width 1, one ofwidth 5 to the motif [FY]PxCC, and one of width 1 to a conservedhydrophobic residue. The maximum and minimum distances betweenthese profiles were taken from the multiple alignment, and the resultingmodel was used as a probe to search the D. melanogaster known pro-teome as listed in FlyBase [3]. Following the discovery of geneCG32667, Blast [4] was run against SwissProt [5] using default para-meters to identify a potential homology to human BMP-regulatingprotein Q8N8U9.

2.2. Extending the SVC familyThe 13 D. melanogaster sequences were aligned and a position-spe-

cific scoring matrix (PSSM; [4]) was constructed. Using this, PsiBlast[4] was run to convergence with an E-value of 0.001 against the NCBInon-redundant database.1 Every homologue identified was aligned tothe original profile, and this new profile was used to re-search the samedatabase. To ensure that no homologues were missed, the procedurewas repeated using HMMER v2.3.2.2

2.3. Modelling CG32667An alignment of the original SVC proteins was built using ClustalW

[6]. The amino acid sequence of collagen IIA VWC domain (PDB code1U5M) was aligned with nine of its own VWC homologues, as identi-fied by PsiBlast. ClustalW was used again to align these two profiles,and the resulting pairwise alignment between CG32667 and 1U5Mwas used as input to Modeller 8v2 [7]. Ten models were built, andthe best scoring one (according to Modeller’s own scoring function)was selected.

Further details on methods used can be found in Supplementarymaterial, Appendix E.

3. Results

3.1. Identification of a novel family of 13 single-domain

Drosophila proteins

The eight proteins speculated to be IGF-like by Kleinjung

et al. [1] were initially used as the basis for the SVC family

(Table 1, asterisked). The family displays the consensus pattern

C–f17; 32g–C–f4; 5g–C–f10; 11g–C–f6; 10g–C–f12; 14g–½FY�–P–X–C–C–f2; 5g–C

where {x,y} represents between x and y residues inclusive.

Initial runs of PsiBlast [4] using this group did not yield any

*Corresponding author. Fax: +44 20 8816 2460.E-mail address: [email protected] (T.J. Sheldon).

1http://www.ncbi.nlm.nih.gov/.2http://hmmer.janelia.org.

0014-5793/$32.00 � 2007 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

doi:10.1016/j.febslet.2007.10.016

FEBS Letters 581 (2007) 5268–5274

Page 3: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

further family members with an E-value lower than the stan-

dard threshold of 0.001. Therefore, a motif model [2] which

looks only for core motifs in specified ordering and ranges

was applied to this group. Application of this method to the

D. melanogaster proteome identified four new family members

CG6301, CG1552, CG31997 and CG32667 as the highest scor-

ing results. A subsequent run of Blast using CG6301 against

the full NCBI non-redundant database identified a further se-

quence BK003543. All 13 of these proteins have the requisite

consensus pattern, and all are predicted by SignalP [8] to have

signal peptides, strongly suggesting that they are secreted. See

Supplementary material, Appendix A for details of individual

sequences.

Notably, it was found that CG32667 lies at cytological re-

gion 10A7 of chromosome 1, within the same 90 kb interval

containing five of the original eight genes. Multiple alignment

using ClustalW [6] and secondary structure prediction using

Psipred [9] further suggested that CG32667 belongs to the ori-

ginal family of peptides. The splice sites and intron phases were

also found to match with those from the original group

(Fig. 1A). When the four family members that occupy a single

5 kb segment of the genome are clustered by sequence similar-

ity, CG32667 and CG15199 are found to be closest to

CG15201 and CG15202, respectively suggesting two gene

duplication events. Their arrangement in the genome supports

this idea (Fig. 2), as does the finding that one duplicate pair

(CG15199 and CG15202) display marked similarity in their

expression patterns, suggesting a shared mechanism of regula-

tion (Table 1).

3.2. A structural model for the SVC family

The databases SMART3 and InterPro4 both annotate

CG32667 as the only one of the SVC proteins to contain a

VWC domain. However, a run of Blast against SwissProt [5]

using the amino acid sequence of CG32667 revealed no close

similarity to any known sequence. The best-scoring alignment

was to residues 166–223 of human BMP-regulating protein

Q8N8U9 with an E-value of 0.41, the two sequences sharing

33% identity across this region. Swiss-Prot annotates this re-

gion in Q8N8U9 as being similar to VWC.

To investigate this link to the structure of the VWC domain,

a 3-dimensional model of a representative SVC protein was

constructed. The profile of the 13 D. melanogaster SVC se-

quences was aligned with the sequence of the VWC domain

from collagen IIA. This revealed that the VWC domain con-

tains 10 disulfide-bonding cysteine residues [10], while the

SVC proteins only have eight. But the two missing cysteines

in SVC form a bonded pair in VWC, suggesting that they

could have been lost together through evolution (Fig. 3A).

This is also arguably the least important disulfide bond in

Table 1Summary of expression data for the 13 D. melanogaster SVC genes

Gene Location In situ developmental expression Expression from interrogation of literature/public databases

CG15203* 10A4 (X) Embryonic fat body and PNS at the endof embryogenesis, not detected in any tissueduring larval development

Embryoa larval fat body [21]

CG2081 (Vago)* 10A4 (X) Not detected in embryos or duringlarval development

Fat body from larvae challenged withgram+/� bacteriab adult head [24]c [25]d larvalfat body [21]

CG32677 10A7 (X) Not tested Adult head [24]c [25]e Larval/pupal library [25]f

CG15199* 10A7 (X) Not detected except for embryonicsalivary glands

Pupal salivary glands [18]

CG15201* 10A8 (X) Not detected in embryos Adult head [24]c

CG15202* 10A8 (X) Not detected except for embryonicsalivary glands

Pupal salivary glands [18]

CG2444* 10D5 (X) Not detected in embryos or duringlarval development

Fat body from larvae challenged withgram+/� bacteriab pupal salivary glands[18] larval fat body [21]

CG12677* 24F2 (2L) Not detected in embryos Fat body from larvae challenged withgram+/� bacteriab

CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged withgram+/� bacteriab adult head [25]d embryo [24]g

CG14132* 68D3 (3L) Not tested Fat body from larvae challenged withgram+/� bacteriab

CG6301 53D 11 (2R) Not tested becoming 2 separate genes N/ACG1552 10A3 (X) Not tested Adult head [24]c [25]d

BK003543 (2L) Not tested N/A

In situ hybridisation of nine out of the 13 D. melanogaster genes reveals that many of them are not expressed during embryonic and/or larvaldevelopment. Analysis of their representation in cDNA collections (available from FlyBase under cDNA clones) shows that most of these genes areexpressed in adults and/or challenged tissues, consistent with their possible role as ‘‘response genes’’. The genes previously described by Kleinjunget al. [1] as putative IGF-like genes are marked by asterisks.aBDGP in situ database (http://www.fruitfly.org/cgi-bin/ex/insitu.pl).bEC library, FlyBase. Chen, F. et al. (2005). FlyTag ML01 EST Project.cRH (Riken Head) library, FlyBase.dGH library, FlyBase.eHL library, FlyBase.fLP library, FlyBase.gRE (Riken Embryo) library, FlyBase.

3http://smart.embl-heidelberg.de/.4http://www.ebi.ac.uk/interpro/.

T.J. Sheldon et al. / FEBS Letters 581 (2007) 5268–5274 5269

Page 4: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

the VWC molecule, as it connects two beta strands within a

sheet; the four remaining (conserved) cysteine pairs stabilise

loop regions.

Using Psiblast, the SVCs were expanded into a wider family

comprising 43 short, single domain secreted proteins. They are

diverse in sequence, but eight cysteine residues are conserved

throughout the group. In the canonical 10-cysteine arrange-

ment of VWC, two motifs are known to be well-conserved:

C2–X–X–C3–X–C4 and C8–C9–X–X–C10 [10]. C3 is missing

from the first of these motifs in the SVC family. This cysteine

forms half of the missing disulfide bridge (Fig. 3A); C2 and C4

remain separated by four amino acids in almost every case

(Fig. 1A). The second VWC motif is conserved across the

SVCs as part of the C-terminal motif [FYHW]–P–X–C–C–

{2,5}–C,where X is usually polar. It is notable that the first res-

idue in this motif always possesses an aromatic side chain.

Five VWC domains are also found in Drosophila Hemolectin

(CG7002), a large, multi-domain protein similar in arrange-

ment to von Willebrand factor [11]. Notably, the two C-termi-

nal VWCs of hemolectin are missing a single disulfide-bonded

pair of cysteines, the same pair that is missing in the SVC fam-

ily.

An examination of the spacing between cysteine residues

found the SVCs to be consistent with the 679 10-cysteine

Fig. 1. Alignments of the SVC proteins. (A) Shows the alignment of all 43 SVC sequences in the same order as Table 2. All cysteines are colouredyellow. Splice sites in the D. melanogaster sequences are coloured blue (phase 1) and green (phase 2). The positions of the five b strands from themodel of CG32667 are shown at the top. Signal peptides and long terminus regions have been omitted for brevity. (B) Shows the alignment of the twoVWC domains of collagen IIA (1U5M) and human BMP-regulating protein Q8N8U9 to SVC protein CG32667. Figures were rendered using Jalview[26].

Gene spanCG18292Klp10A G32667 CG15199 CG15202 CG15201 ran

10982k 10983k 10984k 10985k 10986k 10987k 10988k 10989k 10990k 10991k

Fig. 2. Gene region map of four of the SVC genes from D. melanogaster. The arrangement of these four genes on the genome suggests that the groupis a result of a double duplication event, supporting the idea that the important gene CG32667 should be included in the SVC family. Image sourcedfrom FlyBase (http://flybase.bio.indiana.edu).

5270 T.J. Sheldon et al. / FEBS Letters 581 (2007) 5268–5274

Page 5: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

VWC domains listed in the Pfam database [12] (Supplemental

Table 1 and Appendix C). In addition, the secondary structure

prediction tools Psipred [9] and Yaspin [13] both predict beta

strands, precisely consistent with the VWC model, but not with

an IGF structure (Fig. 3B). Furthermore, the sequence thread-

ing tool GenThreader [14] also predicts a VWC domain (pdb

code 1U5M) as the highest scoring structure for one of the

SVC sequences (AAM93661). The P-value of this result is only

0.045, but this is the same order of magnitude as the GenTh-

reader result for the 8-cysteine VWC5 domain in Hemolectin

(P = 0.029 against 1U5M), while Hemolectin VWC4 (also 8-

cysteine) is not threaded on to a VWC domain at all.

Fig. 3B shows a proposed model of a representative SVC

family member, CG32667, built with Modeller 8v2 [7] using

the collagen IIA VWC domain as a template (PDB code

1U5M). Fig. 1B shows the alignment of CG32667 to this

VWC domain and that of Q8N8U9. On the basis that the 13

proteins all contain a single domain of 88–192 amino acids re-

lated to the VWC domain, they are called single VWC-domain

(SVC) proteins.

3.3. SVC genes appear to be restricted to arthropods

Further runs of PsiBlast using CG32667 have identified

orthologues in a wide set of species (Table 2; Fig. 1A). In addi-

tion to the 13 in D. melanogaster, there are a further 30 in other

arthropods, making a total of 43 proteins across 12 species.

Blood-sucking insects account for 18 of these sequences. All

but one of the 43 proteins are from insects, the exception being

AAK61816 which is from another arthropod, the scorpion

Buthus martensii. An analysis of the SVC protein sequences

using SignalP indicates that they are secreted. See Supplemen-

tary material, Appendix D for details.

3.4. Many Drosophila SVC genes are not developmentally

expressed, nor does their misexpression interfere with

development

The expression of eight out of the 13 D. melanogaster SVCs

was analyzed in Drosophila (Supplementary material, Appen-

dix B). Surprisingly, no expression was detected in any tissue

during embryonic or larval development for five of these eight

(Fig. 4A–F and Table 1). Two other genes, CG15199 and

CG15202, only showed expression in salivary glands, a non-

essential tissue [15] (Table 1; Fig. 4I and not shown). Clear,

spatially restricted expression was only observed for

CG15203, first expressed at embryonic stage 16 in exit glia.

At stage 17, the fat body and PNS structures such as the dorsal

bipolar dendrite (dbd) neurons and one of the lateral chordo-

tonal organs also expressed CG15203 (Fig. 4G and H). How-

ever, this expression was transient and could no longer be

detected in larvae (Fig. 4D). A number of control experiments

were carried out to rule out technical problems as the reason

for our inability to detect expression (Fig. 4J–L and Supple-

mentary material, Appendix B).

Ubiquitous expression of SVC genes from transgenes did not

interfere with normal development (see Supplementary mate-

rial, Appendix B for details).

3.5. Some SVC proteins are induced by environmental stress

Several Drosophila transcriptome-wide libraries ([16–19],

and Flybase) were examined. Consistent with our in situ

hybridisation analysis, most of the SVC genes are not repre-

sented in the embryonic or larval collections of RNAs (Table

1). Instead, eight of them appear to be expressed in adult flies

or during metamorphosis (CG2081, CG32677, CG15199,

CG15201, CG15202, CG2444, CG12677, CG31997).

Interestingly, several SVCs were present in larval RNA

libraries obtained following bacterial infection, an environ-

mental stress (CG2081, CG2444, CG12677, CG31997 and

CG14132). At least one of these three genes was also found

to be highly induced by viral infection. This gene is CG2081,

also known as Vago. Furthermore, Vago, CG15199 and

CG14132 were also found to be nutritionally regulated, and

the expression levels of CG2444, another infection-induced

gene, were regulated during salivary gland apoptosis (see Table

2 for summary). Thus, SVC gene expression can be regulated

by a number of environmental stresses such as infection or

starvation.

4. Discussion

The SVC family was initially identified by Kleinjung et al. [1]

as having a possible insulin-like growth factor (IGF) protein

fold. Here, the family has been extended to 13 members using

a motif model which looks for conservation only in core mo-

tifs. One of these additional family members had sufficient se-

quence similarity to provide the link to a modelling template

which implies an entirely different fold prediction (Fig. 1B).

The alternative structure reported here is that of a single

Fig. 3. A proposed structure for SVC. The arrangement of the fourdisulfide bridges can be seen in schematic diagram (A). Existing pairsA, B, C and D are in yellow, and the missing bridge M is in red. Image(B) shows a stereoscopic view of a SVC family member, CG32667. Thefour disulfide bonds in yellow are labelled A, B, C and D; the positionof the lost Cys-Cys pair (pale yellow) is labelled M. The two red sidechains are Tyr-Pro which make up the first two residues of the[FYHW]–P–X–C–C–{2,5}–C motif found throughout the SVCs.Actual beta strands are depicted as arrows. Regions predicted as bstrands have been coloured green.

T.J. Sheldon et al. / FEBS Letters 581 (2007) 5268–5274 5271

Page 6: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

VWC domain, whose all-beta arrangement is in contrast to the

helical fold of IGF.

The evidence for the suitability of VWC as a template for the

SVC proteins can be summarised as follows. First, the second-

ary structure predictions provided by two separate programs

fit extremely well with the b-strand model (Fig. 3B); they do

not fit with the IGF model of Kleinjung et al. [1]. Second,

the cysteine spacings observed in the SVC family are consistent

with those of the VWC domains described in Pfam. Third, the

two missing cysteines in the 8-cysteine SVCs form a disulfide-

bonded pair in 10-cysteine VWC domains; exactly the same

pattern is found in the two 8-cysteine VWC domains of Dro-

sophila Hemolectin. Fourth, GenThreader predicts VWC as

the top-scoring template, with a P-value comparable with that

of a Hemolectin VWC domain; there was no comparable IGF

prediction.

Because of their clear orthology to D. melanogaster genes

(Table 2), the three D. pseudoobscura genes GA12782,

GA13565 and GA13562, which are speculated to be incom-

plete sequences in the databases, may also encode signal pep-

tides. If this is the case, then all SVC proteins are likely to

be small, secreted, apparently single-domain molecules of

Table 2The 43 proteins of the arthropodic SVC family

Protein Species Comments

CG32667 Drosophila melanogasterCG12677 Induced by bacterial infectiona

See Supplementary material, Appendix ACG14132 Induced by bacterial infectiona

Nutritionally regulated [19]See Supplementary material, Appendix A

CG15199 Nutritionally regulated [19]CG15201CG15202CG15203CG1552CG2081 Vago gene: upregulated following viral and bacterial infectiona [17]

Strongly upregulated in aging fliesb

Nutritionally regulated [19]CG2444 Regulated during salivary gland metamorphosis [18]

Induced by bacterial infectiona

CG31997 Induced by bacterial infectiona

CG6301 See Supplementary material, Appendix ABK003543 See Supplementary material, Appendix AGA12782 D. pseudoobscura Orthologous to CG14132

Predicted unsecretedGA13562 Orthologous to CG15199

Originally predicted unsecreted, but upstream signal foundGA13565 Orthologous to CG15201

Predicted unsecretedGA13566 Orthologous to CG15202GA13567 Orthologous to CG15203GA15228 Orthologous to CG2081GA15367 Orthologous to CG2444GA16601 Orthologous to CG31997AAM93660 Ixodes scapularisAAM93661AAV80782AAY66509AAY66510AAY66701AAY66712AAY66742AAY66767AAT92107 Ixodes pacificusAAT92199 Appears to be missing one cysteineEAT48541 Aedes aegyptiEAT48820EAT48821EAT48822AAV90667 Aedes albopictusAAF16720 Manduca sextaXP 967519 Tribolium castaneumXP320959 Anopheles gambiaeXP395092 Apis mellifera Predicted similar to CG31997AAN46117 CulicoidesAAK61816 Buthus martensii Appears to be missing first cys but probably truncated at the N-term

Table entries are grouped by species, and comments are included where further information is known. The expression of several of the Drosophilamembers is regulated by environmental conditions such as infection or nutritional status.aEC library, FlyBase: Chen et al. (2005), FlyTag ML01 EST Project (unpublished).bhttp://www.ncbi.nlm.nih.gov/projects/geo/gds/gds_browse.cgi?gds=582.

5272 T.J. Sheldon et al. / FEBS Letters 581 (2007) 5268–5274

Page 7: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

101–192 amino acids. In contrast, the VWC domains found

abundantly in hemolectin, VWF and elsewhere are always

found to be combined with other domains to form much larger

molecules. The single exception to this is Lymnaea Granularin

(AAS20460) which is also a single VWC domain protein, but

which possesses the full complement of 10 cysteines [20].

Five of the eight D. melanogaster SVCs that were examined

are not expressed during development, and two others were

only detected in the salivary glands, a non-essential tissue

[15]; only CG15203 was detected in the embryonic fat body

(Table 1; Supplementary material, Appendix B). These results

are generally consistent with the recently available microarray

expression data collated as FlyAtlas [21], where nine out of the

12 SVC genes show no or very low expression in their larval

collections. However, along with CG15203, FlyAtlas reports

expression of a further two SVC genes (CG2444 and

CG2081) in larval fat body, the central organ that regulates

immune and nutritional responses. It is conceivable that differ-

ences in the exact third-instar stage analyzed and/or the nutri-

tional/immune status of the starting tissue account for this

apparent discrepancy.

The general absence of SVC gene expression rules out a

developmental or housekeeping role for the family. Instead,

several of these genes have been found to be nutritionally reg-

ulated, or expressed in response to bacterial infection [3,19].

This, together with the presence of signal peptides across the

SVC family, supports the hypothesis that SVCs are secreted

into the circulation (hemolymph) in response to certain envi-

ronmental stresses. Further support for this idea is provided

by the Vago gene (CG2081), the best characterized Drosophila

SVC gene. As with many Drosophila SVCs, Vago is not ex-

pressed during embryonic or larval development. Instead, its

expression is strongly induced in the fat body of adult flies

24 h after they are infected with Drosophila C virus [17]. The

fat body is both a nutrient sensor and the main organizer of

the immune response in Drosophila [22,23] Interestingly, Vago

expression is also strongly upregulated in aging flies5 which

would be consistent with its induction upon oxidative stress.

Vago expression was also activated when Drosophila larvae

were fed a sugar-only diet [[19], Supplementary data]. Thus,

at least in this case, it would appear that the expression of

the same SVC can be induced upon a number of stress stimuli.

In further support of the SVC response function hypothesis,

Lymnaea Granularin was recently shown to be secreted by

brain connective tissue as part of an internal defence response

against schistosome infection [20].

In this paper, a novel family of genes with a possible re-

sponse function has been identified. Due to their high sequence

Fig. 4. mRNA expression of Drosophila SVC genes. Most SVC genes, such as CG2081(A), CG2444 (B) and CG15201 (C) are not expressed inembryos. Similarly, none of the SVC genes examined showed larval expression, as shown in D–F for third-instar tissue (fat body, CNS and discs, andthe alimentary canal, respectively). Only CG15203 shows clear expression in the fat body of late embryos (G). At this stage, CG15203 is alsotransiently expressed in the PNS (H, note expression in the dorsal bipolar dendrite neuron and LCh5). However, this expression pattern is no longerapparent in larvae (D and not shown). CG15199 (D) and CG15202 (not shown) are expressed in embryonic salivary glands. To confirm the integrityof the probes, ectopic expression of SVCs was induced in larval tissues (J,K) and embryos (L) from UAS transgenes. In these control experiments,SVC gene expression was readily detected in the expected pattern.

5http://www.ncbi.nlm.nih.gov/projects/geo/gds/gds_browse.cgi?gds=582.

T.J. Sheldon et al. / FEBS Letters 581 (2007) 5268–5274 5273

Page 8: Author's personal copy - UPV/EHU · CG31997 102B1 (IV) Not detected in embryos. Low level trachea? Fat body from larvae challenged with gram+/ bacteriab adult head[25]d embryo[24]g

Author's personal copy

variability, the conserved intron structure of the SVC genes

could be exploited in the future to identify remotely homolo-

gous SVC sequences using computational gene finding meth-

ods. Further experimental work to determine their mode of

action and which aspects of the structure are functionally

important would be of great interest.

Acknowledgements: We thank Sharon Gorski for providing cDNAclones, and Patricia Serpente and Penny Sarchet for technical assis-tance. T.S., A.G. and W.T. were funded by the Medical ResearchCouncil. I.M.-A. was funded by a Marie Curie Intra-European Fellow-ship.

Appendix A. Supplementary data

Supplementary data associated with this article can be

found, in the online version, at doi:10.1016/j.febslet.2007.

10.016.

References

[1] Kleinjung, J., Romein, J., Lin, K. and Heringa, J. (2004) Contact-based sequence alignment. Nucleic Acids Res. 32, 2464–2473.

[2] Conklin, D. (2004) Recognition of the helical cytokine fold. J.Comput. Biol. 11, 1189–1200.

[3] Drysdale, R.A. and Crosby, M.A. (2005) FlyBase: genes and genemodels. Nucleic Acids Res. 33, D390–D395.

[4] Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang,Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic Acids Res. 25, 3389–3402.

[5] Bairoch, A. and Apweiler, R. (1998) The SWISS-PROT proteinsequence data bank and its supplement TrEMBL in 1998. NucleicAcids Res. 26, 38–42.

[6] Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUS-TAL W: improving the sensitivity of progressive multiplesequence alignment through sequence weighting, position-specificgap penalties and weight matrix choice. Nucleic Acids Res. 22,4673–4680.

[7] Sali, A. and Blundell, T.L. (1993) Comparative protein modellingby satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815.

[8] Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997)Identification of prokaryotic and eukaryotic signal peptides andprediction of their cleavage sites. Protein Eng. 10, 1–6.

[9] McGuffin, L.J., Bryson, K. and Jones, D.T. (2000) The PSIPREDprotein structure prediction server. Bioinformatics 16, 404–405.

[10] O’Leary, J.M., Hamilton, J.M., Deane, C.M., Valeyev, N.V.,Sandell, L.J. and Downing, A.K. (2004) Solution structure anddynamics of a prototypical chordin-like cysteine-rich repeat (vonWillebrand Factor type C module) from collagen IIA. J. Biol.Chem. 279, 53857–53866.

[11] Goto, A., Kumagai, T., Kumagai, C., Hirose, J., Narita, H.,Mori, H., Kadowaki, T., Beck, K. and Kitagawa, Y. (2001) A

Drosophila haemocyte-specific protein, hemolectin, similar tohuman von Willebrand factor. Biochem. J. 359, 99–108.

[12] Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V.,Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S.,Sonnhammer, E.L., et al. (2004) The Pfam protein familiesdatabase. Nucleic Acids Res. 32, D138–D141.

[13] Lin, K., Simossis, V.A., Taylor, W.R. and Heringa, J. (2005) Asimple and fast secondary structure prediction method usinghidden neural networks. Bioinformatics 21, 152–159.

[14] Jones, D.T. (1999) GenTHREADER: an efficient and reliableprotein fold recognition method for genomic sequences. J. Mol.Biol. 287, 797–815.

[15] Jones, N.A., Kuo, Y.M., Sun, Y.H. and Beckendorf, S.K. (1998)The Drosophila Pax gene eye gone is required for embryonicsalivary duct development. Development 125, 4163–4174.

[16] Tomancak, P., Beaton, A., Weiszmann, R., Kwan, E., Shu, S.,Lewis, S.E., Richards, S., Ashburner, M., Hartenstein, V.,Celniker, S.E. and Rubin, G.M. (2002) Systematic determinationof patterns of gene expression during Drosophila embryogenesis.Genome Biol. 3, (RESEARCH0088).

[17] Dostert, C., Jouanguy, E., Irving, P., Troxler, L., Galiana-Arnoux, D., Hetru, C., Hoffmann, J.A. and Imler, J.L. (2005) TheJak-STAT signaling pathway is required but not sufficient for theantiviral response of drosophila. Nat. Immunol. 6, 946–953.

[18] Gorski, S.M., Chittaranjan, S., Pleasance, E.D., Freeman, J.D.,Anderson, C.L., Varhol, R.J., Coughlin, S.M., Zuyderduyn, S.D.,Jones, S.J. and Marra, M.A. (2003) A SAGE approach todiscovery of genes involved in autophagic cell death. Curr. Biol.13, 358–363.

[19] Zinke, I., Schutz, C.S., Katzenberger, J.D., Bauer, M. andPankratz, M.J. (2002) Nutrient control of gene expression inDrosophila: microarray analysis of starvation and sugar-depen-dent response. Embo. J. 21, 6162–6173.

[20] Smit, A.B., De Jong-Brink, M., Li, K.W., Sassen, M.M., Spijker,S., Van Elk, R., Buijs, S., Van Minnen, J. and Van Kesteren, R.E.(2004) Granularin, a novel molluscan opsonin comprising a singlevWF type C domain is up-regulated during parasitation. Faseb J.18, 845–847.

[21] Chintapalli, V.R., Wang, J. and Dow, J.A. (2007) Using FlyAtlasto identify better Drosophila melanogaster models of humandisease. Nat. Genet. 39, 715–720.

[22] Colombani, J., Raisin, S., Pantalacci, S., Radimerski, T.,Montagne, J. and Leopold, P. (2003) A nutrient sensor mecha-nism controls Drosophila growth. Cell 114, 739–749.

[23] Hoffmann, J.A. and Reichhart, J.M. (2002) Drosophila innateimmunity: an evolutionary perspective. Nat. Immunol. 3, 121–126.

[24] Stapleton, M., Liao, G., Brokstein, P., Hong, L., Carninci, P.,Shiraki, T., Hayashizaki, Y., Champe, M., Pacleb, J., Wan, K.,et al. (2002) The Drosophila gene collection: identification ofputative full-length cDNAs for 70% of D. melanogaster genes.Genome Res. 12, 1294–1300.

[25] Rubin, G.M., Hong, L., Brokstein, P., Evans-Holm, M., Frise, E.,Stapleton, M. and Harvey, D.A. (2000) A Drosophila comple-mentary DNA resource. Science 287, 2222–2224.

[26] Clamp, M., Cuff, J., Searle, S.M. and Barton, G.J. (2004) TheJalview Java alignment editor. Bioinformatics 20, 426–427.

5274 T.J. Sheldon et al. / FEBS Letters 581 (2007) 5268–5274