molecular evolution of integrins: genes encoding …pat3,flyb ps, and most vertebrate bsubunits...

6
Proc. Natl. Acad. Sci. USA Vol. 94, pp. 9182–9187, August 1997 Evolution Molecular evolution of integrins: Genes encoding integrin b subunits from a coral and a sponge DANNY L. BROWER* ² ,SHARON M. BROWER* ² ,DAVID C. HAYWARD*, AND ELDON E. BALL* *Research School of Biological Sciences, Australian National University, Canberra, ACT, 2601, Australia; and ² Department of Molecular and Cellular Biology, Life Sciences South Building, University of Arizona, Tucson, AZ 85721 Communicated by Judith Kimble, University of Wisconsin, Madison, WI, June 2, 1997 (received for review April 30, 1997) ABSTRACT The integrin family of cell surface receptors is strongly conserved in higher animals, but the evolutionary history of integrins is obscure. We have identified and se- quenced cDNAs encoding integrin b subunits from a coral (phylum Cnidaria) and a sponge (Porifera), indicating that these proteins existed in the earliest stages of metazoan evolution. The coral b Cn1 and, especially, the sponge b Po1 sequences are the most divergent of the ‘‘b 1 -class’’ integrins and share a number of features not found in any other vertebrate or invertebrate integrins. Perhaps the greatest difference from other b subunits is found in the third and fourth repeats of the cysteine-rich stalk, where the generally conserved spacings between cysteines are highly variable, but not similar, in b Cn1 and b Po1 . Alternatively spliced cDNAs, containing a stop codon about midway through the full-length translated sequence, were isolated from the sponge library. These cDNAs appear to define a boundary between functional domains, as they would encode a protein that includes the globular ligand-binding head but would be missing the stalk, transmembrane, and cytoplasmic domains. These and other sequence comparisons with vertebrate integrins are discussed with respect to models of integrin structure and function. The integrins comprise a large family of cell surface receptors involved in diverse adhesion and signaling events in animals (1). In vertebrates, at least 20 different ab heterodimeric integrins are important for processes such as embryonic mor- phogenesis, leukocyte migration, platelet aggregation, and regulation of cell proliferation and differentiation. Integrins in Drosophila melanogaster and Caenorhabditis elegans also are known to be required for a host of morphogenetic events (2, 3). While some integrins bind directly to other transmembrane proteins, most are receptors for extracellular matrix compo- nents. Each integrin is composed of nonhomologous a and b subunits, both of which are transmembrane proteins of ap- proximately 100 kDa (1). The large extracellular domains form a globular head, connected by stalks from each subunit to a pair of short cytoplasmic tails. Integrin polypeptides from nematodes, insects, and vertebrates display a remarkable degree of sequence conservation, which extends throughout the a and b subunits (2, 3). The worm b pat3 , fly b PS , and most vertebrate b subunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a stereotypical struc- ture. The high level of structural conservation may be a reflection of the fact that in response to extracellular ligand binding or intracellular signals integrins can undergo confor- mational changes that are propagated along the molecule (4). That is, extensive associations between the a and b subunits, as well as interactions with a host of other proteins inside and outside the cell, probably have placed significant selective constraints on integrin structure in higher animals. Unlike the cadherins or immunoglobulin domain receptors, integrins are not part of a superfamily whose members contain repeating structural units arranged in various ways. Recently, however, an argument has been made that seven short repeats in the globular heads of integrin a subunits are folded into a b-propeller domain similar to that found in the b subunits of heterotrimeric G proteins (5). This b-propeller associates with a 180 amino acid structure known as the I domain in some a subunits, and the head of the integrin b subunits also appears to contain an I domain-like structure (6, 7), including a characteristic metal ion-dependent adhesion site (MIDAS). Natural mutations and experimental studies of b subunits have identified a number of specific I domain amino acid residues required for integrin ligand binding (7–16). While progress is being made in defining integrin structure, and similarities to very distantly related proteins are becoming evident, very little information is available concerning the evolutionary origins of integrins as cell surface receptors. Previously, the most primitive organism from which an inte- grin-encoding gene had been isolated was the nematode C. elegans (3). There have been suggestions of integrin-like proteins in the cnidarian Hydra, based on immunological data and inhibition studies using Arg-Gly-Asp (RGD) peptides (17–19). In the most primitive metazoan phylum, the Porifera (sponges), there have been no good experimental suggestions of integrins, but some extracellular matrix proteins similar to those from vertebrates have been identified, including colla- gens and a tenascin-like molecule (20–24). With these facts in mind, we set out to identify genes encoding integrin-related proteins from primitive metazoans. Here, we report the iso- lation of cDNAs encoding integrin b subunits from two of the most primitive animal phyla, the Cnidaria (represented by the coral Acropora millepora) and the Porifera (represented by the marine sponge Ophlitaspongia tenuis). These sequences rep- resent divergent members of what we term the ‘‘b 1 -class’’ of integrin subunits. MATERIALS AND METHODS cDNA Libraries. The coral cDNA library was made from presettlement embryos of A. millepora. Fertilized eggs were collected from the surface of wading pools containing A. millepora colonies at Magnetic Island, Queensland, Australia (19° 089S, 146° 509E), and raised in bulk in mesh-covered containers suspended at sea until the embryos reached pre- settlement stage. Embryos were then placed in cryo tubes and immediately frozen in liquid nitrogen. Specimens of the marine sponge O. tenuis were collected off Jervis Bay, New South Wales, Australia (35° 039S, 150° 449E). Each sponge was separated from the substratum, shaken off, The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked ‘‘advertisement’’ in accordance with 18 U.S.C. §1734 solely to indicate this fact. © 1997 by The National Academy of Sciences 0027-8424y97y949182-6$2.00y0 PNAS is available online at http:yywww.pnas.org. Data deposition: The sequences reported in this paper have been deposited in the GenBank database [accession nos. AF005356 (b Cn1 ) and AF005357 (b Po1 )]. 9182 Downloaded by guest on January 22, 2020

Upload: others

Post on 29-Dec-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Molecular evolution of integrins: Genes encoding …pat3,flyb PS, and most vertebrate bsubunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a

Proc. Natl. Acad. Sci. USAVol. 94, pp. 9182–9187, August 1997Evolution

Molecular evolution of integrins: Genes encoding integrin bsubunits from a coral and a sponge

DANNY L. BROWER*†, SHARON M. BROWER*†, DAVID C. HAYWARD*, AND ELDON E. BALL**Research School of Biological Sciences, Australian National University, Canberra, ACT, 2601, Australia; and †Department of Molecular and Cellular Biology,Life Sciences South Building, University of Arizona, Tucson, AZ 85721

Communicated by Judith Kimble, University of Wisconsin, Madison, WI, June 2, 1997 (received for review April 30, 1997)

ABSTRACT The integrin family of cell surface receptorsis strongly conserved in higher animals, but the evolutionaryhistory of integrins is obscure. We have identified and se-quenced cDNAs encoding integrin b subunits from a coral(phylum Cnidaria) and a sponge (Porifera), indicating thatthese proteins existed in the earliest stages of metazoanevolution. The coral bCn1 and, especially, the sponge bPo1sequences are the most divergent of the ‘‘b1-class’’ integrinsand share a number of features not found in any othervertebrate or invertebrate integrins. Perhaps the greatestdifference from other b subunits is found in the third andfourth repeats of the cysteine-rich stalk, where the generallyconserved spacings between cysteines are highly variable, butnot similar, in bCn1 and bPo1. Alternatively spliced cDNAs,containing a stop codon about midway through the full-lengthtranslated sequence, were isolated from the sponge library.These cDNAs appear to define a boundary between functionaldomains, as they would encode a protein that includes theglobular ligand-binding head but would be missing the stalk,transmembrane, and cytoplasmic domains. These and othersequence comparisons with vertebrate integrins are discussedwith respect to models of integrin structure and function.

The integrins comprise a large family of cell surface receptorsinvolved in diverse adhesion and signaling events in animals(1). In vertebrates, at least 20 different ab heterodimericintegrins are important for processes such as embryonic mor-phogenesis, leukocyte migration, platelet aggregation, andregulation of cell proliferation and differentiation. Integrins inDrosophila melanogaster and Caenorhabditis elegans also areknown to be required for a host of morphogenetic events (2,3). While some integrins bind directly to other transmembraneproteins, most are receptors for extracellular matrix compo-nents.

Each integrin is composed of nonhomologous a and bsubunits, both of which are transmembrane proteins of ap-proximately 100 kDa (1). The large extracellular domains forma globular head, connected by stalks from each subunit to apair of short cytoplasmic tails. Integrin polypeptides fromnematodes, insects, and vertebrates display a remarkabledegree of sequence conservation, which extends throughoutthe a and b subunits (2, 3). The worm bpat3, f ly bPS, and mostvertebrate b subunits contain 56 conserved cysteine residues,suggesting that the proteins all fold into a stereotypical struc-ture. The high level of structural conservation may be areflection of the fact that in response to extracellular ligandbinding or intracellular signals integrins can undergo confor-mational changes that are propagated along the molecule (4).That is, extensive associations between the a and b subunits,as well as interactions with a host of other proteins inside and

outside the cell, probably have placed significant selectiveconstraints on integrin structure in higher animals.

Unlike the cadherins or immunoglobulin domain receptors,integrins are not part of a superfamily whose members containrepeating structural units arranged in various ways. Recently,however, an argument has been made that seven short repeatsin the globular heads of integrin a subunits are folded into ab-propeller domain similar to that found in the b subunits ofheterotrimeric G proteins (5). This b-propeller associates witha 180 amino acid structure known as the I domain in some asubunits, and the head of the integrin b subunits also appearsto contain an I domain-like structure (6, 7), including acharacteristic metal ion-dependent adhesion site (MIDAS).Natural mutations and experimental studies of b subunits haveidentified a number of specific I domain amino acid residuesrequired for integrin ligand binding (7–16).

While progress is being made in defining integrin structure,and similarities to very distantly related proteins are becomingevident, very little information is available concerning theevolutionary origins of integrins as cell surface receptors.Previously, the most primitive organism from which an inte-grin-encoding gene had been isolated was the nematode C.elegans (3). There have been suggestions of integrin-likeproteins in the cnidarian Hydra, based on immunological dataand inhibition studies using Arg-Gly-Asp (RGD) peptides(17–19). In the most primitive metazoan phylum, the Porifera(sponges), there have been no good experimental suggestionsof integrins, but some extracellular matrix proteins similar tothose from vertebrates have been identified, including colla-gens and a tenascin-like molecule (20–24). With these facts inmind, we set out to identify genes encoding integrin-relatedproteins from primitive metazoans. Here, we report the iso-lation of cDNAs encoding integrin b subunits from two of themost primitive animal phyla, the Cnidaria (represented by thecoral Acropora millepora) and the Porifera (represented by themarine sponge Ophlitaspongia tenuis). These sequences rep-resent divergent members of what we term the ‘‘b1-class’’ ofintegrin subunits.

MATERIALS AND METHODS

cDNA Libraries. The coral cDNA library was made frompresettlement embryos of A. millepora. Fertilized eggs werecollected from the surface of wading pools containing A.millepora colonies at Magnetic Island, Queensland, Australia(19° 089S, 146° 509E), and raised in bulk in mesh-coveredcontainers suspended at sea until the embryos reached pre-settlement stage. Embryos were then placed in cryo tubes andimmediately frozen in liquid nitrogen.

Specimens of the marine sponge O. tenuis were collected offJervis Bay, New South Wales, Australia (35° 039S, 150° 449E).Each sponge was separated from the substratum, shaken off,The publication costs of this article were defrayed in part by page charge

payment. This article must therefore be hereby marked ‘‘advertisement’’ inaccordance with 18 U.S.C. §1734 solely to indicate this fact.

© 1997 by The National Academy of Sciences 0027-8424y97y949182-6$2.00y0PNAS is available online at http:yywww.pnas.org.

Data deposition: The sequences reported in this paper have beendeposited in the GenBank database [accession nos. AF005356 (bCn1)and AF005357 (bPo1)].

9182

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

020

Page 2: Molecular evolution of integrins: Genes encoding …pat3,flyb PS, and most vertebrate bsubunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a

and placed individually in a plastic bag. On returning to thesurface the sponges were buried in dry ice until being storedin a 280°C freezer.

Total RNA was prepared from material ground in liquidnitrogen according to the method of Chomczynski and Sacchi(25). Poly(A)1 RNA was isolated using the poly(A) TractmRNA isolation system (Promega). cDNA libraries wereconstructed using the Lambda ZAP system (Stratagene).

PCR Identification of Integrin cDNAs. The general strategyfor isolation of integrin genes was to amplify fragments fromcDNAs or cDNA libraries by PCR using degenerate primersand a variety of thermal profiles. Candidate bands weregel-purified and sequenced with the primers used for ampli-fication, to ascertain whether they represented bona fideintegrin subunit genes.

Fragments of the coral gene initially were amplified with theforward primer 59-GGITTCyTGGIIIITTCyTAyCTIGACyTAA (corresponding to amino acid sequence GFGSFVDK)and the reverse primers 59-CCICCCyTTCIGGIGCAyGTC(DAPEGG) and 59-GCAyGTCAyGAAICCICCCyTTC(EGGFDA). The buffer contained 1.5 mM Mg21, and Taqpolymerase was used in all primary amplifications. Initially, weused a thermal profile with three cycles of annealing at 47°Cand 30 cycles at 42°C.

The initial amplification of the sponge cDNA used a forwardprimer of 59-GACyTCyTTITACyTTACyTCyTTIATGGA(DLYYLMD) and a reverse primer of 59-AyGTGAyGAAICCIGCAyGTCIGT (TDAGFH). These primers ampli-fied a fragment in 2.5 mM Mg21, but not in 1.25 mM Mg21,perhaps because, as indicated by subsequent sequencing, thereare two mismatches between the primer and cDNA sequences.For this gene, we used a thermal profile of five annealing cyclesat 37°C (and a slow ramp to the elongation temperature)followed by 45 cycles at 42°C.

Cloning and Sequencing. After sequencing indicated anintegrin gene had been amplified, radiolabeled probes weremade from amplified fragments and used to screen the phagelibraries, using standard methods (26). Positive clones wererecovered as inserts in Bluescript SK2 plasmids and grown inDH5a cells. DNA was sequenced using both vector andinternal primers with Applied Biosystems automatic sequenc-ing machines. The coral sequences reported are from plasmidC11A5. Most of the sponge sequence is from SP10C, butbecause this plasmid is not full length, the 59 end of thesequence is derived from a combination of plasmid SP10D(nucleotides encoding residues 3–34) and a PCR product(amplified with Pfu polymerase) derived from phage SP9B(encoding residues 1 and 2). Both strands were sequenced foreach plasmid, but when scanning the SP10D plasmid foralternative splicing only one strand was sequenced.

An alternatively spliced exon found in plasmid SP10Dinserts the sequence cagGTGAGTATACTGAACTATTAT-TATATCTCTCACAAGACTGTATGATTATTGTTGT-GTAGtat (italics indicate f lanking residues included inSP10C) beginning at nucleotide 1355 of the sequence inGenBank. We also found heterogeneity in the size of the 39untranslated region of the sponge clones. Nine nucleotidepolymorphisms were found in SP10D relative to SP10C; thesewould be expected to change five amino acids, none in wellconserved positions. Subsequent experiments showed thateither splice variant could contain different polymorphisms,indicating that the inserted exon is not simply a cloning artifactthat was amplified in the library.

The assignment of the N-terminal methionine for the coralprotein is confirmed by the presence of stop codons in the 59untranslated region. For the sponge, there are no in-framestops (or other AUG codons) in the 44 nucleotides 59 to thepresumed initiating AUG. However we are confident this isindeed the initiating codon, as we have sequenced 2,770nucleotides exclusive of the 39 poly(A), and the mRNA on

Northern blots appears as a broad band of approximately 2.8kb. Moreover, the N-terminal sequence satisfies the criteria fora signal sequence for a protein destined for the plasmamembrane (as determined by PROSORT), and this is not true ifpotential residues encoded by the upstream nucleotides areincluded.

Sequence Analysis. Sequences were aligned first with PILEUPin GCG9, followed by manual adjustments, ensuring that the56 cysteines were aligned. Identity and similarity scores weredetermined using BESTFIT of optimal pairwise alignments,again manually adjusted to align the conserved cysteines.

Linkages for generating phylogenetic trees were constructedusing PAUP 3.1.1. Different algorithms were run with differentcombinations of the data to assess the reliability of variousgroupings. The tree shown in Fig. 2 was generated with aheuristic algorithm, with randomly ordered sample addition.Data for other b subunits are from GenBank (3, 27–38).

RESULTS AND DISCUSSION

As detailed in Materials and Methods, we used degenerate PCRprimers to amplify fragments of integrin b subunit cDNAsfrom a coral and a marine sponge. Adult corals are known toharbor numerous symbionts and other visitors, and to avoidthese contaminants the cDNA library was constructed fromstaged embryos. A probe from the isolated plasmid identifiesa single band of approximately 3.8 kb on A. millepora North-erns. This band is present at all embryonic stages examined andin unfertilized eggs (D. Miller, personal communication).

Sponges also can harbor commensal organisms, and largequantities of pure embryonic tissue were not available to us.However, the species investigated is fan-shaped and 2–3 mmthick and lacks the pores and cavities frequented by commen-sals in more massive sponges. Each individual was shaken todislodge external commensals during collection. Other obser-vations also indicate that the gene isolated does not derivefrom a rare contaminating organism. For example, Northernblots of O. tenuis mRNA confirm that the 2.8-kb transcriptidentified is abundant (not shown), and during the cDNAlibrary screening positive plaques were present at a highfrequency of roughly 1 in 2,000.

Sequence Analysis. Upon conceptual translation, both cD-NAs encode proteins that are obviously integrin b subunits(Fig. 1). The coral protein (bCn1, for the first b from the phylumCnidaria) comprises 792 amino acids, whereas there are 838 inthe sponge subunit (bPo1, signifying the phylum Porifera). Allof the major structural features of integrin b subunits (signalsequence, putative I domain, cysteine-rich stalk, transmem-brane, and cytoplasmic domains) are conserved in both se-quences, including the 56 cysteines characteristic of most bsubunits. While the coral and sponge proteins are differentfrom other b subunits, there is no indication that they arespecialized proteins such as vertebrate b8 and Drosophila bn.That is, the sequences are consistent with the idea that theseproteins represent major tissue integrins of these organisms, inthe same way that vertebrate b1 and Drosophila bPS arecommon in their respective organisms (see below).

Comparisons of specific sequence motifs with other bsubunits reveals varying degrees of conservation. A few spe-cific sequence comparisons are worthy of mention. A numberof residues of the I domain region of b1, b2, b3, or b6 have beenshown by experimental or natural mutations to be importantfor ligand binding (7–16). In addition to the DXSXS motif inthe MIDAS domain (positions 208–212; for easy reference, allamino acid residue numbers are the positions in the alignmentin Fig. 1), residues shown to be critical include Asn-305,Asp-307, Pro-309, Glu-310, Asp-314, Asp-351, and Asp-388,and at least some of these are thought also to be involved incoordination of a divalent cation in a ligand-binding pocket (6,7, 39). All of these residues are conserved in bCn1 and bPo1.

Evolution: Brower et al. Proc. Natl. Acad. Sci. USA 94 (1997) 9183

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

020

Page 3: Molecular evolution of integrins: Genes encoding …pat3,flyb PS, and most vertebrate bsubunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a

FIG. 1. Alignment of amino acid sequences of coral bCn1 and sponge bPo1 with b1-like integrin subunits. Numbers above the sequences indicatethe 56 conserved cysteines, and numbers to the right of each block denote the position in the matrix. Residues mentioned in the text are indicatedby p below the block. The amino acid change and stop codon in the alternatively spliced sponge cDNAs are indicated below the block at position521. The transmembrane domain is the largely green region around position 920. No attempt was made to align the signal sequences before thefirst conserved cysteine. The color scheme used to indicate amino acid similarity is: yellow, proline; red, cysteine; magenta, alanine, glycine, serine,

9184 Evolution: Brower et al. Proc. Natl. Acad. Sci. USA 94 (1997)

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

020

Page 4: Molecular evolution of integrins: Genes encoding …pat3,flyb PS, and most vertebrate bsubunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a

The membrane distal region of the cytoplasmic domains ofmost b subunits contains one or two NPXY motifs (beginningat positions 958 and 970). This region generally has been shownto be important in various assays of integrin function or proteinassociation (reviewed in ref. 40), and studies of missensemutations have directly implicated the NPXY motifs in inte-grin function (16, 41–44). Both of the NPXYs are found in bothbPo1 and bCn1. However, the basic residue that invariablyfollows the first NPXY in other b subunits is a Gln in bCn1.

The membrane-proximal cytoplasmic sequence WKL-LXXXHD (positions 929–937) is conserved in most b subunitsand has been shown to contain residues important for associ-ations with the conserved GFFKR and nearby sequencesfound in a subunits (16, 45). In particular, the terminalAsp-937 residue of the WKLLXXXHD motif of b3 has beenshown to interact with the Arg of the aIIb GFFKR; this Asp isconserved in both bCn1 and bPo1, lending support to thesupposition that these proteins function in association with asubunits. Interestingly, however, the upstream WKyRL se-quence, at the cytoplasmic side of the transmembrane domain,is replaced by IyLKG in bCn1 and bPo1. The similarity of the twonew sequences suggests that this region remains functionallyimportant, and it will be interesting to compare the sequencesof the corresponding a subunits in this region.

Such sequence comparisons of divergent, but functionallysimilar, integrins will take on added interest as more high-resolution structural information becomes available for bsubunits. Recently, a model has been proposed for the bsubunit I domain-like structure (7). Although the resolution ofthis model does not allow many specific predictions to be made,it is noteworthy that the insertion of eight residues at position373 in bPo1, and the shorter insertion of two amino acids atposition 332 in bCn1, are consistent with the model, as bothinserts fall in predicted loops between a b strand and an a helix.

Cysteine Spacings Are Not Conserved. The most noteworthydifference between the bCn1 and bPo1 sequences comparedwith those from higher animals is in the cysteine-rich stalk thatconnects the I domain-like ligand binding region to the plasmamembrane. In most b subunits, this domain contains fourrepeats in which the spacings between cysteines are generallywell conserved. In both bCn1 and bPo1, all of the cysteines arepresent, but the intercysteine spacings are much more variablein repeats 3 and 4. For example, between the 34th and 48thcysteines, 10 of the 14 Cys-Cys spacings are absolutely con-served in all previously reported b subunits (or at least in thosethat contain the relevant cysteines, as some of these are missingin b4, b8 or bn). In six of these 10 sites either bCn1 or bPo1contains extra residues relative to the others, and the stereo-typed CXCXXCXC motifs of repeats 3 and 4 are disrupted. Inaddition, bPo1 contains a unique insert of approximately 37residues between cysteines 51 and 52.

If one assumes that the repeats of the stalk region arose fromduplications of an original unique structural domain, then theb subunits of higher animals, where the repeat structure ismore strongly conserved, probably reflect the ancestral statemore closely than do bCn1 or bPo1. Independent divergencefrom the original repeat structures in the latter proteins alsois suggested by the fact that the coral and sponge intercysteinespacings are not similar to one another. Of course, this raisesthe question of why the integrins of higher animals have beenconstrained to maintain the spacings, while the others havebeen free to vary. Conformation-sensitive antibodies thatrecognize membrane proximal regions of the extracellulardomain have been isolated for other b subunits (46–50), andit is likely that a-b contacts will be important throughout thestalk. Because it is generally more difficult for selection to

change many components simultaneously, one hypothesis isthat the structure of the stalk became more constrained whenb subunits began to associate with multiple a subunits.

A b Subunit Fragment From the Sponge. In the course ofanalyzing sponge cDNAs, we found that some contained analternatively spliced exon of 59 base pairs. When translated inframe, one amino acid is changed relative to the intact bPo1(Fig. 1, position 521), and the second and fifth new codons arestops, generating a protein missing the C-terminal 400 aminoacids of bPo1. This protein would contain the globular ligand-binding domain, but would be missing the stalk, transmem-brane, and cytoplasmic domains of a typical integrin b subunit(Fig. 2). While we have no direct evidence that such a fragmentis actually present in sponges, the abundance of the cDNA (3of 10 isolates from the library) indicates that this protein is verylikely to be made.

There is one previous report of a similarly truncated integrinsubunit, a 60-kDa protein that arises from an alternativelyspliced b3 mRNA (51). Translation of the b3C protein stopswithin four amino acids of the position of the C-terminus of thetruncated bPo1 protein, although the sequences indicate thatthese alternative splicing events are not homologous. Bothtruncations lie just upstream of the cysteine-rich stalk, and thecoincidence in the positions of the stops in b3C and bPo1suggests that this short stretch of residues following the 14thcysteine defines the limit of the ligand-binding head of the bsubunit.

Phylogenetic Analyses. Fig. 3 shows a phylogenetic tree ofthe sequenced integrin b subunits, constructed using the PAUPheuristic parsimony algorithm. The tree is unrooted, becausethere are no b sequences known that are expected to havediverged before the sponge gene. Except for the unexpectedposition of the Drosophila bn protein, discussed below, themost parsimonious b subunit tree is consistent with theaccepted divergence of sponges and cnidarians early in meta-zoan evolution.

The integrin sequences do not permit the reconstruction ofa rigorous phylogeny; for example, while a bootstrappingalgorithm yields a tree similar to that shown, many of thebootstrap values are less than 50. However, we have analyzedthe sequences using a variety of tree-constructing algorithmsand various pair-wise comparisons, and we find no goodindication for an early, conserved divergence in b proteinsequences. The data available suggest that the radiation of bsubunits seen in vertebrates is a relatively late evolutionaryevent, probably occurring in the deuterostome lineage andpossibly within the chordates. To make this conclusion firmlymore invertebrate b subunits need to be examined. Partialsequences of two additional sea urchin b subunits, bL and bC,have been reported (28), and their complete sequences mightbe especially informative in this regard, considering the closeproximity of echinoderms and vertebrates in the phylogenetictree.

Functional studies suggest that b1 is the best candidate forthe ‘‘primal’’ vertebrate b subunit; it associates with an un-usually large array of a subunits and is the subunit mostcommonly involved in fundamental morphogenetic events (1,52). The Drosophila bPS and C. elegans bpat3 proteins show ahigh degree of sequence similarity to b1, associate with mul-tiple a subunits, and also are characterized by widespreadinvolvement in developmental processes (2, 3). For thesereasons, we propose to consider these proteins to be repre-sentatives of a ‘‘b1-class’’ subunit. This is not meant to implythat these proteins are strict orthologues; rather, we use theterm to distinguish these proteins from those, such as theDrosophila bn, that appear to have diverged as they have

and threonine; blue, asparagine, histidine, phenylalanine, tryptophan, and tyrosine; cyan, arginine and lysine; green, isoleucine, leucine, methionine,and valine, and black, aspartic acid, glutamic acid, and glutamine.

Evolution: Brower et al. Proc. Natl. Acad. Sci. USA 94 (1997) 9185

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

020

Page 5: Molecular evolution of integrins: Genes encoding …pat3,flyb PS, and most vertebrate bsubunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a

acquired specialized functions. While we know little about theexpression patterns of bCn1 and bPo1, their sequences areconsistent with their inclusion in the b1-class. Perusal of thesimilarity scores in Table 1, and comparisons of specificsequence motifs, indicate that the sponge and coral proteinsare at least as much like b1 as they are similar to otherinvertebrate proteins. Indeed, the remarkable similarity of b1

to all of the invertebrate b subunits suggests that b1 may havechanged relatively little over the past 500 million years. Be-cause b1 associates with a large number of a subunits (1), itsstructure may have been constrained to a greater degree thanthat of most integrin subunits.

The two sequences presented here represent the mostdivergent b1-class proteins; this is especially true for thesponge. Based on 18S rRNA data, there has been disagree-ment as to whether cnidarians such as coral are monophyleticwith higher metazoans (53, 54). However, as more sequencedata have become available, cnidarians appear to fall verymuch in the metazoan mainstream (e.g., ref. 55), and the highdegree of sequence similarity between bCn1 and human b1

(Table 1) is in agreement with this assignment. The relation-ship of the Porifera to other metazoans is more uncertain, but

while the sponge sequence is significantly more divergent, evenhere there is no evidence for a polyphyletic origin.

While it must be remembered that integrin sequences arenot optimal for phylogenetic reconstructions, there are twosurprises in the tree. The first is the anomalous position ofDrosophila bn between the sponge and coral sequences. Thismost likely results from the fact that parsimony analyses canbreak down when attempting to place a highly divergentsequence. Most probably bn is not an orthologue of the otherinvertebrate sequences, but a functionally specialized andhighly derived protein (27). In fact, bn is not especially like anyof the other b subunits, and in pairwise comparisons bn is moresimilar to the crayfish b (46%) and human b1 (45%), than itis to the sponge and coral proteins (both 43%).

The other unexpected finding is the placement of the C.elegans bpat3 between two arthropod sequences. There appearsto be an unusually high degree of similarity between the C.elegans and Drosophila sequences, as opposed to a particularlyhigh degree of divergence in the crayfish. For example, inpairwise comparisons the crayfish b is most similar to fly bPS,as expected. In explanation, we can only note that both thederivation of arthropod lineages, and nematodes generally,have proven to be enigmatic in other molecular phylogeneticstudies (53, 56).

Integrin-like proteins now have been shown to occur in themost primitive metazoan phyla. How deep in the evolutionarytree did integrins arise? The yeast genome is now fullysequenced, and no protein with obvious homology to anintegrin b subunit has been uncovered. On the other hand,there have been numerous functional and immunologicalsuggestions that integrin-like proteins are present in higherplants and fungi (57–64).

Note Added in Proof. Recently, the sequence of an integrin a subunitalso has been reported from a sponge (65).

We thank David Miller for much advice, and members of his lab andPeter Harrison for assistance in obtaining the coral embryos. JeremyWeinman and Chris Parish provided help in obtaining and adviceconcerning the sponge material. Roy Parker, Richard Maleszka, ScottSelleck, and Adrian Gibbs kindly allowed us the use of their PCRmachines, and we are especially indebted to Richard Maleszka forinvaluable help in our library fishing. Cameron McCrae and SkipVaught’s gang made sequencing virtually painless. Pat O’Grady, LisaNagy, and Sam Ward educated D.B. on phylogenies, and David Miller,Robert Burke, and Torbjorn Holmblad kindly provided information

FIG. 3. Tree showing relationships of b subunit sequences, gen-erated using PAUP heuristic parsimony algorithm. The tree is unrooted;see text for details.

FIG. 2. Expected bPo1 fragment generated by alternative RNAsplicing. The protein would be expected to consist of the b ligand-binding domain (solid shading) without the stalk and cytoplasmicregions (light shading).

Table 1. Pairwise comparisons of b subunit amino acid sequences

CoralbCn1

Wormbpat3

Crayfishb

FlybPS

UrchinbG

Humanb1

Sponge 48 45 45 43 44 49bPol 38 35 37 35 35 38

Coral 51 48 48 51 54bCn1 41 42 41 43 45

Worm 50 55 51 53bpat3 41 47 42 43

Crayfish 54 50 52b 44 41 43

Fly 54 54bPS 46 47

Urchin 54bG 45

Scores for % similarity (top number) and % identity (bottomnumber) for human b1 and invertebrate integrin b subunit sequences.For each comparison, alignments from PILEUP (GCG9) were manuallyadjusted to ensure that the 56 conserved cysteines were aligned, andscores then were calculated by BESTFIT. Signal sequences upstream ofthe initial conserved cysteine were not included in the analysis.

9186 Evolution: Brower et al. Proc. Natl. Acad. Sci. USA 94 (1997)

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

020

Page 6: Molecular evolution of integrins: Genes encoding …pat3,flyb PS, and most vertebrate bsubunits contain 56 conserved cysteine residues, suggesting that the proteins all fold into a

before publication. Mike Graner, Tom Bunch, and Mark Ginsbergcriticized the manuscript, and Scott Selleck helped with figures. Thiswork was supported by National Institutes of Health Grant GM42474.

1. Hynes, R. O. (1992) Cell 69, 11–25.2. Brown, N. H. (1994) BioEssays 15, 383–390.3. Gettner, S. N., Kenyon, C. & Reichardt, L. F. (1995) J. Cell Biol.

129, 1127–1141.4. Humphries, M. J. (1996) Cur. Opin. Cell Biol. 8, 632–640.5. Springer, T. A. (1997) Proc. Natl. Acad. Sci. USA 94, 65–72.6. Lee, J., Rieu, P., Arnaout, M. & Liddington, R. (1995) Cell 80,

631–638.7. Tozer, E. C., Liddington, R. C., Sutcliffe, M. J., Smeeton, A. H.

& Loftus, J. L. (1996) J. Biol. Chem. 271, 21978–21984.8. Loftus, J. C., O’Toole, T. E., Plow, E. F., Glass, A., Frelinger,

A. L. F. & Ginsberg, M. H. (1990) Science 249, 915–918.9. Takada, Y., Ylanne, J., Mandelman, D., Puzon, W. & Ginsberg,

M. H. (1992) J. Cell Biol. 119, 913–921.10. Bajt, M. L., Ginsberg, M. H., Frelinger, A. L., III, Berndt, M. C.

& Loftus, J. C. (1992) J. Biol. Chem. 267, 3789–3794.11. Bajt, M. L. & Loftus, J. C. (1994) J. Biol. Chem. 269, 20913–

20919.12. Bajt, M., Goodman, T. & McGuire, S. (1995) J. Biol. Chem. 270,

94–98.13. Huang, X., Chen, A., Agrez, M. & Sheppard, D. (1995) Am. J.

Respir. Cell Mol. Biol. 13, 245–251.14. Kamata, T., Puzon, W. & Takada, Y. (1995) Biochem. J. 305,

945–951.15. Puzon-McLaughlin, W. & Takada, Y. (1996) J. Biol. Chem. 271,

20438–20443.16. Baker, E. K., Tozer, E. C., Pfaff, M., Shattil, S. J., Loftus, J. C. &

Ginsberg, M. H. (1997) Proc. Natl. Acad. Sci. USA 94, 1973–1978.17. Ziegler, U. & Stidwell, R. P. (1992) Exp. Cell Res. 202, 281–286.18. Zhang, X. & Sarras, M. P., Jr. (1994) Development (Cambridge,

U.K.) 120, 425–432.19. Agbas, A. & Sarras, M. P., Jr. (1994) Cell Adhes. Commun. 2,

59–73.20. Labat-Robert, J., Robert, L., Auger, C., Lethias, C. & Garrone,

R. (1981) Proc. Natl. Acad. Sci. USA 78, 6261–6265.21. Diehl-Seifert, B., Kurelec, B., Zahn, R. K., Dorn, A., Jericevic,

B., Uhlenbruck, G. & Muller, W. E. (1985) J. Cell Sci. 79,271–285.

22. Exposito, J. Y., Ouazana, R. & Garrone, R. (1990) Eur. J. Bio-chem. 190, 401–406.

23. Exposito, J. Y., Le Guellec, D., Lu, Q. & Garrone, R. (1991)J. Biol. Chem. 266, 21923–21928.

24. Humbert-David, N. & Garrone, R. (1993) Eur. J. Biochem. 216,255–260.

25. Chomczynski, P. & Sacchi, N. (1987) Anal. Biochem. 162, 156–159.

26. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1992) MolecularCloning: A Laboratory Manual (Cold Spring Harbor Lab. Press,Plainview, NY), 2nd Ed.

27. Yee, G. H. & Hynes, R. O. (1993) Development (Cambridge,U.K.) 118, 845–858.

28. Marsden, M. & Burke, R. D. (1997) Dev. Biol. 181, 234–245.29. Argraves, W. S., Suzuki, S., Arai, H., Thompson, K., Piersch-

bacher, M. D. & Ruoslahti, E. (1987) J. Cell Biol. 105, 1183–1190.30. Kishimoto, T. K., O’Connor, K., Lee, A., Roberts, T. M. &

Springer, T. A. (1987) Cell 48, 681–690.31. Fitzgerald, L. A., Steiner, B., Rall, S. C., Lo, S. S. & Phillips,

D. R. (1987) J. Biol. Chem. 262, 3936–3939.32. Suzuki, S. & Naitoh, Y. (1990) EMBO J. 9, 757–763.33. Ramaswamy, H. & Hemler, M. E. (1990) EMBO J. 9, 1561–1568.

34. Sheppard, D., Rozzo, C., Starr, L., Quaranta, V., Erle D. J. &Pytela, R. (1990) J. Biol. Chem. 265, 11502–11507.

35. Erle, D. J., Ruegg, C., Sheppard, D. & Pytela, R. (1991) J. Biol.Chem. 266, 11009–11016.

36. Moyle, M., Napier, M. A. & McLean, J. W. (1991) J. Biol. Chem.266, 19650–19658.

37. MacKrell, A., Blumberg, B., Haynes, S. R. & Fessler, J. H. (1988)Proc. Natl. Acad. Sci. USA 85, 2633–2637.

38. Holmblad, T., Thornqvist, P., Soderhall, K. & Johansson, M. W.(1997) J. Exp. Zool. 277, 255–261.

39. Qu, A. & Leahy, D. J. (1995) Proc. Natl. Acad. Sci. USA 92,10277–10281.

40. Dedhar, S. & Hannigan, G. (1996) Cur. Opin. Cell Biol. 8,657–669.

41. Reszka, A. A., Hayashi, Y. & Horwitz, A. F. (1992) J. Cell Biol.117, 1321–1330.

42. O’Toole, T. E., Ylanne, J. & Culley, B. M. (1995) J. Biol. Chem.270, 8553–8558.

43. Ylanne, J., Huuskonen, J., O’Toole, T. E., Ginsberg, M. H.,Virtanen, I. & Gahmberg, C. G. (1995) J. Biol. Chem. 270,9550–9557.

44. Tahiliani, P. D., Singh, L., Auer, K. L. & LaFlamme, S. E. (1997)J. Biol. Chem. 272, 7892–7898.

45. Hughes, P. E., Diaz-Gonzales, F., Leong, L., Wu, C., McDonald,J. A., Shattil, S. J. & Ginsberg, M. H. (1996) J. Biol. Chem. 271,6571–6574.

46. Neugebauer, K. M. & Reichardt, L. F. (1991) Nature (London)350, 68–71.

47. Shih, D.-T., Edelman, J. M., Horwitz, A. F., Grunwald, G. B. &Buck, C. A. (1993) J. Cell Biol. 122, 1361–1371.

48. Bazzoni, G., Shih, D. T., Buck, C. A. & Hemler, M. E. (1995)J. Biol. Chem. 270, 25570–25577.

49. Wilkins, J. A., Li, A., Ni, H., Stupack, D. G. & Shen, C. (1996)J. Biol. Chem. 271, 3046–3051.

50. Faull, R. J., Wang, J., Leavesley, D. I., Puzon, W., Russ, G. R.,Vestweber, D. & Takada, Y. (1996) J. Biol. Chem. 271, 25099–25106.

51. Djaffar, I., Chen, Y.-P., Creminon, C., Maclouf, J., Cieutat,A.-M., Gayet, O. & Rosa, J.-P. (1994) Biochem. J. 300, 69–74.

52. Hynes, R. O. (1996) Dev. Biol. 402–412.53. Lake, J. A. (1990) Proc. Natl. Acad. Sci. USA 87, 763–766.54. Field, K. G., Olsen, G. J., Lane, D. J., Giovannoni, S. J., Ghiselin,

M. T., Raff, E. C., Pace, N. R. & Raff, R. A. (1988) Science 239,748–753.

55. Berghammer, H., Hayward, D., Harrison, P. & Miller, D. J.(1996) Gene 178, 195–197.

56. Sidow, A. & Thomas, W. K. (1994) Curr. Biol. 4, 596–603.57. Marcantonio, E. E. & Hynes, R. O. (1988) J. Cell Biol. 106,

1765–1772.58. Schindler, M., Meiners, S. & Cheresh, D. A. (1989) J. Cell Biol.

108, 1955–1965.59. Quatrano, R. S., Brian, L., Aldridge, J. & Schultz, T. (1991)

Development Suppl. (Cambridge, U.K.) 1, 11–16.60. Wayne, R., Staves, M. P. & Leopold, A. C. (1992) J. Cell Sci. 101,

611–623.61. Zhu, J.-K., Shi, J., Singh, U., Wyatt, S. E., Bressan, R. A.,

Hasegawa, P. M. & Carpita, N. C. (1993) Plant J. 3, 637–646.62. Kaminskyj, S. G. & Heath, I. B. (1995) J. Cell Sci. 108, 849–856.63. Correa, A., Jr., Staples, R. C. & Hoch, H. C. (1997) Protoplasma

194, 91–102.64. Gens, J. S., Reuzeau, C., Doolittle, K. W., McNally, J. G. &

Pickard, B. G. (1997) Protoplasma 194, 215–230.65. Pancer, Z., Kruse, M., Muller, I. & Muller, W. E. G. (1997) Mol.

Biol. Evol. 14, 391–398.

Evolution: Brower et al. Proc. Natl. Acad. Sci. USA 94 (1997) 9187

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

020