primarystructure 170-kdasurface entamoeba

5
Proc. Natl. Acad. Sci. USA Vol. 88, pp. 1849-1853, March 1991 Medical Sciences Primary structure of the 170-kDa surface lectin of pathogenic Entamoeba histolytica (receptor/cellular adherence/carbohydrate binding/cysteine-rich pseudorepeats/amoebiasis) EGBERT TANNICH, FRANK EBERT, AND ROLF D. HORSTMANN Bernard Nocht Institute for Tropical Medicine, 2000 Hamburg 36, Federal Republic of Germany Communicated by Hans J. Muller-Eberhard, December 10, 1990 (received for review October 29, 1990) ABSTRACT The adherence of Entamoeba histolytica to colonic mucins and to host cells appears to be predominantly mediated by a 170-kDa surface lectin of the amoebae. By using an antiserum to the purified lectin, the corresponding cDNA was isolated from an expression library of the pathogenic E. histolytica isolate HM-1:IMSS. Northern blot analysis indi- cated a transcript of -4 kilobases, and Southern blot analyses suggested that multiple genes may encode the lectin or closely related proteins in HM-1:IMSS trophozoites. The cDNA- deduced amino acid sequence revealed an N-terminal signal peptide and a mature protein of 1270 amino acids correspond- ing to a molecular mass of 143 kDa, which comprises a short C-terminal cytoplasmic domain with potential phosphorylation sites, a transmembrane region, and a large extracellular por- tion with nine potential asparagine-linked glycosylation sites. The extracellular portion may be separated into a cysteine-poor domain and a cysteine-rich domain, the latter of which shows in part repetitive structural elements with a low degree of sequence homology to wheat germ agglutinin and to pDd63, a developmentally .expressed protein of Dictyostelium discoi- deum. The enteritic protozoon Entamoeba histolytica infects nearly 10% of the world's population. It causes 50 million cases of colitis or extraintestinal abscesses annually resulting in -50,000 fatalities (1). Adherence of E. histolytica to colonic mucins and to host cells is considered essential for the colonization of the large intestine and for pathogenic tissue invasion, respectively (2). Both forms of adherence appear to be primarily mediated by a surface lectin of the amoebae, which is inhibitable by galactose and N-acetylgalactosamine (3, 4). The structure, function, and antigenicity of the lectin have been studied in considerable detail (for review, see ref. 2). Its carbohydrate binding specificity was reported to be similar to that of the lectin of the coral tree Erythrina cristagalli in that terminal N-acetyllactosamine units provide the major binding determinants (5). The lectin is a membrane-associated glycoprotein with a molecular mass of 170 kDa (6). It is disulfide-linked to one or more units of a 35-kDa protein that was reported to function as a fibronectin receptor (7, 8). The N-terminal amino acid sequence of the lectin has been determined (7). A corre- sponding oligonucleotide probe was found to hybridize to a 4-kilobase RNA of the amoebae.* Five functionally relevant epitopes of the lectin have been defined by studying the binding of monoclonal antibodies and their influence on adherence: antibody binding to two epitopes decreased adherence to mucins and to target cells, binding to another epitope inhibited adherence to target cells but not to mucins, and binding to two further epitopes enhanced adherence to mucins and target cells (9). Similar effects were observed when antisera of patients with invasive amoebiasis were studied. The antisera either reduced or enhanced the adherence reactions (9). Interest- ingly, gerbil vaccination studies using the isolated lectin as antigen also revealed two contrary groups of responders; some animals were protected whereas the others showed an increase in morbidity.t It was concluded that selected epitopes of the lectin molecule may be candidates for a vaccine against invasive amoebiasis. To define the location of these epitopes, a detailed struc- tural analysis of the lectin is required. Herein we describe the amino acid sequence and sequence-derived structural do- mains of the proteinA MATERIALS AND METHODS E. histolytica Isolate, Culture Condition, and Harvest. The pathogenic isolate HM-1:IMSS was cultured in axenic me- dium TYI-S-33 in plastiz tissue culture flasks (10). Tropho- zoites in late logarithmic growth phase were detached by chilling on ice ol 10 min, decanted, pelleted by centrifugation at 500 x g at 40C for 10 min, washed twice in ice-cold phosphate-buffered saline, and frozen at -70'C. Isolation of the 170-kDa Surface Lectin. The lectin was isolated from a detergent-soluble membrane extract of a total of 1.5 x 1010 E. histolytica trophozoites by using the follow- ing modification of the method described by Petri et al. (6). All buffers contained as proteinase inhibitors 10 AuM trans- epoxysuccinyl-L-leucylamido(4-guanidino)butane (E-64), 6.4 mM benzamidine, and 0.5 mM phenylmethylsulfonyl fluoride (all from Sigma). Membrane extraction and affinity purifica- tion of the lectin were performed with 10 samples of 1.5 x 109 trophozoites. Membrane proteins were extracted by 2% (wt/vol) n-octyl ,8-D-glucopyranoside (OG; Sigma). Affinity chromatography was performed using a 10-ml column of p-aminophenyl f3-D-thiogalactopyranoside coupled to Agar- ose (Sigma) and bound material was eluted with 0.5 M N-acetylgalactosamine (Sigma). The eluate was concentrated -250-fold on a Minicon concentrator (Amicon) with a mo- lecular mass cut-off at 15 kDa. The material pooled from 10 preparations contained 5.3 mg of protein in 2.8 ml. A sample of 1.2 ml was diluted 1:10 in 5 mM Tris-HCI buffer (pH 7.5) containing 0.8% OG and pro- tease inhibitors and subjected to anion-exchange chromatog- raphy on a Mono Q HR 5/5 column using the FPLC system Abbreviations: OG, n-octyl 8-D-glucopyranoside; WGA, wheat germ agglutinin. *Mann, B. J., Snodgrass, T., Kittler, E. L. W., Ravdin, J. I. & Petri, W. A., Jr., 37th Annual Meeting of the American Society of Tropical Medicine and Hygiene, Dec. 4-8, 1988, abstr. 390. tPetri, W. A. & Ravdin, J. I., 11th Seminar on Amebiasis, Nov. 15-17, 1989, Mexico City, abstr. 34. tThe sequence reported in this paper has been deposited in the GenBank data base (accession no. M60498). 1849 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on January 25, 2022

Upload: others

Post on 26-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Primarystructure 170-kDasurface Entamoeba

Proc. Natl. Acad. Sci. USAVol. 88, pp. 1849-1853, March 1991Medical Sciences

Primary structure of the 170-kDa surface lectin of pathogenicEntamoeba histolytica

(receptor/cellular adherence/carbohydrate binding/cysteine-rich pseudorepeats/amoebiasis)

EGBERT TANNICH, FRANK EBERT, AND ROLF D. HORSTMANN

Bernard Nocht Institute for Tropical Medicine, 2000 Hamburg 36, Federal Republic of Germany

Communicated by Hans J. Muller-Eberhard, December 10, 1990 (receivedfor review October 29, 1990)

ABSTRACT The adherence of Entamoeba histolytica tocolonic mucins and to host cells appears to be predominantlymediated by a 170-kDa surface lectin of the amoebae. By usingan antiserum to the purified lectin, the corresponding cDNAwas isolated from an expression library of the pathogenic E.histolytica isolate HM-1:IMSS. Northern blot analysis indi-cated a transcript of -4 kilobases, and Southern blot analysessuggested that multiple genes may encode the lectin or closelyrelated proteins in HM-1:IMSS trophozoites. The cDNA-deduced amino acid sequence revealed an N-terminal signalpeptide and a mature protein of 1270 amino acids correspond-ing to a molecular mass of 143 kDa, which comprises a shortC-terminal cytoplasmic domain with potential phosphorylationsites, a transmembrane region, and a large extracellular por-tion with nine potential asparagine-linked glycosylation sites.The extracellular portion may be separated into a cysteine-poordomain and a cysteine-rich domain, the latter of which showsin part repetitive structural elements with a low degree ofsequence homology to wheat germ agglutinin and to pDd63, adevelopmentally .expressed protein of Dictyostelium discoi-deum.

The enteritic protozoon Entamoeba histolytica infects nearly10% of the world's population. It causes 50 million cases ofcolitis or extraintestinal abscesses annually resulting in-50,000 fatalities (1).Adherence of E. histolytica to colonic mucins and to host

cells is considered essential for the colonization of the largeintestine and for pathogenic tissue invasion, respectively (2).Both forms of adherence appear to be primarily mediated bya surface lectin of the amoebae, which is inhibitable bygalactose and N-acetylgalactosamine (3, 4).The structure, function, and antigenicity of the lectin have

been studied in considerable detail (for review, see ref. 2). Itscarbohydrate binding specificity was reported to be similar tothat of the lectin of the coral tree Erythrina cristagalli in thatterminal N-acetyllactosamine units provide the major bindingdeterminants (5).The lectin is a membrane-associated glycoprotein with a

molecular mass of 170 kDa (6). It is disulfide-linked to one ormore units of a 35-kDa protein that was reported to functionas a fibronectin receptor (7, 8). The N-terminal amino acidsequence of the lectin has been determined (7). A corre-sponding oligonucleotide probe was found to hybridize to a4-kilobase RNA of the amoebae.*

Five functionally relevant epitopes of the lectin have beendefined by studying the binding ofmonoclonal antibodies andtheir influence on adherence: antibody binding to twoepitopes decreased adherence to mucins and to target cells,binding to another epitope inhibited adherence to target cells

but not to mucins, and binding to two further epitopesenhanced adherence to mucins and target cells (9).

Similar effects were observed when antisera of patientswith invasive amoebiasis were studied. The antisera eitherreduced or enhanced the adherence reactions (9). Interest-ingly, gerbil vaccination studies using the isolated lectin asantigen also revealed two contrary groups of responders;some animals were protected whereas the others showed anincrease in morbidity.t It was concluded that selectedepitopes of the lectin molecule may be candidates for avaccine against invasive amoebiasis.To define the location of these epitopes, a detailed struc-

tural analysis of the lectin is required. Herein we describe theamino acid sequence and sequence-derived structural do-mains of the proteinA

MATERIALS AND METHODSE. histolytica Isolate, Culture Condition, and Harvest. The

pathogenic isolate HM-1:IMSS was cultured in axenic me-dium TYI-S-33 in plastiz tissue culture flasks (10). Tropho-zoites in late logarithmic growth phase were detached bychilling on ice ol 10 min, decanted, pelleted by centrifugationat 500 x g at 40C for 10 min, washed twice in ice-coldphosphate-buffered saline, and frozen at -70'C.

Isolation of the 170-kDa Surface Lectin. The lectin wasisolated from a detergent-soluble membrane extract of a totalof 1.5 x 1010 E. histolytica trophozoites by using the follow-ing modification of the method described by Petri et al. (6).All buffers contained as proteinase inhibitors 10 AuM trans-epoxysuccinyl-L-leucylamido(4-guanidino)butane (E-64), 6.4mM benzamidine, and 0.5 mM phenylmethylsulfonyl fluoride(all from Sigma). Membrane extraction and affinity purifica-tion of the lectin were performed with 10 samples of 1.5 x 109trophozoites. Membrane proteins were extracted by 2%(wt/vol) n-octyl ,8-D-glucopyranoside (OG; Sigma). Affinitychromatography was performed using a 10-ml column ofp-aminophenyl f3-D-thiogalactopyranoside coupled to Agar-ose (Sigma) and bound material was eluted with 0.5 MN-acetylgalactosamine (Sigma). The eluate was concentrated-250-fold on a Minicon concentrator (Amicon) with a mo-lecular mass cut-off at 15 kDa.The material pooled from 10 preparations contained 5.3 mg

of protein in 2.8 ml. A sample of 1.2 ml was diluted 1:10 in 5mM Tris-HCI buffer (pH 7.5) containing 0.8% OG and pro-tease inhibitors and subjected to anion-exchange chromatog-raphy on a Mono Q HR 5/5 column using the FPLC system

Abbreviations: OG, n-octyl 8-D-glucopyranoside; WGA, wheatgerm agglutinin.*Mann, B. J., Snodgrass, T., Kittler, E. L. W., Ravdin, J. I. &Petri, W. A., Jr., 37th Annual Meeting of the American Society ofTropical Medicine and Hygiene, Dec. 4-8, 1988, abstr. 390.

tPetri, W. A. & Ravdin, J. I., 11th Seminar on Amebiasis, Nov.15-17, 1989, Mexico City, abstr. 34.tThe sequence reported in this paper has been deposited in theGenBank data base (accession no. M60498).

1849

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Janu

ary

25, 2

022

Page 2: Primarystructure 170-kDasurface Entamoeba

1850 Medical Sciences: Tannich et al.

(Pharmacia LKB). The column was washed with 20 mMTris HCI (pH 7.5) containing OG and protease inhibitors, andmaterial was eluted with a 20-ml gradient to 1 M NaCI instarting buffer. Fractions of 0.5 ml were collected and ana-lyzed by gel electrophoresis in NaDodSO4/PAGE and silverstaining.Antiserum to the isolated lectin was raised in rabbit by

repeated intramuscular injections of 100 ,ug of the antigen in1 ml of isotonic saline emulsified with an equal volume ofcomplete or incomplete (boosts) Freund's adjuvant (GIBCO/BRL).Amino Acid Sequencing. The purified material was sub-

jected to NaDodSO4/PAGE under reducing conditions,transferred to a poly(vinylidene difluoride) membrane (Im-mobilon; Millipore) using a semidry electroblotting system(Multiphor II; Pharmacia LKB), and stained with Ponceau S(Sigma). Bands were cut out and applied to a modularfluid-phase protein sequencer (model 810; Knauer, Berlin).The method did not allow the identification of cysteine.

Construction and Screening of the HM-1:IMSS cDNA Li-braries. Agtll and AZAP libraries of the pathogenic E.histolytica isolate HM-1:IMSS were used. The constructionofthe Agtll library has been described (11). The AZAP librarywas constructed with the ZAP cDNA synthesis kit (Strata-gene) by following the instructions of the manufacturer. Thelibrary contained 107 independent recombinant phages.A 1:1000 dilution of the rabbit antiserum against the

isolated lectin was used to screen 2 x 105 recombinant phagesof the Agtll library by a published method (12). SelectedcDNA clones isolated from the Agtll library were used toscreen the AZAP library. Hybridizing phages were isolated,and the plasmids were released according to the instructionsof the manufacturer (Stratagene).

Standard DNA and RNA Technologies. Nucleotide se-quencing, Southern blot, Northern blot, and primer-extension experiments were performed according to pub-lished procedures (13). For primer extension studies, anoligonucleotide (5'-GTAGTCCATACAAAATGTCTCC)complementary to nucleotides 188-209 of the cDNA se-quence (see Fig. 3) was used.Computer Analysis. cDNA and protein sequences were

analyzed using the DNASIS program (Pharmacia LKB). Com-parisons to published protein sequences (National Biomed-ical Research Foundation data base, release 6/90) wereperformed using the FASTA searching program (14).

RESULTS

Isolation of the 170-kDa Surface Lectin, Amino Acid Se-quencing, and Production of an Antiserum. The lectin wasisolated from cell membranes of 1.5 x 1010 trophozoites ofthepathogenic E. histolytica isolate HM-1:IMSS. Membranecomponents were extracted by 2% OG and subjected toaffinity chromatography on immobilized p-aminophenyl 83-D-thiogalactopyranoside. Bound material was eluted with 0.5MN-acetylgalactosamine and further purified by anion-exchange chromatography. Upon NaDodSO4/PAGE of theisolated material under reducing conditions, a major proteinof 170 kDa was found. Sequencing of the N-terminal aminoacids revealed a sequence similar to the one described byPetri et al. (7) (Fig. 1). A rabbit antiserum to the isolatedmaterial reacted on immunoblots of E. histolytica membraneextracts with a 170-kDa antigen, when the gel was electro-phoresed under reducing conditions.

Isolation and Characterization of cDNA Clones Coding forthe 170-kDa Surface Lectin. The antiserum to the isolatedlectin was used to screen 200,000 recombinant phages of a

Agtll cDNA library derived from the pathogenic E. histolyt-ica isolate HM-1:IMSS. After tertiary screening, 11 reactiveclones were isolated. The cDNA inserts were released and

Previous data (Ref. 7):Protein sequencing NH2-G K L N E F S A D I D Y Y D L

Present data:Protein sequencing NH2-D Q V N E F S A D I D Y Y D L ? ? M

N K

cDNA deduced sequence D K L N E F S A D I D Y Y D L G I M

FIG. 1. N-terminal amino acid sequences obtained for the iso-lated 170-kDa surface lectin of E. histolytica compared to thecorresponding sequence deduced from cDNA analysis.

subjected to Northern blot analyses. One cDNA insert,designated gt-Ehl70/5, hybridized to an E. histolytica RNAof -4 kilobases (kb). A 35-mer antisense oligonucleotideprobe derived from the N-terminal amino acid sequence ofthe isolated lectin hybridized to the RNA at the same positionof4 kb. Nucleotide sequence analysis ofgt-Ehl70/5 revealedan open reading frame spanning the entire insert of -2.4 kb.The lack of a 5'-initiating ATG, the absence of a poly(A) tail,and the finding that the hybridizing RNA was substantiallylarger than the gt-Ehl70/5 insert indicated that the 5' and 3'ends of the mRNA were not represented in the cDNA clone.To obtain longer cDNA sequences, gt-Ehl70/5 was used toscreen a AZAP library also derived from E. histolytica isolateHM-1:IMSS. Sixty out of 100,000 recombinant phages hy-bridized to the probe. Four of them were purified and thecDNA inserts designated ZAP-Ehl70/1 to 4, were se-quenced. The sequences of all four inserts contained identical3' ends plus poly(A) tails but differed in their 5' ends,suggesting that they were independent cDNA clones derivedfrom the same mRNA species. The amino acid sequencesdeduced from the longest cDNA clones ZAP-Ehl70/3 and 4contained a nucleotide sequence that corresponded to theN-terminal amino acid sequence of the isolated lectin as itwas obtained by protein sequencing and an additional stretchof hydrophobic amino acids. However, the inserts lacked anin-frame initiation codon at the appropriate position. Primer-extension experiments revealed an extension product of 20nucleotides 5' to ZAP-Ehl70/4, suggesting that the full-length coding sequence contains a few additional aminoacids.

A -a:

wX>

BCy

9.4 -

6.66- =g.

3-5 -

2.4 - ,4 k b --- *

1.0-

FIG. 2. Southern and Northern blot analyses. (A) Southern blot.Approximately 10 ,g of HM-1:IMSS genomic DNA was digested tocompletion with restriction enzymes as indicated, submitted toelectrophoresis, blotted, and hybridized to ZAP-Ehl7O/4 under highstringency. Sizes in kb are indicated on the left. (B) Northern blot.Five micrograms of HM-1:IMSS total RNA was submitted to elec-trophoresis, blotted, and hybridized to ZAP-EH170/4 under lowstringency.

Proc. Natl. Acad. Sci. USA 88 (1991)

Dow

nloa

ded

by g

uest

on

Janu

ary

25, 2

022

Page 3: Primarystructure 170-kDasurface Entamoeba

Medical Sciences: Tannich et al.

Southern and Northern Blot Analyses. Genomic DNA fromtrophozoites of the pathogenic E. histolytica HM-1:IMSSwas digested with various restriction enzymes and hybridizedto ZAP-Ehl7O/4 under high-stringency conditions. In eachdigest, several hybridizing fragments were found (Fig. 2A).Northern blot analysis of total RNA isolated from amoebae

of the same isolate showed an RNA species of -4 kbhybridizing to the cDNA probe (Fig. 2B). This result wasobtained under high- as well as under low-stringent washingconditions.Primary Structure of the 170-kila Surface Lectin. The

amino acid sequence deduced from the nucleotide sequenceof cDNA ZAP-Ehl7O/4 (Fig. 3) suggests that the proteinconsists of several structural domains as shown in Fig. 4. Ahydrophobic segment of 29 residues near the C terminus was

-TA

32

T82ST

3321ST

082TSI

032201

782

Proc. NatI. Acad. Sci. USA 88 (1991) 1851

found that is characteristic for membrane-spanning domains.Approximately 95% of the protein is located on the N-ter-minal side of this segment and appears to reside extracellu-larly.The most N-terminal amino acids are exclusively hydro-

phobic, indicating that they form a signal peptide. Theremaining sequence constitutes a mature protein of 1270amino acids with a molecular mass of 143 kDa. The sequencecontains nine potential asparagine-linked glycosylation sites,all of which are located within the extracellular part of themolecule.The first domain of the mature protein comprises residues

1-372. This region was designated "cysteine-poor" becauseof its low cysteine content of 1.6%. The cDNA-deducedN-terminal amino acids of the mature protein corresponded

-+ZAP-TlT7/4 r-ZAP-17A/3ATTIAAA TATCT TAT TAT TATGST TGTCT TGCALIN I LL L C CL A

GAAACTATLTTTACAEATGTATAGACTGTSATGICDCGTLGAAANSNGENAGSTCAGTACASTTCTAA NOATAGTTTTTAFTTLGTTCACAGAANOPU TGFA

WNT T C E T T KEGNE E C YETY IT N ED NMN L N A QOL NN ENIKMLD E0DT F COE E T A YP IE

ETYEAVDNWD NAFADV E AN! E SATDI N GEK T C FEK TA AERP L A TAVY L N TEKN T T A TEK

E ATTAVC RND FT GG S I T FER S F N N E NED FTI DOT N T NT T SEKC T I DAVNEKN NAVN

TN LA ISTL TTDS TATEI S LOF N L S L L SQ0L KO0SNAVT L T T L KDD0STYA T DN TEKL

23T K D I E T E T I A K Y T A G 0 G 0 A D F I A N 0 A E N D I F E N S S K E K R S T N A A I N T N A

r-00ZAP- I7/2

301 L G S E P N A E T E F D N K N S A N T A A I N N N K D S K T T A A I K I A S I S P N Y N E P T S

1082351

1232OA1

1382'51

1532SOT

1682551

1822A0l

1982651

2132701

2282751

2432

N SEAVS TT ID0E L PESTI NA NL T ENRC DNDEKC SG FCDA HNNNC T C PNHC C EN0C F

r-DIZAP-170/1TACTCCDGTAEGAAACGATG TTTCCTI CTAAGTAACAAPEAGAAAAPTAGECPATCATGGAGLT ECEDLETGAAGACGAA ATGTTACAAT ATAACGCETTGTPT AECNAPTCDDDNNLEEACEIPWPKAKKEANCEA DOEPSSDLYE C TCWSTTCDEV T TSFCE

ECAAAAETAAGAGTACTTGGTGNANLCT AATTNTCTOEGATAGACAAACGTCDE GCAAGAACAACE ATCGTGTATTGATAACATTSPTAIGENS CTAAECTTT

P KL KTCN V P LSYDCDSCGDNSTVCK SAAN C EACTAKPSTS NO TC WSYCEAENSTTG

ST N E C DT T GSTEKPEKC TATSE C T ED L ANRD GC L ER C NE T SK T T YW E NAT C S

N T K E F AO0DSEKS E TNPCEKPT T SAT C L N GOCAAOAACGDAVSN4V GC GTYC SN T D

NAT TTYN D D C D S R SOC G NPNCEKCOPNMG D NSTYS CAVF EEKD KT S S SD ND C A

GATTCDSLTAAATTCCPCADTATACTACGATATCTTATAAAAGAAGAACATTAACDACCCTCTGICTATAPTP ACSCEEAAAGGAEATTTGAAATGDOATA

NOT K I E N K A T L E D S K E T N T N P K D C A N E 0 C P N T T A D C L A N D D N F G E T K F T L

85T P C 0 A A A T A T Y N T S S L F N L T S T E L N L P N S E E F N K E A D K E A T C T T E T T N E C

2732 ACTTCTACGACAAGAAGAAGATGTTTTCGAGGCAGAGAGGTCTCAAGAGAACATCTATACTACTGTGCACATATTA90T K T C S L T E T E E K A E E D L C A E E T K N S S A P F E C K N N N C D P N F N C 0 P I E C K

2882 CAAATTATCGAAGTGATAACAACTTAGTGAAAAACTGGCCACAAGGAAAGTCCTAGCTCTGAGAAAGATGGAGAA95T 1 0 E A I T E E D G K T T T C K D S T K T T C D T N N K N E S A N K A F E S K E S E 0 A E

TOOT C A S T A C N N 0 N S C P A D A E E C N 0 N T E A D Y G C K A N T S E C D S T T T L C E F A

TOOT T D D P 5 L D S E N P N T K S S A E L N N A C L K T K C A E S K S S D S K T N K N E D T E N S N

3332 TACAACAGATCTCGACGACTTACACATGGAATTTCCAGAAAGATTTCGAATCCAATAACATCAGAAGTCATTATTCITTT D P K P N N P C E T A T C D 0 T T S E T T T K K T C T A S E E F P T T P N 0 G N C P T C 0 C S

TT5T T L D S S S A L T N T S E T D K E A T S L D A C S N C N A N N 0 T D N T 0 0 L N N N T E C I A 5 0

3632 ATTAATAATGTTGGAGCTATTGCAGCTGCAACTACTGTTGCTGTAGTTGTAGTTGCAGTCGTAGTTGCATTAATTGTTGTTTCTATT1201 N N V GA IA A A T TAAAAVVVVAAAAAA IVV

3782 GAAAATGCAGA.ATATGTTGGAGCAGATAATGAAGCAACTAATGCAGCTACTTACAATGGATAAT1251 E N A E G A D E A T N A A T TY

'TGCATTATTTAAGACTTATCAACTTGTTTCATCAGCTATGAAGAATGCCATTACAACAACTAATGSL FEKTTO LAVS SANEKN A T TT N

~TTAATTTAATTATCTTACTATTTAAAAAAAAAAAA WMMAAZA-EhT7T/44-JZAP-EhTFA/3

ZAP-EhT7TIT.2

FIG. 3. cDNA sequence andcDNA derived amino acid se-quence of the 170-kDa surface lec-tin of E. histolytica. The sequenceis incomplete by 20 nucleotides atthe 5' end. The beginning and theend of each of the cDNA clonesZAP-Eh17O/1 to -4 are indicatedby arrows. Positive numbering ofthe amino acids starts at the Nterminus of the mature protein.Thin underline shows the positionof the amino acid sequence deter-mined by protein sequencing. Theputative transmembrane region isunderlined by a heavy bar. Thestop codon and cysteine residuesare marked by asterisks, and po-tential asparagine-linked glycosyl-ation sites are indicated by solidsquares.

Dow

nloa

ded

by g

uest

on

Janu

ary

25, 2

022

Page 4: Primarystructure 170-kDasurface Entamoeba

1852 Medical Sciences: Tannich et al.

NH2 |

* 0 0 0 _ 0.J.III COOH

*I.-I Ll.. _- COO-

373 648 ' 204 ' 27C232

signal cysteine-poor cysteine-rich cytoplasmic

transmembrane

FIG. 4. Structural domains of the 170-kDa surface lectin of E. histolytica. Domain structure was based upon sequence properties. Boxeswithin the shaded area represent pseudorepeats. Potential asparagine-linked glycosylation sites are indicated by dots.

to those found by sequencing the N terminus of the isolatedlectin. Identity to the cDNA deduced sequence may beobtained if the two sequences obtained by protein chemistryare combined (Fig. 1).The second domain spans residues 373-1203. It was

termed "cysteine-rich" because of its high cysteine contentof 10.9%o. Certain repetitive elements were identified in thesequence between residues 373 and 648. They may be alignedto form nine contiguous pseudorepeats of -30 amino acids,each of which contains four cysteines and several otherconserved amino acids (Fig. 5).The third domain, which may be located between residues

1204 and 1232, constitutes a putative transmembrane region.It is followed by the presumed cytoplasmic domain of 38residues, which extends to the C terminus of the protein. Itcontains 11 candidate phosphorylation sites at serine, thre-onine, and tyrosine residues.Computer searches revealed low-degree sequence homol-

ogies to other proteins, which were found to lie within thecysteine-rich region of the molecule. Sequence identitiesexceeded 20% of the residues for wheat germ agglutinin(WGA) isolectin 1 and pDd63, a developmentally expressedprotein of Dictyostelium discoideum (15, 16) (Fig. 6).

DISCUSSIONThe structure, function, and immunogenicity of the 170-kDasurface lectin of E. histolytica have been studied in consid-erable detail by Ravdin (2), who termed the molecule galac-tose/N-acetylgalactosamine-inhibitable adherence lectin. Byusing an affinity purification proposed by these investigators(6), we isolated a protein from amoeba membranes that wasconsidered to be this molecule because (i) the same methodwas applied for the isolation, (ii) the protein migrated with anapparent molecular mass of 170 kDa in NaDodSO4/PAGEunder reducing conditions, and (iii) the N-terminal aminoacid sequence of the protein reported by Petri et al. (7) isidentical to the sequence presented here except for the firstthree residues; the differences may be due to sequencingerrors or to polymorphisms of the polypeptide.An antiserum raised against the isolated lectin was used to

identify a corresponding cDNA clone in an expression libraryof pathogenic E. histolytica, which subsequently was used toisolate a nearly full-length cDNA clone. This clone wasconsidered to represent the lectin because (i) an antiserum to

the isolated lectin was used for the initial screening, (ii) theN-terminal amino acid sequence determined for the isolatedlectin was found within the cDNA deduced amino acidsequence at an appropriate position, and (iii) the molecularmass of 143 kDa calculated for the mature protein from thecDNA-deduced amino acids is in good agreement with themolecular mass of 170 kDa experimentally determined for theisolated lectin. The difference between the two values may bedue to multiple glycosylations of the native protein. Previousfindings indicated that the lectin is a glycosylated protein (2),and the cDNA-deduced amino acid sequence revealed a totalof nine potential asparagine-linked glycosylation sites. Gly-cosylation at several or all of these sites may increase themass of the molecule to 170 kDa.The N-terminal amino acid sequence of the mature protein

as it was deduced from the cDNA sequence shows a com-bination of the two sequences obtained by protein chemistry.Although sequencing errors cannot be excluded, the datamay be interpreted as an indication of polymorphisms in thepolypeptide chain. These may relate to the finding that thecorresponding cDNA hybridized to multiple fragments ingenomic Southern blots (Fig. 2A). The complexity of thehybridization pattern could be due to intervening genomicsequences or to gene multiplicity. Since no interveningsequences have so far been found in the genome of E.histolytica, the results of the Southern blots more likelysuggest that the HM-1:IMSS amoebae contain multiple genescoding for the 170-kDa surface lectin or closely relatedproteins. Polymorphisms ofthe protein may, therefore, resultfrom polymorphisms among concomitantly expressed genes.If they exist, then the RNA blot indicates that the sizes of thetranscripts do not vary substantially (Fig. 2B).The architecture of the protein is characteristic for a cell

surface receptor. The N-terminal sequence codes for a typ-ical signal peptide (18). The putative transmembrane regionof :30 amino acids resembles other known membrane an-chors (19). Like in most other membrane proteins, thecytoplasmic domain is located toward the C terminus. In thismolecule it is rather small but contains 11 amino acids withhydroxyl groups, suggesting that signal transduction mayoccur by way of phosphorylation.Approximately 95% of the molecule appears to be located

extracellularly. According to the cysteine content, the extra-cellular part of the molecule may be separated into twodomains. The N-terminal domain is cysteine-poor. Second-

KR D K C G F

ED LELIDD D N N L K E V

DJ JE Q RN L N IT GIN

A S T G N Q C K

|I V SE

cic

C

C

C

CcC

D A HER

I P W P AJEP K A K X EV V T X|Y N A S

K K Y X H G NLT V Q EKJV

EHR L S

NE[IV E G N Y

T E D L V R D G

]C MCC E N D

PIA T

E PK V K

E A D Q K P S S D G Y

T G K] T N

V K TISEY I E M S

DWT Y S SD S S K H E I

L I K R

F Y S

V G LYEV P Y

W SYT

Q E|Y|VYV A K

G G D S|T|G S V C K| D| T|G D K P K|N T S K T T Y W E N V D

FIG. 5. Alignment of ninepseudorepeats found within thecysteine-rich domain of the 170-kDa surface lectin of E. histolyt-ica. Residues that are conservedin at least four out of nine re-peats are boxed. Conserved cys-teine residues are surrounded bya double line. For optimal align-ments, gaps were introducedinto the sequences. The numberof the first residue of each repeatis indicated on the left.

373

403434

460493521

550

580

616

Proc. Natl. Acad Sci. USA 88 (1991)

Dow

nloa

ded

by g

uest

on

Janu

ary

25, 2

022

Page 5: Primarystructure 170-kDasurface Entamoeba

Proc. Natl. Acad. Sci. USA 88 (1991) 1853

Ehl7O C K D L E[LEEV VWK Y N AS[ fE P K V K IM P Y D D D N N L K E 434-468

WGA1 Q R C G E Q G S N M ELJJP N N L C C S Q Y GI C M G G D Y C G K 1- 33pDd63 C D D UITC C N [jPI N V D D NN P T D A T K S T G V T H T 376-409

Ehl7O

WGA1pDd63

V C K Q K A N CE A D Q[E P S HjG YC W S Y TC D ETT G F CK YEHG

VVNQD GA [S T S K RC G s QA G G A TC TN N QCC S Q YG YC G F G A

PVIV D DN NE3C TD FsD]AK CN E ASCmC D GV CT L[PEh170 G N L C TGK T T N C Q E Y V C S EQ R C T E H AVK E VKTSPYI[EMWGA1 E Y C G A G CQG G P CIR A D I K CGIQA G G[ LIC P N N L C C S Q W

pDd63 I S [J P K P K N K C Q VA K CL I K G CT VS NV V CDD G N A C TED

Ehl70 S C Y V AKJC N L N TGMCE[R L S C sGDYjSTGSJ

WGA1 G S C G L G S E F C GG JQ S GAC ST D K PIC GGD A

pDd63 S C S D T GK F E P I KLPK K N K C I I K CD P I K G T I NS

Ehl70 V CKEID ASTGNQO KfClNK V E Nf YfDSS iJHE I COY T - -

WGA1 V C T N N Y C C SK G GSICIG I G PGIJYfG A G C Q S G GNCIDIA

pDd63 TVNI EID L C N I G E E D T GK CMNY R QJD HOD N - -

ary structure predictions (20) suggest that the protein folds those regions ofpredominantly into a-helices and p-sheets. No repetitive protective immuielements were identified. Computer searches did not revealsignificant homologies to any other protein sequences listed. H. Buss for techniThe extracellular domain adjacent to the transmembrane

region has a cysteine content of >10%. In the N-terminal part 1. Walsh, J. A. (1of this domain, evidence was found for repetitive structural 2. Ravdin, J. I. (1elements that may be aligned to form nine contiguous pseu- 3. Chadee, K., PeClin. Invest. 841dorepeats (Fig. 5). In addition, this part of the sequence has 4. Ravdin, J. I.,8a low degree of homology to two proteins with cysteine-rich Hewlett, E. L.repetitive structures (Fig. 6). The similarity to WGA was of 5. Li, E., Becker,

1725-1730.particular interest because WGA is also a lectin and its 6. Petri, W. A., Jrtertiary structure has been determined by X-ray crystallog- & Ravdin, J. I.raphy (15, 21). However, no reasonable assumptions could be 7. Petri, W. A., Cmade concerning the tertiary structure of the amoeba lectin man, J. & Ravdsince it lacks several cysteine residues that appear to form 9. Petri, W. A., Jrdisulfide bonds critical for the stability of the WGA molecule V., Simjee, A. l(21). Furthermore, no conclusions could be drawn with nol. 144, 4803-4

letn 10. Diamond, L. S.,regard to the carbohydrate binding site of the amoeba lectin Soc. Trop. Med.since only one ofthe six amino acids that are believed to form 11. Tannich, E., Hthe carbohydrate binding site in WGA was found to be (1989) Proc. Naconserved (22). It should be noted, however, that the car- 12. Young, R. A. &80, 1194-1198.bohydrate ligands of WGA, which are N-acetylglucosamine 13. Maniatis, T., Fand N-acetylneuraminic acid, differ from those ofthe amoeba Cloning: A Labslectin (17). Spring Harbor, I

In conclusion, the location of the carbohydrate binding site 14. Pearson, W. R. kremains unclear. No attribution to a distinct class of lectins 15. Wright, C. S. & (could be made on the basis of consensus sequences (23). It 16. Williams, J. G.,cannot even be predicted that the carbohydrate binding site Kay, R. R., Earl

is located within the cysteine-rich domain because there are 17. Goldstein, I. J.examples of lectins that contain a cysteine-rich domain but Functions, and Abind to the carbohydrate ligand in a cysteine-poor region (17). I. E., Sharon, NThe availability of cDNA clones coding for the 170-kDa 18. von Hejne, G. (1

surface lectin of E. histolytica will allow the recombinant 19. von Hejne, G. (1expression of selected parts of the molecule. Thereby, it may 20. Chou, P. Y. & Fbe possible to identify functionally relevant regions such as 21. Wright, C. S., Gathe carbohydrate binding site. Furthermore, recombinant 22. Wrght, C. 5. (19technology will possibly allow the selective expression of 23. Drickamer, K. (1

469 -505

34- 71

408-44 1

506-54 3

72-107

442-478

544 -576

108-139

479-5 16

577-6 10

140- 17 1

517-548

FIG. 6. Sequence homology of thecysteine-rich domain of the 170-kDasurface lectin of E. histolytica (Ehl70)with WGA isolectin 1 (WGA1) and theDIF-induced prestalk protein of D.discoideum (pDd63). On the right, theresidue numbers are indicated for therespective protein. Identical residuesare boxed; conserved cysteine resi-dues are doubly boxed. For optimalalignments, gaps were introduced intothe sequences. In the overlap of '170amino acids, 22% (Ehl70/WGA1) and24% (Ehl70/pDd63) of the residuesare identical. The amino acids ofWGAthat are considered to be involved inligand binding are Asp-29, Ser-62, Tyr-73, Glu-115, Ser-148, and Tyr-159 (17);only Tyr-159 is conserved in the se-quence of the amoeba lectin.

f the molecule that were found to elicit aLne response.

Dbloch for performing the protein sequencing andical assistance.

1986) Rev. Infect. Dis. 8, 228-238.1989) J. Infect. Dis. 159, 420-429.tri, W. A., Jr., Innes, D. J. & Ravdin, J. I. (1987) J.D, 1245-1254.Murphy, C. F., Salata, R. A., Guerrant, R. L. &(1985) J. Infect. Dis. 151, 804-816.A. & Stanley, S. L., Jr. (1988) J. Exp. Med. 167,

r., Smith, R. D., Schlesinger, P. A., Murphy, C. F.(1987) J. Clin. Invest. 80, 1238-1244.'hapman, M. D., Snodgrass, T., Mann, B. J., Bro-din, J. 1. (1989) J. Biol. Chem. 264, 3007-3012.r. (1990) Parasitol. Today 6, 62-63.r., Snodgrass, T., Jackson, T. F. H. G., Gathiram,E., Chadee, K. & Chapman, M. D. (1990) J. Immu-4809., Harlow, D. R. & Cunnick, C. C. (1978) Trans. R.

1. Hyg. 72, 431-432.lorstmann, R. D., Knobloch, J. & Arnold, H. H.Wt. Acad. Sci. USA 86, 5118-5122.S Davies, R. W. (1983) Proc. Natl. Acad. Sci. USA

'ritsch, E. F. & Sambrook, J. (1989) Molecularoratory Manual (Cold Spring Harbor Lab., ColdNY).& Lipman, D. J. (1988) Proc. Natl. Acad. Sci. USA

Olafsdottir, S. (1986) J. Biol. Chem. 261, 7191-7195.Ceccarelli, A., McRobbie, S., Mahbubani, H.,

iy, A., Berks, M. & Jermyn, K. A. (1987) Cell 49,

& Poretz, R. D. (1986) in Lectins: Properties,Applications in Biology and Medicine, eds. Liener,r. & Goldstein, I. J. (Academic, New York), pp.

1985) J. Mol. Biol. 184, 99-105.1985) Curr. Top Membr. Transp. 24, 151-179.asman, G. D. (1978) Adv. Enzymol. 47, 251-276.avilanes, F. & Peterson, D. L. (1984) Biochemistry

)80) J. Mol. Biol. 141, 267-291.1988) J. Biol. Chem. 263, 9557-9560.

Medical Sciences: Tannich et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

25, 2

022