homolog of escherichia coli recaproteinin plastids of ... · (a) lane 1, protoplast fraction; lane...

5
Proc. Natl. Acad. Sci. USA Vol. 89, pp. 8068-8072, September 1992 Plant Biology A homolog of Escherichia coli RecA protein in plastids of higher plants (chloroplast/recombination/recA gene/DNA damage/DNA repair) H. CERUTTIt, M. OSMAN, P. GRANDONI, AND A. T. JAGENDORF* Section of Plant Biology, Cornell University, Ithaca, NY 14853 Contributed by A. T. Jagendorf, April 24, 1992 ABSTRACT Studies of chloroplast DNA variations, and several direct experimental observations, indicate the existence of recombination ability in algal and higher plant plastids. However, no studies have been done of the biochemical path- ways involved. Using a part of a cyanobacterial recA gene as a probe in Southern blots, we have found homologous sequences in total DNA from Pisum sanvwn and Arabidopsis thaEaa and in a cDNA library from Arabidopsis. A cDNA was cloned and sequenced, and its predicted amino acid sequence Is 60.7% identical to that of the cyanobacterial RecA protein. This finding is consistent with our other results showing both DNA strand transfer activity and the existence of a protein of the predicted molecular mass crossreactive with antibodies to Escherichia coli RecA in the stroma of pea chloroplasts. The observation of chloroplast DNA recombinants in so- matic hybrids of higher plants (1), genetic studies of the inheritance of chloroplast markers in several crosses of Chlamydomonas (2-4), the integration of donor DNA by homologous recombination in chloroplasts of transformed Chlamydomonas (5), and extensive comparative analyses of chloroplast genome structure (6-9) indicate that DNA re- combination occurs in chloroplasts of both higher plants and green algae. The biochemistry of any recombinational mech- anism in chloroplasts is completely unknown, however. It is generally accepted that plastids originated from cy- anobacterial progenitors, acquired by an ancestral eukaryotic cell through an endosymbiotic event (10, 11). Therefore it seemed probable that any chloroplast recombination system should be related to a eubacterial counterpart. In Escherichia coli and many other prokaryotes, the RecA protein is essen- tial for homologous recombination and for a variety of SOS responses to DNA damage (12-20). In searching for a pos- sible higher-plant recA gene we used a cyanobacterial recA as a probe and found homologous sequences in nuclear DNA from pea and Arabidopsis thaliana. With the same probe we have cloned an Arabidopsis thaliana cDNA that encodes a protein highly homologous to eubacterial RecA, except for a predicted chloroplast transit peptide at its amino terminus.§ The likely expressed protein was detected in chloroplast stromal extracts by crossreaction with polyclonal antibodies to E. coli RecA protein. MATERIALS AND METHODS Materials. Standard laboratory chemicals were purchased, unless otherwise noted, from Sigma. Purified RecA protein was obtained from United States Biochemical. Cellulysin and Macerase were from Calbiochem. Horseradish peroxidase- conjugated goat antibodies to rabbit IgG, and a chemilumi- nescent substrate, were from Amersham. Isolation of Stroma from Intact Chloroplasts and Immuno- detection of a RecA Protein. Protoplasts were prepared from leaves of Arabidopsis or pea by digestion with 3% (wt/vol) Cellulysin and 0.5% (wt/vol) Macerase. Intact chloroplasts were isolated by passage of washed protoplasts through a nylon mesh (10-lum pore size) for disruption, followed by centrifugation of the homogenate on discontinuous Percoll gradients (21). Chloroplasts were broken by osmotic lysis and thylakoid membranes were removed by centrifugation. The supernatant was concentrated by acetone precipitation to give the chloroplast stromal fraction. Potential bacterial contamination was tested by plating aliquots of isolated protoplasts or chloroplasts on LB medium (22). Proteins from the different fractions were separated by SDS/polyacrylam- ide gel electrophoresis, transferred to nitrocellulose, and probed with two polyclonal rabbit antisera to RecA protein from E. coli strain K-12. Detection was by means of goat anti-rabbit IgG antibodies conjugated to horseradish perox- idase, and a chemiluminescent substrate. In some cases, the enzyme was inactivated by incubation in 15% (wt/vol) H202 and the blot was reprobed with an antibody against the y subunit of CF1. DNA Isolation and Probing with a Cyanobacterial recA Gene Fragment. Total DNA was isolated from Arabidopsis leaves by a miniprep procedure (R. L. Last, personal communica- tion). Isopycnic CsCl centrifugation (22) was used for the purification of total DNA from pea leaves and of chloroplast DNA from intact pea chloroplasts. Standard procedures were used for digestion, electrophoretic separation, and transfer of the DNA to nylon membranes (22). The filters were probed with a 32P-labeled BstEII fragment comprising the 5' half of the coding sequence (23). Prehybridization was at 700C for 6 hr and hybridization was at 600C overnight (24). Filters were washed three times for 1.5 hr with 2x SSC/0.1% SDS (22) at room temperature, then three times in 0.2x SSC/0.1% SDS at 500C for another 1.5 hr. The membranes were exposed to Kodak XAR-5 film with an intensifying screen. cDNA Cloning and Sequencing. A cDNA was isolated by screening '-300,000 members of an Arabidopsis library in the AYES vector (25), using as probe the Synechococcus recA gene (23). The bacteriophages were plated on an E. coli strain deleted for recA (C 10289, ref. 26). Standard procedures were used for library screening (22). After subcloning into pBluescript (Stratagene), nested deletions were generated and the DNA was sequenced on both strands by the dideoxy chain-termination method using T7 DNA polymerase (22). The sequences were analyzed with the Genetics Computer Group software package (27). Protoplast Treatments. Protoplasts (10 ml, 106 protoplasts per ml) in liquid LP* medium (28) were placed in 9-cm Petri tPresent address: Department of Botany, Duke University, Durham, NC 27706. tTo whom reprint requests should be addressed. §The sequence reported in this paper has been deposited in the GenBank data base (accession no. M98039). 8068 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on December 6, 2020

Upload: others

Post on 22-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: homolog of Escherichia coli RecAproteinin plastids of ... · (a) Lane 1, protoplast fraction; lane 2, chloroplast fraction; lane 3, chloroplast stromalfraction (half the amountofprotein

Proc. Natl. Acad. Sci. USAVol. 89, pp. 8068-8072, September 1992Plant Biology

A homolog of Escherichia coli RecA protein in plastids ofhigher plants

(chloroplast/recombination/recA gene/DNA damage/DNA repair)

H. CERUTTIt, M. OSMAN, P. GRANDONI, AND A. T. JAGENDORF*Section of Plant Biology, Cornell University, Ithaca, NY 14853

Contributed by A. T. Jagendorf, April 24, 1992

ABSTRACT Studies of chloroplast DNA variations, andseveral direct experimental observations, indicate the existenceof recombination ability in algal and higher plant plastids.However, no studies have been done of the biochemical path-ways involved. Using a part of a cyanobacterial recA gene as aprobe in Southern blots, we have found homologous sequencesin total DNA from Pisum sanvwn and Arabidopsis thaEaa andin a cDNA library from Arabidopsis. A cDNA was cloned andsequenced, and its predicted amino acid sequence Is 60.7%identical to that of the cyanobacterial RecA protein. Thisfinding is consistent with our other results showing both DNAstrand transfer activity and the existence of a protein of thepredicted molecular mass crossreactive with antibodies toEscherichia coli RecA in the stroma of pea chloroplasts.

The observation of chloroplast DNA recombinants in so-matic hybrids of higher plants (1), genetic studies of theinheritance of chloroplast markers in several crosses ofChlamydomonas (2-4), the integration of donor DNA byhomologous recombination in chloroplasts of transformedChlamydomonas (5), and extensive comparative analyses ofchloroplast genome structure (6-9) indicate that DNA re-combination occurs in chloroplasts of both higher plants andgreen algae. The biochemistry of any recombinational mech-anism in chloroplasts is completely unknown, however.

It is generally accepted that plastids originated from cy-anobacterial progenitors, acquired by an ancestral eukaryoticcell through an endosymbiotic event (10, 11). Therefore itseemed probable that any chloroplast recombination systemshould be related to a eubacterial counterpart. In Escherichiacoli and many other prokaryotes, the RecA protein is essen-tial for homologous recombination and for a variety of SOSresponses to DNA damage (12-20). In searching for a pos-sible higher-plant recA gene we used a cyanobacterial recA asa probe and found homologous sequences in nuclear DNAfrom pea and Arabidopsis thaliana. With the same probe wehave cloned an Arabidopsis thaliana cDNA that encodes aprotein highly homologous to eubacterial RecA, except for apredicted chloroplast transit peptide at its amino terminus.§The likely expressed protein was detected in chloroplaststromal extracts by crossreaction with polyclonal antibodiesto E. coli RecA protein.

MATERIALS AND METHODSMaterials. Standard laboratory chemicals were purchased,

unless otherwise noted, from Sigma. Purified RecA proteinwas obtained from United States Biochemical. Cellulysin andMacerase were from Calbiochem. Horseradish peroxidase-conjugated goat antibodies to rabbit IgG, and a chemilumi-nescent substrate, were from Amersham.

Isolation of Stroma from Intact Chloroplasts and Immuno-detection of a RecA Protein. Protoplasts were prepared fromleaves of Arabidopsis or pea by digestion with 3% (wt/vol)Cellulysin and 0.5% (wt/vol) Macerase. Intact chloroplastswere isolated by passage of washed protoplasts through anylon mesh (10-lum pore size) for disruption, followed bycentrifugation of the homogenate on discontinuous Percollgradients (21). Chloroplasts were broken by osmotic lysis andthylakoid membranes were removed by centrifugation. Thesupernatant was concentrated by acetone precipitation togive the chloroplast stromal fraction. Potential bacterialcontamination was tested by plating aliquots of isolatedprotoplasts or chloroplasts on LB medium (22). Proteins fromthe different fractions were separated by SDS/polyacrylam-ide gel electrophoresis, transferred to nitrocellulose, andprobed with two polyclonal rabbit antisera to RecA proteinfrom E. coli strain K-12. Detection was by means of goatanti-rabbit IgG antibodies conjugated to horseradish perox-idase, and a chemiluminescent substrate. In some cases, theenzyme was inactivated by incubation in 15% (wt/vol) H202and the blot was reprobed with an antibody against the ysubunit of CF1.DNA Isolation and Probing with a Cyanobacterial recA Gene

Fragment. Total DNA was isolated from Arabidopsis leavesby a miniprep procedure (R. L. Last, personal communica-tion). Isopycnic CsCl centrifugation (22) was used for thepurification of total DNA from pea leaves and of chloroplastDNA from intact pea chloroplasts. Standard procedures wereused for digestion, electrophoretic separation, and transfer ofthe DNA to nylon membranes (22). The filters were probedwith a 32P-labeled BstEII fragment comprising the 5' half ofthe coding sequence (23). Prehybridization was at 700C for 6hr and hybridization was at 600C overnight (24). Filters werewashed three times for 1.5 hr with 2x SSC/0.1% SDS (22) atroom temperature, then three times in 0.2x SSC/0.1% SDSat 500C for another 1.5 hr. The membranes were exposed toKodak XAR-5 film with an intensifying screen.cDNA Cloning and Sequencing. A cDNA was isolated by

screening '-300,000 members of an Arabidopsis library in theAYES vector (25), using as probe the Synechococcus recAgene (23). The bacteriophages were plated on an E. coli straindeleted for recA (C 10289, ref. 26). Standard procedureswere used for library screening (22). After subcloning intopBluescript (Stratagene), nested deletions were generatedand the DNA was sequenced on both strands by the dideoxychain-termination method using T7 DNA polymerase (22).The sequences were analyzed with the Genetics ComputerGroup software package (27).

Protoplast Treatments. Protoplasts (10 ml, 106 protoplastsper ml) in liquid LP* medium (28) were placed in 9-cm Petri

tPresent address: Department of Botany, Duke University, Durham,NC 27706.tTo whom reprint requests should be addressed.§The sequence reported in this paper has been deposited in theGenBank data base (accession no. M98039).

8068

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 6,

202

0

Page 2: homolog of Escherichia coli RecAproteinin plastids of ... · (a) Lane 1, protoplast fraction; lane 2, chloroplast fraction; lane 3, chloroplast stromalfraction (half the amountofprotein

Proc. NatL. Acad. Sci. USA 89 (1992) 8069

dishes. Mitomycin C was added to the desired concentration,and protoplasts were incubated for 12 hr in the dark. Intactprotoplasts were reisolated by flotation, and proteins wereextracted and analyzed by Western blotting.

RESULTSDetection of a Protein in Chloroplast Stroma That Is Im-

munologically Related to E. coli RecA. An immunoblottingtechnique was used to detect Arabidopsis proteins related toRecA (Fig. la). Nonimmune serum did not reveal any bands(data not shown). Two polyclonal antibodies raised against E.coli RecA crossreacted with three protoplast proteins. Twoof them are soluble chloroplast proteins detected in thestromal fraction (Fig. la). The apparent molecular mass ofthe faster moving stromal protein is 40.5 kDa, almost iden-tical with that ofE. coli RecA (Fig. la). Similar proteins werealso identified in pea chloroplasts (Fig. lb). Consistent withthe presence of RecA in chloroplasts, we detected DNAstrand exchange, an essential activity ofE. coliRecA (12, 14),in crude stromal extracts from pea (H.C. and A.T.J., unpub-lished work).

Induction by DNA Damage. It has been argued that theprimary biological role of recombination is the repair ofDNAdamage (29). Exposure of E. coli to DNA-damaging agentsinduces the SOS response, resulting in derepression of =20genes (including recA) (12-20). The RecA protein is involvedin multiple aspects of this response: regulation of geneinduction by promoting cleavage of the LexA repressor (13,15, 19, 20), recombinational repair (13, 14, 19, 20), SOSmutagenesis (13, 16, 19), DNA replication (13, 17), andduplication mutagenesis (13). In other organisms, genesinvolved in DNA repair/recombination are also induced inresponse to DNA damage (19, 30).To see whether DNA-damaging agents would affect ex-

pression of the proteins immunologically related to RecA,pea protoplasts were incubated with mitomycin C. Mitomy-cin C is a bifunctional alkylating agent that forms interstrandcrosslinks (15, 31), presumably requiring a recombinationalpathway for repair (29). The treatments increased the steady-state level of the chloroplast crossreacting protein similar in

a 1 2 3 4

kDa

-46.5-40.5

b

c

molecular mass to RecA, suggesting its involvement in DNArepair/recombination (Fig. lb). The same blot was probedwith an antibody against the y subunit ofthe chloroplast ATPsynthetase as a control for the proper loading of the lanes(Fig. ic). Further details of the induction will be describedelsewhere (H.C., H.-Z. Ibrahim, and A.T.J., unpublishedwork).Homologous DNA in Higher Plants. Blot hybridization of

restriction enzyme-digested genomic DNA revealed se-quences homologous to the Synechococcus recA gene (23) inpea and Arabidopsis (Fig. 2). However, we were unable todetect any hybridization to purified chloroplast DNA (Fig. 2)or mitochondrial DNA (data not shown). Homology to recAhas not been found in the completely sequenced chloroplastgenomes of tobacco (32), Marchantia (33), or rice (7). More-over, induction of the stromal protein was prevented byprotein synthesis inhibitors acting on cytosolic (80S) ribo-somes, also suggesting a nuclear localization for this gene(data not shown).A recA Gene in Arabidopsis thalana. Using the Synecho-

coccus recA gene (23) as a probe, we screened an ArabidopsiscDNA library and cloned a gene showing extensive homologyto eubacterial recA. The cloned cDNA was sequenced bystandard techniques. Polymerase chain reaction analysisshowed that this was the longest cDNA in the library encod-ing the RecA protein (data not shown). Although the cDNAis truncated at its 5' end (Fig. 3a), it is long enough to revealthe features of the encoded protein. The amino terminusshows no similarity to bacterial RecA sequences and isprobably a chloroplast transit peptide (Fig. 3b). Chloroplasttransit peptides are not highly homologous, except for aloosely conserved motif at the cleavage site for the stromalprocessing protease (35, 36). The deduced amino acid se-quence contains a perfect match to this consensus cleavagesite (Fig. 3a). When such a match is found, it is predicted tospecify the correct cleavage site with 90%o probability (35).The sequence upstream of the putative cleavage site isenriched for serine and threonine (29.5%) and is almostdevoid of acidic residues (2.0%). It also lacks predictedsecondary structures except for two relatively small regions,one of them an amphiphilic (3-strand next to the cleavage site(data not shown). These are typical features of chloroplast

1 2 3 4H P P P kb

wxF - 23.1

-9.4-6.6

-4.4-i--38.0

FIG. 1. Detection of proteins immunologically related to E. coliRecA in Arabidopsis thaliana. (a) Lane 1, protoplast fraction; lane2, chloroplast fraction; lane 3, chloroplast stromal fraction (half theamount of protein loaded in lanes 1 and 2); lane 4, purified E. coliRecA protein. (b) Mitomycin C induction in the steady-state level ofthe chloroplast protein similar to E. coli RecA. This protein is slightlysmaller in pea (39 kDa) than in Arabidopsis. Pea protoplasts wereincubated for 12 hr in the presence of 0, 6, 15, or 30 ,uM mitomycinC (lanes 1-4, respectively), before protein isolation. (c) The sameblot shown in b was reprobed with antiserum to the y subunit of thechloroplast ATP synthetase, a nuclear-encoded chloroplast proteinsimilar in size to RecA. This protein was not induced by DNA-damaging agents and served as a control for the proper loading of thelanes.

-2.3-2.0

-0.6

FIG. 2. Southern blot showing sequences related to Synechococ-cus recA in genomic DNA from Arabidopsis and pea. Lanes 1 and 2,total Arabidopsis DNA (1 Mg); lane 3, total pea DNA (8 pg); lane 4,pea chloroplastDNA (1 Mg). Restriction enzymes: H, HindIll; P, PstI. Size markers are in kilobases (kb).

Plant Biology: Cerutti et al.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 6,

202

0

Page 3: homolog of Escherichia coli RecAproteinin plastids of ... · (a) Lane 1, protoplast fraction; lane 2, chloroplast fraction; lane 3, chloroplast stromalfraction (half the amountofprotein

8070 Plant Biology: Cerutti et a!. Proc. Nat!. Acad. Sci. USA 89 (1992)

a

106 TCT TCT TGC TAC TCC CGCCGC CTC TAT TCT CCG GTT ACC GTC TAC GCC#GCGMSA CTAACTCTCCCACAAAAATC AOTTCTGAA TTCGATGACAM ATC AACGGCSOCT-16 UEJ 2M Iy r AM& o y e rM kl ZA AelaLys Lys Lou Set His Lys IL* Ser Sar Stu Ph. Asp Asp Arg Ito Aan Gly Ala

211 CCTTCCGCOTGTTCCCTCCCGCCCCAAGOTTAGGGAOTAGATGCATACATTATTGTAAW G T C G T20 Lou Ser Pro Asp Ala Asp Sor Arg Ph. Lou Asp Arg Gin Lys Ala Lou Glu Ala Ala Not Asn Asp Ito Asn Sot Seor Pho Sly Lys Gly Sot Vat Thr Arg Lou

55 GLy Sot Ala Sly Gly Ala Lau Vat GLu Thr Ph. Set Sat Sly IL* Lau Thr Lau Asp Lau ALa Lau Gly Sly Gly Lou Pro Lys Sly Arg Vat Vat GLu II. Tyr

90 Sly Pro GLu Sor Sar Gly Lys Thr Thr Lou Ata Lau His Ala IL. ALa Glu Vat Gin Lys Lou Gly Sty Asn ALa Net Lou Val Asp Ala GLu His Ale Phe Asp

526 C C A C A C T O T A T GAATTGAAGGTCCGCAGTATGCGG T C T A C C A G T O O C125 Pro Ala Tyr Ser Lys Ala Lou Gly Val Asp Vat GLu Asn Lou Ito Val Cys Gin Pro Asp Asn Sly GLu Not Ala Lou Gtu Thr Ala Asp Arg Not Cys Atg $or

160 Sly Ala Val Asp Lou II. Cys Val Asp Sat Vat Sat Ala Lou Thr Pro Arg Ala GLu Ito GLu Gly Gtu II. Sly Nat Stn Stn Net Sly Lou S~n Ala Atg Lou

736 ATG AGT CAA OCT CTT CGT AMA ATG TCA GSA AC GCC TCT MAA OCT GGG TGT ACT CTT ATT TTC CTA MAC CAA ATA AGA TAC MSG ATT GOT STS TAC TAT OGG MAT195 Net Sat Gtn Ala Lou Atg Lys Net Sat Sly Asn Ala Sat Lys Ala Gly Cys Thr Lou ieo Ph. Lou Asn Stn II. Atg Tyr Lys IIe* Sly Val Tyr Tyt Sly Asn

841 CCA GAGSGTG ACT AGCGGAGWA ATT GCGSTTAMSAGTTC TTC GCSTCG GTC CGTCTA GAA ATT CGT TCTGOCTOGG AGMATC AMATCT ASCAAASGGSGATGAASAT ATT230 Pro Glu Vat Thr Sat Gly Gly Re. Ala Lau Lys Ph. Ph. Ala Sat Vat Arg Lou GLu II. Arg Sat Ala Sly Lys lIe.-Lys Sat Sat Lys Sly Asp Stu Asp II.

946 GOT CTTCOGGOCT CGT GTA AGAGTG CAGMAG AGCMSAGTT TCA AGACCG TATMAAGCAGCAGAG TTTSGAGATT ATG TTTSGGGSAAGOA OTCAGTMAAACTGSAMTSC265 Sly Lou Arg Ala Atg Vat Atg Vat G~n Lys Sat Lys Vat Sat Atg Pro Tyr Lys SIn Ala GLu Ph. Gtu Ieo Net Pha Sly Stu Sly Val Sat Lys Lou Sly Cys

1051 GTT CTTSGATTOTGOCT GAAATT ATG GMAGTT GTGGTCMSAAMGOT TCC TGGTAC AGC:TAC GAASGAC CMAASGCTC GSGCMSAASAAGASMAGASCA CTG CAGCAC300 Vat Lau Asp Cys Ala Glu Ie Naet Gtu Vat Val Val Lys Lys Sly Sat Ttp Tyr Sat Tyr Gtu Asp Gtn Arg Lou Sly Sin Sly Atg Gtu Lys Ala Lou Stn His

1156 TTA AGOSGMMC CCTOCT CTT CMS AC GMATT GAGAGAMASTGASAGATTG TATG GTTA GAT GGAGM GTT CATCSATAATCACTCCTTTGSATSAGC ASC AGCTCT335 Lou Atg GLu Asn Pro Ala Lou Stn Asp GLu Ito Gtu Lys LWs Val Atg Lou Lou Nat Lou Asp Sly Gtu Vat His Arg Sat Thr Pro Lau Nat Sot Sat Sot Sat

1261 TCC TCG OCT TCA CAT CGC GAA SAA GMA SAA SAA SAC TCG CTT SAC SAT TTC CMA TSA CATMACACTTMAGCTTGAGACTTTCS.GATCTATGGATGTACAACACTT!i!!iiT'iZCATCA370 Sat Set Ala Sat His Arg Glu Glu Glu Glu Glu Asp Sat Lou Asp Asp Ph. S~n*

1381 TGTTCACCTCTTGTTTCMSGTTTC

b -. . -5.. 25. -1 . 5..-.5 .... ...15..._25 .... I...3---.5.... .. .....5.. _._5..... ...Os..5....con FGKG G L A GGGA E GPES GKTTAt: DSQLVLSLKLNPSFTPLSPLFPFTPCSSFSPSLRFSSCYSRRLYSPVTVYAAKKLSHKISSEFDDRINGALSPDADSRFLDRDKALEAAMNDINSS ::~::::.'::SVTRL- SAGG-ALVETFSSG1'L1 .DL. L..i- "LPKGRVW IYV SL::::Sop: MSAISNNP.KE...NLVL.A. ERN:.::::.-:- AIM.- 'D.AQ-MK.A.IP..A .0 -.PR . .Av: MAINTDTSGK. .... TMVL.0.ER-. AIM.. - D.TR-MR.. .1.T. . R..f- DK.EK. :.::.IMKM- .EDVV-EQ. .VIPT.SIALN . R I ...

Bs: mS...A..DM.LKA.EKQ:O:::: .IMK. --EKTD-TR1S.VP..S.AL.T.0: -::Y.F0R.IIV..Am: H......SQERA .IMK. -GKDQVVET. V. TR .. G. . V.:. I R.R iV0g: MSD.KS...A. LAQ.EK. :::::.AINKMD. .QIE-ENL.OvI T.S.5G:.. -:V. RRR. .Mf: MDEN.S...A..LSQ. EKQr' '.Im.N DTDVA.DIQAV.T.S.G .1 -.1 -.R.I.BP: MDDKTSKAAAAEKA...A..LSQ.EKA ".:.IN.Y- DNEVEHDIQOV.T.S.G J.1 V ..AA-.,T V.PC: MTAEKS.---A..LAQ.EKA :':.1`.M-M DGEAAEDIQVV.T.S.G .1 V- ..R...EC: M4AIDENK....A..LGA.EKQO::'- .IM..- EDRS-MD ..-. !.T.S.S .1 A: ..M..:

F- Protein-Protein Interactions ---- IATP ~4yd.4!.105....115.....125...!.1'35 ...1145....j.155 ...'.165... 1.175...j. 185..1~.195.... !.205.... 215.. -125....235.....245..

Con A E ASDY LG L P D DSV ALP EEGAG G ARMESAR FNODREK GPE T GGALKF SVDAt: ALHAI EVOKLGGNAMLVDA :H :F:PA.:SKA 000:00ENIVCtA DNGEMALETADRMCRSGAVO.LICVO ::S V:N:::A .:I ::El: MASH LO: :L A:L KMSGNASKAGCTLI'L...1Y0 159 0NV: S,: I: F LEAsp . ......A.V.OAFI. L: .0T:.A. I L..A....S...I..ALO..A .0:..1 -'::'K''O.0 .-I.MGRS...V.-LG 5.--00. D..Av: .... ..E ..IAPF. .. GO' L::.0T:AS. .1I5. L.AS . T. S.. IV.I .LOVP.A I1001 A::.:V . :::,1-O AHVA:S .IT. 1G0.. .V. '..:L A -.0.T. .:S:' T 0N Y0 D

Bf:. .AA..I.AFI.. FAK..FS Q.IEL.SI10 .A::. .::K. .SHA::NKV . . LTSAV. TRT.C. 1. I.:E .MF:: T: T N:.:: 0. 0Bs:. ... ORTS.-Fl.. -LL .V0 AQK. .OI1.E :LLS. .T..A...I.EALV .....I Av:::.V: K. 'DMD: DASMV . . AIN.SKTrA. I: -':. E .VH. .K T-PI R: 05 .::: 1Am: .1.. .A..K..TCAF ... -L: .S'- ARK-CALDE LISE::-A..A. . 1. .TLV. P. L. - .L.L.NDNA .-.Y.SV..SKTIOIV.: M.-P.-ITNg: C.E.VOAC..N..VCAF.0.V ARK: .K..E:OLDS.1.T..Q... ICOL.. .GI: MO. ':- K.:. ::':D DM.bS..VLT HI K. TLOITL VV N M..MFS T: 0 N 0H0:AO. M-tAt.. L: V:AAK N.0.SD LIDS.T..Q...I1..MLV...S 00 -.K -.DR . T.INOL. N..M .,

Bp: 1.00 -MO...CAFP.... L OSVQ: ASKC` OLTDL IS T..Q... IT.ALV... S. .::::::A: V::K. .HDSLPI . LTATIKRTN.MV. :-.*4A. MF':: .':.:T T:N ~S: DPC: 0.00 L.. T.AFI.. -L:.OVS :AAK: .N.PE: LIDS.'T..Q...IT.ALV...S ~ K. .M:DSLP . LTIROA I.H.MF.0.N 0 Ac:T9 AARE.KTCAFI..- L '. I:ARK::. ..15. LCD. T..Q....ICALA_. 0.0. ADSHK...D AN .A H4- NMLA..LKQSNTL.. .4.H F: TT -N0 Y0

I O~~~~~~~~ucleotide binding -

:.255 .265_...:.275.... 1..285.... !..295_.1_. 305. ...315 .... 1.325 .....:.335 .... 1--345.. ,355,. .-.365 .... 309 .. 3A. DENTCor R K G VOK P IDGGA SWY 55At: SAGKI SSKGDEDI LRARVR`Q: SKVSR :YKQAEFE .MFE I SKLGCVLDCAEIMEVVVKK 5 0Y':EDQRLGA REKALAHLRENPALADEIEKKVRLLMLDGEVHRSTPLMSSSSSSASHREEEEE2SLO.! - :1!Asp: RISOL KG-ESEPEFI..K.K-A N..AP-:R1... D': K.: I.Rv. .M. .L. TG.iTS.R. A: GDN IA::: .DN.OKY-E.D VAA I VO...ENLDMSSMGFGDEH1YTEEE SC.AV: RISYL --..YDEF N.VK.K::A RNAP .FRI ...D I. -K ..TY... LV.L..ETGILLR.: A O.':GDNIS .DN.IKY.E.K.DFAEA.KAQ..EKLDK.A.VSANSVAK.ANEEDEEOVDLDO. AIf: G550. --5.E.0. KOTK.K0(' N. AP, FRK...D .. I.S.EII.LGADLGIIK.S 0.0..K. ..DA.K.CIAD..E.AE.L.GLIFEKLREHKIs: R.EAL --ARDOV1 NKTKIK-: N..AP: PFRTYOD .0Y- E.EII.LGTELDI.S.S .:.E..... N.K.F.K..KDIMLM.QEQl.EHYGLDONNVVQQQAEETSEELEFE. ..5Am: RV.A. DR-OS 000NQ..K V ONLAP P.0V0.D .0- ,..M.ELI.LGVKAN..K.S :A F: NST.I ..N.K.F..D..A...MAA1..GAI-NAALISEALAAVPDLDGTPV.E 956.5.Mg: ROA. --AO. L NEYT ..K :N. AP" FR. ..D: LY I.WE.EL I .IjGKNAI IN.DS. A: .- .NGAK I KDNVRV.K. . .EI S...DA.I.A.NGVEMH.ITEGTQDETDGED.PED 53 0Hf: ROT. -00.... SET..K SO'NNAR` P.L... DLY: ITRE.EIIELAONLKLIE.A: A SK D.E...H.EIAN..DA.I.EHSNLANAAMT-APDEE.DE 55 9

BP: R.A ..0...0 NEDT.. IC .V .N. .AP: F.... D-.0SAI.RE.EII.LAOQAN.-D.S .A 'SGN.l :KDNVREY.K.HKEMAI...N...ENSGrOSRAATF.ASEAEDGE soPt: RDiSl --.0.0.V -NOOKS(V. M... P-: FRE.-I .D. LY -IRA.E I I.L GAAKI .D.A A::::.: NGAK I.KDN.REPF....EIAR... RI.ESLGVVAMPDGAGNEAEAMDEEE 55.

Ec:. RI-AV --E.EN000 DET..K -V :N.1AA: P.0..U:LY .- NFY.ELV.LGVKEKLIE.A :A .KGEKI. .:KAN.TAW.KD. .ETAK....E.L-.DSNPNSYPDFSODD.EGVAETN.DIF 92. %

DNA BindingF- NucLeotide binding 4

FIG. 3. Nucleotide and deduced amino acid sequences of a cDNA encoding the A. thaliana RecA protein. (a) The truncated cDNA containsa continuous open reading fr-ame starting at its 5' end. The stop codon (star) is followed by sequences with 75-80% homology to elementsimplicated in efficient polyadenylylation of plant mRNAs (34) (dotted lines). The amino acid sequence contains a putative chioroplast transitpeptide (underlined residues) with a perfect consensus cleavage site (35) (box). Amino acids are numbered fr-om the predicted start of the matureprotein (arrow). (b) Comparison of the amino acid sequences of Arabidopsis RecA and several eubacterial homologs. Numbering of residuesstarts at the predicted cleavage site (arrow) for the chloroplast transit peptide in the Arabidopsis sequence. Invariant amino acids (row labeledCon) were determined by the alignment of 22 sequences (14, 23) (GenBank release 69 and this work) and are indicated by shaded areas. Dashesrepresent gaps introduced to optimize alignments. Dots indicate residues conserved with respect to the Arabidopsis sequence. Postulatedfunctional domains in E. coli RecA (12) are shown below the sequence. Potential caveats on these assignments have been discussed (12, 14).At, A. thaliana; Ssp, Synechococcus sp.; Av, Anabaena variabilis; Bf, Bacteroides fragilis; Bs, Bacillus subtilis; Am, Aquaspirillummagnetotacticum; Ng, Neisseria gonorrhoeae; Mf, Methylobacillusflagellatum; Bp, Bordetella pertussis; Pc, Pseudomonas cepacia; Ec, E.coli.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 6,

202

0

Page 4: homolog of Escherichia coli RecAproteinin plastids of ... · (a) Lane 1, protoplast fraction; lane 2, chloroplast fraction; lane 3, chloroplast stromalfraction (half the amountofprotein

Proc. NatL. Acad. Sci. USA 89 (1992) 8071

transit peptides (35-37) and strongly support that identity forthe first 51 amino acids of the truncated sequence.The mature protein is predicted to be 387 amino acids long

(Fig. 3a), with a calculated molecular mass of 41.8 kDa. Thisis close to the apparent molecular mass of the faster movingprotein identified in the chloroplast stromal fraction (Fig. la).The amino acid sequence shows 60.7% overall identity withthe Synechococcus protein and 52-57% identity with 20 otherprokaryotic RecA proteins, 10 of which are shown in Fig. 3b.The amino and carboxyl termini are poorly conserved (Fig.3b), although they may have functional significance (12, 14).Interestingly, the carboxyl end is enriched for acidic residuesin almost all species analyzed (Fig. 3b). Although this se-quence is more divergent than any of the eubacterial RecAproteins found to date, predicted functional domains ofthe E.coli protein (12, 14, 16, 18) are largely conserved (Fig. 3b).Amino acids known to cause recombination deficiency whenaltered in E. coli RecA (12, 14, 16, 18) are invariant in theArabidopsis sequence. However, residues affecting prefer-entially co-protease activity (12, 14, 16, 18) and/or causinghyper-recombinogenic phenotypes (12, 14) are not so wellconserved in the Arabidopsis gene. Since this gene is nowlocated in the nucleus of a eukaryote, it is tempting tospeculate that it has acquired a different regulatory systemand it is able to evolve independently of LexA.

Table 1. Codon usage in A. thaliana nuclear genes encodingchloroplast proteins and several eubacterial recA genes

Genes* CGN/AGR ratiot Codon usage distance*EPSP synthase§ 0.62 5.07Tryptophan synthase 0.47 6.28Acetolactate synthase 1.36 6.48Cs gene 0.33 6.42Rubisco activase 0.91 6.88

Averagel 0.74 ± 0.41 6.22 ± 0.68Arabidopsis recA 1.56 6.87Anabaena recA 5.00 10.78Synechococcus recA 8.50 11.39Escherichia recA 0011 15.82

By Spearman's rank correlation coefficient, the CGN/AGR ratioand codon usage distance are correlated with each other (r = 0.88;P < 0.005).*In plant genes (first six listed), sequences corresponding to themature proteins were used for the analysis (i.e., introns andchloroplast transit peptides were excluded). References for theArabidopsis genes are listed in ref. 41. The Cs gene codes for achloroplast protein of unknown function. The eubacterial recAgenes were obtained from GenBank (release 69).tArginine coding ratio, where N = A, C, G, or T, and R = A or G(40).*Determined by a modification ofa published procedure (38). Briefly,synonymous codons differing only in their third nucleotides weregrouped. Termination codons and single codon groups (methionineand tryptophan) were excluded, leaving 21 codon groups with a totalof 59 codons. The relative frequency of different codons in eachgroup was calculated. The overall difference in codon usage be-tween any gene and the Arabidopsis average was computed by adistance algorithm (38):

i-59D(A, X) = j,x(i, A) -x(i, X)|,

where x(i, A) and x(i, I) are the frequencies of the ith codon in geneA and in the average Arabidopsis nuclear gene encoding a chloro-plast protein (determined by pooling the coding sequences of thefive genes shown).§5-Enolpyruvylshikimate-3-phosphate synthase.IThe Arabidopsis average was calculated from the five individualgenes shown (41) and is expressed as the mean + SD. To avoidgiving excessive weight to fluctuations in codon usage for rareamino acids, a problem with short proteins, only genes of similarlength to recA were analyzed.

IIE. coli RecA lacks arginine residues encoded by AGR.

The plastid genome encodes only a small proportion of theproteins needed for functional chloroplasts (7, 32, 33), and itis thought that most genes have been transferred to thenucleus during evolution (11, 38). The base composition andcodon usage of these transferred genes have adjusted toreflect their nuclear localization (38). It has been hypothe-sized that codon usage is genome-specific and provides abasis for species classification comparable to classical sys-tematics (39). We applied several methods used to comparenonhomologous sequences (38-40), to determine the degreeof similarity between the sequenced cDNA and several otherArabidopsis nuclear genes encoding chloroplast proteins.The dinucleotide frequency and base composition (TA/ATratio, and%G + %C) were not significantly different betweenthe genes examined (analyses not shown). However in codonusage distance and the arginine coding ratio, the clonedsequence was indistinguishable from the Arabidopsis genesand clearly different from several eubacterial recA genes(Table 1). These data suggest that the Arabidopsis recA genehas adjusted to reflect its localization in the nuclear genome,clearly diverging from even the more closely related cyano-bacterial homologs.

DISCUSSIONChloroplast DNA recombination has been studied exten-sively, particularly in Chlamydomonas, by genetic analysis(3, 4). There has also been much work suggesting the in-volvement of certain sequence elements in plastid DNArecombination (4, 6, 7). However, very little is known at theenzymatic level. Our finding of a plastid-localized RecAhomolog provides biochemical evidence for a chloroplastrecombination system and strongly supports its relationshipto the eubacterial counterpart. To our knowledge, this is alsothe first observation of a recA homolog in a eukaryote. Whilerecombination/repair enzymes have been identified in vari-ous other eukaryotes (42-46), they have structures andenzymatic characteristics that differ from those of the bac-terial RecA protein.

In view of the known roles for RecA in E. coli, andinduction of the pea enzyme by DNA damage, it is likely thatthe chloroplast enzyme is also concerned with DNA repair.We have not yet determined whether the Arabidopsis genecan complement a recA-deficient strain of E. coli. However,by using a complementation assay, Pang et al. (ref. 47, nextpaper in this issue) isolated a gene from the same ArabidopsiscDNA library, and found that it could increase the survivalof a mutant (phr-, uvrB-, recA-) E. coli strain exposed toUV light. Its DNA sequence is quite different from the oneisolated by hybridization with a recA probe, however. Inview of the complex pathways forDNA repair in bacteria (18,19, 20, 48), it is likely that a number of proteins will be neededto interact with RecA in chloroplasts as well. Further workis needed to define fully the enzymology ofDNA repair andrecombination in chloroplasts.

We thank J. W. Roberts for antibodies againstE. coli RecA, D. A.Bryant for the Synechococcus recA clone, J. T. Mulligan for theArabidopsis cDNA library, and R. L. Last for helpful advice. H.C.was supported by a predoctoral fellowship and P.G. by a postdoc-toral fellowship from the Cornell National Science Foundation PlantScience Center. M.O. was supported by a fellowship from the AfgradFoundation. This work was supported in part by a Hatch grant toA.T.J.

1. Medgyesy, P., Fejes, E. & Maliga, P. (1985) Proc. Nati. Acad.Sci. USA 82, 6960-6964.

2. Gillham, N. W. (1978) Organelle Heredity (Raven, New York),pp. 397-422.

3. Lemieux, B., Turmel, M. & Lemieux, C. (1990) Theor. Appl.Genet. 79, 17-27.

Plant Biology: Cerutti et al.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 6,

202

0

Page 5: homolog of Escherichia coli RecAproteinin plastids of ... · (a) Lane 1, protoplast fraction; lane 2, chloroplast fraction; lane 3, chloroplast stromalfraction (half the amountofprotein

8072 Plant Biology: Cerutti et al.

4. Boynton, J. E., Giliham, N. W., Newman, S. M., & Harris,E. H. (1991) in Cell Organelles: Advances in Plant GeneResearch, ed. Herrmann, R. (Springer, Vienna), Vol 6, pp.1-48.

5. Kindle, K. L., Richards, K. L. & Stem, D. B. (1991) Proc.Natl. Acad. Sci. USA 88, 1721-1725.

6. Howe, C. J., Barker, R. F., Bowman, C. M. & Dyer, T. A.(1988) Curr. Genet. 13, 343-349.

7. Hiratsuka, J., Shimada, H., Whittier, R., Ishibashi, T., Saka-moto, M., Mori, M., Kondo, C., Honji, Y., Sun, C.-R., Meng,B.-Y., Li, Y.-Q., Kanno, A., Nishizawa, Y., Hirai, A., Shi-nozaki, K. & Sugiura, M. (1989) Mol. Gen. Genet. 217, 185-194.

8. Palmer, J. D. (1985) Annu. Rev. Genet. 19, 325-354.9. Palmer, J. D. (1990) Trends Genet. 6, 115-120.

10. Cedergren, R., Gray, M. W., Abel, Y. & Sankoff, D. (1988) J.Mol. Evol. 28, 98-112.

11. Gantt, J. S., Bakdiuf, S. L., Calie, P. J., Weeden, N. F. &Palmer, J. D. (1991) EMBO J. 10, 3073-3078.

12. Kowalczykowski, S. C. (1991) Biochimie 73, 289-304.13. Echols, H. & Goodman, M. F. (1990) Mutat. Res. 236, 301-311.14. Roca, A. I. & Cox, M. M. (1990) CRC Crit. Rev. Biochem.

Mol. Biol. 25, 415-456.15. Sassanfar, M. & Roberts, J. W. (1990) J. Mol. Biol. 212, 79-96.16. Dutreix, M., Moreau, P. L., Bailone, A., Galibert, F., Battista,

J. W., Walker, G. C. & Devoret, R. (1989) J. Bacteriol. 171,2415-2423.

17. Witkin, E. M., Roegner-Manisalco, V., Sweasy, J. B. & Mc-Call, J. 0. (1987) Proc. Nat!. Acad. Sci. USA 84, 6805-6809.

18. Wang, W. B. & Tessman, E. S. (1986) J. Bacteriol. 168,901-910.

19. Walker, G. C. (1985) Annu. Rev. Biochem. 54, 425-457.20. Little, J. W. & Mount, D. W. (1982) Cell 29, 11-22.21. Nivison, H., Fish, L. E. & Jagendorf, A. T. (1986) Methods

Enzymol. 118, 282-295.22. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular

Cloning:A Laboratory Manual (Cold Spring Harbor Lab., ColdSpring Harbor, NY).

23. Murphy, R. C., Gasparich, G. E., Bryant, D. R. & Porter,R. D. (1990) J. Bacteriol. 172, 967-976.

24. Church, G. M. & Gilbert, W. (1984) Proc. Natl. Acad. Sci.USA 81, 1991-1995.

25. Elledge, S. J., Mulligan, J. T., Ramer, S. W., Spottswood, M.& Davis, R. W. (1991) Proc. Natl. Acad. Sci. USA 88, 1731-1735.

26. Csonka, L. N. & Clark, A. J. (1979) Genetics 93, 321-343.27. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids

Res. 12, 387-395.28. Lehminger-Mertens, R. & Jacobsen, J. H. (1989) In Vitro Cell

Dev. Biol. 25, 571-574.29. Bernstein, H., Byerly, H. C., Hopf, F. A. & Michod, R. E.

(1985) Science 229, 1277-1281.30. Friedberg, E. C. (1991) BioEssays 13, 295-302.31. Tomasz, M., Lipman, R., Chowdary, D., Pawlak, L., Verdine,

G. L. & Nakanishi, K. (1987) Science 235, 1204-1208.32. Shinozaki, K., Ohme, M., Tanaka, M., Wakasugi, T., Haya-

shida, N., Matsubayashi, T., Zaita, N., Chunwongse, J.,Obokata, J., Yamaguchi-Shinozaki, K., Ohto, C., Torazawa,K., Meng, B. Y., Sugita, M., Deno, H., K hira, T.,Yamada, K., Kusuda, J., Takaiwa, F., Kato, A., Tohdoh, N.,Shimada, H. & Sugiura, M. (1986) EMBO J. 5, 2043-2049.

33. Ohyama, K., Fukuzawa, K., Kohchi, T., Shirai, H., Sano, T.,Umesono, K., Shiki, Y., Takeuchi, M., Chang, Z., Aota, S. I.,Inokuchi, H. & Ozeki, H. (1986) Nature (London) 322,572-574.

34. Mogen, B. D., MacDonald, M. H., Graybosch, R. & Hurt,A. G. (1990) Plant Cell 2, 1261-1272.

35. Gavel, Y. & Von Hejne, G. (1990) FEBS Lett. 261, 455-458.36. Von Heijne, G., Steppuhn, J. & Herrmann, R. G. (1989) Eur.

J. Biochem. 180, 535-545.37. Von Heijne, G. & Nishikawa, K. (1991) FEBS Lett. 278, 1-3.38. Oliver, J. L., Manrn, A. & Martinez-Zapater, J. M. (1990)

Nucleic Acids Res. 18, 65-73.39. Grantham, R., Gautier, C. & Gouy, M. (1980) Nucleic Acids

Res. 8, 1893-1912.40. Grantham, R. (1978) FEBS Lett. 95, 1-11.41. Von Heijne, G., Hirai, T., Klosgen, R. B., Steppuhn, J., Bruce,

B., Keegstra, K. & Herrmann, R. (1991) Plant Mol. Biol.Reporter 9, 104-126.

42. Moore, S. P. & Fishel, R. (1990) J. Biol. Chem. 265, 11108-11117.

43. Kmiec, E. B. & Holloman, W. K. (1982) Cell 29, 367-374.44. Kolodner, R., Evans, D. H. & Morrison, P. T. (1987) Proc.

Natl. Acad. Sci. USA 84, 5560-5564.45. Sugino, A., Nitiss, J. & Resnick, M. A. (1988) Proc. Natl.

Acad. Sci. USA 85, 3683-3687.46. Lowenhaupt, K., Sander, M., Hauser, C. & Rich, A. (1989) J.

Biol. Chem. 264, 20568-20575.47. Pang, Q., Hays, J. B. & Rajagopal, I. (1992) Proc. Natl. Acad.

Sci. USA 89, 8073-8077.48. Smith, G. R. (1987) Annu. Rev. Genet. 21, 179-201.

Proc. NatL Acad. Sci. USA 89 (1992)

Dow

nloa

ded

by g

uest

on

Dec

embe

r 6,

202

0