molecular biology of rotaviruses. ix. conservation and divergence in genome segment 5

9
Journal of General Virology (1994),75, 3413 3421. Printed in Great Britain 3413 Molecular biology of rotaviruses. IX. Conservation and divergence in genome segment 5 L. Xu, ~ Y. Tian,~t O. Tarlow, 1 D. HarbouF and M. A. McCrae ~* 1Department of Biological Sciences, University of Warwick, Coventry and z School of Veterinary Science, University of Bristol, Langford, Bristol, U.K. Nucleotide sequencing of RNA segment 5 from seven strains of group A rotavirus has been carried out to investigate the extent of diversity and conservation, as well as possible selective pressures involved in driving the fixation of sequence changes in this gene. Analyses of the derived sequences revealed that sequence con- servation could not be correlated either with rotavirus serotype or the species of origin of the virus strain. These sequences together with other published and unpub- lished sequences of this gene have raised the total number available for comparison to 17. Alignment of all the available sequences revealed that only 88 amino acid positions (17.6%) in the protein encoded by gene 5 (VP5) are absolutely conserved but that the metal- binding motif reported by others is conserved in all sequences. Despite the high degree of sequence di- vergence, alignment of secondary structure predictions for VP5 showed a high level of conservation, suggesting that constraints on sequence divergence may operate at the level of overall higher-order structure of the encoded protein. Introduction Rotaviruses constitute a genus within the virus family Reoviridae and they therefore have a genome composed of discrete segments of dsRNA (Estes & Cohen, 1989). Studies on these viruses have been stimulated by the fact that they are the predominant cause of acute viral gastroenteritis in the young of a wide range of birds and mammals including humans and all the major species of domestic livestock (Flewett & Woode, 1978; Holmes, 1983). Consequently they are major medical and ~et- erinary pathogens for which completely effective vaccines are urgently required. The fastidious requirements that these viruses have for growth in tissue cui:ure has prompted many groups to pursue a recombinant DNa- based approach to the development of such vaccines (McCrae & McCorquodale, 1987; Andrew et at., 1992). This in turn has led to efforts to maximize our t Deceased. The new rotavirus gene 5 sequencesappearing in this paper have been deposited with the EMBL sequence database and given the following accession numbers: Z12105 (bovine strain B223), Z12106 (human strain Hochi), Z12107 (porcinestrain OSU), Z12108 (bovine strain UKtc), Z32534 (human strain St 3), Z32535 (simian strain RRV), Z32552(human strain 69M). understanding at the molecular level of the virus replication cycle and the role that various viral gene products play in it. Part of this effort has involved the isolation and sequencing of cDNA clones of individual viral RNA species (McCrae & McCorquodale, 1982b; Both et at., 1982; Xu et at., 1990). For genes such as those encoding the virion structural proteins VP4 and VP7, which elicit neutralizing antibody, comparative sequencing of the genes from virus strains or different serotypes has been used to identify regions of variable sequence that are likely to be of epidemiological significance (Estes & Cohen, 1989). In other cases, DNA sequencing of viral genes has been employed in con- junction with comparative database searching to attempt to ascribe a biological function to a particular gene product. RNA segment 5 of the rotavirus genome has been shown to encode a protein known variously as VP5, NCVP2, NS53 and NSP1 (McCrae & Faulkner-Valle, 1981 ; McCrae & McCorquodale, 1982a; Estes & Cohen, 1989). Initially there were conflicting reports as to whether or not VP5 was a structural or non-structural protein, although more recent results indicate that it is not a component of the virus particle (Brottier et al., 1992). The precise function(s) of VP5 have not been defined, but it has been shown to be synthesized in low amounts from early times in the replication cycle (Johnson & McCrae, 1989) and also detected in pre-core 0001-2769 © 1994SGM

Upload: independent

Post on 21-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Journal of General Virology (1994), 75, 3413 3421. Printed in Great Britain 3413

Molecular biology of rotaviruses. IX. Conservation and divergence in genome segment 5

L. Xu, ~ Y. Tian,~t O. Tarlow, 1 D. H a r b o u F and M . A. M c C r a e ~*

1 Department of Biological Sciences, University of Warwick, Coventry and z School of Veterinary Science, University of Bristol, Langford, Bristol, U.K.

Nucleotide sequencing of RNA segment 5 from seven strains of group A rotavirus has been carried out to investigate the extent of diversity and conservation, as well as possible selective pressures involved in driving the fixation of sequence changes in this gene. Analyses of the derived sequences revealed that sequence con- servation could not be correlated either with rotavirus serotype or the species of origin of the virus strain. These sequences together with other published and unpub- lished sequences of this gene have raised the total number available for comparison to 17. Alignment of all

the available sequences revealed that only 88 amino acid positions (17.6%) in the protein encoded by gene 5 (VP5) are absolutely conserved but that the metal- binding motif reported by others is conserved in all sequences. Despite the high degree of sequence di- vergence, alignment of secondary structure predictions for VP5 showed a high level of conservation, suggesting that constraints on sequence divergence may operate at the level of overall higher-order structure of the encoded protein.

Introduct ion

Rotaviruses constitute a genus within the virus family Reoviridae and they therefore have a genome composed of discrete segments of dsRNA (Estes & Cohen, 1989). Studies on these viruses have been stimulated by the fact that they are the predominant cause of acute viral gastroenteritis in the young of a wide range of birds and mammals including humans and all the major species of domestic livestock (Flewett & Woode, 1978; Holmes, 1983). Consequently they are major medical and ~et- erinary pathogens for which completely effective vaccines are urgently required. The fastidious requirements that these viruses have for growth in tissue cui:ure has prompted many groups to pursue a recombinant DNa- based approach to the development of such vaccines (McCrae & McCorquodale, 1987; Andrew et at., 1992). This in turn has led to efforts to maximize our

t Deceased.

The new rotavirus gene 5 sequences appearing in this paper have been deposited with the EMBL sequence database and given the following accession numbers: Z12105 (bovine strain B223), Z12106 (human strain Hochi), Z12107 (porcine strain OSU), Z12108 (bovine strain UKtc), Z32534 (human strain St 3), Z32535 (simian strain RRV), Z32552 (human strain 69M).

understanding at the molecular level of the virus replication cycle and the role that various viral gene products play in it. Part of this effort has involved the isolation and sequencing of cDNA clones of individual viral RNA species (McCrae & McCorquodale, 1982b; Both et at., 1982; Xu et at., 1990). For genes such as those encoding the virion structural proteins VP4 and VP7, which elicit neutralizing antibody, comparative sequencing of the genes from virus strains or different serotypes has been used to identify regions of variable sequence that are likely to be of epidemiological significance (Estes & Cohen, 1989). In other cases, DNA sequencing of viral genes has been employed in con- junction with comparative database searching to attempt to ascribe a biological function to a particular gene product.

RNA segment 5 of the rotavirus genome has been shown to encode a protein known variously as VP5, NCVP2, NS53 and NSP1 (McCrae & Faulkner-Valle, 1981 ; McCrae & McCorquodale, 1982a; Estes & Cohen, 1989). Initially there were conflicting reports as to whether or not VP5 was a structural or non-structural protein, although more recent results indicate that it is not a component of the virus particle (Brottier et al., 1992). The precise function(s) of VP5 have not been defined, but it has been shown to be synthesized in low amounts from early times in the replication cycle (Johnson & McCrae, 1989) and also detected in pre-core

0001-2769 © 1994 SGM

3414 L. Xu and others

1 R F MATFKD] UKtc MATFK~]

B223 MATFKD~ 69M SATFK~I D S I MATFKD~ 4F MATFKDA 4S MATFKDA Hoch i MATFKD~ W a MATFKD~ IGVS03MATFKD~ O S U M A T F K D ] S t 3 MATFKD~ EH~ MAAFKD] E W MAAFKDI SAI| n A ~ X ~ SAIIP SATF~DA RRV SATF~DA

* * * * * * * . * *

YHYKRLNKLNSLVLKLGANDEWRPAPMT. KYKG YHYKKLNKLNS~VLKLGANDEWRPAPVT. KYKG YHYKKLNKLNGLVLKLGANDAWRPAPVA. KYKG YQYKKLNKLNNAILKLGANDVWRSSTLT. KRKG YQYKKLNKLNNAVLKLGANDVWRPSTLT. KRKC YYYKRINKLNHAVLKLGVNDAWRPSPPT. KYKG YYYKRINKLNHAVLKLGVNDAWRPSPPT. KYKG YYYKRINKLNHAVLKLGVNDTWRPSPPT KYKG ¥¥¥KRINKLNHAVLKLGVNDTWRPSPPT KYKG YHYKRINKLNHTVLKLGVNDTWRPSppT KYKG YYYKRINKLNHAVLKLGVNDTWRPSPPT KYKG ¥YYKRINKLNHVVLKLGVNDTWRPSPPT KYKG FHYRRITKLNRELLRIGANSSWTPAPpS NIRG FHYRRITKLNRELLRIGANSTWTPAPPS NIRG FHYRRLTALNRRLCNIGANSICMPVPDE KIKG FHYRRLTALNRRLCNIGANSIWMPVPDA .KIKG FHYRRVTKLNRELLRIGANSVWTPVSSNKIKIKG~

LD LD LD LD LD LD LD LD LD LD LD LD LE LE LE LE IE

~YTNLTY ~YTNLTY ~HTOLTF ~HTDLT~ ~HTDLTY ~HTDLTY ~ETDLTY ~HTDLTY ~HTDLTY ~HTDLTY QHTDLTY ~HTDLT¥ QLTNLTY QLTNLTY QIADLTH QIADLTH QLTGLTF

RG ~LYH~ QM FLDEE RG ~LYH~ QM FLDE~ RG ~LFH~ QM FLDE~ QG LIYH~ EM FLDD~ QG 5IYH~ E~ FLDDE RG SMYHV Q~ FLDN~ RG ~MYHV QW FLDN~ RG ~MYBV Q~ FLDD~ ~G ~MYHV Q~ FLDD~ RG rMYHV Q~ FLDNI RG rMYHV Q~ FLDN~ QG rMSHV Q~ FLDS~ YG SLHHV Q~ FLDD| YG SL~V Q~ FLDDI YG SLPHV K~ FLDN| Y~ SLPHV K~ FLDNI H~ SLAHV Q~ FLDNI ** *** ** *****

100 RLLRMRTFKDVVTKEDIEGL HLLRMRTFKDVITKEDIEGL HLLRMRTFRNEINKEDVEGL BLLRMRTFRNEVTKSDLENL BLLRMRTFRNEITKSDLENL HLLRMRTFKNDVTKDDLMNL BLLRMRTFKNDVTKDDLMNL BLLRMRTFKNEITKDDLKNL flLLRMRTFKNEITKDDLKNL BLLRMRTFKNEVTKDDLKNL ELLRMRTFKNEVTKDDLMNL HLLRMRTFKNEVTKNDLMNL BLLRLRTVKSPITTEKLASI HLLRLRTVKSPITTEKLASI BLLKLRTVKHPITKDKLQCI HLLKLQQLKHPITKDKLQCI HLLKFRTFESPITKEKLQCI

101

RF LTMYETL ~INEKLVNKFINSVKQRKCRNEYLLEWYNHLL! UKtc LTMYEIL PINEKLVNKFINSVKQRKCRTEYLLEWYNHLL~ B223 INMYNII PINE RVVDKFINNVKQRKCRNELLIEWYNHLL] 69M INMYDTL PINQKIVNKFTNI IKQHKCRNEYLTQWYNHFL! DSI INMYNTL PINKK I VHKFANT I KQ HKCRNE y STQWYN HFL! 4F IDMYNIL PMNQ K IVNKF I NS TRQ HKCRNE YATQWY NHLL! 4S IDMYNIL PMNQKIVNKFINSTRQHKCRNEYATQWYNHLL! Hochl IDMYITL PMNQKIVCKFINNRNNTNVEMNDMTQWYNHLL! W ~ IDMYSTL PM~QKI~CKFI~NTKQ~KCR~ECMTQWYNHLLM IGVS03IDMYETL IPMNQRIVCRFINNTRQHKCRNECMTQWYNSLLL OS~ VDMFDTL~[PMNQKIVDKFINNTRQHKCRNECVNQWYNSLLM St3 IDMYNTLIPINQKIVDKFINSTRQHKCRNECMTQWYNBLLM EHP VKMYQLL~PINBSIVKKFVKSTKQHKCRNDFELSWYNQLVL EW VKMYQLL~PINHSVIKKFVKSTKQHKCRNEFELSWYNQLVL S A I l IDLYNIIIPINDKVIRKFERMIKQRKCRNQYKIEWYNHLLL SAIIP IDLYNIIIIPINDKVIRKFERMIKQRECRNQYKIEWYNHKLL RRV INLYELLI~INHGVINKFKKTIKQRKCRNEFDKSWYNQLLL

XTLQALTINLED ITLQALTIKLED ITLQALSIELEG ITLQSLSIELDG ITLQSLSIELDG ITLQSLSVELDG ITLQSLSVELDG ITLQSLSIELDG TLQSLSIELDG TLQSMSIELDG TLQSLSIELDG TLQSLSIELDG TLTAAAVHCDD TLTAAAVHCDD TLNAAAFKFDE TLNAAAFKFDE TLNAAVFKFHS

200 NV~¥IFGYYDCMEHENQ~FQFINLLEKYDKLLLDDRNFHRMSHL STYYIFGYYDCMEHENQ FQFINLLEKYDKLLLDDRNFNRMLHL DVYYIFGYYDCMGKENQIFHFVNMINRYDRLLLDDKNFDRMMHL DIY¥1FGYYDDMHKINQ FSFANLINKYDVLLLDSINFDRMAFL DIYYIFGYYDDMHKINQ FSFTNLISKYDMLLLDSINFDRMAFL DIY~IFGYYDDMNNINQ FSFVNLVDMYDKLLLDDINFNRMSFL DIY¥IFGYYDDMNNINQ FSFVNLVDMYDKLLLDDINFNRMSFL DVYYVFGYYDNMNSVNQ FSFTNLIDVYDKLLLDDVNFVRMSFL DVYYVFGYYDNMNSVNQ FSFTNLIDMYDKLLLDDVNFVRMSFL DVYYVFGYYDNMNSINQ F~FTNLVDIYDKLLLDDVNFTRMSFL DVYYIFGYYDDMNNVNQ FSFVNLVDIYDKLLLDDVNFTRMSFL DVYYVFGYYDSMSDINQ FSFANLIDIYDKLLLDNINFNRMSFL

C.I~YIFGHYEGKANQSN ~RFVNCVDEYDRLLLDDVNFDRMAFL C.IYYIFGHYEGKANQSN YRFVNCVDEYDRLLLDDVNFDRMAFL NNLYYVFGLYEKSVSDIY Y~IVNFINEFDKLLLDDINFTRMSNL NNLYYVFGLYEKSVSDIY YRIVNFINEFDKLLLDHINFTRMSNL RDV.YVFGFYEGSSPCID YRLVNCIDLYDKLLLDQVNFERMSSL

* ** * * * * **** ** ** *

201 300 R F [ - ~ v I L O O E ' ~ A L R Y F S K S R F L S K G I ~ K R L S R S D F S D N L S ~ D R ~ , S P T S L M Q V V R N C I S I n I N D C Z W ~ A C T V I V D A R N Y i S I M N S S ~ T ~ S Y S V S Q a C ~ L F T K ~ K U K t c I P I T I L Q Q E Y A L R ~ F S K S R ~ L S K G K K R L I ~ R N D F S D M L V ~ D ~ S t, T S L i QVVRI~C ~ S T . ~ I ~ D Y ~ W ~ K A C T F V V D A R ~ y i N I M ~ S S ~ TE H~ S V S Q R C K L F T K ~ K B223 I P I V A L Q Q ~ Y A L R Y F S K S R F I S Q I K R E M N R H D F S D N L M E E R D N P M S F M ~ V T R I ~ C V S I ~ M t ~ U l ~ D W I ~ R C K L I G D ~ R ~ X n E L ~ K S ~ X T E ~ X S I S N R C R L ~ T I ~ K 69M I~LTL~QEYALRYFSKSRFITERRKYIETLHFSDNILNNLHNPNFTL~VIRNCSNMSV... EWNKACNLIRNISDYFDILKSSHTESYNISPRCRMFTQYK DSI I~LTLQQE YALRYF S K SRF I TERRKC i E ILHFS DN i LDNLHN PNF TLQVI RNC SNM SV . . EWNKACNI IRNI SDYFDILKSSHTEFMNISPRCRMFTQYK 4F I~VVLQQEYALRYFSKSRFITEGRKCVNDSEFSVNILENLSNPNFKIQITRNCSEVSS i . . KWNEMCKLVKNVNTYFDVLKTSHIEFYSIPTRCRL~TQRK 4S . I ~ V V L Q Q E Y A L R Y F S K S R F I T E G R K C V N D S E F S V N I L E N L S N P N F K I Q I T R N C S E V S S . . . K W N E M C K L V K N V N T Y F D V L K T S H I E F ¥ S I P T R C R L F T Q R K

Hochl I~TSLQQEYAIRYFSKSRFISEQRKCVNDSHFSINVLENLHNPSFKVQITRNCSELLL...GWNEACKLVKNVSAYFDMLKTSCVEFYSVSTRCRIFTQHK W8 0~TSLQQEYAIRYFSKSRFISEQRKCVNDSHFSINVLENLHNPSFKVQITRNCSELLL " . .GWNEACKLVKNVSAYFDMLKTSRIEFYSVSTRCRIFTQHK IGV8 3]~ASLQQEYALRYFSKSRFISEQRKCVNDSHFSINVIENLHNPSFKVQITRNCSELS? . . . DWNKACKLVKKISAYFDILKTSBIEFYSVSTRCRIFTQCK OSu ]~VTLQQEYALRYFSKSRFISEQRKCVSDSRFSINVLENLHNPSFKMQITRNCNELSS... DWNGACKLVKDTSAYFNILKTSHVEFYSISTRCRVFTQRK SEtH 3 I~VALQQEYALRYFCKSRFISEKRKCVSDLHFSANV~ ENLHNPSFKIQITRNC IELSS . . . DWNGACKLVKDVSAYFDMLKTSHIEFYSISTRCREFTQHK EW I~G RLQKYYAKRYF IAS R I P SAQ PAKLTYSDFSVKTL I .NSGAYARRRIIYRSVTNFHWQSHEDPLNDLLLDKDKILAALMTNERRPFLTHNLNFTSLLHE ~AII I~ERLQKHYAKRYFIASRTPSAQPVKLTYSDFSVKTLI .NSGAYARQR~IYRSITDFRW~SHEDPLNDLFLNKDKILAALMTNERRPFLTHNLNLTSLLHE

P~ELRTI.MQEYFQLSRLPSSKLKQI~FSDFTKETVIFNT¥TKTPGRSIYRN~TEFN~RDELEL~SDLKNDKNKLIAAMMTSKYTRF~AHDNNFGRLKMT SA~ I~ IELRAI • TQEYFQLSRSPSSKLKQ IYFSDFTKETV~ FNT¥ TKTPGRSIYRNVTEFNWRDELELYTDLKNDKNKLIAAMMTSKYTRFYAHDNNFGRLKMT

~DNLQSIYANKYFKLSRLPSMKLKRIYYSDFSKQNL I .NKY . KTK SRIVLRNLTEFTWDSQTDLHHDL~NDKDKILAALSTSFLKQFETHDLNLGRIKAD

301 400 R F FG IVSRLV~N¥ IFSSHE SCALNVH~ K Q INNRYKVWE DFRLRK IYNNVM DFIRALMKSNGNVG HCSSQESVYKYIPHLFLICKTEKWNEAVEMLFNY U K t c F G I I SKLVB ! SY I FS SHE SCALNVH~ R Q ~ N S H Y KV W E D F R L K K I Y NN VM D F ] RALVK S NGNVG Q C S SQ E SVY K C I P D I F L I C KM E KWN E AVE VL F NY B223 LNIISKLVB NYIFSNHGLCALDVN} K K I DNH¥ E ~WNDFRLRK X YNNMMNF I RALVKSNTNVGHC SSHELVYKC I SSVFIVWKIEKWNDSVRTLFEY 69M LKIASKLI~ NYVASNHNSLATEVH~ K SINNNSIVWTDFRIKNVYNDVFNF~RALVKSNLYVGHCSSEERIYESIKDILNVCKENEWNMLVTEIFNQ DS| LKIASKLI~ NYVASNHNSLATEVH}i: K SXNNNSIVWNDFRIKNVYNDIFNFIRALVKSNLYVGHCSSEEKIYESIKEVLNVCKENEWNMLVTEMFNQ 4F LRMVSKLI~ NYVTSNHRATATEVH}i K SINN SYTVWNDFRIKKVYDNIFSFVRALVKSNVN I GHCSSQEKVYEHIKNILDVCDDDKWNSSVTE IFNC 4S LRMVSKLI; NYVTSNHRATATEVH~i~ K S INN SYTVWNDFR I KKV¥ DN I FSFVRALVKSNVN IGHC SSQEKVYEH IKN I L DVC DDDKWNS SVTE IFNC Hochi LKMASKLI~ N¥1TSNHRTSATEVH}~K SVNNNYTVWNDFRVKKIYDNIFSFLRALVKSNVNIGHCSSQEKIYEYVEDVLNVCDNEKWKTSIMKVFNC WH LKMASXLI~ MYITSNHRTSATEVH} K SVNNSYTVWN DFRVKK I Y DNI F SFLRALVKSNVR I G HRSSQE K IYEYVE DVLNVC DNE KWKTS IMKVFNC |~VS03LKMASK L I ~ HYITSNHKTLATEVHI ~ K SVNNSYTVWNDFR I KN I Y DN I FNFLRALVKSNVN IGHCSS~EKIYE YVE DVLNVC DDE RWKTS IME IFNC OSU LKIASKLI~ N¥ITSNERTSATEVH~ K S INS SYTVWNDFRVKK I ¥ DN I FNFLRALVKSNVNVGHC SSQEKIYQCVENILDVC DNE KWKTSVTK IFN¥ S[3 LKMASKH I ~ ~NYVTSNHRTSATEVH~ K S INNSYAVWNDFRVK K I ¥ DN I FNFLRALVKSNANVGHC SSQEKIYEBIE DVLYVC DDEKWKTAVTE I FHC EHP LSELVHHA~ ~ CYLHSFHVQPASKVH, fl SVAFDFHT. VDWRI RRI ¥ DDVMYFLRKCCKSNV$S~SC SSLDPMDKVVEAhLLEMFTE SFK~BARLLF~C ~'~1 L SE LVH HA~iPCy L HSFHVQ PASKVR R S VAF DF HT . I DWRI KR I Y 0DVMY FLRAC CRSNV S SG SC S SLE PMDATVKAALLEMFTE DFKHHARF LFHC SA|I I FELG HHC(IPNYVASNHPGNASDIQ K NIKYFLSK. I DWRIRDMYNLLMEF IKDCYKSNVNVGHC SSVENIYPLIKRLIWSLFTNHMDQTIEEVFNH SA|IP IFELGHHC(PNYVASNHPGNASDIQI K NIKYFLSK.IDWRIRDMYNLLMEFIKDCYKSNVNVGHCSSVENIYPLIKRLIWSLFTNHMDQTIEEVFNH R R V IFELG HHC}~NYISSNHWQPASKI S K NVKYAFRD. MDWKME SNYNELLSF IQ SC YKSNVNVGHCSSIEKAYPLVKDILWHS ITEYI DQTVEKLFNT

* * * * . .* * ** *

401

R F LE~VDINGTEYVLLDYEVNWEVRGLVMQNMD.GKV~R[LNMNDTKKILSAMIFDWFDTRYMRET UKtc LE~VDINGTEYVLLDYEVNWEVRGLVMQSMD.GK~RILNINDTKKILSTIIFDWFDVRYMRET B223 L EI~VEINHVEYVLLDHELSWEMSGVIMQINN.GK~RILSFDDVKKIMGAIIYDWFDVRYMRE T 69M L~IKLNEDSYVLLNYEINWNVMNVLINSI..GKII~KILTLNDVISILRIIIDDWFDIRFMRNT DSI L~IKLNENNYILLNYEINWNVMNVLINSI..GKII~KILTLSDVILILRIIIYDWFDIRFMRNT 4F L~VELNAIKYVLFNHEVGWDVINVLVQNI..GK~QILTLNDIVIVLQSIIYDWFDIRYMRHT 4S L~VELNAIKYVLFNHEVGWDVINVLVQNI..GK~QILTLNDIVIVLQSIIYDWFDI~YMRHT Hochl L EI~VELDDVKYVLFNHEINWDVINVLVQSI..GK~QILTLKNVITIIQSIIYEWFDIRYMRN T WR L~VELO~VKYVLFHHE~WDV~HVLVQSI..GK~QILTLKNVITIIQSIIYEWFDIRYMRN~ IGV803LEI~VELDDVKYVLLNHEINWDVINVLVHSI..GK~QILTLENVIAIMQSIIYEWFDIRYMRN~ OSU L EI~VELNAVNYVLFNHEVNWDVINVLVQSI..GK~QILTLNDVTTIMQSIIYEWFDRKYMRN~ S [ 3 L EI~VELDAVKYVLFNHEVNWDVINLLVQSV,.GK~QILTLNDIVIIMKSIIYEWFDIRYMRN~ EHP F~VQIDDVSYILFNYPVNYDIYDFIIRTLATER~FTLSYKQFTTILFALVERWYDLSQIERI E W F~IEVDNVN¥1LFNYPLNYDIYNFVMRTLATER~FTLSYKQFTNILFALVERWYDLSQIERI SA|I M~VSVEGTNVIMLILGLNISLYNEIKRTLNVDSII~MVLNLNEFSSIVKSISSKWYNVDELDK~ SAII~ M~VSVEGTNVIMLILGLNISLYNEIKRTLNVDSII~MVLNLNEFSSIVKSISSKWYNVDELDK RRV M N~.VQVNEQQVIKFCWQIDIALYMHIKMILETEAL~,FIFTLNQFNSIIKGIVNQWCDVAELDH

499 TTSTTNQLRTLNKTNELIDEYDLEL•DVE TTSTTNQLRTLNKRNELIDEYDLELSDVE IVSTTNELRKLNKDNNLMDGYDYELSDIE STFTVNKLKQLYEKDKTAE.¥DSGISDVE TTFTVNKLKQLYEKDRTAE.HDSRISDIE TTFTVDKLRRLCVEKKTVE.YDSGISDVE TTFTVDKLRRLCVEKKTVE.YDSGISDVE

MVTFTIDKLRRLHTELKTAE.YDSGISDVE MVTFTIDKLRRLHTELKTAE.YDSGISDVE MVTFTIDKLRRLHTGLKTVD.YDSGISDIE MTTFTVDKLRRLCTGSKTVD.CNSGISDVE MTTFTVDKLRQLCTGVKTVD.YDSGISDVE LSIAPTNRLIELQERGNLAEEFDLLLSSSDSEED LSTTPTNRLIELQEHGDLAEEFDLLLSSSDSDED MSIKSTEELIEMKNSGTLTEEFELLISNSEDDN MSIKSTEELIEMKNSGTLTEEFELLISNSEDDN LCTEQTDALVKLEEEGKLSEEYELLISDSEDDD

Fig. 1. Multiple sequence alignment of the 17 full length VP5 amino acid sequences currently available. Sequence alignment was carried out using the Pileup program from the GCG sequence analysis package. Conserved cysteine residues are boxed and shaded and conserved prolines are boxed. Amino acids conserved in all 17 sequences are denoted by an asterisk under the appropriate position in the alignment. The sequences used to generate this and subsequent figures have all been submitted to the EMBL database by ourselves

Rotavirus gene 5 and VP5 analysis 3415

complexes (Patton, 1986; Patton & Gallegos, 1988; Gallegos & Patton, 1989), suggesting that it is probably involved in early events in the replicative process. It can be speculated that any such role of VP5 in the early events of virus replication might involve its interaction with species-specific host cell proteins and that evolution to maximize the quality of any such interactions could provide the selective pressure driving sequence diver- gence in gene 5. VP5 may be involved in the selection of the correct complement of RNA segments for incor- poration into progeny virions. Consistent with this possibility, when the first gene 5 sequence was determined it was found to contain two potential zinc finger regions which in other genes are known to be involved in nucleic acid-binding (Bremont et al., 1987). Also, when the gene was subsequently expressed using a recombinant baculo- virus, VP5 was found to possess both zinc- and RNA- binding activity (Brottier et al., 1992). The first gene 5 sequence determined came from the bovine RF strain of rotavirus and, consistent with this gene encoding a non- structural protein, the corresponding gene sequence from the UKtc bovine strain showed a very high level of sequence conservation (Y. Tian & M. McCrae, unpub- lished observation). It was therefore a considerable surprise when the gene 5 sequence from the simian rotavirus SAll strain revealed a very high level of sequence divergence (Mitchell & Both, 1990). Despite the conservation of the generalized metal-binding domain found between amino acids 37 and 81, the overall level of homology was only in the region of 50% at the nucleotide level and less than 40 % at the amino acid level (Mitchell & Both, 1990). This surprisingly high level of sequence divergence greatly exceeded that found for all other rotavirus genes, including those encoding the neutralization antigens of the virion, which are generally found to exhibit the highest levels of sequence diversity due to the selective pressure of the host's immune response. In an attempt to define more clearly the extent of sequence conservation and divergence in gene 5 we have extended our analysis to include seven additional virus isolates.

Methods Virus strains and propagation. The following strains of virus were

used in this study. The UKtc (G serotype 6) and B223 (G10) strains of bovine rotavirus; the St 3 (St Thomas 3; G4), Hochi (G4) and 69M (G8) strains of human rotavirus; the rhesus rotavirus (RRV) strain (G3) of simian virus and the OSU strain (G5) of porcine rotavirus. In all cases virus stocks were propagated at low m.o.i, in the BSC-I line

of African green monkey kidney cells as previously described (McCrae & Faulkner-Valle, 1981).

eDNA cloning. The sequence of the UKtc gene was determined using eDNA clones. These were generated using the original cloning strategy developed for use on dsRNAs (McCrae & McCorquodale, 1982a).

PCR amplification and DNA sequencing. Sequence determination for gene 5 of the B223, St 3, Hochi, 69M, RRV and OSU strains was done by direct sequencing of PCR-amplified DNA. Full length eDNA was generated from viral RNA extracted from infected cells in a combined reverse transcription PCR amplification carried out using terminal oligonucleotide primers as previously described (Xu et al., 1990). The sequences of the primers used were taken from the sequence of the UKtc virus strain and were as follows: 5'-terminal primer, CCCGGGA- TCCATGGCCGGCTTTTTTTATGA; 3'-terminal primer, CGATC- GCGAATTCTGCAGGTCACATTTTAT. Sequencing of the ampli- fied DNA was carried out as described previously (Xu et al., 1991) using the two terminal primers to establish the sequence of the near terminal region. Then, using the determined sequence as a basis, further appropriately spaced sequencing primers were synthesized and used to extend and complete the sequence. Both DNA strands were completely sequenced and all primer positions were sequenced through. It should be noted that one disadvantage of the strategy employed is that it dictates that the 5'-terminal 18 nucleotides and 3'-terminal 13 nucleotides of each virus strain will have the sequence of the primers used for PCR (underlined above) and so possible sequence divergence in these small regions will not be revealed. However previous studies (McCrae & McCorquodale, 1983; Clarke & McCrae, 1983) have shown that these regions show little or no sequence variation within group A rotaviruses.

Computing. Computer analysis of the sequence information was carried out using the University of Wisconsin Genetics Computer Group (GCG) analysis package (Devereux et al., 1984) and on the suite of programs available on the SEQNET facility at Daresbury, U.K.

Results

Sequence of gene 5from the UKtc bovine rotavirus strain

Sequencing of this gene was carried out on two independent eDNA clones isolated as described in Methods and was completed shortly before the pub- lication of the gene 5 sequence from the RF strain of bovine rotavirus (Bremont et al., 1987). The sequence (data not shown) was 1579 nucleotides in length and contained a single long open reading frame extending from nucleotide 32 to 1505, giving a protein (VP5) of 491 amino acids (Fig. 1). Comparison of the UKtc sequence with that of the RF strain revealed a very high level of sequence conservation of 88.7 % at the nucleotide level and 95.5 % at the amino acid level (Fig. 1). This high level of sequence conservation is entirely consistent with the bulk of the experimental evidence, which indicates

or others and have the following accession numbers: M22308 (rotavirus RF strain), Z12108 (UKtc), Z12105 (B223), Z32552 (69M), L18945 (DS1), L29183 (4F), L29185 (4S), Z12106 (Hochi), L18943 (Wa), X59297 (IGV803), Z12107 (OSU), Z32534 (St 3), U08423 (EHP), U08428 (EW), X14914 (SA1 I), L18944 (SAI 1P), Z32535 (RRV).

3416 L. X u and others

Table 1. Leve ls o f nucleotide and amino acid conservation across various gene 5 and V P 5 sequences*

Nucleotide conservation (%)

RF UKtc B223 69M DS1 4F 4S Hochi Wa IGV803 OSU St 3 EHP EW SAll SAllP RRV

Amino acid conservation (%)

RF 100 88"7 74 65"8 65"9 66"8 66"9 65"7 64"8 64"6 65"6 66 52"3 52"3 51"7 52"4 52"8 UKtc 95'5 100 75"1 66"5 66"3 6 7 . 3 67"2 66"7 66"2 66"5 66 656 52"6 53"9 53"8 53"7 53"6 B223 71.5 71"3 100 66"9 65"8 67"1 67 67"8 66"8 67 65"6 67"4 50"9 50"9 54"7 53"9 54"4 69M 57'9 57"9 56"5 100 93 75"5 75"4 75"3 75"3 75"8 74"4 75"5 51"6 52 55"9 56"2 53"6 DS1 58'1 58'1 58'3 91'8 100 75"8 75"7 75"3 74"6 76"1 74"3 75"9 51"1 51"1 56"4 55"7 54"2 4F 59'3 59'1 58'5 68'8 69'4 100 99"9 81"3 81"1 82 81"7 823 51"9 52"7 54"8 54"9 53"4 4S 59'3 59'1 58"5 68'8 69'4 100 100 81"4 81"1 82"1 81"8 82"4 51"9 52"8 54"9 54"9 53"4 Hochi 55"9 56"1 56-5 66"5 64"9 78.2 78-2 100 98 91"9 84"5 84"9 53 52"8 54"4 54"4 53"2 Wa 57-7 57'7 58"3 68'4 66"7 80'9 80"9 95"9 100 91"9 84"1 84"7 52"6 52"3 55"4 55"2 53"4 IGV803 57"3 56"9 58-7 67'4 68-8 79.7 79"7 88-1 90"6 100 84"7 84"9 52"3 51"1 55"7 54"4 53"6 OSU 59"3 59"1 59"8 66"5 67"4 82"1 82'1 82"8 85 84-8 100 849 51"3 52 54"7 54'4 53 St 3 58"9 58"7 58"9 68 68"8 82'3 82'3 81"3 84 84"2 85"4 100 52"5 51"4 54"6 54"4 52"7 EHP 38"2 39 38"1 37-3 37"4 39-8 39"8 40"1 41 40-2 40.5 38"5 100 91.3 55.2 54"8 58"3 EW 39-1 40"1 31"1 37"4 39"5 40-7 40.7 40'7 42"6 40"7 41 37"7 91"9 100 55"6 55"5 59 SAll 38-9 38"7 37-9 36"2 38 36.4 36-4 34'9 36'2 36"8 36"9 38"5 44"4 44"8 100 98"9 64"5 SAllP 38"1 37"1 37-3 37 37"6 36.2 36"2 34"3 35'3 36 36'1 37"8 43"4 43"8 97"8 100 64"2 RRV 38 37"8 37-1 37"3 37"7 38-6 38"6 38"7 39'8 37"9 38 40-1 49"6 50"6 57"4 56'8 100

* The values given were obtained using the GAP program of the GCG software package.

that the protein product of gene 5 is a non-structural protein.

Conservat ion o f gene 5 sequence according to species o f

origin?

The publication of the gene 5 sequence from the simian virus strain SA11 (Mitchell & Both, 1990) revealed a very high level of sequence divergence from the bovine sequences (approximately 50% conservation at the nucleotide level and 39 % at the amino acid level; Table 1). If the protein product of gene 5 is indeed a non- structural protein then it is difficult to see how selective pressure from the host's immune system could have provided the selective force to drive such high levels of sequence divergence, particularly when the major neutral- ization antigen (VP7) of these viruses only diverge to the extent of 16% at the amino acid level (i.e. 84% con- servation). Owing to the involvement of VP5 in the replication cycle (see above) it would be expected that viruses originating from the same host species would have very similar gene 5 sequences, as is the case for the bovine UKtc and RF strains. To test this hypothesis the gene 5 sequences from a number of virus strains isolated from different species were determined. These were those of a second simian RRV strain, the human strains 69M, Hochi and St 3, the porcine strain OSU and finally a third bovine strain B223. A variety of pairwise com- parisons could be made from these data but it was clear that viruses originating in different species, such as the porcine rotavirus strain OSU and human strain St 3, could show a level of conservation similar to that exhibited by the two bovine strains RF and UKtc (Fig. 1 and Table 1). By contrast, two virus isolates from the

same species, SAI l and RRV, exhibited amongst the highest levels of sequence divergence (Fig. 1 and Table 1). These two comparisons also demonstrated that the sequence of gene 5 and hence its protein product could not be correlated with virus G serotype since the two simian rotaviruses are both members of G3 whereas the porcine OSU strain and the human St 3 strains belong to G5 and G4, respectively.

Overal l sequence diversi ty in gene 5 and V P 5 o f group

A rotaviruses

During the course of this study, Hua and coworkers published three new gene 5 sequences, those from the human Wa and DS1 strains and that from their laboratory variant of the simian rotavirus SA11 (Hua et

aL, 1993). These sequences together with the two published earlier, two porcine rotavirus sequences avail- able to us from another study (Burke et al., 1994) and three unpublished sequences that have been submitted to the D N A databases, when added to the seven new sequences presented here, raised the number of full length group A rotavirus gene 5 sequences available for comparison to 17. Simple pairwise comparison of the sequences at the nucleotide and amino acid level revealed a wide range of diversity (Table 1). Thus, at the nucleotide level conservation ranged from as high as 99.9 % (porcine isolates 4F and 4S) to as low as 51.7 % (bovine strain RF and simian strain SA11) whereas at the amino acid level the range was 100 % (porcine isolates 4F and 4S) down to 34-3 % (human strain Wa and simian strain SA 11P).

The Pileup algorithm of the GCG sequence analysis package was used to construct a multiple sequence

Rotavirus gene 5 and VP5 analysis 3417

1.0

0-8

0.6

0.4

0-2

0.0 0 100 200 300 400 500

Amino acid position

Fig. 2. Graphical representation of amino acid conservation across the sequence of the VP5 protein. The distribution of conservation was calculated using the Plotsimilarity program from the GCG sequence analysis package using a window size of 10.

SA11

5 ~ SA11P 1 0 RRV

/11 C Ew

/ 5 "C) EHP

4"v RF 27 ~ ~ 18

~ ' ~ " ' C ) B223

\ 3/ODSl 9 69 M

2 ~ S t 3

1 ~ OSU

4 ~ IGV803

4 ~ 7 a

3"O Hochi

Fig. 3. Phylogenetic tree of rotavirus isolates constructed using VP5 sequences. A phylogenetic tree was constructed using the Phylotree program which can be accessed by e-mail ([email protected]) at the Computational Biochemistry Research Group (Zurich, Switzerland). The values given next to the lines indicate PAM distances between branch points on the tree. This is a measure of phylogenetic distance between the virus isolates.

alignment of the protein products of the different gene 5 sequences to allow analysis of the nature and distribution of their conserved features. This alignment (Fig. 1) revealed that 88 amino acid positions (17.6%) were absolutely conserved including cysteines at residues 8, 44, 47, 48, 56, 59, 65, 68, 74, 327 and 330, and prolines at positions 80, 109, 142, 174, 201,310, 403,436 and 465 in the alignment. The conserved metal-binding motif noted by others (Bremont et al., 1987; Mitchell & Both, 1990; Hua et al., 1993) is also evident. In this larger data set it extended from position 39 in the alignment to position 83 and showed 59% absolute sequence conservation in- cluding six cysteines, a histidine (63) and a proline (80). In order to view the distribution of the conservation the Plotsimilarity algorithm of the GCG analysis package was used to give a graphical display of the extent of conservation along the protein chain (Fig. 2). This revealed that overall the amino-terminal half of the protein shows a much higher level of similarity than the carboxy-terminal half (Fig. 2). This is a reflection of the fact that 71 of the 88 absolutely conserved positions fall in the amino-terminal half of the protein. Despite the carboxy-terminal portion of the protein showing much lower levels of similarity, three peaks covering regions of greater than 50% similarity are evident, indicating regions that may be of functional importance (Fig. 2). The data shown in Fig. 1 were also used to construct an unrooted phylogenic tree to give an indication of the phylogenetic relationships between the various VP5 sequences. The result revealed several layers of phylo- genetic division, the first of which produced two groups; one containing the simian and murine rotavirus isolates and the second containing everything else (Fig. 3). Within the division containing the human, bovine and porcine isolates further subdivisions were evident. Thus for example the human Wa and Hochi isolates, which fall into serotypes G1 and G4, respectively, are phylo- genetically closer than Hochi and St 3 which are both of serotype G4; also the human St 3 isolate is phylo- genetically closer to the porcine OSU isolate than it is to other human virus strains (Fig. 3).

The high level of proline conservation seen in Fig. 1 suggested that despite extensive sequence divergence at the primary sequence level, particularly in the carboxy- terminal half of the protein, there might be more extensive conservation at the structural level. This possibility was investigated using the algorithm de- veloped by Rost and coworkers (Rost & Sander, 1993; Rost et al., 1994) to obtain a predicted secondary structure for each of the 17 VP5 sequences available. These were then aligned using the Pileup algorithm from the GCG package. The results (Fig. 4) confirmed that despite the extensive changes at the primary sequence level there was a remarkably high level of conservation in

3418 L. Xu and others

1 R F L ..... HHHH HHHHHHHHHH HHH-LLr . r . r . r . r . r .T . r .T . r . r . . . . . . . E ......

U K t c L---HHHHHH HHHHHHHI'IHH HHH-r.r.T.r.T.L r.r.r.r.T.r.T.. ..... E ......

B223 L - - - ~ HHHHHHHHI.~ H H H - L r : f . r . r . r . T . r . r . r . r , r . r . . . - - - - E . . . . . .

6 9 M L - - - H H H H H H HHHHHHHHI, IH H H H - r . L L r , r . r . r . r . r . r . r . r . r . . . _ - - - E . . . . . .

D S I L - - - H H H H H H HHHHHHI-IHI-LH Hi tH.r .ur . r . t . r . . r . r . r . r . r . r . r . , . L - - - E . . . . . .

4 F L - - - ~ HHHHHHHH/-IH H H - - - L ! . r . r . r . r . r m r . r . r . r . . . . . . . E . . . . . .

4S L - - - H H H H H H HHItHHHHItHH H l . I - - - t / / J . ~ LLI . r .T .T .T . . . - - - - E . . . . . .

H o c h i L - - - H H H H H H HHHI'IHHHHHI.I H H - - t . r . r . r . L r . r.r.r.r,T,r,r,. . . . . . E . . . . . .

W a L---HHHHHH HHHHHHHHHH HH-_r . r . r . r . r , r . r . r , r , r , r , t a , . ..... E ......

I G V 8 0 3 L ..... Hm~H mlmmatmatH l . t ~ - r . r . [ . L r . r . r . r . r . r . r . r x . . . _ ---E ......

O S U L---HHHHHH HHHHHHHHHH H .... LLLLL r . r . r . r . t , r . t ....... E ......

St 3 L - - - H H H H I t H ~ H . . . . L L L L L r . r . r . r . r . r . t , . . . . . . E . . . . . .

E H P L L L . . . . . . H HHHHHHHHttH HHH-I, t .T.r .Lr . T .T .T . r .T . .L . -h L L L . . . . . . .

E W I L L . . . . . . H HHHHHHHHHI'I H H H - ~ i . r . r . r . r . r . t . . . L

S A I l L . . . . HHHHH HHHHIIHHHHH HHH-LLLr.T. r . L L L T , r . r . I . . . L

S A I 1 P L L L ...... H HHHHHHHHHH HHH.LLLT,I',I'. r . T . L r . T , - L . . L

R R V L - - - H H H H H H HHHI{HHHHI~ HHH-r .LLr . r . r .T . r . r . r .T . r , t . . . .

1 1 1

R F HHI-IHHHHHHH . . . . . . . . Ht

U K t c ~ H . . . . . . . . H t

B223 HHI.IItHHHHHH . . . . . . . . HI-

6 9 M HItHHHItHItHI! II ....... Hi

D S 1 HHHHHHItHHH .........

4 F HHHHHHHHRH . . . . . . . . HI-

4S HHHHHHHHHH ........ HII

H o c h i -HHHHHHHHH - - - L L . . . . H

W a - ~ H . . . . . . . HHH

I G V 8 0 3 HHHHIIHHHHH H ....... HH

O S U HHHHItHHHHH ........ HH

St 3 HHHHHItH1.1HH ........ IIH

E H P -HHHHHHHHH H - L L - L L - - H

E W -HHHHHHHHH H-LL-LL--H

SA 11 -HHHI'IHHHHH ........ HH

S A I 1 P -HHHHHHHHI-I H . . . . . . . . H

R R V -HHHHHHltHH . . . . . . . . . H

HHHHHHH ..... HHHHHI4HH

..... H - ~ HHHEmH ..... HH-HHHHHH

HIIHHHHH ..... H-IIIflII~-

HHHHHH ...... H-HHHHH-

......... HH---

HHHHHH ......... HH---

--H ............ HH---

HHHHHH ........ HHH-- -

HI-IHHHH ...... H - ~

HHHHH .......... HH--.

HHHHI~4 ...... H-HHHHH-

HHHHHH ..... HHHHHHHHH

Hi{]flIHH .....

HHHHHH ..... H--IIHHHHH

HHHHHH ...... H ~

-HHHHH ........ HHI~K4H

221

R F LL--E .... L LLL.----LL LLLL--RRRR --EE---L;/q H I ~ H H H

U K t c L ....... LL L--.---LLL LLLL--EEEE - - - r . t . r . r . r . - H -HHHHHHHHH

B223 . . . . . . . . L L L - - . . . . . L L L L L . l Z ~ m ~ : - - - E E E - - L L L - -HIBtHHHHHH

6 9 M . . . . E . . . . . L - L . - - - - L L r . t . t . r . - - E E E E . . . . . L L . . . . HHHHHHHHH

D S ] . . . . E ~ . . . . L L L , - - - - L L L I . L L - - - R R R R . . . . . L L . . . . HHHHHHHHH

4 F L ....... L- --L ..... LL LLL--RRRR . . . . . . . . . L L -~HHHHHA'H

4S L ....... L- --L ..... LL LLL--~ZlU~ .... - .... LL -HHHHI'[HIIHH

Hochi ....... LL- --L- .... LL L L L - - ~ .... - .... LL -HIIHHHHHHH

W a ....... LL- --L ..... LL LLL--EEEE .... - .... LL -HHHHHHHHH

IGV803 .... EE ...... L ..... LL LLL--EEEE .... - .... LL --HHHHHHHH

O S U L---E ....... L ..... LL LLI~-EEEE .... ---LLLL -HHHHHHHHH

St 3 ............ h ..... LL LLL--~m~R. . . . . - .... LL -HHHHHHHIIH

EHP r.r.r.r ..... LL LL .... LLLL LLL--EE~E~ ......... L L - ~

E W L L L - - E - - - L L L . . . . L L L L T.T,r.T.r..EEEE . . . . . T.T.T,T.L L - ~

S A I l L---E---L- L ........ L r-T-rLr--EEEE ....... LL- HHHHHIIHHHH

SAIIP LL ....... L LLL ..... LL LLLT.r--EEE ........ LLH HHHHHHHHHH

RRV .......... L ...... LLL LrJT..~EE ..... LLLL- - ~

331

R F --LL ......... ~ HHHHHHHHHH LLLL---LL- HKHHHHHHHH

U K t c --LL ......... HHHHHH~ ~ H H LLLL--LLL- HHHHHHHHHH

B223 --LL ........ HHHHHHHH HIIHHHHHHH- T.r.r.r.r.r.r.r.r.- HHHHHHHHHH

6 9 M - - L L L . . . . . . . . HI'IHHHHH HHI.IHHHHHHH L L L L - - - L L - HHHHHHHHHH

D S I --LLL ........ HHHHHHH HHHHHHHHHH LLLL---LL- HHHHHHHHHH

4 F - - L L - E E E . . . . . . HHHHHH HHHHHHHHH- L L L L - - - L L - HHRHHHHHHH

4S - - L L - E E E . . . . . . HHHHHH HHHHHHHHH- L L L L - - - L L - HHHHHHHHHH

H o c h i - - I . J . , - E E . . . . . . . ltHHHHH HHHHHHI41.1H- L L L L - - - L L - HHHHBHHHHH

W a --LL-EE--- -r---HHHHHH HHHHHHHHH- LLLL--LL-- H H ~

I G V 8 0 3 - - L L - E E . . . . . . ~ HHHHHHHHH- L L L L - - L L L - HHHHHHHI~HI'I

O S U - - L L - E ~ E E . . . . . HHHHHH ~ - L L L L - - - L - - HHHHHHHHHH

St 3 --LL-EE ....... HHHIfHH HHHHI{HHHH- LLLL--LLL- HHHHHHHHHH

E H P E-LLL..LL- -HHHHHIIHIfH HHHHHHHHH- Lr.r .r .r .r .r .r ,r ,L HHHHHHHHHH

E W E .... . .LL- --HHHHHHHH HHHHHHHHH- r . r a . r , rd . .LLLL HHHHHHHHItH

S A I l . . . . . . . - L L - - H H H H H H H t t HHHHHHHHH- LLLLLT.I",LT.- I~IHE[HHI-IHHH

S A l l P . . . . . . . - - L L--HHHHHHI-I HHHHHHHHH- LT,LT.r . r . r&L- HHHHHHHHHH

R R V . . . . . . . L L L --l tHI-IHHHHH HHHHHHHHIt- T . r . L r . L L L L - - HHHHHHHHHH

L L L . . . . . . . . L L - - - L . . . . . . . . . . . . . . L L - - r . T . r . r . . . . . . . r.r.r.T.r, L L L - H H H H H H I { H H H H H - - L L

---E ............. L ............. L ..... LL .............. HHHHHHH HHHHHH ....

LLL ........ LL---LLL ........... LLL--LLL-H ....... LLL LLL-HHHHHH HHHHHH--L-

---EE ............ L ................ E-LLL ........... L - - ~ HHHHHH ....

220

- - L L L - - K K K K K E E - - L - - - L L L L L . . . . . . H HH . . . . . . . L L . . . . . L L L - HItHHHHHHHH H . . . . . . . . L

H-r.rJ.~-,~,~ EE--L---LL LLL ..... HH HHH ...... L L ....... L- h'HHHHHHI4HH H ........ L

- - ~ - - K K K K K E E - - L r . t . r . r . r . L L L . . . . . . H H H H - H H - - L L L L . . . . . . L - HHRMHHHHHH H . . . . . . . . .

--LLL-Kw.~,~ EE--L .... L LLL--HHHHH HHH ...... L L ....... L- ~ HH ........

-.LLL-P~KK EE--L----L r . r . r .T . - -H-HH HHHHHH---L L ....... L- HHHHHHHHHH }Hi . . . . . . . .

.--LLL--KKKK EE--LLL-LL LLL-E ..... HHH ...... L L ......... HHHHHHHHH~ HH ........

--LLL-Kw.~K EE--LLL-LL LLL-E ..... HHH ...... L L ......... HHHHHHHHHH HH ........

----LLL--KKKK EEE-LL-LLL LLL ....... HH ....... L L-E ..... L- HHHHHHHHHH HH ........

--LLL-R~w.I¢ EEE-LL-LLL LLL ...... H HHH ...... L L-E ..... L- HHHHHHHHHH H .........

H-LLL-~¢z~ EE--L--LLL LLL ...... H HHH ...... L L ..... LLL- HHHHHHHHHH H ........ L

--LLL-~KKK EEE.T.T.T,T,t.r. LLL-EE ..... H ....... L L ......... HHHHHHHHHH H ..... EE--

--LLL-~ff~ E~--L-L--L LLL ...... H HHH ...... L L ......... HHHHHHHHHH HH ........

H - r . r . T . ~ K ~ EE.--T.Lr3",T.L r.T.T.T..E . . . . . H . . . . . . . L LULL, . . . . . . . HHHHHHHHHH H H H - - L T . I . L r .

H - L L L - ~ - ~ . ~ E E . - r . r . r . r . r , r . r . r . r . r . -E . . . . . H . . . . . . . L L L L . . . . . . . H H ~ H H H H - L L L L L

H - L L L - J r , w.u~ E E E - - L - - L L r.r.r.r . . . . . . . HHHH . . . . . . L L . . . . . . . L - H - L H H H H I ~ HH . . . . . . . L

H-r . r . r . r .wm*m EE . . . . r . r . r . r . r . r . r . r . . . . . . . HHHH . . . . . L L L L L - r . r . T , ! . - - H H H H - H H H H H - - L L L L L L L

H - r . T . T . r . ~ E E - - L L L L L L L L L - - - H - H H H H H - - H - - - L L . . . . . . L L - HHHHHHHHHH HHH . . . . . . .

3 3 0 HHHHHHHHHH -LL-EEE--L - - H H H H H H H - -L--EE~--L -EE-LLL .............

HHHHHHHHHH ----T.T.~ ..... HHHHIIHHH HH ...... LL L-E-LLL .............

HHHHHHHHHH H-LL-EE--L --HHHHHHH .......... L --E-LLLL ........ L---

HHHH--HMH ..... EZE-LL --HHHHHHHH ........ LL --E-LLL .............

HHHHHHHHHH --nTJ.RK'R----L LL-HHHHHHII H ....... LL --E-LLL .............

HHHEHHHHH- - L - - ~ .................. LL --E-LLL .............

HHHHHHI.mH- --L--.KK~K~ .................. LL --E-LLL .............

HHHI-IHHIII~- - L - ~ .................. LL L-E-T.LT,r. ............

HHHHHHHHHH -LL-~u~a~ .................. LL L-E-T.LT.r .............

HHRHHHHHI-Vrl - L L - - E E E E E ---~--E-EE E ....... LL --E-LLLL ............

H - - H H } ~ ' [ ] ~ ' I . . . . . K ~ . . . . . . . . . . . . . . . . . . L L - E ~ - T , T . T . T . T . . . . . . . . . . . .

HHHHHHHHH- --L----~K ...... HHHH --H-H--LLL L--E--T.LT,T . . . . . . . . . . . . .

HHHHHHHHH- LLLL-EE--L . . . . . H H H H H H H H - - - r . T . T . r . - E - - T . T , T . L T . . LLLLLEEEEE

HHHHHHHHH- L L T . T - - E E E - L - - - H H H H H H H H H H - - - L L L L - E - - r . r . r , r . L , - - - L L L * E E E

HHHHHHHHH- LL--~K~ ................. LLLL ---LLLL .............

HHHHHHHH-- L L - - - E E - - - L . . . . . . . . . . . . . . . . L L L - E - L L r . T . r . T . . L . . . . . . . . .

HHItHIIHHHH . . . . . . . . . . . . . . . . . HHHH HH . . . . L L L L - - - L r . r . r . r . . . . . . . . . . . . .

4 4 0 HHH---L--H HHHHHHHHH- -LLL--LL-E R'~-'-K'~-~------H H-HHHHH--. LLLL--EEE-

HHHHH .... H HI'IHHHH~4H- -LLL--LL-E z~zK'~------H HHHHHHHHH. LLLL--EEE-

HH ........ HHHHHHHHH- -LL---LL-E ~ .... H HHHHHHHHH. LLLL--EEE-

HHHHH--L-- HHHHHHHHH- -LL---LL-E KK~:~¢ .... H H.HHHHHH-. LLLL-EEEE-

HHHHH ..... HHHHHHHHH- -LL---LL-E z~K~ .... H H.HHHHHH-. LLLL-EEEE-

HHHH--LL-- HHHHHHHHH- LLL---LL-E F.~:W.gK ........ H .... •. LLL--EEEE-

HHHH--LL-- HHHHHITHHH- LLL---LL-E ~ ........ H .... •. LLL--EEEE-

HHHH--LL-H H ~ H H H - -LL---LL-E ~:gKW.~ .... H HHHHHHRH.. LLLL--EEE-

HHHHH-LL-H HHHHHHHHH- -LL---LL-E ~W,:K~ .... H HHHHHHHH.. LLLL--EEE-

HHHI'IH .... H I'K4]~IHHHHH- -LL---LL-E R ~ - - - H HHHIffHHHH.. L r . T . T . - - E E E -

~--LL-- HHI~5~HHH-- -LL---LL-E ~Kt~5~E ...... HHHHHH.. LLLL--EEE-

HHH---L--H HHHHHHHHH- LLL---LL-E EEEEE~---H HHHHHHH-.. LLr.T.-EEEE-

HHHHH-L--H HHHHHHHHH- LLLL--LL-E RmI~t~-LL-H HHHHHHHH-- LLLLL--EE-

HHHHH-L--H HHHHHHHHHH -LLL--LL-E RRRI~-.LL-H HHHHHHHH-- LLLLL--EE-

HHHHH-L--H HHHHHHHHH- -LLL--LL-E ~ .... H H-HHE~HH-- LLLL--EE--

HHHI'IH ..... H H H H - LLLL--LL-E ~RRRR ..... ~ LLLLL-EEE-

HHHH--L--H HHHHHHHHH- -LLL--LL-E EEEE ..... H HHHHHHHHH- LLLLL-EEE-

110

-L ..... L ............. L .... LLL .............. HHHHHHH HHHHHH ....

-L ..... L ............. L .... LLL .............. ~ HHHHHH ....

-L ..... L ................ E-LLL ............. L-HHHHHH HHHHHH--L-

-L ..... L-E E-E .............. LL .............. HHHHHHH ~ ....

-L ..... L-E -_I¢ ........ L .... LLL- ............. HH]HIHE~ HHHHHH ....

-L ..... L-E mu~u~ ........ E--LL ..... E ....... LHHHHHHH HHHH-H--L-

-L ..... L-E ~KK~KK ........ E--LL ..... E ....... LHHHHHHH HHHH-H--L-

-L ..... L-E KKK~KK ........ E--LL .............. HHHHHHH HHHHHH--L-

-L ..... L-E ~ ........... LL .............. ~HHHH HHHHHH--L-

-L ..... L-- E-EEE ......... E--LL .............. ~ ~ - - L L

-L ..... L-E gzzz~K ........ E--LL .... EE ......... ~ HHHHH---L-

LL ..... L ................ E--LL .............. HHHHHHH H~HHHH--L-

- L L - - - L . . . . H . . . . . . . . . L L - - L T . T . T . T . . . . . . T.T.T.T.T. L L L - H H H H H H H H H H H H - - L L

441 499

RF -L-HHHHHI~ HHHH .... HH H--LL~ .... HHHHHIIII-LL LL-EEEE ...... LL ....

UKtc LL-HHHHHHH HHHH--L--H H--LLLL--- HHIIHHHHHLL LL-teRR ....... LL ....

B223 -L-HHHHHHH HH~4 ...... H ---LLL .... HHHHHHH-LL LL ........... LL ....

69M -L-HHHHHHH HHHH .... HH H--LLLL--- HHHHHHHH-L LL-RRRR ..... LL .....

DSI --HHHHHHHH HHHH .... HH H--LLLL--- HHHHHHHH-L LL--EE ...... LL .....

4F ---HHHHHHH HHI4H ..... H ---LLL .... HHHHHHH-LL L-v.~-~ ..... LL .....

4S ---HHHHHHH HHHH ..... H ---LLL .... HHHHHHH-LL L.RR~ ..... LL .....

H o c h i - L - H H H H H H H HHHH . . . . . H - - - L L L . . . . HHHIHtHHHLL L - E E E E E . . . . . L L . . . . .

W a - L - H H H H H H H HHHH . . . . . H - - - L L L . . . . HHItHHHHHLL r . r . ~ . ¢ ~ R . . . . . L L . . . . .

I G V g O 3 - L - H H H H I I I ~ H H H - - - L - - H - - - L L L . . . . HHI'IHHHH-LL L--e:~KKKK . . . . L L . . . . .

OSU -L-HHHHHHH HHHHH---HH ---LLL .... HHHHHHH-LL L ........... LL .....

St 3 - L - H H H H H H H HHHH . . . . . H H - - L r . r , r . - - - HHHHHHLLLL . R ~ . . . . . . . L L . . . . .

E H P LL-HHHHHHH H H H H H - L - H H H - - L L L L - - - HHHHHHHHLL L L L . . . . . . . . r .r .r . t .r , t .r .r .

E W L--HHHHHHH HHHHHLL-HH H - - t . r . r . t . r . L - HHHHHHH-LL ILL ........ LLT.r,T,T.r.T.

S A I I -L-HHHHHHH HHHII .... KH H--LLLL--- 14HHHHHH-LL LL-E ......... r . r . r , r , r . ,

S A I I P LL-WrIEIHHHH H H H H - L L . . . . . L L L L L - - - H I t H H m e d - L L L L . . . . . . . . . . r .Lr . r . r . r . .

R R V - L - H H H H H H H H H H H - - - H H H H - - L L L L - H H HHITHHHHH-L L L - - - E E - - E - . - L T , r . r . r . .

Fig. 4. For legend see opposite.

Rotavirus gene 5 and VP5 analysis 3419

the distribution of predicted a-helix, loop and fl-sheet regions.

Discussion

This study has confirmed and greatly expanded earlier reports on sequence analysis of gene 5 from the group A rotaviruses (Pedley et aL, 1983; Bremont et al., 1987; Mitchell & Both, 1990; Hua et al., 1993). The picture that has emerged is one of a protein which overall shows a higher level of sequence divergence than any other rotavirus gene, including those encoding the neutral- ization antigens, which generally would be expected to exhibit the highest level of variation as a result of the selective pressure exerted by the host's immune system. Thus, for example, in comparing the VP5 sequences of the bovine UKtc (G6) and B223 ((310) strains the two proteins diverge by 29-5% (Table 1) whereas the corresponding VP7 sequences only show a difference of 17'7% at the amino acid level (Xu et al., 1991). Whilst the apparent absence of error repair systems in the activities of RNA-dependent RNA polymerases would provide a mechanism for generating high levels of mutational change within rotavirus genes, some selective pressure is needed to cause the fixation of such changes in virus populations. From the limited dataset available to them, Hua et al. (1993) concluded that the only variable that appeared to correlate with VP5 sequence was the (3 serotype of the virus, leading them to speculate that variation within the two proteins may be linked as the result of some undefined interaction between them. This speculative idea was an attractive one as it provided a selective force, the humoral arm of the immune system, that could account for the fixation of changes in a virus gene encoding a non-structural protein. However, the expanded dataset reported here does not support this idea. Thus, for example, the human rotavirus strain Hochi (G4) shows a considerably higher level of conservation (95.9 %) with the human Wa (G 1) isolate, which has a different G serotype, than it does with the human St 3 (81.3 %) strain which has the same serotype. It is possible that the cell-mediated arm of the immune response is acting as the selective force for fixing mutational changes, as has been shown to be the case in other virus systems (Phillips et al., 1991). However, studies carried out on the antigen specificity of the cytotoxic T cell (CTL) response against rotaviruses have

shown that VP5 does not appear to be an important target for CTL activity (Offit et al., 1994). One hypothesis that was tenable from comparison of the first three gene 5 sequences to be determined, the bovine RF and UKtc strains (Bremont et al., 1987; Y. Tian & M. A. McCrae, unpublished observations) and the simian SA11 isolate (Mitchell & Both, 1990) was that the selective force might be provided through the need for VP5 to interact with species-specific cellular proteins to achieve efficient virus replication. In part the choice of isolates from which to sequence gene 5 in the current study was made with the objective of testing this hypothesis. The conclusion that can be reached from the enlarged dataset is that whilst such interactions may well be important to the function of VP5, they do not appear to provide the major selective force for divergence since variation within isolates from the same species (for example SA11 and RRV) can be greater than in those from different species (OSU versus St 3). Recently studies investigating the viral gene(s) responsible for controlling virulence of rotaviruses in a mouse model using virus reassortants have found that the only viral gene that could be correlated with pathogenic phenotype was gene 5 (Broome et al., 1993). Unfortunately the relative viru- lence characteristics of the virus isolates used in the current study have not been assessed in this model of viral pathogenesis, although clearly selection of virulence phenotype could possibly provide the selective force necessary to account for the high level of sequence divergence observed. This is an area where further work with viruses of defined pathogenic phenotype is certainly merited.

Despite the high levels of divergence observed some features were conserved. Thus the first postulated metal- binding motif noted by Bremont et al. (1987) is conserved across all of the isolates sequenced. Indeed the region in which it lies, namely amino acids 39 to 83, showed 59 % sequence identity and contained no less than 26 or 29.5 % of the 88 absolutely conserved residues, implying a functional importance for this region. Furthermore in a non-defective rotavirus variant with a deletion in gene 5 that also introduced a frameshift into the gene, the one- third length amino-terminal fragment of VP5 that would be produced included this conserved motif (Tian et al., 1993). This observation indicated that only the amino- terminal third of the protein is required for virus replication in vitro and strengthened the suggestion that

Fig. 4. Conservation of predicted secondary structure across VP5. A secondary structure prediction for each of the VP5 sequences was obtained using the PredictProtein mail server operated by EMBL. This server uses the algorithms of Rost and coworkers (Rost & Sander, 1993; Rost et al., 1994). The prediction files were then multiply aligned using the Pileup program of the GCG sequence analysis package. E indicates positions predicted to have flosheet structure, H denotes positions predicted as being in an a-helix and L denotes positions predicted to be in loops. Positions at which no prediction with an acceptable level of confidence could be made are indicated by '- ' and gaps introduced to optimize the alignment by a '. '.

3420 L. Xu and others

the cysteine-rich conserved motif may be functionally important (Tian et al., 1993). The confirmation that VP5 exhibits both zinc- and RNA-binding capacity (Brottier et al., 1992) has provided further evidence for a functional role for this region of the protein but the exact molecular mechanism of its role in virus replication still remains to be determined. Recent work using deletion mutants of gene 5 has shown that the RNA-binding domain of VP5 is encompassed within the first 81 amino acids of the protein and that the cysteine-rich motif is essential for RNA-binding activity (Hua et al., 1994). This latter study also mapped the region of the protein determining the subcellular localization of the protein to cytoskeletal matrix of the cytosol as lying between amino acids 84 and 176 (Hua & Patton, 1994; Hua et al., 1994).

An interesting observation that has emerged from the analysis carried out in this study concerns conservation of predicted secondary structure. This showed a much higher level of conservation than was evident at the primary structure level. The implication of this is that although the primary sequence is free to diverge to a considerable degree, underlying the variation is a selective pressure to maintain the higher-order structure of the protein, consistent with it having an important role in the replicative cycle of the virus.

The large dataset of gene 5 sequences used in the present study has defined the regions of conservation and divergence within this rotavirus gene. It will require more extensive mutagenesis studies to confirm and dissect the functional importance and role of the various regions identified. Studies of this type are already underway in this and other laboratories.

This work was supported by grants from the AFRC and MRC. Y.T. was in receipt of a fellowship from the Henry Lester Trust during part of this work.

References

ANDREW, M.E., BOYLE, D.B., COUPAR, B.E.H. , REDDY, D., BELLAMY, A. R. & BOTH, G.W. (1992). Vaccinia rotavirus VP7 recombinants protect mice against rotavirus induced diarrhea. Vaccine 10, 185 191.

BOTH, G. W., BELLAMY, A. R., STREET, J, E. & SEIGMAN, L. J. (1982). A general strategy for cloning double stranded RNA: nucleotide sequence of the simian 11 rotavirus gene 8. Nucleic Acids Research 10, 7075-7088.

BREMONT, M., CHARPILIENNE, A., CHABANNE, D. & COHEN, J. (1987). Nucleotide sequence and expression in Escherichia coil of the gene encoding the nonstructural protein NCVP2 of bovine rotavirus. Virology 161, 138 144.

BROOME, R. L., Vo, P. T., WARD, R. L., CLARK, H. F. & GREENBERG, H. B. (1993). Murine rotavirus genes encoding outer capsid proteins VP4 and VP7 are not the major determinants of host range restriction and virulence. Journal of Virology 67, 2448-2455.

BROTTIER, P., NANDI, P., BREMONT, M. & COHEN, J. (1992). Bovine rotavirus segment 5 protein expressed in the baculovirus system interacts with zinc and RNA. Journal of General Virology 73, 1931-1938.

BURKE, B., MCCRAE, M.A. & DESSELBERGER, U. (1994). Sequence analysis of two porcine rotaviruses differing in growth in vitro and in pathogenicity: distinct VP4 sequences and conservation of NS53, VP6 and VP7 genes. Journal of General Virology 75, 2205~212.

CLARKE, I. N. & MCCRAE, M.A. (1983). The molecular biology of rotaviruses. VI. RNA species-specific terminal conservation in rotaviruses. Journal of General Virology 64, 1877-1884.

DEVEREUX, J., HAEBERLI, P. & SMITHIES, O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Research 12, 387 395.

ESTES, M.K. & COHEN, J. (1989). Rotavirus gene structure and function. Microbiology Reviews 53, 410-449.

FLEWETT, T. H. & WOODE, G. N. (1978). Rotaviruses. Brief review. Archives of Virology 57, 1 25.

GALLEGOS, C. O. & PATTON, J. T. (1989). Characterization of rotavirus replication intermediates. A model for the assembly of single shelled particles. Virology 172, 616-627.

HOLMES, I. H. (1983). Rotaviruses. In The Reoviridae, pp. 35%423. (Edited by W. K. Joklik). New York: Plenum Press.

HUA, J. & PATTON, J. T. (1994). The carboxyl-half of the rotavirus nonstructural protein NS53 (NSP1) is not required for virus replication. Virology 198, 567 576.

HUA, J., MANSELL, E- A. & PATTON, J. T. (1993). Comparative analysis of the rotavirus NS53 gene: conservation of basic and cysteine rich regions in the protein and possible stem loop structures in the RNA. Virology 196, 372-378.

HUA, J., CHEN, X. & PATTON, J. T. (1994). Deletion mapping of the rotavirus metalloprotein NS53 (NSP1): the conserved cysteine rich region is essential for virus specific RNA binding. Journal of Virology 68, 39904000.

JOHNSON, M. & M¢CRAE, M.A. (1989). Molecular biology of rotaviruses. VIII. Quantitative analysis of the regulation of gene expression during virus replication. Journal of Virology 63, 2048- 2055.

MCCRAE, M. A. & FAULKNER-VALLE, G. P. (1981). Molecular biology of rotaviruses. I. Characterization of the basic growth parameters and pattern of macromolecular synthesis. Journal of Virology 39, 490-496.

MCCRAE, M. A. & MCCORQUODALE, J. G. (1982a). Molecular biology of rotaviruses. II. Identification of the protein coding assignments of calf rotavirus genome RNA species. Virology 117, 435M43.

MCCRAE, M. A. & MCCORQUODALE, J. G. (1982b). Molecular biology of rotaviruses. IV. Molecular cloning of the bovine rotavirus genome. Journal of Virology 44, 1076-1079.

MCCRAE, M. A. & MCCORQUODALE, J. G. (1983). Molecular biology of rotaviruses. V. Terminal structure of viral RNA species. Virology 126, 204~212.

MCCRAE, M.A. & McCORQUODALE, J .G. (1987). Expression of rotavirus proteins in E. coli. Gene 55, 9-18.

MITCHELL, D. B. & BOTH, G. W. (1990). Conservation of a potential metal binding motif despite extensive sequence diversity in the rotavirus nonstructural protein NS53. Virology 174, 618-621.

OFFIT, P.A., COUPAR, B. E. H., SVOBODA, Y.M., JENKINS, R.J., MCCRAE, M. A., ABRAHAM, A., HILL, N. L., BOYLE, D. B., ANDREW, M. E. & BOTH, G. W. (1994). Induction of rotavirus specific cytotoxic T lymphocytes by vaccinia virus recombinants expressing individual rotavirus genes. Virology 198, 1(~16.

PATTON, J, T, (1986). Synthesis of simian rotavirus SAIl double stranded RNA in a cell free system. Virus Research 6, 217 233.

PATTON, J.T. & GALLEC, OS, C.O. (1988). Structure and protein composition of the rotavirus replicase particle. Virology 166, 358 365.

PEDLEY, S., BRIDGER, J. C., BROWN, J. F. & McCRAE, M. A. (1983). Molecular characterization of rotaviruses with distinct group antigens. Journal of General Virology 64, 2093-2101.

PHILLIPS, R. E., ROWLAND-JONES, S., NIXON, D. F., GOTCH, F.M., EDWARDS, J. P., OLUNLESI, A. O., ELVIN, J. G., ROTHBARD, J. A., BANGHAM, C. R. M., RIzzA, C.R. & McMICHAEL, A.J. (1991). Human immunodeficiency virus genetic variation that can escape cytotoxic T-cell recognition. Nature, London 354, 453~457.

ROST, B. & SANDER, C. (1993). Prediction of protein structure at better than 70 % accuracy. Journal of Molecular Biology 232, 584~599.

ROST, B., SANDER, C. & SCHNEIDER, R. (1994). P H D - an automatic mail server for protein secondary structure prediction. Computer Applications in the Biosciences 10, 53-60.

TIAN, Y., TARLOW, O., BALLARD, A., DESSELBERGER, D. & MCCRAE, M. A. (1993). Genomic concatemerization/deletion in rotaviruses a new mechanism for generating rapid genetic change of poten- tial epidemiologic importance. Journal of Virology 67, 6625 6632.

Xu, L., HARBOUR, D. t% MCCRAE, M. A. (1990). The application of the

Rotavirus gene 5 and VP5 analysis 3421

polymerase chain reaction to the detection of rotavirus in faeces. Journal of Virological Methods 27, 29-38.

Xu, L., HARBOUR, D. & MCCRAE, M. A. (1991). Sequence of the gene encoding the major neutralization antigen (VP7) of serotype 10 rotavirus. Journal of General Virology 72, 177- 180.

(Received 11 July 1994; Accepted 23 August 1994)