extension of the dna binding consensus of the chicken c-myb and v
TRANSCRIPT
© 1992 Oxford University Press Nucleic Acids Research, Vol. 20, No. 12 3043-3049
Extension of the DNA binding consensus of the chickenc-Myb and v-Myb proteins
Kathleen WestonInstitute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Road, London SW3 6JB, UK
Received March 17, 1992; Revised and Accepted May 26, 1992
ABSTRACT
The chicken c-myb gene and the v-myb oncogenetransduced by avian myeloblastosis virus (AMV)encode DNA binding transcription activators. The DNAbinding domain of AMV v-Myb displays a number ofamino acid changes relative to c-Myb; v-Myb proteinsin which one or more of three crucial residues in theDNA binding domain are mutated to resemble thec-Myb sequence display altered transformationphenotypes. In order to establish whether the spectrumof DNA binding sites which AMV v-Myb can recogniseis different from that seen by chicken c-Myb, a siteselection protocol was used to derive consensusbinding sequences for three variant Myb proteins madeIn vitro, and also using nuclear extract from the v-mybtransformed cell line BM2. The results show that theoriginal consensus binding site defined for v-Myb, YAA-CKG, can be extended to YAACKGHH, and that thisnew consensus holds for both v-Myb and chickenc-Myb.
INTRODUCTION
The v-myb oncogene is the transforming gene of two separateavian retroviruses, avian myeloblastosis virus (AMV) anderythroblastosis virus 26 (E26); in the latter, it is fused to a secondoncogene, \-ets. Like its cellular progenitor c-myb, \-mybencodes a DNA-binding transcription activator (1). The c-MybDNA binding domain is at the amino terminus of the protein,and is contained within a tripartite 52 amino acid repeatdistinguished by the occurrence of tryptophan residues spacedeither 18 or 19 residues apart (2, 3). Computer modelling of thedomain suggests that it has some similarity to the helix turn helixstructure proposed for the homeobox DNA binding domain (4).Both AMV and E26 v-Myb are amino-terminally truncatedrelative to c-Myb, and have lost almost all of the first repeat unit,which has been demonstrated to be unnecessary for specific DNAbinding (5,6). Whilst the DNA binding domain of E26 v-Mybis identical to c-Myb, that of AMV v-Myb contains 4 pointmutations. The differentiation state of \-myb transformed cellsappears to be tightly linked to three of these point mutations.In vitro, v-myb will only transform cells of the myelomonocyticlineage. When the c-myb DNA binding domain is used,transformed cells resemble myeloblasts. When the AMV v-mybsequence is used, cells have a monoblast phenotype. Individualmutation of the three residues causes phenotypes intermediate
between the two extremes (7). One explanation for thisobservation might be that one or more of the three residues isin a critical position in the DNA binding domain involved in site-specific recognition of DNA; mutation might subtly alter thebinding site specificity of the AMV v-Myb protein and henceallow regulation of a different subset of genes, leading todifferences in cell morphology after transformation. In order totest this model, a site selection protocol was used to generatebinding site consensus sequences for chicken c-Myb, AMVv-Myb and a truncated mutant of v-Myb. The results show thatthe binding site consensus for all three proteins is slightly largerthan that originally defined for v-Myb, but remains essentiallythe same in all cases.
MATERIALS AND METHODSPiasmidspT7(5V: The entire AMV v-myb coding sequence was removedfrom plasmid vmybgagSP65 (1) by BamHI digestion and gelpurification, and recloned into the BamHI site of a pUC12derivative whose EcoRI and Sail sites had been destroyed bydigestion, end repair and reclosure. This plasmid, pvgag, wasdigested with EcoRI and BamHI, and the 800bp 3' v-mybsequence was gel purified. A 36Obp fragment carrying the 5'end of v-myb was generated by EcoRI digestion followed bypartial Ncol digestion. The 5' and 3' ends of v-myb were ligatedinto pT7/3Sal (8) cut with Ncol and BamHI, to give pT7/3V.pT7/3V therefore contains the v-myb coding region linked to thehuman /3 globin initiating methionine and Kozak sequence.
p77(}XT: The v-myb coding region was subjected to site directedmutagenesis to delete nucleotides 830-880 (numbering as in 9),which encode amino acids 215-232 in the v-myb activatordomain. Mutated v-myb was cloned into the pUC12 derivativedescribed above to give pvgagSDl 1. pvgagSDl 1 was cut withSail, end-repaired, and ligated with an Xbal termination linkerCTAGTCTAGACTAG to give pvXT. pvXT was digested as forpvgag and cloned into pT7/3Sal cut with Ncol and BamHI.
p77/3C: A fragment carrying the 5' end of the chicken c-mybcoding region (nucleotides 30-687 encoding amino acids 2-186)was removed from pneoAMV5'SAcmyb (1) by PolymeraseChain Reaction (PCR) using primers CM2 (TCCGGCGCATG-GTGGAATTC) and CM4 (GGATCCTGCGCACGGAGACC-CCGGCA). The fragment was digested with Avill and EcoRI,
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018
3044 Nucleic Acids Research, Vol. 20, No. 12
and cloned into pT7/3Sal first cut with Ncol and end-repaired,and then cut with EcoRI. The resulting plasmid, pT7/SC5', wasthen digested with EcoRI and BamHI, and ligated with a 1520bpfragment carrying the 3' end of the c-myb coding region cut frompneoAMV5'SAcmyb with EcoRI and BamHI.
ProteinsIn vitro transcription and translation. Plasmids pT7/3XT, pT70Vand pT7/3C were linearised with BamHI, and transcribed in vitroby standard methods (10). Capped cRNA was translated aspreviously described (11). The integrity of the resulting XTmyb,Vmyb and Cmyb proteins was checked by Western blotting usingmonoclonal antibody myb2.74 (12), or by labelling a smallportion of the translation reaction using 35S-methionine,followed by SDS-PAGE. All proteins were assayed for functionalDNA binding activity in gel mobility shift assays usingoligonucleotide 9M1/9M2 (see below) as probe.
BM2 nuclear extracts. Nuclear extract from the transformedchicken myeloid line BM2 was made essentially as described in(13), except that buffer A contained 0.23M sucrose and 0.1 %NP40, buffer C contained 0.32M NaCl, and extracts were flashfrozen in buffer C without dialysis. The protein concentrationof the extract was 6.5mg/ml.
Oligonucleotides and probesSite selection oligonucleotide. Oligonucleotide R76 (CAGGTC-AGATCAGCGGATCCTGTCG(N)26GAGGCGAATTCAGT-GCATGTGCAGC) was double stranded using the comple-mentary primer F (GCTGCACATGCACTGAATTCGCCTC) aspreviously described (14). Primers F and R (CAGGTCAGAT-CAGCGGATCCTGTCG) were used for PCR amplificationduring site selection.
Gel mobility shift probes. 9M1: CTAGGACATTATAACGGT-TTTTTAGT; 9M2: CTAGACTAAAAAACCGTTATAATGT-C; 9MU1: CTAGGACATTATCACGGTTTTTTAGT; 9MU2:CTAGACTAAAAAACCGTGATAATGTC; IK: CTAGGAC-ATTACAAC; 2K: CTAGACTAAAAAACAGTTGTAATG;3K: CTAGACTACCACCCCGTTATAATG; 4K: CTAGACT-AAAACCCCGTTATAATG; 5K: CTAGACTAAAAACCCG-TTATAATG; 6K: CTAGACTAAAACACCGTTATAATG;7K: CTAGACTAAAAATCCGTTATAATG; 8K: CTAGACT-AAATGACCGTTATAATG; 9K: CTAGGACATTATAAC.Oligonucleotides were annealed in 50mM NaCl and end-labelledunder standard conditions using AMV reverse transcriptase,40/iCi a-32P dCTP, and 0.2mM dATP, dGTP and dTTP.Annealings: 9M1/9M2, 9MU1/9MU2, 1K/2K, 9K with 3K, 4K,5K, 6K, 7K, 8K, and 9M2.
Site selectionSite selection was performed as described in (14), except thatimmune precipitation was for 1 hour rather than overnight, with3/tl of monoclonal antibody myb 2.74 (12) prebound to 30̂ 1 ofProtein A Sepharose. Myb2.74 recognises an epitope in the regionof Myb immediately after the amino-terminal DNA bindingdomain. 2/tl of programmed reticulocyte lysate or BM2 nuclearextract was used per round of site selection. Selectedoligonucleotides were cloned as described (14), by performinga gel mobility shift assay in the presence of \y\ myb 2.74antibody, excising the band corresponding to the supershiftedDNA:protein:antibody complex and amplifying eluted DNA by
PCR. Following digestion with BamHI and EcoRI, fragmentswere cloned into M13mpl8 or pUC18, and sequenced bystandard methods using T7 DNA Polymerase (Pharmacia).During sequence analysis, a central random sequence of less than26 bp was occasionally found; these were only scored ifsequenced on both strands to eliminate the possibility ofcompression artefacts.
Gel mobility shift assays
Binding reactions (lOOmM KC1) contained 12/xl buffer E (14),2/ig poly(dIdC).(dIdC), 2/ig calf thymus DNA, 2^1 programmedreticulocyte lysate or 4/*l BM2 nuclear extract, and 0.25/ig probein a 20/tl final volume. For supershifts, 1/tl of myb 2.74 antibodywas also added. Reactions were incubated for 30 minutes on ice,then loaded onto a 4% 30:1 crosslinked polyacrylamide gelcontaining 0.5 XTBE and 4% glycerol. Gels were run in the coldfor 3 hrs at 15V/cm, dried onto 3MM paper and autoradio-graphed. Several my/3-specific complexes were always observedusing the V and C myb lysates, as both contained a number ofprematurely terminated myb species. Extra bands evident in BM2gel shifts were presumably due either to some degradation ofthe endogenous v-Myb protein, or to the presence in the extractof other proteins capable of binding the 9M1 site. As only thethree slowest-migrating complexes were recognised by Myb2.74,a combination of these possibilities seems likely.
RESULTSMyb proteins used for site selectionFour sources of myb protein were used in the binding siteselection assay. XT, V and C myb proteins, diagrammed inFigure la, were made by translation in rabbit reticulocyte lysate.The sequences of the V and C myb proteins correspond exactlyto the AMV v-Myb oncoprotein and the chicken c-Myb proto-oncoprotein respectively. XT myb is a C-terminally-truncatedderivative of AMV v-Myb, which lacks the myb activatordomain, and additionally binds DNA with greater affinity thaneither v- or c-Myb (15). As an alternative source of AMV v-Myb,nuclear extract was prepared from the AMV-transformedchicken myeloid line BM2; this line expresses large quantitiesof v-Myb, but no detectable c-Myb (16). Three different versionsof AMV v-Myb were used for site selection in order to determinewhether the source or binding affinity of the test protein had anyeffect on binding site recognition.
Before use for site selection, all four protein sources were testedin gel mobility assays using as probe an oligonucleotide, 9M1,which contained the specific Myb binding site defined as SiteA in the chicken mim-\ gene (17). A typical assay is shown inFigure lb. Complexes which could be specifically competed withan excess of cold 9M1 (lanes 2, 6, 10 and 14), but not with mutantoligonucleotide 9MU1 (lanes 3, 7, 11, and 15), and which weresupershifted with the Myb-specific monoclonal antibody Myb2.74(lanes 4, 8, 12 and 16), formed in all four cases.
Binding site selectionSite selections were performed using as starting material thedouble-stranded oligonucleotide R76, in which 26 bp of randomsequence are flanked by two different 25 bp fixed sequences.After each round of binding, specific Myj3-DNA complexes wereimmunoprecipitated using antibody Myb2.74, and the DNA waseluted and amplified by PCR using primers complementary tothe fixed ends of the oligonucleotide. Four rounds of site selection
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018
Nucleic Acids Research, Vol. 20, No. 12 3045
were performed with each of the Myb protein sources. Arepresentative gel mobility assay monitoring specific complexformation during the selection process is shown in Figure 2. Afaint complex (lane 2), visible after the second round of siteselection, is further amplified in rounds three and four (lanes 3and 4). The round 4 complex can be specifically competed withexcess cold 9M1 (lanes 5 and 6), and is supershifted by antibody(lane 7). Selection with each of the myb protein sources wassimilarly monitored, and DNA eluted from all four round 4supershifted complexes was amplified, cloned and sequenced.
Compilation of selected sequencesThe sequences of the round four oligonucleotides are shown inFigure 3. Examination of the sequences revealed that 209 of thetotal of 213 analysed contained either a 5/6 or perfect match tothe originally defined v-Myb binding consensus, T/CAACG/TG(18). It therefore appears that this consensus forms a commonbinding motif for all the four Myb proteins assayed. Additionally,17/43 XT sequences, 29/67 V sequences, 11/61 C sequences and16/42 BM2 sequences contained two or more putative bindingsites. Although the frequency of multiple sites was greater thanmight be expected by chance, and thus might reflect a genuinebinding preference of the Myb proteins, there was no obviouspattern in the orientation or spacing of the sites.
Derivation of an extended Myb binding consensusIn order to determine whether Myb binding sites comprise morethan the simple core motif, oligonucleotides containing a singleMyb site were aligned as shown in Figure 3. Initial analysis
.ST
_1 I I I I
HIM
Figure 1. A. Structure of reticulocyte lysate-translated myb proteins. Arrowsindicate repeat units of the DNA-binding domain. The minimal DNA bindingdomain (wide-striped), activator domain (dark-striped) and repressor domain(stippled) are also shown. The three dots in the DNA binding domain of c-Mybindicate amino acid changes relative to v-Myb. B. Gel mobility shift assays usingoligonucleotide 9M1 as probe. Lanes 1 - 4 : XT Myb. Lanes 5 - 8 : v-Myb. Lanes9 - 1 2 : BM2 nuclear extract. Lanes 1 3 - 1 6 : c-Myb. Lanes I, 5, 9 and 13 aresimple gel shifts. Lanes 2,6,10 and 14 additionally contain a 100-fold excess ofcold 9MI oligonucleotide, lanes 3,7,11 and 15 a 100-fold excess of 9MU1 (seetext), and lanes 4, 8, 12 and 16 monoclonal antibody Myb2.74.
concentrated on the single site oligonucleotides, as it was notpossible to determine unequivocally which sites were genuinein oligonucleotides with two or three Myb motifs. However,comparison of the base scores for single sites only and for bothsingle and multiple sites showed that base preferences wereessentially the same for either; accordingly, consensus sequenceswere generated using all sites. For the analysis, 5/6 and 6/6matches to the Myb core consensus which contained the sequenceAAC at positions 2, 3 and 4 were scored as real sites. 6/6 matchesto the Myb core consensus in which the outermost base wasderived from the fixed flanking sequences were scored, but nofixed sequences were included in subsequent scoring. Table 1details the base preferences of the four different Myb proteinsources. Within the hexamer core, as might be predicted, therewas very little variation between the different proteins, exceptat positions 1 and 5; at position 1, the T:C ratio of the XT, Vand BM2 proteins was ~ 70% :24% whereas that of C-Myb was51%:39%, and at position 5, the XT, V and BM2 proteinspreferred a G residue (>82% of sequences), whereas the Cprotein selected G (48%), T (32%) or C (18%). Table 2summarises the consensus sequences proposed for each of thefour proteins, together with a general consensus for all fourtogether. No base preferences on the 5' side of the hexamer corewere observed for any of the sequences, but a weak consensuscould be defined for the 3' 5bp. Although this was slightlydifferent for each protein, very similar binding preferences wereevident, the most marked being an aversion for G residues atpositions +1 , +2, +4 and +5.
Verification of an extended myb consensus sequenceA series of double stranded oligonucleotide probes (Figure 4a)was constructed to test the validity of the base preferences inferredfrom the site selection assays. Gel mobility shift assays wereperformed using XT, V and C proteins, and equal amounts ofthe nine probes. Results are shown in Figure 4b. All threeproteins behaved virtually identically with respect to theirpreferences for specific probes. As shown in Figure lb and lane2 of Figure 4, probe 9M1 contains a very efficient binding site;this site can be destroyed (probe 9MU1, lane 3) by mutation ofthe AAC core of the myb hexamer to CAC (18). Interestingly,mutation of the Myb core from TAACGG to CAACTG appears
bound + Ab
free
Figure 2. Gel mobility shift assay of labelled oligonucleotide pools selected usingXTMyb protein. Lanes 1 - 4 : rounds 1 - 4 of site selection. Lanes 5 and 6: round4 pool with 100-fold excess of cold 9M1 oligonucleotide alone (lane 5) or togetherwith antibody Myb2.74 (lane 6). Lane 7: round 4 pool supershifted with antibodyMyb2.74.
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018
3046 Nucleic Acids Research, Vol. 20, No. 12
Ggag
X. XT MmM-j alt«aXI- tcgTTATCTnUCCCAACCTCCCTGCCTAcagX2- tcgTTBI»OQOTCrrTCTACACCATXCACgsgX3c tcgCTTIliyMCTTTAAACCCCACArTITgao.X5- - tcrgCTCCCCCTTBUCCIACAAATACCCCgagX7- tcf lrnujiTT*rnaragrcATTTCAcccg«gI B - tcgCACCCSATACCAAABU»JJUlTlATCq»i;X9 ctcAACAATCCtMCCCTAAACTTCTACCcgaxioc ctcAccccnimrticccrccATTACCAAcqaXI1 ctxACAAATtXCCTTCPttCOBCATTTAAcgaX12- tcgTTCTCTAATMOQCTCTATCTTGGTX12b cX12c- jXI3 ctc*ACTAAAATCACB*COBTAAATAAcgaXI 3b ctcrACTCDUUXMCSOCAGAACCTAACcgaXI 3c ctdUanOTTTTTCGCAAACCCCAAATTCcgaXI5 ctcTCACCACCSGTAIUCQeTAGArrAAcgaxi ( ctccci;iixT»r»»mnrAu I i n LAAAcgaX20 ctcCTACAlU.l l l lABUUa»:H.l l . lCCcgaX21- ctcCCCAnaCOCATACTCCCCCTACAkcqaX22 ctcAATCATACGCCCCMOaOAAAAATAAcgaX25 ctcCTACGTTrCTCCMCUUlTlTltJCAAciJO12 5b ctcaUMCOCAAAAATCAACATWCCCTcgaX2Sc* ctcCCACAAACCCATOUOaCCAAAATAAcgaX27 ctdUCIBCCAT-X2I- tcgTTCACCO
"TMCAAcoa
X30 ctcTTAACCCCCCCOnarHH UlAAAcga
X3- tcgCTTnmiWIlATOUMJJUlLl ICATCgag
xio ctcauLooarcACAAcccacanuccBTcgaXI Ob ctcCAAAACTlWtTlrATCrCTaaCTCTcgaX14- Tri]TTrrMW"W I ' I— »'w*"f-*»»'-'--j-3X19 ~t~ ' I M^y7fl/*ftftRfl"tT'7'*TTHTT-71X22a- tcgCTTCATATTCTnTMCamaCTeTgagX22bX23- tdjAAOkaCSSCATACOUtCaOTTAAASTgai]X2S CATTACCSTOUTlmaCieACSAAAAX2<X2(aX2(bX27aX27b gX 2 t - tcgCTTBUCaCTaUhCTOTAGTACCAACgag
C. CClClbC l c -C3C4*C5CSa-C5b-C6CtaCib*C6c-C(dCl'C8*C»-C»b*CJc-C9d-Cllb-C12a-C12bC13a-C13b--C13c*cl 5b*Clib*Cite - -Cll*C19C20C21-C21a->C21bC22C22bC23C21e-C23b*C24C24aC24b*C24C-*C251
C2CC 2 ( a -C2«b-C33*C35a-C l a -CldciociiaC13C15cC1M-Cl(dC22cC2icC3Sb
alt««TCTCCCCCCTaUCOSCATCACCClUcga
ctcAACAACAACAATCTTCAAABUtCigCcgactcACCCCSCTACTSTTCAAAABlCaaTcga
ctoTTCAAAIMCaaCSAAAATTATAACCcgactcCACTMJICTfrCACATACCTACCAACcga
ctcCACAAIUCICCCAAACAGATCCCAAcgactcSAGGCGCACMCIOrAACAATCAAACcgatcgATAATCTCTMU3Ml l u l u l l U A C c c a p
ctoMCOBTACAACAATACCCCCTCCCATcgotcgACCATTAATTTCTTMLOaeTAATCCTgag
ctcAGCCTCCTACaUCOBCCACTtrrATCcaac t c A A T l i n M U l UlLCATlClLlAGAcqa
ctcCSCCCCCAAOUhCCOTCAAAATAAAAcgactcAACninmrCTTACCACTCTgCAACcga
ctcaUOOaeCAAACCCTCACCCCTCTTcgaCACACnUCaLaaTTTDlA
gCCACATTCTrTCCITCAAcga
ctcnUCCOAAATAATCCCTCCCTTAACcgatcgCTTATTACACaUaCaSCTAAAAGACCgaiJ
ctcACCTTCACACCEAAACTACaMCTSrcgoctcACIlkCTOACAAACSTCACACTCCTAcga
ctcATCCATCCCCCTCCTTOUkCIOSCATegatcgTTTCTCCCAIUCOSCCSCTCATCCTgag
tCCTCCATOUkOaaCACTTTGACTCTC
ctcCCCCTCACTCCTXXaUkOCOUJUJWCcgactcATCCTATAOUbOOUUXAACCCACAAcija
ctcCCCTACAAABUkCOGCCATCCCATAAcgactcCGAArACACCCCCCATIMOaCrCACcga
ctcCCTEATAACOUbCTBCCAAACTCAACcgatcqCTTCTTCCmACTOTCAATCTCTAAgag
CAATCTCCACCACaUCCaC
ctcTTATGTTACCACTCTACACaUCTOTcgactcACCCACACCCTraUbCTOCTATCACCcga
ctcACTGGACAnTOiaCIBCCACTAraAcgaC T T T A T I U C a kg g
ctcACTCAAAAaUbCOSACAAAATTTTcgactcCAACGAACAATTCnkOaarAAATAAcga
nUCOtcgTCATnUCIOTCACCTACTCCDWTctcACAACAATCCIUM.I.Ul 11 LACAACSOcga
ctcAAACCACATCAAOUULOOeCACTAACcgoncCTOULOCaCATTAAACCCCAATCAAAcga
ctcTCCTCCAACAACOOOOCATCTCTAAcgact<MCTSTCATTCTCCAACAAAAATftACcga
tcgATTTMUaWI IL11 IgCATATATAACqagctcTOGUCOtAAASCAAACCCCCCCTTCcia
ctcCCCTCCAACTOUCIOACAATTCTACcgataUCTSCCCTajbCCSCATCCAACCTciaiptaUCCSTCUmecCCCACATTACCTCcgatTCTcniroflcACCornmTCTGCAAcga
in* 111* gigTATCUCaCCAAACCTTAAcqaOTTTCCCCTCaCTICg g g
cgTTOaCaOTCCCACTASCAOUUkCaagagctcAAATTUAXUTlAl 1 lirjOCCCCCCcqatClOTTJkCACaUULCOffTAATTCCCTT
ggctcACM*CTCCTCCCmirrarATTTAAcga
1 .V3V3aV3bV4VS-V5bV 7 - .V7b-V10V14-V14a-V15VISV2tV24c-V2Sa-V46-V4(aV49aV50-VMbV51-VS2-V53-V54-V57-V5«-VtO*VCOb-'VJ1-V t 4 -V(4a-<V(4b
V t-<~«.j altaactcAACCACCCTACCCOUCOCACTATAAcga
XMCTBctcTGACCCCCTAtATCACOUtCaCACAAcga
ctri>CCCCCACTCC»»COeCATCTCAAC<:»atcgTTACAABttCagTCTrATTCCCAAACgag
ctcCCCCTGATTTmJbCaSCCCTTCTAAcgatcqCTTACCCTCTHI »JI I 1111 lAGATTgag
ctcaUCaOACACTACATCCCTAAATAACcgactcAT7VCCCTrATACT»K 1 Ul. i l l lA-CcgactcCCACCSCACCATCUCSOTCACTAAcga
CCcCTAAACACCCSTUCaacCCSTCAACcgactcACCCCACTCGACTUCaOAAATCCAAcga
ctcCATASniOaBTCCTATCCCACCTTAcgactcAACACACCTAAOUhCaOTCACATCACcga
ctoUO00CT«rrCATCTACCGCTTTTAAcgatcgTTATTQUkQOOAATTAtTTCCCCCGCgag
ctcAAAAACCCTCCAAIMCIBACTCA-Ccqa
ctcACCTBUUSaACTATAACCAGACTAAcgatcgCTTAAAnaCgBTACASCCAATCTCCgag
CTMCgBrrAAATaSTACTtcgCTCCCCOl TTATrcCgagg
tcgTTAACCATCT<3MCOCrCCCAATCTga^tcgCCTTTDUCOOCTTTCTCCrTCCATCgAg
tcgCTTAAACAACnj>O»»TTrTATCCAgagctcAAACTUCmCAATTCCCACCCTtGCcga
tcgTTCTAABUUCOOTCATTCCGGAATTgagtcgCTTAATTTataCaSTAAAACCCTCCCgagct c rniAl lHMtlm-llllAin.nig ag
CTcTfl»fn»ACTrTCAAGCAAACACTAAcga
tcg»JCOOACCTCTTTTCTTirrCOGCCAoagV 6 t « - ctcTAAAAATCCCCCTTlMOOOACACATcgaV i l - tcgCTTGCCCTTlaCOBrCATTTCACTCTgagV2 ctcAGCATCCBaCOCTAAAAACTCTBCcqaV3c- ctcCCUOOTACUCaaACTTTTCTCCTcgaV5a tcgCTMCSCCCCCATCACGCTTCCCOTTgagVSc tcg-TTCCCTTTAAIUCaeaUtCTCCATgag
V I - tcgCACnUCSTTAAAATIMCStCTTgagVI t b - tcgACTCACCnaCaSTCAATTOUCeTTgagV25V2<V2«bV32aV34V43-V44V44aV44bV4SV 4 I -V4»V50.V55VSSaV5»V(2a
ctcOUCOecCACTCTACIMCCeTCACCcgatcgTCTACBmjJUl 111 HMCOgTCCCgagctaUCTOTCCACCCSTRACCCCTCCAAcga
nlCMBUtCae
ctcCCATUCTCCUCCaACCTTAATTAAcgatcgCCCCCOUtCaSACAATTCUCaCTTTgagctcTAAAAETCCTBCCCCACACCMkCaOcgatcgTCTTTCCOTTCGACTDUkCOOCTCTAgagctcCOUhCOOCACCCUCaaCATSACCTcgactcACTTUCeCTCUCITCCATCCCCAAcga
gctcCCCCUCaCTABUCCCTATACTGAcga
CTCTMCHnACgACPULU»J»/llHTAv»5V((bV47V70 ctcACTAACataCTCCCACOUkCCSCAAAcga
D. H D hi—«<-g a l t a a,B1 ctcATCATAATTCCCCBJCaarATTCAAcgaB2 ctcAASACATTCCSACDtaCOaCTAAAAcgaB3- tcgTTAAACCrajeaSTATATTTCCTCAgaqB3b . .AAAACCCmUCaOACTAACCcgaB4 ctcAAACTCCCrrTCTOkkCCCTCAAAAAcgaBS- tcgTTCCTATTTAGIMCaaACTACTCSAqw)B(- tcgTTATTACCmMOaOTAAAACTATCgagB(c ctcACCTTACIUCaSTAAAAAACCCCAAcgaB7 ctcOUCOSTACACC—ACCCTAATCCTCcgoBIO- ctcTTACTCCCCCCCCAAAIUCaXAACci)aBU ctcCCACCCCSCCCCUCEBATACTCCCTcgaB20 ctcCCAnTCTCCCCIUCaaTAAACTAAcgaB21 ctcTTTUCaSCAAASCAACAAnTTAAAcgaB22b tcgrTCAAArTCnUOeeCAAAlTGATTgagB25b ctaUCICTAASACTCCATCATCCCCTAAcgaB2t- tcgCTTCTGATCSUbQOOrTGTtAATTTgagB27 c t c c i . i T a j c i A A r m A A n a c a c r n . ' i c g aB27bB27c-B2I-B29-B30b
gctcTrATtHftinTATTATCCACCTCIAAcga
tcgTTATTAOUbCOCTCCTTTTTCTCTCTgagUOGaCAGACCCATATCACGCAC
ggctaUOGaCAGACCCATATCACGCACTGAcga
B S B K ctcACAGACAACAmUbCOOTAAACCACcgaBillli' ctcACCTCAXU*CaOCACAATATCCCCAcgaB2b a j C S S I M C a e T A T C TB3c gBib ctcCCCCSTDtCACTCCATCTMbCOeTAAcgaBll ctcC-ACACCTTDUkCSBTCACOaCeSCcgaB13 ctcTCTCT00mAATCCBA£TOCCAC-cgaBIS ctcCCACAATCtUCCSCACCUCCCCTAcgaBllb c t c A C C C T » » r m l » m U 1 1B20a g gB21b ctcTTACUillUILTAOACOgCAAATAAcga
B23 ctcTACACCTTCIUCCCCCATCUCTOTcgaB24a- ctcACTUCCCTACOUCCeCATATCTAAcgaB27a- ctcACCTDUCaCCTAnaCaOCCACCAAcgaB24 ctcCUyCCSCCATACCTUCacTATTTCcgaB24b ctcTTAOAOUIAOUkCSTCATAATCCAAcgaB25 ctcCCUCenrnUUCeCCTCCOUICCOTcga
ACcga
Figure 3. Sequences of cloned selected round 4 oligonucleotides. Minus signs indicate reverse orientation to original sequence reading. Starred sequences containa 5/6 match to the Myb core consensus (shown in bold type); the remainder contain a 6/6 match. Single site sequences are shown aligned with respect to the Mybcore; double or triple site sequences are presented unaligned. Nucleotides in lower case are the ends of the fixed flanking regions. Comp: ambiguous sequence causedby compression artefact on gel.
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018
Nucleic Acids Research, Vol. 20, No. 12 3047
Table 1. Analysis of selected sequences. Base preferences for single sites, double sites, and all sites together are shown for each of the four protein sources. Thebases immediately 5' of the myb core hexamer are numbered +1 to +5; the core is unnumbered. Base preferences of all sites are shown as simple scores, andalso as percentages.
A. XT binding thtasiogl* sltastccATot*lFoalU.on
Doubla alt
rc0&Tot4llPosition
All SltASTCaj .TotalPosition
All mitmtTC
0APoa I d an
177
_
«
24C
434
4113
4
<*>70.722.4
0
R V binding kitesSlngla altasTCGATotalPosition
277
135
Doubla altos
c
cATotalPosition
All altaaTccATDUlPosition
All altaaTcGAPosition
3C17
457
4324
392
ea.52C.105.4
24A K
34A K
41A K
A K
34A K
39A K
95AAC
A K
2
14
24
1
_34
101SO_
a
14.4l . C
0
2232
34
71
359
3• 0395
9 . 33 . 2• 4.23 . 2
1123124
11
23*
2254341
1 . 11 .1
4 . 9
_1
33254
74
354
7377
594
7 .55 . 1•1 .95 . 3
12424244 1
18
7344 1
1014413414 1
49.223.0
21.34 1
1211
13344 l
2222
10374 l
3435123934 1
34. C37.41.124.74 1
49-
11244 2
10
12324 2
14192235*4 2
24 .32.
3 9 .4 2
423
734•2
1219
175242
1 14 2424M-2
20.4 7 .4 . 42 7 .- 2
1
•
7
57
3
1223(244 3
1
11314 3
2431020574 3
425.
354 3
14-101234• 3
159
2132• 3
2991733
•3
331019
.13
.1
.0
. 2
.337 .5• 3
•23132444
4
111044
2145243444
37.510.7
42.944
19•
343444
1312
174 944
322010
23•S44
37.723.511. •27.144
1411•244 5
3
112»4 5
24
c319344 5
4 9 .1 1 .
3 3 .4 3
203
39354 5
1413
15444 3
34145
24• 14 5
4 4 .1 9 .4 .22 9 .4 3
21
2
41
4
C C binding altcaSlngla allTC0ATotalPoaltlan
c*a23! •1147
Doubla alua
cATotalPoaltloa
All altaaTCCATotalPoaltlan
All altaaTC
A
Poaltloo
•
}
20
142*1(47
n>30.1!» . •
9^0
49A K
22AAC
71A K
A K
D. BM2 binding •&*>
Singla altT
GA
TotalPoaltlan
17
-222
Ooubl* altaaTcGATotalPoaltlon
All altaaTC
ATotalPoaltlon
All altaaTC
GAPoaltlan
219
232
3*12
454
<%l70.422.207.4
24AK
33A K
57A K
A K
1CIS21149
22
U1114171
12.4K . I
1.4
1
21-24
2424133
54
157
• ••7 .0•2.51.1
—-44549
-
322
--
a•71
00
-
23-24
2124433
22
457
3 . 53 . 5• 4.07 . 0
201449494 1
9
1194 1
2423710444 1
34.234.•10.314.74 1
13
-52441
11
-3334 1
2425
•574 1
42.143.9014.04 l
723211O4 2
•
4114 2
9
n417
a4 2
14. •50. •
27.94 2
4
-
1424
9
112304 2
12l i
2*544 2
22.227.e1 .94>.24 2
7542743
1
411• 1
10•1011414 1
14.413.11C.454.14 3
4
41224
55415294 3
117
27534 3
20 . •13.215.130.94 3
U91U4144
7
41 744
211C417CO44
ja.J24.7
24.344
5
1132444
U5212444
14•
Z3
so44
32.014.04 . 044.044
124120414 3
7
217• 3
1713C2254• 3
29.322.410.417.9
C
1922•5
115
•24• 5
17111174C4 3
37.023.92 . 231.04 3
completely to abolish binding under bandshift conditions (probe2K, lane 4), indicating that differences must exist in the bindingaffinities of the four possible myb core sequences (seeDiscussion). Oligonucleotides 3K and 4K, which contain Gresidues at positions + 1 , +2, +4 and +5 (3K) or positions +1and +2 (4K) relative to the Myb hexamer core, failed to bindto any of the three proteins (lanes 5 and 6); oligonucleotides 5Kand 6K, which contain a G residue at either +1 or +2, appearto bind either very weakly (lane 7) or not at all (lane 8); probe7K, with an A residue at position +1 , is functional, but lessefficient than 9M1 (lane 9), and probe 8K, designed as an optimalsite for c-Myb, binds approximately as well as probe 7K (lane10). It therefore appears that the extended consensus derived usingthe site selection assay is correct, with respect to the exclusionof G residues from the 3' side of the myb core site.
DISCUSSION
A site selection assay has been employed to show that the chickenc-Myb protein and the AMV v-Myb oncoprotein share a preferredcommon binding site, with consensus YAACKGHH. In the caseof c-Myb, the consensus can reasonably be extended to includea C residue at position 5, as sites containing this residue weredetected at greater frequency (18%) than when v-Myb proteins
Table 2. Consensus Myb binding sites derived from analysis of selected sequences.For each of the four protein sources, the consensus sequence for positions +1to +5 to the 5' side of the myb core hexamer was derived as follows: >N =base occurred in >50% of cases; N/N - either base occurred in >75% of eases;notG = G occurred in <6.5% of cases, and lowG « G occurred in ~ 10%of cases. The general consensus was derived similarly.
XT
V
c
• a
Taaatnar
Poaltlon
T/C
»/C
i / e
T/C
T
1
AAC
AAC
AAC
AAC
AAC
1J4
C/T
S
C/T/C
s
e/TCc)
5
C
e
a
a
0
<
>TnotG
nota
T/C
T/Cnote
aoto
• 1
notG
>coota
>cnotG
Mnota
nota
•2
T/A
T/A
>A
>A
-
• 1
T/A
lovG
note
T/Anota
loaO
44
T/»notG
>Tnote
lo»C
notG
nato
were used (4%). Additionally, slightly different bindingpreferences outside the originally defined Myb core sequencewere detected for each of the four proteins. It should be stressedthat none of the preferences were mutually exclusive, and maybe due to the conditions of the assay. During initial rounds ofsite selection, the selecting protein will be in excess over itscognate binding sites, and will therefore recognise a completespectrum of both low and high affinity sites; however, as cognate
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018
3048 Nucleic Acids Research, Vol. 20, No. 12
9 M I TAACGGmrrr9MU1 TcACGGTTTTT2K cAAcTGTTTTT3K TAACGGogTog4K TAACGGogTTT
5K TAACGGgTTTT6K T A A C G O T B T T T7K TAACGGlTTTTBK TAACGGTdTT
specific
specific
specific
Figure 4. Verification of an extended Myb consensus sequence. A.Oligonucleotides used for gel mobility shift assays. Mutations with respect tothe 9M1 parent oligonucleotide are shown in bold type. B. Gel mobility shiftassays with XTMyb, v-Myb and c-Myb. Lane 1 contains a 100-fold excess ofcold 9MI oligonucleotide. Reactions loaded in all other lanes did not containoligonucleotide competitors. Note that a slow-migrating non-specific complex isvisible in some lanes; this is not a Myb-specific band as it is detected in conditionswhere Myb cannot bind (lane 3, probe 9MU1).
sites are amplified during succeeding rounds of selection, thenumber of binding sites will eventually exceed the amount ofprotein, and so there will be preferential binding to higher ratherthan low affinity sites, meaning that only a subset of all possiblebinding sites will be represented. The initial spectrum of sitesselected and the point at which the change to high versus lowaffinity selection occurs is dependent on the binding affinity andconcentration of the selecting protein. As both these parametersdiffered for the four proteins used here, differences in basepreference might be expected. Consensus sequences derived usingv-Myb protein from BM2 nuclear extract and the in vitrotranslated proteins may also have differed due to dissimilaritiesin protein modification, such as phosphorylation, affectingbinding affinity.
Enumeration of the frequency of occurrence of all four possible'perfect' core hexamers in the selected sequences showed thatTAACGG was selected at least 50% of the time by the v-Mybderivatives (XT: 27/49; V: 46/67; BM2: 27/40), and 38% ofthe time by c-Myb. TAACTG and CAACTG were the leastpopular permutations for the v-Myb derivatives (XT: 6/49 and4/49 respectively; V: 2/67 and 2/67 respectively; BM2 : 2/40and 2/40 respectively) but occurred slightly more often in thec-Myb selection (7/42 and 12/42 respectively), as would beexpected from the less stringent base preferences of c-Myb atpositions 1 and 5 of the Myb core site (see Results). Thisdifference between the v-Myb derivatives and c-Myb may againbe a consequence of the site selection assay; as c-Myb binds DNA
least well of the four test proteins (15), lower affinity sites maystill be present in the round 4 population of oligonucleotides, forthe reason discussed above. However, the lower frequency ofoccurrence of these two binding sites in the XT, V and BM2site selection assays may be a reflection of their weaker affinityfor v-Myb protein; c-Myb may not differentiate quite so muchbetween the four possible Myb cores. The in vivo consequencesof this difference with respect to transformation by v-Myb arehard to predict. As v-Myb has a higher affinity for DNA thandoes c-Myb (20), the elevated amounts of v-Myb necessary fortransformation (21) might be expected to bind most or all of thosesequences recognised by c-Myb. The increased binding specificityof v-Myb over c-Myb observed here does not support the theorythat AM V v-Myb binds an additional spectrum of transformation-specific sites.
A noticeable feature of the selected oligonucleotides was thatmany contained two putative Myb binding sites. A second Mybcore hexamer would be expected to occur by chance in a 26-merabout every 36 molecules, assuming the maximum possiblespacing between the two motifs; in practice, spacing was usuallyless than maximum in the double site oligonucleotides observed,so the probability is actually less than this. As between 1 in 6(for c-Myb) and 1 in 3 (for BM2) of oligonucleotides carrieddouble sites, it must be assumed that this is of significance, andpresumably means that Myb proteins have a greater bindingaffinity for double than single sites. This predilection for doublesites has also been noted in a similar study using the murinec-Myb protein (19), but the basis for the phenomenon is unclear;possibly, some cooperativity of binding is occurring. It is ofinterest to note that in the promyelocyte-specific mim-l gene,which is the only clearly defined gene shown to be directlyregulated in vivo by Myb proteins (17), trans-activation occursvia three closely spaced binding sites in the promoter region.
The base preferences indicated by site selection for the Mybproteins were broadly confirmed by gel mobility shift assays usingspecific oligonucleotide probes. Whilst any mutation of thebinding site 9M1 appeared to decrease binding affinity, thepresence of G residues at positions +1 and +2 relative to theMyb core site caused the most dramatic reduction of binding forall the Myb proteins in gel mobility assays, as predicted fromthe site selection data. Inspection of selected oligonucleotidesshowed that only 2/284 putative myb core hexamers werefollowed by the sequence GG, whose presence completelyabolished binding in the gel mobility assays, and only 19/284had a G at either +1 or +2, which reduced binding to almostundetectable levels. Interestingly, the dinucleotide GG occurs inonly 1/284 sites at positions +4 and +5, and indeed only 8/284times between positions +1 to +5, making it seem likely thatit has deleterious effects on binding in positions other than +1and +2. Further experiments are required to test this possibility.
An oligonucleotide in which the high affinity Myb core TAA-CGG was mutated to CAACTG failed to complex significantlywith any of the chicken Myb proteins in gel mobility shift assays.Bona fide selected binding sites unable to form complexes in gelmobility shift assays have been described previously (19).Presumably, the sensitivity of the gel mobility shift assay is lowerthan that of site selection with regard to the detection of allpossible complexes; the PCR amplification step in site selectionensures that even complexes present in small quantities can bescored. Bearing this in mind, the gel mobility shift results provideconfirmation that the CAACTG core is a lower affinity bindingsite for the Myb proteins, as predicted by site selection.
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018
Nucleic Acids Research, Vol. 20, No. 12 3049
In a similar study using a C-terminally truncated murine c-Mybprotein (19), an extended Myb consensus sequence of YAA-CT/C/GGYCR was derived from comparison of 51 selectedoligonucleotides. This sequence is sufficiently similar to thechicken c-Myb consensus derived here of YAACT/(C)/GGYCAto make it seem likely that the murine and chicken proteins havealmost identical binding properties. As there is only one aminoacid difference between the DNA binding domains of the murineand chicken proto-oncogenes, that they share a consensussequence is not unexpected.
In conclusion, the three amino acid changes in the DNA bindingdomain of the AMV v-Myb oncoprotein relative to that of E26and c-Myb do not cause any major changes in binding specificity,as assayed by site selection. Comparison of the selected bindingsites of AMV v-Myb and c-Myb indicates that the preferredv-Myb core consensus is YAACT /GG, whereas that of c-Mybis YAACT/(C)/GG. The apparent relaxation in the specificity ofc-Myb binding is probably due to its reduced affinity for DNArelative to v-Myb; the sequence YAACCG was observed in siteselections with the three v-Myb derivatives, but at much lowerfrequency. The observed differences in the transformationphenotype caused by mutation of any of the three residues inthe DNA binding domain are therefore unlikely to be due tochanges in binding specificity. Other potential mechanisms, suchas differential recognition of accessory factors, or changes inbinding affinity, must be invoked instead.
ACKNOWLEDGEMENTS
I thank Pratipa Badiani for skilled technical assistance, and MaryCollins for helpful comments on the manuscript. This work wassupported by the Cancer Research Campaign.
REFERENCES1. Weston, K. and Bishop, J. M. (1989) Cell, 58, 85-93.2. Klcmpnauer, K.-H. and Sippel, A. E. (1987) EMBOJ., 6, 2719-2725.3. Peters, C. W. B., Sippel, A. E., Vingron, M. and Kkmpnauer, K.-H. (1987)
EMBOJ, 6, 3085-3090.4. Frampton, J., Leutz, A., Gibson, T. and Graf, T. (1989) Nature, 342, 134.5. Howe, K. M., Reakes, C. F. L. and Watson, R. J. (1990) EMBOJ., 9,
161-169.6. Oehter, T., Arnold, H., Biedenkapp, H. and Klempnauer, K.-H. (1990)Nucl.
Acids Res., 18, 1703-1710.7. Introna, M., Golay, J., Frampton, J., Nakano, T., Ness, S.A. and Graf,
T. (1990) Cell, 63, 1287-1297.8. Norman, C , Runswick, M., Pollock, R. and Treisman, R. (1988) Cell, 55,
989-1003.9. Klempnauer, K.-H., Gonda, T. J. and Bishop, J. M. (1982) Cell, 31,
453-463.10. Melton, D. A., Krieg, P.A., Rebagliati, M., Maniatis, T., Zinn, K. and
Green, M.R. (1984) Nucl. Acids Res., 12, 7035-7056.11. Jackson, R. J. and Hunt, R. T. (1983) Meth. in Enzymol., 96, 50-74.12. Evan, G. I., Lewis, G. K. and Bishop, J. M. (1984) Mol. Cell. Bid., 4,
2843-2850.13. Digram, J. D.. Lebovitz, R. M. and Roeder, R. G. (1983) Nucl. Acids Res.,
11, 1475-1489.14. Pollock, R. and Treisman, R. (1990) Nucl. Acids. Res., 18, 6197-6204.15. Ramsay, R. G., Uhii, S. and Gonda, T. J. (1991) Oncogene, 6, 1875-1879.16. KJempnauer, K.-H., Symonds, G., Evan, G. I. and Bishop, J. M. (1984)
Cell, 37, 537-547.17. Ness, S. A., Marknell, A. and Graf, T. (1989) Cell, 59, 1115-1125.18. Biedenkapp, H., Borgmeyer, U., Sippel, A. E. and Klempnauer, K.-H. (1988)
Nature, 335, 835-837.19. Howe, K. M. and Watson, R. J. (1991) Nucl. Acids Res., 19, 3913-3919.20. Luscher, B., Christenson, E., Lilchfield, D. W., Krebs, E. G. and Eisenman,
R. N. (1990) Nature, 344. 517-522.21. Schirm, S.,Moscovici, G. and Bishop, J. M. (1990)7. Virol., 64, 767-773.
Downloaded from https://academic.oup.com/nar/article-abstract/20/12/3043/2376568by gueston 30 March 2018