and distribution of palindromes y saccharomyces cerevisiae g · 2012. 1. 29. · genetic...

31
University of Zagreb Faculty of Science Department of Biology RECOMBINOGENICITY AND DISTRIBUTION OF P ALINDROMES IN THE YEAST Saccharomyces cerevisiae GENOME DOCTORAL THESIS submitted to the Department of Biology Faculty of Science University of Zagreb for the academic degree of Doctor of Natural Sciences (Biology) Zagreb, 2011. Berislav Lisnić

Upload: others

Post on 26-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

University of ZagrebFaculty of Science

Department of Biology

Recombinogenicity and distRibution of PalindRomes

in the yeast Saccharomyces cerevisiae genome

Doctoral thesis

submitted to the Department of Biology

Faculty of Science University of Zagreb

for the academic degree ofDoctor of Natural Sciences (Biology)

Zagreb, 2011.

Berislav Lisnić

Page 2: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

The research described in this thesis was performed by the author in the Laboratory of Biology and Microbial Genetics at the Faculty of Food Technology and Biotechnology (Zagreb, Croatia), under supervision of dr. sc. Ivan Krešimir Svetec and dr. sc. Zoran Zgaga, as part of the postgraduate program in biology at the Department of Biology, Faculty of Science.

ii | Declaration of autorship and superivion

declaration of autorship and supervision

Page 3: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

Like all scientific endeavours, the work presented here was a collaborative effort that included many people. Preparing, organizing and writing this thesis has been a difficult, long, frustrating - but more than anything else - an exciting and intellectually rewarding adventure. Those who have, directly or indirectly, joined and helped me on this journey have earned my deepest respect and appreciation.

First of all, I would like to extend my gratitude to my mentors, Professors Zoran Zgaga and Ivan Krešimir Svetec for their counsel, guidance, patience and encouragement during my undergraduate and graduate years. This thesis would not have seen the light of day without their support. I also record special thanks to my committee members, Professors Višnja Besendorfer and Kristian Vlahoviček for their valuable support and suggestions. Furthermore, I gratefully acknowledge the financial support from the Croatian Ministry of Science, Education and Sports, which made the research described in this thesis possible.

It has been a privilege to spend several years at the Laboratory of Biology and Microbial Genetics at the Faculty of Food Technology and Biotechnology, where I have met many people with whom I shared some of the best and some of the worst moments of my PhD adventure. I will always hold our group and lab members, whether former or current, in my memory - particularly Krešimir Gjuračić, Sandra Gregorić, Petar Mitrikeski and Marina Miklenić, who were always ready to lend a helping hand and ear whenever it was needed.

The staff of the Ligament polyclinic, Diana, Dora, Hrvoje, Mario, Marinko, Marko, Slavica, Zdravko, and most notably "Il Dottore Medico" - dr. Stjepko Bućan, deserve my sincerest thanks. The knowledge, assistance and friendship Stjepko extended to me over the last couple of years mean to me more than I could ever express. I owe him my heartfelt appreciation.

My friends, especially Zvonka Zrnčević, Anita Marjanović, Lucija Barbarić and Igor Obleščuk were never-failing sources of joy, happiness and support when it was most needed. I am very happy to have you around me, and I hope that our friendship will last for many more years to come.

Finally, I dedicate this work to the closest members of my family - my mother Jasna, who left us too soon, my wife Vanda and my mother-in-law, Jasna. It is they who, quite often, suffered more from my PhD than myself. Their unconditional love, encouragement, understanding and endless patience were always my driving force and inspiration that allowed me to successfully finish this journey. I owe you everything and I hope that this work will make you proud.

Acknowledgements | iii

acknowledgements

Page 4: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

contents

iv | Contents

Title page iDeclaration of autorship and supervision iiAcknowledgements iiiContents ivBasic documentation card viTemeljna dokumentacijska kartica viiList of papers viiiAbbreviations ixSummary x

INTRODUCTION

1 | intRoduction to PalindRomic dna 1.1. Nomenclature and definitions

1.2. Palindromes are important, but dangerous DNA motifs

2 | mechanisms of PalindRome-dRiven genetic instability 2.1. Hairpin-mediated genetic instability

2.1.1. Replication slippage

2.1.2. Invasion of the 3' end into ectopic homology

2.1.3. Nucleolytic processing of the hairpin structure

2.1.4. Stalled replication fork reversal

2.2. Cruciform-mediated genetic instability

2.2.1. Mechanisms of cruciform extrusion

2.2.2. Cruciform resolution

2.2.3. Center-break palindrome revision3 | mechanisms of PalindRome foRmation in vivo 3.1. Palindromes are formed in vivo as a consequence of DNA metabolism

3.1.1. Replication slippage

3.1.2. Template switching

3.1.3. Fusion of sister chromatids

3.1.4. Replication of hairpin-capped DNA molecules4 | PalindRomes and human health

4.1. Palindromes can cause genetic diseases

4.1.1.Palindrome-mediatedβ-thalassemia 4.1.2. Palindrome-mediated translocations and associated syndromes

4.1.3.Palindromicamplificationsandcancer

1

2

2

4

7 7

8

11

13

15

17

17

20

23

25

25

26

27

29

31

34

34

34

36

38

Page 5: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

contents

DISCUSSION

5 | factoRs affecting PalindRome-stimulated Recombination

5.1. A 110-bp palindromic insert stimulates

recombination regardless of its genomic position

5.2. The role of CRP1, MUS81, SAE2 and SPO11

genes in palindrome-induced pop-out recombination

5.3. A threshold size for palindrome-induced recombination

6 | PalindRomic sequences in the yeast RefeRence genome sequence 6.1. A model for the evolutionary dynamics

of palindromic sequences in the yeast genome

CONCLUSIONS

REFERENCES

CURRICULUM VITAE

INDIVIDUAL PAPERS

Contents | v

41

42

42

46

47

49

51

53

54

64

65

Page 6: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

basic documentation card

University of ZagrebFaculty of ScienceDepartment of Biology

Recombinogenicity and distRibution of PalindRomes

in the yeast Saccharomyces cerevisiae genome

Berislav lisnićFaculty of Food Technology and Biotechnology

Department of biochemical engineeringLaboratory of biology and microbial genetics

Pierottijeva 6, 10 000 Zagreb, Croatia

Genomes of many organisms, from bacteria to mammals, are known to contain significant proportion of palindromic sequences. Although frequently located in cis-acting genetic elements, where they play an important role in the regulation of various cellular processes, palindromes have been recognized as recombinogenic motifs that can provoke various types of genome rearrangements and cause genetic diseases in humans. Employing yeast Saccharomyces cerevisiae as a model organism, this study provides systematic analysis of factors affecting palindrome-stimulated recombination, together with computational tools for distribution analysis of palindromes in DNA sequences. This work establishes for the first time that palindromic sequences become recombinogenic only after they attain a critical length of approximately 70 bp. In addition, the obtained results not only indicate that palindromes conform to specific rules of evolutionary dynamics, but also suggest the existence of a mechanism that suppresses the recombinogenicity of numerous palindromes longer than 70 bp present in the human genome.

Doctoral Thesis

(64 pages, 18 figures, 1 table, 192 references, original in English)

Thesis deposited in National and University Library, Hrvatske bratske zajednice 4, 10 000 Zagreb, Croatia.

Keywords: cruciform/genome stability/palindrome/recombination/yeast

Supervisor: dr. sc. Ivan Krešimir Svetec

Reviewers: dr. sc. Višnja Besendorfer dr. sc. Ivan Krešimir Svetec dr. sc. Kristian Vlahoviček

Thesis accepted: July 6th, 2011.

vi | Basic documentation card

Page 7: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

temeljna dokumentacijska kartica

Berislav lisnić

Palindromske sekvencije u značajnom su udjelu zastupljene u genomima velikog broja organizama, od bakterija do sisavaca. Iako se često nalaze u raznim cis-djelujućim genetičkim elementima, gdje igraju važnu ulogu u regulaciji raznih staničnih procesa, palindromi su prepoznati kao rekombinogeni motivi koji mogu uzrokovati kako raznovrsne rearanžmane genetičkog materijala, tako i genetičke bolesti kod ljudi. Koristeći kvasac Saccharomyces cerevisiae, provedena je sistematska karakterizacija čimbenika koji utječu na rekombinaciju potaknutu palindromom, te su razvijeni računalni alati za analizu distribucije palindroma u sekvencijama DNA. K tome, u ovom je radu po prvi puta utvrđeno da palindromske sekvencije postaju rekombinogene tek kada dosegnu kritičnu duljinu od približno 70 bp, a ukupni rezultati ne samo da pokazuju kako se palindromske sekvencije pokoravaju specifičnim pravilima evolucijske dinamike, već upućuju i na postojanje mehanizma koji suprimira rekombinogenost brojnih palindroma duljih od 70 bp, koji su prisutni u humanom genomu.

Prehrambeno-Biotehnološki FakultetZavod za biokemijsko inženjerstvo

Laboratorij za biologiju i genetiku mikroorganizamaPierottijeva 6, 10 000 Zagreb, Hrvatska

Rekombinogenost i distRibucija PalindRoma

u genomu kvasca Saccharomyces cerevisiae

Sveučilište u ZagrebuPrirodoslovno-matematički fakultetBiološki odsjek

Doktorska disertacija

(64 stranice, 18 slika, 1 tablica, 192 literaturna navoda, jezik izvornika: engleski)

Rad je pohranjen u Nacionalnoj i sveučilišnoj knjižnici, Hrvatske bratske zajednice 4, 10 000 Zagreb, Hrvatska.

Ključne riječi: kruciform/stabilnost genoma/palindrom/rekombinacija/kvasac

Mentor: dr. sc. Ivan Krešimir Svetec

Ocjenjivači: dr. sc. Višnja Besendorfer dr. sc. Ivan Krešimir Svetec dr. sc. Kristian Vlahoviček

Rad prihvaćen: 6. srpnja 2011.

Temeljna dokumentacijska kartica | vii

Page 8: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

list of papers

This thesis consists of the following papers:

o Lisnić B, Svetec IK, Štafa A, Zgaga Z (2009) Size-dependent palindrome-induced intrachromosomal recombination in yeast. DNA Repair 8: 383-389.

o Lisnić B, Svetec IK, Šarić H, Nikolić I, Zgaga Z (2005) Palindrome content of the yeast Saccharomyces cerevisiae genome. Current Genetics 47: 289-297.

o Svetec IK, Lisnić B, Zgaga Z (2002) A 110 bp palindrome stimulates plasmid integration in yeast. Periodicum Biologorum 104: 421-424.

viii | List of papers

Page 9: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

abbreviations

Abbreviations | ix

descRiPtion of yeast gene names

standaRd name name descRiPtion

CRP1 Cruciform DNA-recognizing protein 1 CYC1 Cytochrome c 1 DNA2 DNA synthesis defective 2 EXO1 Exonuclease 1 LYS2 Lysine requiring 2 MRE11 Meiotic recombination 11 MSH1 MutS homolog 1 MUS81 Ethyl-methane sulfonate, UV-sensitive clone 81PSO2 Psoralen derivative sensitive 2 RAD1 - RAD52 Radiation sensitive 1-52 SAE2 Sporulation in the absence of SPO eleven 2 SGS1 Slow growth suppresor 1 SLX4 Synthetic lethal of unknown (X) function 4 SPO11 Sporulation deficient mutant 1-1 URA3 Uracil requiring 3 XRS2 X-ray sensitive 2 YEN1 Yeast endonuclease 1

abbReviations foR commonly used teRmsA, C, G and T Adenine, cytosine, guanine and thymine Alu-IR Inverted repeat composed of two Alu elements bp Base pair DNA Deoxyribonucleic acid DSB Double-strand breakdsDNA, ssDNA Double-stranded DNA, single-stranded DNA FS2 Fragile site 2HSV-1 Herpes simplex virus 1IR Inverted repeatkb KilobaseMMV Mouse minute virusMRX Mre11/Rad50/Xrs2 protein complexNHEJ Non-homologous end-joiningPATRR Palindromic AT-rich repeatRNA Ribonucleic acidRMS Restriction-modification systemmRNA Messenger RNArRNA Ribosomal RNA SSA Single-strand annealingSSB Single-strand binding protein(s)Ty1 Transposon yeast 1UV Ultraviolet

Page 10: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

summary

x | Summary

Numerous DNA sequencing projects have demonstrated that genomes of many

organisms contain significant proportion of palindromes - repetitive DNA motifs

characterized by an internal two-fold axis of rotational symmetry. Palindromes are

frequently found in different regulatory regions of the genome, such as promoters,

operators and origins of replication, where they are involved in the regulation of

fundamental cellular processes. However, longer palindromes have also been recognized

as highly recombinogenic motifs, provoking different types of potentially dangerous

genetic alterations which can have harmful impact on human health.

The ability of palindromes to undermine the stability of genetic material and promote

DNA rearrangements has been linked to their potential to form secondary structures

in DNA known as hairpins and cruciforms. Once formed, these secondary structures

can be processed to recombinogenic double-strand breaks that can lead to genome

rearrangements and development of serious genetic diseases, such as β-thalassemia,

Emanuel syndrome, type-1 neurofibromatosis and cancer. However, in spite of their

negative influence on genetic stability and human health, factors affecting palindrome-

mediated DNA breakage and recombination still remain poorly characterized.

In this study, using yeast Saccharomyces cerevisiae as a model organism, the

effect of several physical and genetic parameters, primarily length and position, but

also the deletion of MUS81, SPO11, CRP1 and SAE2 genes on palindrome-stimulated

recombination was systematically investigated. In addition, we developed a computer

program which can identify, locate and count palindromes in a given sequence in a

strictly defined way, and we used this program to analyze the palindrome content of the

entire yeast genome. The obtained results demonstrate that, regardless of their position

in the genome, palindromic sequences become recombinogenic only after they attain a

critical length of approximately 70 bp, whereas the deletion of the SAE2 gene completely

abolishes the high recombination rate in palindrome-containing constructs. Moreover,

computational analyses enabled us to propose a general model for the evolution of

palindromic sequences in yeast, while the combined results suggest that suppression of

palindrome-induced recombination may be crucial for the genetic stability of genomes

containing large numbers of very long palindromes, such as the human genome.

Page 11: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

individual papers

Page 12: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

DNA Repair 8 (2009) 383–389

Contents lists available at ScienceDirect

DNA Repair

journa l homepage: www.e lsev ier .com/ locate /dnarepai r

Size-dependent palindrome-induced intrachromosomal recombination in yeast

Berislav Lisnic, Ivan-Kresimir Svetec, Anamarija Stafa, Zoran Zgaga ∗

Laboratory of Biology and Microbial Genetics, Department of Biochemical Engineering, Faculty of Food Technology and Biotechnology, Pierottijeva 6, 10 000 Zagreb, Croatia

a r t i c l e i n f o

Article history:Received 3 September 2008Received in revised form 15 October 2008Accepted 25 November 2008Available online 4 January 2009

Keywords:PalindromeSAE2Pop-outGenome stabilityRecombination

a b s t r a c t

Palindromic and quasi-palindromic sequences are important DNA motifs found in various cis-actinggenetic elements, but are also known to provoke different types of genetic alterations. The instability ofsuch motifs is clearly size-related and depends on their potential to adopt secondary structures known ashairpins and cruciforms. Here we studied the influence of palindrome size on recombination between twodirectly repeated copies of the yeast CYC1 gene leading to the loss of the intervening sequence (“pop-out”recombination). We show that palindromes inserted either within one copy or between the two copiesof the CYC1 gene become recombinogenic only when they attain a certain critical size and we estimatethis critical size to be about 70 bp. With the longest palindrome used in this study (150 bp) we observeda more than 20-fold increase in the pop-out recombination. In the sae2/com1 mutant the palindrome-stimulated recombination was completely abolished. Suppression of palindrome recombinogenicity maybe crucial for the maintenance of genetic stability in organisms containing a significant number of largepalindromes in their genomes, like humans.

© 2008 Elsevier B.V. All rights reserved.

1. Introduction

Palindromes, defined as inverted repeats without spacer DNA,are present in the genetic material of all living organisms, in viralgenomes and plasmids. Their significance was first appreciated inthe context of the bacterial type II restriction–modification sys-tems and a number of regulatory proteins were found later to bindpalindromic sequences as homo- or hetero-dimers. Another impor-tant property of palindromes and other types of closely spacedinverted repeats is their intrinsic potential to form intra-strandhydrogen bonds giving rise to secondary structures known as hair-pins and cruciforms. Both protein binding and secondary structureformation make palindromic sequences important constituentsof different cis-acting genetic elements like promoters, operators,attenuators, viral and plasmid origins of replication [1–5].

Another reason to study palindromes is based on observa-tions obtained in different experimental systems indicating thatthey induce genetic instability. In Escherichia coli, it is very diffi-cult to propagate plasmids with palindromes longer than 160–200nucleotides and such instability can only partially be suppressedin sbcC/sbcD mutants [6,7]. A 160 bp palindrome induced geneticrecombination in vegetative cells of the yeast Schizosaccharomycespombe [8], while a 140 bp palindrome presented a hot spot formeiosis-specific double-strand breaks in S. cerevisiae [9]. Strong,palindrome-induced genetic instability was also observed in the

∗ Corresponding author. Tel.: +385 1 4836013; fax: +385 1 4836016.E-mail address: [email protected] (Z. Zgaga).

yeast vegetative cells where the inverted dimer of the URA3gene (1.02 kb) introduced in the genome stimulated recombinationin adjacent region up to 17,000-fold [10]. However, palindromicsequences present on plasmid molecules that could not be prop-agated in E. coli without rearrangements were maintained in theyeast genome and it was suggested that the yeast S. cerevisiae shouldbe used for cloning and propagation of long palindromes [11]. Thepotential of palindromic and near-palindromic sequences to inducegenetic rearrangements was also observed in mammalian cells.For example, Cunningham et al. [12] established the murine cellline bearing an artificially introduced 15.4 kb palindrome that wasrearranged and subsequently stabilized by “center-break mecha-nism” at the rate of about 0.5% per population doubling. In humans,long AT-rich near-palindromic sequences were identified as breakpoints for meiotic translocations [13,14], while the formation oflarge palindromes observed in human cancers was associated withgenomic amplifications [15]. Moreover, it has been proposed thatthe nonrandom formation of large palindromes in human can-cers may be triggered by the presence of shorter palindromicor near-palindromic sequences that can adopt hairpin and cruci-form structures [16]. To account for these and other examples ofpalindrome-induced genetic instability, two non-exclusive typesof models have been proposed: first, which depend on templatemisalignment during replication and second, which require anenzymatic activity that can transform secondary structures formedin vivo into double-strand breaks (reviews: [17–20]). The membersof the polyfunctional MRX complex (Mre11-Rad50-Xrs2), togetherwith the product of the SAE2 gene, were identified in yeast as pro-teins involved in processing of the hairpin-capped break needed

1568-7864/$ – see front matter © 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.dnarep.2008.11.017

paper 1

66 | Size-dependent palindrome-induced intrachromosomal recombination in yeast

Page 13: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

384 B. Lisnic et al. / DNA Repair 8 (2009) 383–389

for generation of recombinogenic DNA ends [21,22] and recentlyit was shown that the plasmid–borne palindromic sequences aretransformed to DSBs in a Mus81-dependent manner [23].

Important genetic elements, but also potentially destabilizingmotifs, palindromic sequences are attractive DNA patterns for sys-tematic study at genome level. Short palindrome avoidance wasfirst observed in the genomes of some bacteriophages and bac-teria and this bias was explained in the light of the activity ofbacterial restriction–modification systems [24]. However, shortpalindromes were also under-represented in the yeast S. cerevisiae,but at the same time palindromes longer than 12 bp were shownto be highly over-represented [25]. Long, AT-rich palindromeswere also over-represented in the chromosomes III and X of thenematode Caenorhabditis elegans [26]. In the human genome palin-dromic sequences are over-represented in the introns and upstreamregions. More than 30 palindromes longer than 100 bp were scored,including several palindromes attending up to 300–400 bp [27].Moreover, it has been proposed that longer palindromes may beeven more frequent in eukaryotic genomes but remain undetecteddue to their instability during cloning in the E. coli cells [28].

An excess of large palindromes could clearly present a threat togenome integrity and we decided here to investigate in more detailthe influence of palindrome size on palindrome-induced geneticinstability. For this purpose palindromic sequences of increasinglengths were introduced in the yeast genome in order to monitorwhether they would stimulate intrachromosomal recombinationbetween flanking homologous sequences. Our results indicated thatpalindrome-stimulated pop-out recombination is size-dependent,but only after a critical size of approximately 70 bp is reached.

2. Materials and methods

2.1. Media and growth conditions

Yeast cells were grown at 28 ◦C either in the standard YPD or insynthetic complete (SC) medium, Ura+ transformants were selectedon synthetic SC medium lacking uracil (SC-ura), and geneticinresistant transformants were selected on YPD plates containing200 �g/mL G418 [29]. Pop-out recombinants were selected on SCplates supplemented with 1 g/L 5-fluoroorotic acid (5-FOA) and33.6 �g/mL uracil [30]. E. coli DH5� was grown at 37 ◦C in LBmedium supplemented with 50 �g/mL ampicillin, when necessary[31].

2.2. Plasmids

Plasmids used in this study for introduction of palindromicsequences into the yeast genome were based on the integrative vec-tor pAB218-2 (6504 bp) containing the yeast CYC1 region (1691 bp)and the URA3 gene [32]. Synthetic palindromic inserts were intro-duced either in the plasmid backbone (PvuII site) or within the yeastCYC1 region (EcoRI site). The longest, 136 bp EcoRI–EcoRI palin-dromic insert presented in Fig. 2c was constructed by self-ligationand subsequent digestion with EcoRI of the 68 bp EcoRI–BlnI frag-ment from the multiple cloning site (MCS) present in the plasmidpAB218-8 [33]. This palindrome was then either ligated directly intothe EcoRI site of pAB218-2 to give rise to the plasmid pAB7-146,or end-filled using the Klenow fragment and then ligated into thePvuII site, giving rise to the plasmid pAB9-150. Although the samepalindromic insert was used for construction of these two plas-mids, the overall size of palindromic sequence in pAB7-146 was146 bp since the EcoRI site in pAB218-2 is flanked by T:A and A:Tbase pairs. Similarly, the overall size of the palindromic sequencein pAB9-150 is 150 bp, since the 136 bp palindromic fragment wasend-filled prior to insertion into the PvuII site in pAB218-2, which

is flanked by G:C and C:G base pairs. Plasmids with shortenedpalindromic sequences present either in the CYC1 region (pAB7-110, pAB7-98, pAB7-74, pAB7-50 and pAB7-20) or in the plasmidbackbone (pAB9-114, pAB9-102, pAB9-78, pAB9-54 and pAB9-24)were constructed by digestion of corresponding parental plasmids(pAB7-146 and pAB-150, respectively) with the appropriate restric-tion enzyme (HindIII, SphI, SalI, BamHI or SacI; see Fig. 2c for exactlocation of these sites) and subsequent religation. The overall sizeof a palindromic sequence formed in each plasmid is indicated inits name. However, it should be noted that the restriction sitesEcoRI and PvuII present in the plasmid pAB218-2 (no palindromicinsert), together with their flanking bases, are also 8 bp palindromicsequences. Standard molecular biology procedures were used for allDNA manipulations [31].

2.3. Yeast transformation

We used standard lithium acetate transformation procedure[34] for construction of the sae2::kanMX strains, but without carrierDNA added. For the construction of all other yeast strains a modifiedspheroplast procedure, described previously, was used [35].

2.4. Yeast strains

All yeast strains used in this study were isogenic to the parentalstrain FF18-52 (MATa, ade5, leu2-3,119, trp1-289, ura3-52, canR).The strain FF-Ura+, in which the ura3-52 allele was replaced bythe functional URA3 gene, was isolated after the transformation ofthe FF18-52 strain with circular plasmid pAB218-2 and Southernblot analysis of the Ura+ transformants obtained. The yeast strainsused to study the influence of palindromic sequences on pop-outrecombination were constructed by transformation of FF18-52 withappropriate plasmids linearized with XhoI in order to target plas-mid integration into the yeast CYC1 region present on chromosomeX [32]. The control strain F-OOO, with no inserted palindromicsequences, was constructed by integration of the plasmid pAB218-2 in the CYC1 region. As explained before, this construct containedthe endogenous 8 bp palindromes in positions a–c as presentedin Fig. 1. The yeast strains F-OPO-150, F-OPO-114, F-OPO-102, F-OPO-78, F-OPO-54 and F-OPO-24 were constructed by targeted

Fig. 1. Pop-out assay for palindrome-induced intrachromosomal recombination.Recombination between directly repeated CYC1 regions, associated with loss ofthe intervening sequence, generated uracil auxotrophs that were selected on theplates containing 5-FOA. Palindromic sequences (head-to-head arrows, not drawnto scale) were inserted in one of the following locations: EcoRI site present in thecentromere-proximal (a) or centromere-distal (c) copy of the CYC1 gene (strains F-POO and F-OOP, respectively) or in the PvuII site found in the plasmid backbone(strains F-OPO, position b). A–F indicate locations of primers used for amplifica-tion of palindrome-containing regions in the yeast genome. The centromere onchromosome X is indicated by a circle.

paper 1

Size-dependent palindrome-induced intrachromosomal recombination in yeast | 67

Page 14: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

B. Lisnic et al. / DNA Repair 8 (2009) 383–389 385

Fig. 2. Analysis of palindromes introduced in the yeast genome. (a) Amplification of the target loci with primers E and F. The expected sizes of the amplified E–F fragments areas follows: 553 bp, 565 bp, 595 bp, 619 bp, 643 bp, 655 bp and 691 bp in the strains F-OOO, F-OPO-24, F-OPO-54, F-OPO-78, F-OPO-102, F-OPO-114 and F-OPO-150, respectively.DNA fragments were separated by electrophoresis in the 1% agarose gel. Size standard was from Fermentas. (b) The E and F fragment amplified from the strain F-OPO-150(691 bp) was further digested with all restriction enzymes that have their recognition sites present in the palindrome. Indicated are the sizes of the fragments generated aftercleavage with BlnI deduced from the known DNA sequence. The third DNA fragment produced by restriction enzyme cleavage, containing the central part of the palindrome,is not visible on the figure. Digestion products were separated by electrophoresis in 8% polyacrylamide gel. Analogous analysis was performed with all strains containingpalindromic insertions. (c) Nucleotide sequence of the 150 bp palindrome present in the strain F-OPO-150. The restriction enzymes used in this analysis, together with theircleavage sites, are indicated. The sequence of the shorter palindromic inserts can be deduced from this figure by joining symmetrically placed restriction enzyme sites writtenin bold. For example, joining of the two HindIII sites creates the 114 bp palindrome found in the strain F-OPO-114.

integration of the plasmids pAB9-150, pAB9-114, pAB9-102, pAB9-78, pAB9-54 and pAB9-24 into the CYC1 region and they containedpalindromic sequences of decreasing sizes (150, 114, 102, 78, 54and 24 bp, respectively) between the two copies of the CYC1 region(Fig. 1, position b). On the other side, targeted integration of theplasmids pAB7-146, pAB7-110, pAB7-98, pAB7-74, pAB7-50 andpAB7-20 could give rise to two types of transformants: those con-taining palindromic insertion in the left or in the right copy of theCYC1 region (positions a and c in Fig. 1). Southern blot was usedto distinguish between these two possibilities and to isolate theyeast strains F-POO-146, F-POO-110, F-POO-98, F-POO-74, F-POO-50, and F-POO-20 containing palindromic sequences of decreasingsizes (146, 110, 98, 74, 50 and 20 bp, respectively) in the centromere-proximal (left) copy of the CYC1 region (Fig. 1, position a). Thesame strategy was used to construct the strain F-OOP-110 con-taining the 110 bp palindrome in the centromere-distal copy ofthe CYC1 region and two strains with non-palindromic inserts inthe CYC1 region, F-OON-110 and F-NOO-110, where the plasmidpAB218-8 cut with XhoI was used for transformation. The pres-ence of palindromic (or non-palindromic) inserts in our constructswas confirmed by Southern blotting and restriction analysis of theinserts amplified by PCR, as described in Fig. 2. We did not encounterany difficulties in amplifying palindromic sequences under stan-dard conditions [30] using GoTaq® Polymerase (Promega, Cat. No.M3178). Left copy of the CYC1 region was amplified with primersC (5′-GAG-CGG-ATA-CAT-ATT-TGA-AT-3′) and D (5′-CTT-CCA-ACT-TCT-TAC-CAG-TG-3′), inserts placed in the plasmid backbone withprimers E (5′-GGA-ATT-CCC-CCT-TAC-ACG-GAG-3′) and F (5′-GGA-ATT-CGC-CTT-TGA-GTG-AGC-3′), and right copy of the CYC1 regionwith primers A (5′-AAC-GGA-AGC-GAT-GAT-GAG-AG-3′) and B (5′-GGC-ACC-TGT-CCT-ACG-AGT-TG-3′). Positions of all primers usedare indicated in Fig. 1. Finally, deletion of the SAE2 gene in thestrains F-OOO and F-OPO-114 was carried out by standard one-step gene replacement procedure [36]. The disruption cassette usedfor inactivation of the SAE2 gene was amplified from the strain

BY4742:�sae2 [37] using primers 302 (5′-GCT-GAA-TGA-AGC-AAC-TCC-TG-3′ and 303 (5′-CGT-TCC-CGT-GGT-AGA-AAT-GC-3′). Theaccuracy of deletion of the SAE2 gene in the strains F-OOO andF-OPO-114 was verified by Southern blotting using DIG-labelledkanMX as a probe and additionally by PCR with primers 302,303, K1 (5′-TAC-AAT-CGA-TAG-ATT-GTC-GCA-C-3′) and K2 (5′-AAT-CAC-CAT-GAG-TGA-CGA-CT-3′). The list of relevant yeast strainsused here is presented in Table 1. Note that the position, typeand size of the insert are indicated in the strain name (forexample, F-OON-110 indicates the strain with non-palindromic

Table 1Yeast strains used to study palindrome-induced recombination.

Straina Size and location of a palindromic ornon-palindromic insertionb

F-OOOc –F-POO-20 20 bp palindrome in aF-POO-50 50 bp palindrome in aF-POO-74 74 bp palindrome in aF-POO-98 98 bp palindrome in aF-POO-110 110 bp palindrome in aF-POO-146 146 bp palindrome in aF-OPO-24 24 bp palindrome in bF-OPO-54 54 bp palindrome in bF-OPO-78 78 bp palindrome in bF-OPO-102 102 bp palindrome in bF-OPO-114 114 bp palindrome in bF-OPO-150 150 bp palindrome in bF-OOP-110 110 bp palindrome in cF-OON-110 110 bp non-palindrome in cF-NOO-110 110 bp non-palindrome in aF-OOO/sae2 –F-OPO-114/sae2 114 bp palindrome in b

a All strains are derived from the strain FF-18-52 MATa, ade5, leu2-3,119, trp1-289,ura3-52, canR .

b Locations a–c are indicated in Fig. 1.c The strain F-OOO contained endogenous 8 bp palindromes in locations a–c.

68 | Size-dependent palindrome-induced intrachromosomal recombination in yeast

paper 1

Page 15: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

386 B. Lisnic et al. / DNA Repair 8 (2009) 383–389

110 bp insertion present in the centromere-distal copy of the CYC1region)

2.5. Estimation of recombination rates

Pop-out recombination rate was determined by fluctuationanalysis using 15 independent cultures per strain. For eachstrain, a preculture grown to stationary phase in SC-ura medium(∼108 cells/mL) was diluted 10,000 times with SC medium and25 �L aliquots were further distributed into each of the 20 culturetubes. The tubes were then incubated on orbital shaker at 175 rpmuntil stationary phase was reached. At the end of growth, the num-ber of Ura− recombinants was determined by plating the contentof each of the 15 cultures on plates containing 5-FOA. The totalnumber of cells was determined by plating the appropriate dilu-tion of the remaining five cultures on YPD plates. Colonies werecounted after 48 h and the most probable number of recombinationevents, together with respective 99% confidence intervals for eachstrain, was determined in WR Mathematica v5.2 by the maximum-likelihood method described by Lea and Coulson [38] using thestatistical fluctuation analysis package SALVADOR [39]. Recombina-tion rates were expressed as the ratio of the most probable numberof recombination events and the number of cell divisions, and wereconsidered significantly different if there was no overlap between99% confidence intervals.

2.6. Gel electrophoresis

Agarose (1.0%) and polyacrylamide (8.0%) gels were run in 1×TBE [31]. Gel images were taken through a red/orange filter withSony DSC-W90 digital camera, converted to grayscale and croppedto show only relevant portions of the gels. Figures were drawn inAdobe Illustrator CS3.

3. Results

3.1. Experimental system

In order to study the influence of palindrome size on its poten-tial to induce recombination in yeast, we constructed two seriesof plasmids containing palindromic sequences of increasing sizesinserted either in the yeast CYC1 gene or in the plasmid backbone.Targeted integrations of these plasmids into the yeast CYC1 locusresulted in genetic structures depicted in Fig. 1. Palindromes werepresent either within one of the CYC1 sequences (strains F-POOand F-OOP) or between the two CYC1 sequences (strains F-OPO).We also constructed yeast strains that contained a non-palindromic110 bp insertion in the CYC1 gene (strains F-NOO-110 and F-OON-110, Table 1). Recombination between the two CYC1 sequencesassociated with deletion of the intervening sequence (pop-out)gives rise to uracil auxotrophs that can be selected on the syntheticmedium containing 5-FOA and we wondered whether the presenceof palindromic sequences would stimulate this type of homologousintrachromosomal recombination (Fig. 1).

Having in mind the instability of palindromes observed bothin the E. coli and in yeast cells, we first checked for the pres-ence and integrity of palindromes in our constructs (Fig. 2). PCRamplifications of the target sequences from the genomic DNA gavefragments of expected sizes (Fig. 2a). The amplified fragments werefurther digested with different restriction enzymes that recognizesequences contained in the palindromes (Fig. 2b and c). This anal-ysis demonstrated that all restriction enzyme sites were conservedin our constructs indicating that the palindromic sequences weresuccessfully introduced and maintained in the yeast genome.

Fig. 3. Palindrome-stimulated pop-out recombination is SAE2-dependent. Errorbars indicate 99% confidence intervals.

3.2. Stimulation of pop-out recombination by the presence of a110 bp palindromic insert is abolished in SAE2 mutant

In the control strain F-OOO, that contained no palindromicsequences inserted, the uracil auxotrophs occurred at the rate of2.82 × 10−5. This was more than two orders of magnitude higherthan the mutation rate to uracil auxotrophy determined in the strainFF-Ura+ (1.60 × 10−8) indicating that the pop-out recombinationbetween the two copies of the CYC1 gene is dominant mecha-nism for generation of Ura− colonies in our strains, as presentedin Fig. 1. Insertion of a 110 bp palindrome either within or betweenthe two copies of the CYC1 gene further stimulated the pop-outrate, while the presence of a non-palindromic insert of the samesize in either copy of the CYC1 resulted in a small, but significantdecrease in the pop-out rate. Finally, in the �sae2 background, nodifference in the pop-out rate was observed between the straincontaining palindromic insertion (F-OPO-114/sae2) and the controlstrain (F-OOO/sae2) so we conclude that the palindrome-inducedstimulation of pop-out recombination was completely dependenton the functional SAE2 gene (Fig. 3). We also note that even in thestrain without palindromic insertion the sae2 mutation exhibiteda small, but significant influence on the pop-out rate which wasdecreased from 2.82 × 10−5 in the strain F-OOO to 1.31 × 10−5 inthe strain F-OOO/sae2.

3.3. Palindrome-stimulated pop-out recombination depends onpalindrome size and location

Pop-out rates were further determined in the strains with palin-dromic sequences of different sizes inserted either within theleft copy of the CYC1 gene (strains F-POO) or between the twoCYC1 genes (Strains F-OPO). Similar, but not identical results wereobserved with these two series of strains (Fig. 4). For the F-OPOstrains the presence of 24 and 54 bp palindromes did not changethe pop-out rate, while both 20 and 50 bp palindromes inserted inone copy of the CYC1 gene resulted in a small decrease, rather thanincrease in the pop-out rate. On the other side, although longerpalindromes were recombinogenic in both series of strains, the

Size-dependent palindrome-induced intrachromosomal recombination in yeast | 69

paper 1

Page 16: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

B. Lisnic et al. / DNA Repair 8 (2009) 383–389 387

Fig. 4. Palindrome-stimulated recombination is size-dependent. Pop-out recombi-nation rates in the strains with palindromic insertions in two different locations arepresented (strains F-OPO and F-POO). First data point relates to the control strainF-OOO, that has no palindromic insert. Error bars indicate 99% confidence inter-vals (please note that in some cases the confidence intervals are very narrow andobscured by the data points).

more pronounced increase was observed with the F-OPO strainswhere the presence of a 150 bp palindrome resulted in 22-foldstimulation of the pop-out rate.

4. Discussion

While short palindromes are important DNA motifs frequentlyfound in different cis-acting genetic elements longer palindromicsequences are highly unstable in various organisms. Since recom-binogenicity of a palindrome is clearly size-related we decided toinvestigate how the step-by-step increase in the size of a palin-drome introduced in the yeast genome will affect its geneticstability. The pop-out assay developed here to measure geneticrecombination was convenient for introduction of different palin-dromic sequences present on bacterial plasmids directly into theyeast genome, but also demonstrated a high sensitivity required fordetection of slight differences in genetic stability. In our previouswork we analyzed by Southern blotting 80 5-FOA resistant coloniesderived from the strain F-POO-110 and 60 colonies derived fromthe strain F-NOO-110. They have all lost the intervening sequenceindicating that they were indeed produced by pop-out recombi-nation [33]. Two different mechanisms could be responsible forthe generation of uracil auxotrophs in our pop-out assay: single-strand annealing (SSA) involving repeated copies of the CYC1 gene,or intrachromosomal reciprocal recombination between these twohomologies [40]. As stated before, the spontaneous mutation rateto uracil auxotrophy was more than two orders of magnitude infe-rior to the pop-out rate reported here and could not significantlyinfluence our results.

4.1. The role of SAE2 gene in palindrome-stimulated pop-outrecombination

The results presented in Fig. 3 clearly indicate that the insertionof a 110 palindromic sequence either within or between directlyrepeated copies of the CYC1 gene stimulated pop-out recombi-nation. A non-palindromic insertion of the same size and placedin the same location in either copy of the CYC1 gene (strainsF-NOO-110 and F-OON-110) resulted in decreased, rather than

increased pop-out rate. Suppression of the pop-out recombinationby the presence of point mutations in recombining homologies wasalready reported [41] and our results indicate that the presence ofdeletions/insertions could have the same effect (see also below).However, the palindrome-induced recombination was completelyabolished in the �sae2 background where the pop-out rates weresimilar in the F-OPO-114 and F-OOO strains (Fig. 3). First isolatedas a gene required for meiotic DSB repair [42,43] the SAE2 genewas also shown to be involved in DNA repair in yeast vegetativecells [44] and recombination between inverted [22,45] and directrepeats [46,47]. Formation of chromosome inverted duplicationsin the �sae2 cells has been attributed to their inability to pro-cess hairpin structures formed between inverted repeats [22] andrecently it was shown that the Sae2 protein operates as an endonu-clease, together with the Mre11-Rad50-Xrs2 (MRX) complex, tocleave hairpin structures in vitro [21]. The 20-fold reduction in thepop-out rate in the F-OPO-114/sae2 strain strongly suggests thatthe palindrome-induced recombination observed in our assay alsorequires the formation and processing of hairpin-capped breaks[22,23]. On the other side, a small but significant reduction in thepop-out rate in the strain without palindromic insertion (F-OOO)observed in the �sae2 background is consistent with results ofClerici et al. [46] who demonstrated that the DSB repair by the SSApathway is delayed and less efficient in the absence of the Sae2protein.

4.2. The influence of palindrome location

The same 110 bp palindromic insertion stimulated pop-outrecombination in all three locations tested, but not to the sameextent: while the pop-out rates in the strains F-POO-110 and F-OOP-110 were comparable (1.06 × 10−4 and 0.98 × 10−4 respectively),twice as much uracil auxotrophs per cell division (2.37 × 10−4) weregenerated in the strain F-OPO-114 (Fig. 3). Apart from the fact thatthis palindrome is 4 bp longer, an increased pop-out rate may alsoreflect different possibilities and requirements for repair of a DSBintroduced in the palindromic sequence present either betweenthe two copies of the CYC1 gene or within one copy. The resultspresented in Fig. 4 also indicate higher stimulation of the pop-outrate in the F-OPO strains. While in both cases the SSA pathwaycould give rise to uracil auxotrophs, the conversion pathway, whichis rarely associated with reciprocal recombination in yeast, maycompete for repair of the breaks induced within the CYC1 gene.Moreover, the break introduced within the palindromic insertion inthe CYC1 gene would produce heterologous termini that can impairefficient repair by the conversion-type recombination [48]. In otherwords, although the formation of recombinogenic structures couldbe equally frequent in all locations, the breaks introduced betweenthe two copies of the CYC1 could be more productive in generatinguracil auxotrophs. However, it is also possible that some other fac-tors, like local chromosome structure/supercoiling or proximity tosome cis-acting elements like origins of replication [10], could influ-ence extrusion of the hairpin/cruciform and its transformation tothe recombinogenic break.

4.3. A critical size for palindrome-induced recombination in yeast

In order to get a better insight in the size-dependence ofpalindrome-induced recombination we decided to insert palin-dromes of different sizes in two different locations, within onecopy of the CYC1 gene (strains F-POO), and between the two copies(strains F-OPO). In spite of some differences that were observedbetween these two series of yeast strains and that were alreadydiscussed, they revealed the same basic response (Fig. 4). Namely,the increase in the pop-out rate observed was not gradual andno stimulation of intrachromosomal recombination was observed

70 | Size-dependent palindrome-induced intrachromosomal recombination in yeast

paper 1

Page 17: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

388 B. Lisnic et al. / DNA Repair 8 (2009) 383–389

with palindromes up to 50/54 bp. A significant increase with the F-OPO strains was observed when the palindrome size reached 78 bp.In the F-POO strains, the insertion of 20 and 50 bp palindromesresulted in decreased, rather than increased pop-out rate. As dis-cussed before, these short palindromes may act as heterologousinsertions within one copy of the CYC1 gene reducing the pop-outrate, but when 50 and 74 bp palindromes were compared, a sig-nificant increase in the pop-out rate was observed. The increasein palindrome length for the next 14 bp resulted in more thantwofold increase in the pop-out rate and was followed by verystrong further stimulation of genetic instability. Based on this anal-ysis we conclude that palindromes become recombinogenic onlywhen they attain a certain critical size and in our case we can esti-mate this critical size to be about 70 bp. Since recombinogenicityof palindromic sequences is related to their potential to form sec-ondary structures it may be that smaller palindromes either do notform such structures in vivo or that they are very unstable andshort-lived. Conversely, with increasing size of a palindrome thecruciform/hairpin structure may become more stable and readilytransformed to a DSB. Local melting of DNA duplex and adoptionof cruciform structure are also temperature-dependent [49] whichmay lead to higher palindrome instability at elevated tempera-tures (data not shown). However, we cannot exclude the possibilitythat even below the estimated critical size palindromes may formhairpins/cruciforms in vivo, but that they are not subjected to thecleavage by secondary structure-specific nucleases. This possibil-ity is supported by the observation that only hairpins longer than70 bp are recognized and repaired during yeast meiosis in a pathwayindependent from the Msh1 and Rad1 proteins [9].

4.4. Suppression of palindrome-induced recombination may becrucial for the genetic stability of genomes containing longpalindromes

To summarize, we show that even though palindromes longerthan approximately 70 bp may be propagated and maintained inyeast, they present a source of genetic rearrangements and willeventually be eliminated from the genome in the absence of pos-itive selection. However, can our results be exploited to anticipatethe stability of palindromic sequences present in the genomesof other organisms, notably, in humans? The available informa-tion about the presence of longer palindromes in the genomes israther sporadic, and only a few systematic studies have been per-formed. While the longest palindrome in the yeast genome has44 bp [25] and only a few palindromes longer than 60 bp werefound on chromosomes III and X in C. elegans [26], palindromesexceeding 100 bp were found in some bacterial genomes (Lisnic etal., unpublished) and more than 30 palindromes longer than 100 bpwere scored in the haploid human genome [27]. These results sug-gest that their recombinogenicity may be controlled at the cellularlevel. Kurahashi et al. [50] have demonstrated a very high incidenceof near-palindrome mediated t(11;22) chromosomal translocationsin male meiosis and they proposed that the factor(s) required fortransformation of palindromic sequences to DSBs are present onlyin meiotic cells. Alternatively, it is also possible that the recombino-genicity of palindromes is in fact actively suppressed in somaticcells in order to maintain genome stability. The inactivation ofa function involved in suppression of palindrome instability maytrigger recombination processes giving rise to different geneticrearrangements described in detail in yeast [51], but also observedin human cancer cells [15].

Conflicts of interest

No conflicts of interest.

Acknowledgements

We gratefully acknowledge the assistance of our students Mar-ija Kovac, Vanda Juranic-Lisnic and Lucija Barbaric in differentstages of this project. We also thank Natasa Tomasevic for technicalassistance. This work was funded by Croatian Ministry of Science,Education and Sports [grant number 2258].

References

[1] J.T. Andersen, P. Poulsen, K.F. Jensen, Attenuation in the rph-pyrE operon ofEscherichia coli and processing of the dicistronic mRNA, Eur. J. Biochem. 206(1992) 381–390.

[2] K. Hiratsu, S. Mochizuki, H. Kinashi, Cloning and analysis of the replication originand the telomeres of the large linear plasmid pSLA2-L in Streptomyces rochei,Mol. Gen. Genet. 263 (2000) 1015–1021.

[3] R.R. Sinden, S.S. Broyles, D.E. Pettijohn, Perfect palindromic lac operator DNAsequence exists as a stable cruciform structure in supercoiled DNA in vitro butnot in vivo, Proc. Natl. Acad. Sci. U.S.A. 80 (1983) 1797–1801.

[4] S.K. Thukral, A. Eisen, E.T. Young, Two monomers of yeast transcription factorADR1 bind a palindromic sequence symmetrically to activate ADH2 expression,Mol. Cell Biol. 11 (1991) 1566–1577.

[5] S.K. Weller, A. Spadaro, J.E. Schaffer, A.W. Murray, A.M. Maxam, P.A. Schaffer,Cloning, sequencing, and functional analysis of oriL, a herpes simplex virus type1 origin of DNA synthesis, Mol. Cell Biol. 5 (1985) 930–942.

[6] A.F. Chalker, D.R. Leach, R.G. Lloyd, Escherichia coli sbcC mutants permit stablepropagation of DNA replicons containing a long palindrome, Gene 71 (1988)201–205.

[7] J.C. Connelly, D.R. Leach, The sbcC and sbcD genes of Escherichia coli encode anuclease involved in palindrome inviability and genetic recombination, GenesCells 1 (1996) 285–291.

[8] J.A. Farah, E. Hartsuiker, K. Mizuno, K. Ohta, G.R. Smith, A 160-bp palindromeis a Rad50.Rad32-dependent mitotic recombination hotspot in Schizosaccha-romyces pombe, Genetics 161 (2002) 461–468.

[9] F. Nasar, C. Jankowski, D.K. Nag, Long palindromic sequences induce double-strand breaks during meiosis in yeast, Mol. Cell Biol. 20 (2000) 3449–3458.

[10] K.S. Lobachev, B.M. Shor, H.T. Tran, W. Taylor, J.D. Keen, M.A. Resnick, D.A.Gordenin, Factors affecting inverted repeat stimulation of recombination anddeletion in Saccharomyces cerevisiae, Genetics 148 (1998) 1507–1524.

[11] A.J. Rattray, A method for cloning and sequencing long palindromic DNA junc-tions, Nucleic Acids Res. 32 (2004) e155.

[12] L.A. Cunningham, A.G. Cote, C. Cam-Ozdemir, S.M. Lewis, Rapid, stabilizingpalindrome rearrangements in somatic cells by the center-break mechanism,Mol. Cell Biol. 23 (2003) 8740–8750.

[13] H. Kurahashi, B.S. Emanuel, Long AT-rich palindromes and the constitutionalt(11;22) breakpoint, Hum. Mol. Genet. 10 (2001) 2605–2617.

[14] H. Kurahashi, T. Shaikh, M. Takata, T. Toda, B.S. Emanuel, The constitutionalt(17;22): another translocation mediated by palindromic AT-rich repeats, Am.J. Hum. Genet. 72 (2003) 733–738.

[15] H. Tanaka, D.A. Bergstrom, M.C. Yao, S.J. Tapscott, Widespread and nonrandomdistribution of DNA palindromes in cancer cells provides a structural platformfor subsequent gene amplification, Nat. Genet. 37 (2005) 320–327.

[16] V. Narayanan, K.S. Lobachev, Intrachromosomal gene amplification triggered byhairpin-capped breaks requires homologous recombination and is independentof nonhomologous end-joining, Cell Cycle 6 (2007) 1814–1818.

[17] D.A. Gordenin, M.A. Resnick, Yeast ARMs (DNA at-risk motifs) can reveal sourcesof genome instability, Mutat. Res. 400 (1998) 45–58.

[18] D.R. Leach, Long DNA palindromes, cruciform structures, genetic instability andsecondary structure repair, Bioessays 16 (1994) 893–900.

[19] S.M. Lewis, A.G. Cote, Palindromes and genomic stress fractures: bracing andrepairing the damage, DNA Repair (Amst) 5 (2006) 1146–1160.

[20] K.S. Lobachev, A. Rattray, V. Narayanan, Hairpin- and cruciform-mediated chro-mosome breakage: causes and consequences in eukaryotic cells, Front Biosci.12 (2007) 4208–4220.

[21] B.M. Lengsfeld, A.J. Rattray, V. Bhaskara, R. Ghirlando, T.T. Paull, Sae2is an endonuclease that processes hairpin DNA cooperatively with theMre11/Rad50/Xrs2 complex, Mol. Cell 28 (2007) 638–651.

[22] K.S. Lobachev, D.A. Gordenin, M.A. Resnick, The Mre11 complex is required forrepair of hairpin-capped double-strand breaks and prevention of chromosomerearrangements, Cell 108 (2002) 183–193.

[23] A.G. Coté, S.M. Lewis, Mus81-dependent double-strand DNA breaks at in vivo-generated cruciform structures in S. cerevisiae, Mol. Cell 32 (2008) 800–812.

[24] M.S. Gelfand, E.V. Koonin, Avoidance of palindromic words in bacterial andarchaeal genomes: a close connection with restriction enzymes, Nucleic AcidsRes. 25 (1997) 2430–2439.

[25] B. Lisnic, I.K. Svetec, H. Saric, I. Nikolic, Z. Zgaga, Palindrome content of the yeastSaccharomyces cerevisiae genome, Curr. Genet. 47 (2005) 289–297.

[26] M.D. LeBlanc, G. Aspeslagh, N.P. Buggia, B.D. Dyer, An annotated catalog ofinverted repeats of Caenorhabditis elegans chromosomes III and X, with observa-tions concerning odd/even biases and conserved motifs, Genome Res. 10 (2000)1381–1392.

[27] L. Lu, H. Jia, P. Droge, J. Li, The human genome-wide distribution of DNA palin-dromes, Funct. Integr. Genomics 7 (2007) 221–227.

Size-dependent palindrome-induced intrachromosomal recombination in yeast | 71

paper 1

Page 18: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

B. Lisnic et al. / DNA Repair 8 (2009) 383–389 389

[28] S.M. Lewis, S. Chen, J.N. Strathern, A.J. Rattray, New approaches to the analysisof palindromic sequences from the human genome: evolution and polymor-phism of an intronic site at the NF1 locus, Nucleic Acids Res. 33 (2005)e186.

[29] F. Sherman, Getting started with yeast, Methods Enzymol. 350 (2002) 3–41.[30] J.D. Boeke, F. LaCroute, G.R. Fink, A positive selection for mutants lacking

orotidine-5′-phosphate decarboxylase activity in yeast: 5-fluoro-orotic acidresistance, Mol. Gen. Genet. 197 (1984) 345–346.

[31] J. Sambrook, D.W. Russell, Molecular Cloning: A Laboratory Manual, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 2001.

[32] I.K. Svetec, D. Stjepandic, V. Boric, P.T. Mitrikeski, Z. Zgaga, The influence of apalindromic insertion on plasmid integration in yeast, Food Technol. Biotechnol.39 (2001) 169–173.

[33] I.K. Svetec, B. Lisnic, Z. Zgaga, A 110 bp palindrome stimulates plasmid integra-tion in yeast, Period. Biol. 104 (2002) 421–424.

[34] R.D. Gietz, R.H. Schiestl, High-efficiency yeast transformation using the LiAc/SScarrier DNA/PEG method, Nat. Protoc. 2 (2007) 31–34.

[35] K. Gjuracic, Z. Zgaga, Illegitimate integration of single-stranded DNA in Saccha-romyces cerevisiae, Mol. Gen. Genet. 253 (1996) 173–181.

[36] A. Wach, A. Brachat, R. Pohlmann, P. Philippsen, New heterologous modules forclassical or PCR-based gene disruptions in Saccharomyces cerevisiae, Yeast 10(1994) 1793–1808.

[37] E.A. Winzeler, D.D. Shoemaker, A. Astromoff, H. Liang, K. Anderson, B. Andre,R. Bangham, R. Benito, J.D. Boeke, H. Bussey, A.M. Chu, C. Connelly, K. Davis, F.Dietrich, S.W. Dow, M. El Bakkoury, F. Foury, S.H. Friend, E. Gentalen, G. Giaever,J.H. Hegemann, T. Jones, M. Laub, H. Liao, N. Liebundguth, D.J. Lockhart, A. Lucau-Danila, M. Lussier, N. M’Rabet, P. Menard, M. Mittmann, C. Pai, C. Rebischung,J.L. Revuelta, L. Riles, C.J. Roberts, P. Ross-MacDonald, B. Scherens, M. Snyder, S.Sookhai-Mahadeo, R.K. Storms, S. Veronneau, M. Voet, G. Volckaert, T.R. Ward,R. Wysocki, G.S. Yen, K. Yu, K. Zimmermann, P. Philippsen, M. Johnston, R.W.Davis, Functional characterization of the S. cerevisiae genome by gene deletionand parallel analysis, Science 285 (1999) 901–906.

[38] D. Lea, C.A. Coulson, The distribution of the number of mutants in bacterialpopulations, J. Genet. 49 (1949) 264–285.

[39] Q. Zheng, New algorithms for Luria-Delbrück fluctuation analysis, Math. Biosci.196 (2005) 198–214.

[40] F. Pâques, J.E. Haber, Multiple pathways of recombination induced by double-strand breaks in Saccharomyces cerevisiae, Microbiol. Mol. Biol. Rev. 63 (1999)349–404.

[41] A. Datta, M. Hendrix, M. Lipsitch, S. Jinks-Robertson, Dual roles for DNAsequence identity and the mismatch repair system in the regulation of mitoticcrossing-over in yeast, Proc. Natl. Acad. Sci. U.S.A. 94 (1997) 9757–9762.

[42] A.H. McKee, N. Kleckner, A general method for identifying recessive diploid-specific mutations in Saccharomyces cerevisiae, its application to the isolationof mutants blocked at intermediate stages of meiotic prophase and characteri-zation of a new gene SAE2, Genetics 146 (1997) 797–816.

[43] S. Prinz, A. Amon, F. Klein, Isolation of COM1, a new gene required to com-plete meiotic double-strand break-induced recombination in Saccharomycescerevisiae, Genetics 146 (1997) 781–795.

[44] E. Baroni, V. Viscardi, H. Cartagena-Lirola, G. Lucchini, M.P. Longhese, The func-tions of budding yeast Sae2 in the DNA damage response require Mec1- andTel1-dependent phosphorylation, Mol. Cell. Biol. 24 (2004) 4151–4165.

[45] A.J. Rattray, C.B. McGill, B.K. Shafer, J.N. Strathern, Fidelity of mitotic double-strand-break repair in Saccharomyces cerevisiae: a role for SAE2/COM1, Genetics158 (2001) 109–122.

[46] M. Clerici, D. Mantiero, G. Lucchini, M.P. Longhese, The Saccharomyces cerevisiaeSae2 protein promotes resection and bridging of double strand break ends, J.Biol. Chem. 280 (2005) 38631–38638.

[47] K. Lee, S.E. Lee, Saccharomyces cerevisiae Sae2- and Tel1-dependent single-strand DNA formation at DNA break promotes microhomology-mediated endjoining, Genetics 176 (2007) 2003–2014.

[48] I.K. Svetec, A. Stafa, Z. Zgaga, Genetic side effects accompanying gene targetingin yeast: the influence of short heterologous termini, Yeast 24 (2007) 637–652.

[49] V. Viglasky, P. Danko, J. Adamcik, F. Valle, G. Dietler, Detection of cruciformextrusion in DNA by temperature-gradient gel electrophoresis, Anal. Biochem.343 (2005) 308–312.

[50] H. Kurahashi, H. Inagaki, T. Ohye, H. Kogo, T. Kato, B.S. Emanuel, Palindrome-mediated chromosomal translocations in humans, DNA Repair (Amst) 5 (2006)1136–1145.

[51] V. Narayanan, P.A. Mieczkowski, H.M. Kim, T.D. Petes, K.S. Lobachev, The patternof gene amplification is determined by the chromosomal location of hairpin-capped breaks, Cell 125 (2006) 1283–1296.

72 | Size-dependent palindrome-induced intrachromosomal recombination in yeast

paper 1

Page 19: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

RESEARCH ARTICLE

Berislav Lisnic Æ Ivan-Kresimir SvetecHrvoje Saric Æ Ivan Nikolic Æ Zoran Zgaga

Palindrome content of the yeast Saccharomyces cerevisiae genome

Received: 10 February 2005 / Accepted: 20 February 2005 / Published online: 18 March 2005� Springer-Verlag 2005

Abstract Palindromic sequences are important DNAmotifs involved in the regulation of different cellularprocesses, but are also a potential source of geneticinstability. In order to initiate a systematic study ofpalindromes at the whole genome level, we developed acomputer program that can identify, locate and countpalindromes in a given sequence in a strictly definedway. All palindromes, defined as identical inverted re-peats without spacer DNA, can be analyzed and sortedaccording to their size, frequency, GC content oralphabetically. This program was then used to prepare acatalog of all palindromes present in the chromosomalDNA of the yeast Saccharomyces cerevisiae. For eachpalindrome size, the observed palindrome counts weresignificantly different from those in the randomly gen-erated equivalents of the yeast genome. However, whilethe short palindromes (2–12 bp) were under-repre-sented, the palindromes longer than 12 bp were over-represented, AT-rich and preferentially located in theintergenic regions. The 44-bp palindrome found betweenthe genes CDC53 and LYS21 on chromosome IV wasthe longest palindrome identified and contained onlytwo C-G base pairs. Avoidance of coding regions wasalso observed for palindromes of 4–12 bp, but was lesspronounced. Dinucleotide analysis indicated a strongbias against palindromic dinucleotides that could ex-plain the observed short palindrome avoidance. Wediscuss some possible mechanisms that may influencethe evolutionary dynamics of palindromic sequences inthe yeast genome.

Keywords Palindrome Æ Inverted repeat ÆDinucleotide Æ Saccharomyces cerevisiae Æ Sequenceanalysis

Introduction

Closely spaced inverted repeats (IRs), palindromes andquasipalindromes can be found in the DNA of naturalplasmids, viral and bacterial genomes and eukaryoticchromosomes and organelles. In prokaryotes, they mayserve as binding sites for regulatory proteins, while shortperfect palindromes are known as recognition sites fortype II restriction-modification systems (RMSs) that playa significant role in bacterial ecology and evolution(Gelfand and Koonin 1997; Rocha et al. 2001). Anotherimportant property of such motifs is their potential toform intra-strand hydrogen bonds within DNA mole-cules or in corresponding RNA transcripts. Therefore,they are contained in genes encoding functional RNAmolecules, the structure of which depends on the for-mation of proper intra-strand bonding, and in differentcis-acting genetic elements, like terminators, attenuators,plasmid and viral origins of replication. Protein bindingand secondary structure formation are also modes ofaction for IRs and related motifs in eukaryotic cells. Forexample, palindromes with a spacer of one nucleotidewere identified in yeast sequences regulating cellular re-sponse to the accumulation of unfolded proteins in theendoplasmic reticulum (Mori et al. 1998) and a hetero-dimeric complex was isolated that binds two palindromicsequences in the promoter region of the human erbB-2gene (Chen and Gill 1996). In mouse B lymphoma cells,palindromic and potential stem-loop motifs were iden-tified as break-points during class switch recombination(Tashiro et al. 2001); and the formation of intra-strandsecondary structures is essential in the process of im-munoglobuline gene rearrangement known as V(D)J-joining (Cuomo et al. 1996).

However, in spite of their importance and functionalversatility, longer palindromes and IRs were shown to be

Communicated by S. Hohmann

B. Lisnic Æ I.-K. Svetec Æ Z. Zgaga (&)Faculty of Food Technology and Biotechnology,University of Zagreb, Pierottijeva 6,10000 Zagreb, CroatiaE-mail: [email protected].: +385-1-4836013Fax: +385-1-4836016

H. Saric Æ I. NikolicSail Company Croatia Ltd., Ilica 412, 10000 Zagreb, Croatia

Curr Genet (2005) 47: 289–297DOI 10.1007/s00294-005-0573-5

Palindrome content of the yeast Saccharomyces cerevisiae genome | 71

paper 2

Page 20: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

very unstable in different organisms, from bacteria tomammalian cells. The recombinogenicity of such motifsis attributed to their potential to form secondary struc-tures known as hairpins and cruciforms and the molec-ular models proposed to explain palindrome-inducedgenomic instability can be divided into two classes: first,based on template switching or slippage of the DNApolymerase during replication of DNA adopting sec-ondary structures and, second, requiring an enzymaticactivity that transforms cruciforms and hairpins to re-combinogenic lesions, like double-strand breaks (DSBs;Leach 1994). Both types of models are supported byexperimental data; and several human genetic disorderscan be explained either by errors occurring during thereplication of palindromic and quasipalindromic se-quences (Bissler 1998; Gordenin and Resnick 1998) or byIR-induced illegitimate end-joining (Repping et al. 2002).

Given the importance of palindromic sequences in theregulation of different cellular processes on the one sideand their influence on genetic stability on the other side,an analysis of the incidence of palindromes at the geno-mic level seems particularly interesting. LeBlanc et al.(2000) prepared a catalog of palindromes of 4–60 bppresent on chromosomes III and X of the nematodeCaenorhabditis elegans. Palindromes were classified asAT-rich, non-AT-rich or GC-rich and this analysisindicated that the long, AT-rich palindromes are muchmore frequent in the actual chromosomes than in therandomly generated sequences. The IRs separated by asingle base pair were also included in their study and such‘‘odd palindromes’’ were more frequent than perfect IRswithout a spacer. Avoidance of short palindromic se-quences was observed in the genomes of some bacterio-phages (Sharp 1986), but also in the genomes of theirbacterial hosts (Karlin et al. 1997; Gelfand and Koonin1997; Rocha et al. 2001). The highest bias was frequentlyobserved for the recognition sites of the type II restrictionenzymes, supporting the view that RMSs influence gen-ome evolution in prokaryotes. Consistent with thishypothesis, Fuglsang (2004) demonstrated that the suc-cession of codons disfavoring palindrome formation ismore pronounced for the bacterial species that haveRMSs. In order to initiate a systematic study of palin-dromes present in different genomes, we decided first todevelop a computer program that can score and count allpalindromes present in a given sequence in a strictly de-fined way. This program was then used to analyze thepalindrome content of the entire genome of the yeastSaccharomyces cerevisiae and the results of this analysisclearly indicate that the palindromic sequences conformto the specific rules of evolutionary dynamics.

Materials and methods

DNA sequences

The DNA sequence files for all yeast chromosomes(NC_001133.4, NC_001134.5, NC_001135.6, NC_

001136.5, NC_001137.2, NC_001138.4, NC_001139.4,NC_001140.3, NC_001141.1, NC_001142.5, NC_001143.4, NC_001144.3, NC_001145.2, NC_001146.2,NC_001147.4, NC_001148.2) were downloaded fromthe NCBI (ftp://ftp.ncbi.nih.gov/genomes/Saccharomy-ces_cerevisiae/) on 3 April, 2004. The total length ofdownloaded yeast genomic DNA sequences (excludingmitochondrial DNA) was 12,070,766 bp.

The file containing the coding regions of the entireyeast genome (that is, ORF coding sequences only,without 5¢UTR, 3¢UTR, intron sequences, or basesnot translated due to translational frameshifting)was downloaded on 12 July, 2004 from the SGD (ftp://genome-ftp.stanford. edu/pub/yeast/data_download/se-quence/genomic_sequence/orf_dna/orf_coding.fasta.gz).This file includes all ORFs except dubious ORFs, andprior to examination of the palindrome and dinucleotidecontent in the coding regions, the sequences of mito-chondrial DNA were removed from the file. The totallength of the coding DNA sequences in this shortenedfile was 8,755,368 bp.

Ten random genomes (12,070,766 bp each) weregenerated with respect to the frequency of the four basespresent in the yeast genome (A=30.90%, C=19.17%,G=19.13%, T=30.81%), so that the average propor-tions of the bases in the random genomes were:A=30.90±0.01%, C=19.18±0.01%, G=19.13±0.01% and T=30.81±0.01%.

Palindrome count

The Spinnaker program described in this workis available upon request.

Statistics

The numbers of dinucleotides and palindromes deter-mined in the chromosomal DNA were called the ob-served numbers, while the numbers of dinucleotides andpalindromes determined in the random genomes werecalled the expected numbers. To test whether the medi-ans of the expected numbers of palindromes and dinu-cleotides differed significantly from the observednumbers, we employed the Wilcoxon signed rank testwith a confidence interval of 99%, using MINITABver. 14.1.

Results

Palindrome scoring

Systematic study of palindromic sequences requiresconvenient software that will recognize and map allpalindromes in a predetermined size-range at the geno-mic level. This is not a trivial problem, since individual

290

74 | Palindrome content of the yeast Saccharomyces cerevisiae genome

paper 2

Page 21: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

palindromes found within a given sequence can bescored in different ways. First, we may decide to score asindividual palindromes only those palindromes that donot share common base pairs, but this approach canresult in an underestimation of the actual number ofpalindromes, since some palindromes may remainundetected and different results can be obtained for thesame sequence (compare Fig. 1a,b). Therefore, wedecided to count also the palindromes that do sharecommon base pairs and in this case we have the possi-bility either to score only the longest palindromes thatmay overlap, or to include also the shorter palindromesembedded within longer palindromes. There are twoclasses of such palindromes, those sharing the samecenter of symmetry (nested) and those having a differentcenter of symmetry. This is shown in Fig. 1c, where allpalindromes present in the test sequence are indicated,including both nested and non-nested palindromes en-tirely contained within longer palindromes. Embeddedpalindromes are omitted in Fig. 1d, while Fig. 1e pre-

sents a combination of these two modes of palindromescoring where only non-nested embedded palindromesare presented.

Our computer program, named Spinnaker, was de-signed to score palindromes as indicated in Fig. 1c–e.Let us define first the frame as an array of nucleotides ofeven length. The location of the frame within a givenDNA sequence is identified by position of its leftboundary (L) and by its length. The maximum length ofthe frame (p) is propounded as a search parameter thatcorresponds to the length of the potentially largest pal-indrome. The position of the right boundary of theframe is given by R=L + p � 1. Additionally, we defineP1 as the right boundary of the last previously foundpalindrome. In the beginning, we set L=0 and P1=0.The algorithm for the search for overlapping palin-dromes can be described recursively as follows. Step 1:we set the frame size to p and move it one position to theright (L:=L+ 1). Step 2: within the frame, we check forcomplementarity in the Lth and Rth base pair, then the(L + 1)th and the (R � 1)th base pair and so on. If it isfound that all base pairs are complementary, the frame isdefined as a palindrome and its properties, including P1,are recorded. However, if there is at least one non-complemantary base pair and if the current size of theframe is >2 and R >P1, we reduce the frame size bydecreasing the position of its right boundary by two andthen repeat step 2. Otherwise, we repeat step 1. Thesearch for embedded palindromes (with or without un-ique symmetry axis) follows the same general procedure,ignoring the P1 parameter and introducing a few addi-tional internal structures needed for the affiliation of apalindrome to a given category.

The maximal size of a palindrome can be set to 500 ntand the results of sequence analysis can be sorted by sizeor frequency, alphabetically, or by GC content (per-centage or absolute number of GC base pairs). Addi-tionally, Spinnaker can graphically display thedistribution of palindromes, where the analyzed se-quence is presented as a horizontal line and palindromesof a given size as vertical lines. The results of a searchcan be saved as two separate files, one that includes thenumber and exact chromosomal locations of all palin-dromes and the other that includes the summarized re-sults of a search. The program is user-friendly and worksunder Windows.

We compared Spinnaker with two programs designedfor the identification of IRs: Reputer (Kurtz andSchleiermacher 1999) and Palindrome (Rice et al. 2000).Both programs recognize palindromic sequences, asshown in Fig. 1e, so that some of the palindromescontained within a longer palindrome remain unde-tected. For example, they detected only three 6-nt pal-indromes in the test sequence presented in Fig. 1, whileSpinnaker recorded six such palindromes when embed-ded palindromes were counted. Reputer also recordedsome non-palindromic sequences, so that 29 instead of16 palindromic dinucleotides were found in our test se-

Fig. 1 Palindrome scoring. Different numbers of palindromes canbe counted in the same sequence depending on the scoringcriterion. a, b Non-overlapping palindromes do not share commonbase pairs. c Embedded palindromes include both non-overlappingand palindromes that share common base pairs as well asall shorter palindromes contained within a longer palindrome.d Overlapping palindromes include both non-overlapping andpalindromes that share common base pairs but not shorterpalindromes contained within a longer palindrome. e Embeddedpalindromes with nested palindromes excluded

291

Palindrome content of the yeast Saccharomyces cerevisiae genome | 75

paper 2

Page 22: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

quence; and the minimal palindrome size for Palindromeis 4 nt.

Palindromic sequences in the yeast genome

We initiated our study of yeast palindromic sequencesby identifying all palindromes present in individualchromosomes (data not shown). We did not observe anyparticular pattern in the distribution of palindromic se-quences along different chromosomes, although thisfeature was beyond the scope of this study and was notsystematically analyzed. The complete catalog of palin-dromes present in all yeast chromosomes, determinedboth as overlapped and embedded palindrome counts, ispresented in Table 1. As expected, a sharp size-depen-dent decrease in the numbers of palindromes was ob-served, but only for shorter palindromes. For example,the number of palindromes containing 22 bp was 65,while the number of 20-bp palindromes was 59. Simi-larly, there were more 26-bp than 24-bp palindromesidentified. The palindrome content determined in the tenrandomly generated genomes is also included in Table 1and can be compared with the palindrome content of theactual genome. For each palindrome size, the valuesobserved in the yeast genome were significantly differentfrom those obtained in artificial genomes (P=0.006).However, palindromes up to 12 bp were under-repre-sented, while longer palindromes were over-represented.The longest palindromes found in ten randomly gener-ated sequences contained 24 bp, while 85 palindromeslonger than 24 bp were found in the yeast genome.Similar results were obtained when embedded palin-dromes were included in the palindrome count, but here

even the 10-bp palindromes were over-represented. Thisis not unexpected, since in this case, all the 10-bp pal-indromes present in longer palindromes were also addedto the sum. It is important to note that, even whenembedded palindromes were counted, palindromesshorter than 10 bp were under-represented in compari-son with randomly generated genomes.

Long palindromes are AT-rich and are placedin intergenic regions

Each palindrome identified by our program can be lo-cated on the physical map of the yeast genome (http://www.yeastgenome.org), as shown for the six longestpalindromes found in the yeast chromosomal DNA(Fig. 2). They were all placed in intergenic regions andwere AT-rich, so we decided to analyze these featuresmore systematically. We calculated the A+T content forall palindromes found in the yeast genome and for thepalindromes detected in ten randomly generated se-quences. The A+T content of palindromes found inartificial sequences was around 70%, while slightly lowervalues were observed in genomic palindromes up to12 nt (Fig. 3). However, a high increase in A+T contentwas observed with longer palindromes that were almostexclusively composed of A-T base pairs (Fig. 2). Theremarkable exception was the presence of a GC-rich(52.4%) 42-nt palindrome found between the RFX1 geneand YLR177W on chromosome XII (Fig. 2).

In order to find out the position of palindromic se-quences with respect to annotated coding regions, wedetermined the proportion of palindromes that wereplaced in intergenic (non-coding) regions. The sequence

Table 1 Comparison of thenumbers of palindromes in theyeast genome and in randomgenomes

aThe average number from tenrandom genomesbUnder-represented palin-dromes (P=0.006), also repre-sented in italicscOver-represented palindromes(P=0.006)

Palindrome size Overlapping palindromes Embedded palindromes

Yeastgenome

Randomgenomesa

Yeastgenome

Randomgenomesa

2 1,732,917b 1,904,950.4 2,770,675b 3,183,592.34 479,972b 580,521.2 705,275b 839,462.86 133,411b 159,564.1 194,874b 221,347.38 38,234b 42,738.6 57,086b 58,424.410 10,246b 11,256.8 17,514c 15,382.812 2,809b 2,995.9 6,495c 4,095.114 939c 807 3,159c 1,096.316 282c 212 1,813c 289.118 96c 55.2 1,207c 77.120 59c 16.9 857c 21.922 65c 3.8 606c 5.024 24c 1.2 413c 1.226 29c 0 289c 028 17c 0 194c 030 9c 0 125c 032 9c 0 80c 034 8c 0 49c 036 4c 0 25c 038 3c 0 13c 040 3c 0 6c 042 2c 0 3c 044 1c 0 1c 0

292

76 | Palindrome content of the yeast Saccharomyces cerevisiae genome

paper 2

Page 23: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

containing only the yeast coding regions represented72.73% of the total yeast genome (see Materials andmethods), but even the short palindromes were prefer-entially (30–35%) placed in intergenic regions. Only theproportion of palindromic dinucleotides found in theintergenic regions (27.65%) was very close to that

expected from random distribution between the codingand non-coding part of the genome. A preferentiallocation in intergenic regions was particularly pro-nounced for palindromes longer than 10 bp, so thatmore than 97% of palindromes longer than 18 bp wereidentified outside of the coding regions (Fig. 3).

Palindromic dinucleotides in the yeast genome

Four dinucleotides, AT, TA, GC and CG, may beconsidered as the shortest palindromic sequences, butthey are also found in the center of any longer palin-drome. Our analysis indicated that the short palin-dromes, including palindromic dinucleotides, were lessfrequent in the yeast genome than in the random ge-nomes (Table 1) and we decided to determine relativefrequencies for all 16 dinucleotides (Fig. 4). Indeed, thetwo most under-represented dinucleotides were palin-dromic dinucleotides containing purine on the 3¢ end,TA (0.77) and CG (0.80), while AT (0.94) was moder-ately under-represented and GC (1.02) was only slightlyover-represented. For non-palindromic dinucleotides,very close values were observed for complementary di-nucleotides and only the AC and GT couple was under-represented (0.89). The ratio between non-palindromicand palindromic dinucleotides composed either of A-Tor G-C base pairs illustrates well the avoidance of pal-indromic dinucleotides in the yeast genome. TT+AAdinucleotides are 1.33 times more frequent thanTA+AT dinucleotides, while CC+GG dinucleotidesare 1.17 times more frequent than CG+GC dinucleo-tides (Table 2). The same analysis was also done fordinucleotides present in all palindromes longer than

Fig. 2 Sequences and chromosomal locations of the six longestpalindromes found in the yeast genome. Numbers indicate the startand end locations for ORFs and palindromes

Fig. 3 Analysis of the A+Tcontent and location ofpalindromes with respect to thenon-coding regions in the yeastgenome. Palindromes werescored as presented in Fig. 1d(overlapping palindromes).Error bars indicate onestandard deviation

293

Palindrome content of the yeast Saccharomyces cerevisiae genome | 77

paper 2

Page 24: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

14 bp and the results obtained were quite different fromthose observed with the entire yeast genome, indicatingthat the long palindromes are specifically enriched inpalindromic dinucleotides. However, the bias was muchmore pronounced for the AT and TA dinucleotides,which were 5.6 times more frequent than the TT+AAdinucleotides (Table 2).

The data presented in Fig. 4 also indicated a strongbias in the frequency of palindromic dinucleotidescomposed of identical base pairs, so that the AT/TAratio was 1.22 and the GC/CG ratio was 1.28. Thisanalysis was also done for the coding complement of theyeast genome and for palindromes longer than 14 bp(Table 2). The AT/TA ratio rose to 1.31 in coding DNA,while in long palindromes it was 1.036, indicating thatthe two dinucleotides are almost equally frequent. Thiscan be explained by the presence of long runs of alter-nating A and T nucleotides frequently found in the longpalindromes (Fig. 2; data not shown). In contrast, theGC/CG ratio in the coding complement was very closeto that observed for the entire genome and in the longpalindromes it even increased to 1.33.

Discussion

Palindromes, quasipalindromes and closely spaced IRscan be found in the genomes of all organisms and are

involved in various cellular processes. In the presentwork, we focused on palindromes, defined as perfect IRswithout spacer DNA. In order to make possible a sys-tematic study of these important DNA motifs in differ-ent genomes, we first developed a computer programthat can identify, locate and count palindromes in a gi-ven nucleotide sequence in a strictly defined way. Thisprogram was then used to prepare a catalog of all pal-indromic sequences present in the S. cerevisiae genomeand in randomly generated equivalents of the yeastgenome. Comparison between actual and random ge-nomes revealed several important differences that arediscussed below.

Identification of palindromic sequences

As discussed before, palindrome scoring can be done indifferent ways and we decided to consider both sepa-rated (non-overlapping) and overlapping palindromes,together with all smaller palindromes contained withinlonger palindromes, as individual palindromes. Such anapproach ensures that no palindromic motif that may behidden within a longer palindrome can be omitted; andthis is very important for the detection of specific pal-indromic sites, like restriction enzyme sites or regulatoryunits. In addition to the embedded palindrome count,we also used a simplified way of palindrome scoringwhere partial overlapping of individual palindromes isallowed, but all smaller palindromes completely con-tained within longer palindromes are ignored. This wayof palindrome scoring (overlapped palindromes) indi-cates individual palindromic loci within a given sequenceand each palindrome is counted only once. In this way, amore realistic picture of the numbers and distribution ofpalindromic loci is obtained, especially when we con-sider the long runs of alternating complementary dinu-cleotides frequently present in a genome. For example,ten AT dinucleotides will be scored as a single 20 nt

Fig. 4 Occurrence ofdinucleotides in the yeastgenome. For each dinucleotide,the ratio between the numberdetermined in the yeast genome(observed) and the averagenumber obtained in ten randomgenomes (expected) is presented

Table 2 Relative frequencies of palindromic dinucleotides in thewhole genome, coding DNA and palindromes longer than 14 bp.ND Not determined

Ratio Wholegenome

CodingDNA

Palindromes>14 bp

(AA+TT)/(AT+TA) 1.328 ND 0.178(GG+CC)/(GC+CG) 1.166 ND 0.800AT/TA 1.220 1.308 1.036GC/CG 1.277 1.291 1.330

294

78 | Palindrome content of the yeast Saccharomyces cerevisiae genome

paper 2

Page 25: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

palindrome, while in the same sequence 100 embeddedpalindromes can be found. For these reasons, we deci-ded to use both embedded and overlapping palindromecounts in the analysis of palindrome content (Fig. 1c,d).As illustrated by our analysis of the yeast genome, thisdistinction is important. For example, both 10-nt and12-nt palindromes are under-represented when countedas overlapped palindromes, but are over-representedwhen scored as embedded palindromes (Table 1). Pal-indromic dinucleotides were also analyzed for two rea-sons. First, there is no clear reason to includetetranucleotides and to exclude dinucleotides from thepalindrome count and, second, each palindrome con-tains one of the four palindromic dinucleotides in thecenter and the analysis of their incidence could be usefulfor the general study of palindromic sequences. This wasdemonstrated by our analysis of the yeast genome,where the palindromic dinucleotides are particularlyunder-represented compared with non-palindromic di-nucleotides with the same base pair composition. In theC. elegans genome, the IRs separated by one nucleotideare more frequent than the repeats without a spacer(LeBlanc et al. 2000). One possible explanation for thisinteresting observation could be that the palindromicdinucleotides are also avoided in the C. elegans genome,but this type of analysis has not been performed.

Palindromes in the yeast genome

The palindrome content of the yeast genome was sortedby size and compared with the palindrome content of therandomly generated equivalents of the genome. Furtheranalysis indicated several interesting numerical trendsthat allowed us to make a clear distinction between‘‘short’’ and ‘‘long’’ palindromes, with the breakingpoint at 10–12 bp. While positive selection could ac-count for the relative abundance of long palindromes,under-representation of short palindromes seems moreintriguing. The avoidance of short palindromes was al-ready observed in viral and bacterial genomes and wasattributed either to the activity of the restriction enzymespresent in the cell, or introduced by horizontal genetransfer (Sharp 1986; Gelfand and Koonin 1997; Rochaet al. 2001). Obviously, such an interpretation could notexplain the paucity of short palindromes in the yeastgenome, where no restriction/modification system hasbeen detected. Since palindromic dinucleotides were alsounder-represented in the yeast genome, we decided toperform a systematic analysis of the dinucleotide contentof the yeast genome. This analysis indicated a differentbias for each palindromic dinucleotide in the followingorder: TA > CG > AT > GC; and only the GCdinucleotide was slightly over-represented. The strongbias observed for palindromic dinucleotides can alsoexplain the under-representation of other short palin-dromes, since each palindrome contains a palindromicdinucleotide as a center of symmetry. If, for some rea-son, such centers of symmetry are less frequently

formed, the occurrence of all palindromic sequences mayalso be expected to be less frequent. Palindromic dinu-cleotides composed of A-T base pairs were particularlydisfavored, consistent with our finding that the A+Tcontent of short palindromes present in the yeast gen-ome is decreased in comparison with the random se-quence.

The observed differences in the occurrence of dinu-cleotides are not specific for the yeast genome. Earlysequence analyses indicated strong biases in dinucleotidefrequencies in prokaryotes, eukaryotes, mitochondrialand viral genomes (Nussinov 1984). Karlin et al. (1997)also found that TA and CG are among the most under-represented dinucleotides in different microbial ge-nomes; and they extensively discussed the mechanismsthat could create a genomic signature. For example, onepossible explanation for the observed biases in thedinucleotide frequencies could reflect the yeast codonbias (Sharp and Cowe 1991), since a high proportion ofthe yeast genome consists of protein-coding sequences.However, our data indicated that 72.4% of all palin-dromic dinucleotides were located within coding regionsthat represent 72.7% of the entire genome. This resultsuggests that the high avoidance of palindromic dinu-cleotides in the yeast genome (Table 1) is not limitedonly to the coding DNA. We also observed that thedinucleotide GC-CG ratio was 1.28 for the entire gen-ome and 1.29 for the protein-coding part of the genome,while in palindromes longer than 14 bp it was even in-creased to 1.33. These results indicate that the observeddinucleotide bias is not a consequence of the bias incodon usage but could rather reflect the intrinsic bias inspontaneous mutagenesis or a bias in replication/repair,as proposed by Karlin et al. (1997). For example, thefact that the TT dinucleotide is 1.47 times more frequentthan the TA dinucleotide in the yeast genome could bedue to the biased incorporation of non-complementarynucleotides during DNA replication and/or repair syn-thesis, to the bias in the mismatch repair process, orbecause the probability for TA dinucleotides to mutateis slightly higher than that for TT dinucleotides. Suchbiases could be beyond the level of detection in anybiochemical assay, but could leave an imprint in thegenomic DNA sequences during evolutionary time, likethe genome-wide under-representation of short palin-dromes described here. Moreover, it is possible that asimilar mutagenic bias, rather than the activity ofrestriction enzymes, resulted in the short palindrome/restriction site avoidance observed in bacterial genomes.A systematic study of their palindrome contents, like theone described here, may contribute to a better under-standing of the role of these two processes in the evo-lution of bacterial genomes.

A high preference for non-coding regions was ob-served for long palindromes, but even the short (4–10 bp) palindromes showed size-dependent avoidance ofthe protein-coding regions. Therefore, although biaseddinucleotide composition could be the main reason forthe decreased frequency of short palindromes at the level

295

Palindrome content of the yeast Saccharomyces cerevisiae genome | 79

paper 2

Page 26: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

of the whole genome, there could be some other mech-anism(s) regulating their incidence in the coding regions.Avoidance of the 6-bp palindromes was also observed inthe genes of several bacterial species, including Buchnerasp. and Wigglesworthia sp., which apparently have noRMSs (Fuglsang 2004). It is tempting to speculate thatthe presence of palindromes in the genes is avoided be-cause they could affect in some way the metabolism ofmRNAs. It should be noted that, in addition to theformation of intramolecular hairpins, palindromic se-quences can also promote associations between twoidentical mRNA molecules in opposite orientations. TheRNA/RNA duplex created between two palindromicsequences may interfere with subsequent step(s) in pro-tein synthesis. For example, longer palindromes presentin mRNAs may produce an effect similar to that of theinterfering RNA molecules (Novina and Sharp 2004),stimulating mRNA degradation.

Our data clearly indicate that the number of largepalindromes present in the yeast genome is much higherthan expected from the analysis of random sequences.Starting from 10–12 bp, palindromes tend to be not onlymore frequent than expected, but also more AT-rich andpreferentially located in the non-coding regions. Palin-dromes longer than 18 bp are usually built mainly ofA-T base pairs and are almost exclusively located in theintergenic regions, comprising only 27.7% of the entiregenome. The longest palindrome detected in the yeastgenome (12.1 Mb) has 44 bp and contains only two C-Gbase pairs. An even more pronounced bias for long, AT-rich palindromes was observed in C. elegans chromo-somes III and X (LeBlanc et al. 2000). Four 60-bppalindromes were found on chromosome III (11 Mb)and 17 on chromosome X (16 Mb) and they were builtexclusively of A-T base pairs. Over-representation oflarge palindromes is consistent with their presumed rolein different cellular processes, like the regulation of geneexpression or initiation of chromosomal replication; andthe high A+T content may facilitate local DNA meltingand adoption of secondary structures. Systematic dele-tions of the longest palindromes detected in the yeastgenome, followed by phenotype analysis of the corre-sponding strains, could shed more light on their possiblebiological functions.

Different mechanisms may regulate palindromeincidence

The data presented here, together with the results ofother studies, may suggest an integrated view on theevolutionary dynamics of palindromic sequences inyeast (Fig. 5). The occurrence of short palindromes inthe entire genome could be disfavored due to a slightbias in mutagenic DNA replication and/or the increasedprobability for palindromic dinucleotides TA, CG andAT to mutate, while additional mechanism(s) couldcontribute to palindrome avoidance within coding re-gions. AT-rich palindromes that acquire the critical sizeof 10–12 nt may become unstable and increase their size,mainly due to the insertion of AT or TA dinucleotidesby slippage during DNA replication (Kruglyak et al.1988; Toth et al. 2000). Since insertions in the codingregions lead to gene inactivation, they are tolerated onlyin the intergenic regions. Long, AT-rich palindromescould acquire novel functions, but could also present astarting point for the generation of other palindromicsequences, imperfect palindromes and IRs, enlarging therepertoire of possible cis-acting genetic elements. Theupper size of a palindrome may be regulated by thepotential of palindromic sequences to form cruciformstructures in vivo. As mentioned before, such structurescan be processed to DSBs (Lobachev et al. 2002) and areknown to induce different types of recombination events(Gordenin et al. 1993; Leach 1994). The longest palin-drome found in the yeast genome contains 44 bp, butthe critical size of a palindrome that could act as aninitiator of recombination still needs to be experimen-tally determined.

Concluding remarks

The new research tool for palindrome analysis describedhere may contribute to a deeper insight into the evolu-tion and functions of these important DNA motifs. Ouranalysis of the first complete catalog of palindromicsequences present in the genome of a cellular organismrevealed several interesting findings. For example, weshow that the under-representation of short palindromesis not limited to the bacterial genomes but can also beobserved in yeast, while size-dependent palindromeavoidance in the coding regions seems particularlyintriguing. We also point out the relevance of dinucle-otide bias analysis for the study of palindromes and webelieve our computer program will also be useful formore specific tasks, like the identification of potentialrestriction sites or regulatory elements.

Fig. 5 Evolutionary dynamics of palindromic sequences in yeast.The solid vertical line indicates the size limit between short and longpalindromes as determined in this work. The critical size leading topalindrome loss/destruction (dashed vertical line) needs to beexperimentally determined

296

80 | Palindrome content of the yeast Saccharomyces cerevisiae genome

paper 2

Page 27: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

Acknowledgements We are grateful to Ana Vukelic for help instatistical analysis. This work was supported by grant 0058014from the Croatian Ministry of Science, Education and Sports.

References

Bissler JJ (1998) DNA inverted repeats and human disease. FrontBiosci 3:408–418

Chen Y, Gill GN (1996) A heteromeric nuclear protein complexbinds two palindromic sequences in the proximal enhancer ofthe human erbB-2 gene. J Biol Chem 271:5183–5188

Cuomo AC, Mundy CL, Oettinger MA (1996) DNA sequence andstructure requirements for cleavage of V(D)J recombinationsignal sequences. Mol Cell Biol 16:5683–5690

Fuglsang A (2004) The relationship between palindrome avoidanceand intergenic codon usage variations: a Monte Carlo study.Biochem Biophys Res Commun 316:755–762

Gelfand MS, Koonin EV (1997) Avoidance of palindromic wordsin bacterial and archaeal genomes: a close connection withrestriction enzymes. Nucleic Acids Res 25:2430–2439

Gordenin DA, Resnick MA (1998) Yeast ARMs (DNA at-riskmotifs) can reveal sources of genome instability. Mutat Res400:45–58

Gordenin DA, Lobachev KS, Degtyareva NP, Malkova AL, Per-kins E, Resnick MA (1993) Inverted DNA repeats: a source ofeukaryotic genomic instability. Mol Cell Biol 13:5315–5322

Karlin S, Mrazek J, Campbell AM (1997) Compositional biases ofbacterial genomes and evolutionary implications. J Bacteriol179:1363–1370

Kruglyak S, Durret RT, Schug MD, Aquadro CE (1998) Equilib-rium distribution of microsatellite repeat length resulting from abalance between slippage events and point mutations. Proc NatlAcad Sci USA 95:10774–10778

Kurtz S, Schleiermacher C (1999) REPuter: fast computation ofmaximal repeats in complete genomes. Bioinformatics 15:426–427

Leach DRF (1994) Long DNA palindromes, cruciform structures,genetic instability and secundary structure repair. Bioessays16:893–898

LeBlanc MD, Aspeslagh G, Buggia NP, Dyer BD (2000) Anannotated catalog of inverted repeats of Caenorhabditis eleganschromosomes III and X, with observations concerning odd/even biases and conserved motifs. Genome Res 10:1381–1392

Lobachev KS, Gordenin DA, Resnick MA (2002) The Mre11complex is required for repair of hairpin-capped double-strandbreaks and prevention of chromosome rearrangements. Cell103:83–193

Mori K, Ogawa N, Kawahara T, Yanagi H, Yura T (1998) Pal-indrome with spacer of one nucleotide is characteristic of thecis-acting unfolded protein response element in Saccharomycescerevisiae. J Biol Chem 273:9912–9929

Novina CD, Sharp PA (2004) The RNAi revolution. Nature430:161–164

Nussinov R (1984) Doublet frequencies in evolutionary distinctgroups. Nucleic Acids Res 12:1749–1763

Repping S, Skaletsky H, Lange J, Silber S, Veen F van der, OatesRD, Page DC, Rozen S (2002) Recombination between palin-dromes P5 and P1 on the human Y chromosome causes massivedeletions and spermatogenic failure. Am J Hum Genet 71:906–922

Rice P, Longden I, Bleasby A (2000) EMBOSS—the europeanmolecular biology open software suite. Trends Genet 15:276–278

Rocha EPC, Danchin A, Viari A (2001) Evolutionary role ofrestriction/modification systems as revealed by comparativegenome analysis. Genome Res 11:946–958

Sharp PM (1986) Molecular evolution of bacteriophages: evidenceof selection against the recognition sites of host restriction en-zymes. Mol Biol Evol 3:75–83

Sharp PM, Cowe E (1991) Synonymous codon usage in Saccha-romyces cerevisiae. Yeast 7:657–678

Tashiro J, Kinoshita K, Honjo T (2001) Palindromic but not G-rich sequences are targets of class switch recombination. IntImmunol 13:495–505

Toth G, Gaspari Z, Jurka J (2000) Microsatellites in differenteukaryotic genomes: survey and analysis. Genome Res 10:967–981

297

Palindrome content of the yeast Saccharomyces cerevisiae genome | 81

paper 2

Page 28: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

Abstract

Background and Purpose: Perfect inverted repeats, or palin-dromes, are known as potent inducers of genetic recombination. In our previous work we investigated the influence of a 110 bp palindro-mic insertion present on a non-replicative plasmid on integration into the yeast genome (19). The palindrome influenced the final outcome of recombination process, but no stimulation or targeting of plasmid in-tegration was observed. In this study we investigated whether plasmid integration would be stimulated when the palindrome was present in both the chromosomal and the plasmid copy of the CYC1 gene.

Materials and Methods: Yeast strains and plasmids containing palindromic sequence in the CYC1 gene were constructed using con-ventional methods. Strain construction was followed by Southern blot analysis.

Results: When the palindromic sequence was present in both the chromosomal and the plasmid copy of the CYC1 gene, the relative ef-ficiency of transformation was increased more than four-fold. During intrachromosomal (»pop-out«) recombination, the palindrome present in only one copy of the CYC1 gene was lost with very high frequency.

Conclusions: Our study may encourage the use of inverted repeats present in the genomes of different organisms for development of pal-indrome-based integrative vectors.

INTRODUCTION

Genomic DNA may contain significant proportion of repeated sequences that may be dispersed throughout the genome or or-

ganized as direct repeats. Inverted repeats present a special class of repeated sequences due to their ability to form secondary structures known as hairpins or cruciforms. Palindromes are inverted repeats without spacer DNA and are frequently found in different cis-acting genetic elements like promoters, operators or origins of replication. However, in spite of their obvious biological importance, such motifs are unstable in different organisms, from bacteria to mammalian cells. Two types of models have been proposed to explain the instability of palindromes. The first were based on the existence of specific endo-nucleases that transform cruciforms into double-strand breaks (DSBs), while other propose that the recombinogenic structure is formed only when the progression of DNA replication is blocked by the formation of a secondary structure (1, 6, 8).

IVAN-KREŠIMIR SVETECBERISLAV LISNIĆZORAN ZGAGA

Laboratory of Biology and Microbial GeneticsFaculty of Food Technology and BiotechnologyUniversity of ZagrebPierottijeva 6P.O.B. 625HR-10 001 ZagrebCroatia

Corrrespondence:Ivan-Krešimir SvetecLaboratory of Biology and Microbial GeneticsFaculty of Food Technology and BiotechnologyUniversity of ZagrebPierottijeva 6P.O.B. 625HR-10 001 ZagrebCroatia

Key Words: palindrome, recombination,plasmid integration, yeast

Abbreviation:DSB - double strand break

Received September 26, 2002.

421

Original scientific paper

PERIODICUM BIOLOGORUMVOL. 104, No 4, 421-424, 2002

UDC 57:61CODEN PDBIAD

ISSN 0031-5362

A 110 bp palindrome stimulates plasmid integration in yeast

82 | A 110 bp palindrome stimulates plasmid integration in yeast

paper 3

Page 29: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

In the yeast S. cerevisiae, recombinogenic potential of inverted repeats and palindromes has been demon-strated for both mitochondrial and chromosomal DNA. In mitochondria, inverted repeats are responsible for the formation of amphimeric genomes in petite mutants (16), and mitochondrial endonucleases that cut the cruciform structures in vitro have been isolated in S. cerevisiae (3) and Schizosaccharomyces pombe (13). Palindromes in chromosomal DNA are recognized by the meiosis-spe-cific endonuclease Spo11 (12), but shorter palindromes were shown to be refractory to the mismatch repair during meiotic recombination (77). In mitotic cells, stimulation of recombination induced by inverted repeats was directly related to the size of repeats and inversely related to the size of intervening spacers, so that an inverted dimer of the URA3 gene (1,0 kb) stimulated recombination in the adjacent region up to 17,000-fold (70). Recombination was also stimulated by the pol3-t mutation in the gene encod-ing the DNA polymerase d, suggesting the involvement of DNA replication in the formation of a recombinogenic structure. Recently, Lobachev et al. (9) demonstrated the formation of DSBs at hairpin structures formed in vivo be-tween inverted repeats, but the responsible nuclease has not been identified.

In our previous work we studied the influence of a 110 bp palindromic insertion on plasmid integration in the homologous region present in the yeast genome (19). We found that the palindromic insertion did not stimulate plas-mid integration, but influenced some subsequent step(s) of the recombination process. In this study we decided to introduce the palindromic insertion in the chromosomal copy of the CYC1 gene to see whether the efficiency of transformation would be increased when the palindromic sequence was present both on the plasmid molecule and in the homologous region in the yeast genome.

MATERIALS AND METHODS

Plasmid and strain construction

Construction of the plasmids containing 102 bp palin-dromic (pAB218-5) or non-palindromic (pAB216) inser-tions in the EcoR I site of the 1685 bp yeast CYC1 region has been described (79). These two insertions were partially homologous, since the monomer used to create inverted repeats was also present in non-palindromic insertion, and the overall size of the palindromic sequence (insert together with flanking regions) was 110 bp (79). These two plasmids also contained two additional mutations in that re-gion created by inactivation of the restriction enzyme sites Kpn I and Sph I. In this work we used the plasmids pAB218-7 (palindromic insertion) and pAB218-8 (non-palindromic insertion) that were the same as the plasmids pAB218-5 and pAB218-6 but did not contain these two mutations. The CYC1 region, also contained two sites for the restriction enzyme Xho I, so that the digestion of the plasmid DNA with this enzyme created a 434 bp gap 256 bp upstream

from the EcoR I site. Standard procedures were used for all DNA manipulations (77) and the yeast strain used was FF18-52 (MATa, leu2-3,112, ura3-52, ade5, trpl-289, can1). The strains containing desired insertions in their CYC1 re-gions were constructed as follows. First, the spheroplasts were transformed with the plasmid DNA that was cut with Xho I in order to target integrations to the CYC1 region and the structure of the CYC1 locus was analyzed by South-ern blotting (see below). Two Ura+ transformants (one for each plasmid) were further grown to the stationary phase in 4 ml of the minimal supplemented medium without ura-cil. Aliquots were plated on the agar plates containing 0.75 mg/ml of 5-fluoroorotic acid for selection of the cells that had lost the URA3 gene due to »pop-out« recombination event between directly repeated CYC1 regions (2). These were further analyzed by Southern blotting in order to de-tect the colonies that contained a single copy of the CYC1 region carrying the insertion formerly present on the plas-mid.

Yeast transformation and analysis of transformants

Yeast transformation by a modified spheroplast proce-dure and Southern-blot analysis of transformants were de-scribed previously (4, 7). For strain construction, genomic DNA of Ura- colonies was digested with the restriction en-zyme Sac I that cuts the plasmids pAB218-7 and 218-8 only within insertion in the CYC1 region. Relative efficiency of transformation was determined by comparing the efficien-cies of transformation obtained with circular plasmid and the plasmid cut with Xho I in the CYC1 region. In each ex-periment the two samples were transformed with gapped plasmid (0.4 µg) and ten samples with circular DNA (2 µg) and the efficiency of transformation was expressed as the number of transformants per µg, of DNA.

RESULTS AND DISCUSSION

Strain construction

The lack of stimulation of plasmid integration in the pres-ence of a palindromic insertion, observed in our previous study, could be due to different reasons. For example, if the creation of a recombinogenic structure depends on DNA replication (5, 8), a palindrome present on a non-rep-licative plasmid would not stimulate plasmid integration. Alternatively, if the palindromic insertion was recombino-genic even in the absence of DNA replication, recombina-tion could be blocked by the lack of homology to the pal-indrome in the genomic copy of the CYC1 region. In order to overcome possible barriers for palindrome-stimulated transformation, we decided to transfer the palindromic and non-palindromic insertions (used as a control) from the plasmids pAB218-7 and pAB218-8 in the CYC1 region present in the yeast genome. We first transformed the yeast strain FF18-52 with the plasmid DNA cut with Xho I and a sample of transformants was further analyzed by Southern blotting. For both plasmids we observed that the

I.K. Svetec et al. Palindrome-induced plasmid integration

422 Period biol, Vol 104, No 4, 2002.

A 110 bp palindrome stimulates plasmid integration in yeast | 83

paper 3

Page 30: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

heterologous insertion present on the plasmid molecule was lost in the majority of transformants (Figure 1). This is in accordance with current models for homologous recom-bination predicting that the unbroken DNA molecule acts as a donor of genetic information (15). A single colony from each strain containing the insertion in the left copy of the CYC1 region (as presented in Figure 1) was further used to isolate spontaneous Ura+ colonies that were analyzed by Southern blotting. This analysis revealed a striking dif-ference between the strains containing palindromic and non-palindromic insertions in the duplicated copy of the CYC1 region. In the strain containing the non-palindromic insertion 18/60 auxothrophs retained the insertion, while only 1/80 was detected in the strain that contained the palindromic insertion in the CYC1 region (Figure 2). Pref-erential loss of the palindromic sequence was previously detected during integration of the circular plasmid (19) and could be explained in two ways. One possibility is that the hairpin structure formed in the DNA heteroduplex was preferentially excised and another is that the recom-bination between two copies of the CYC1 gene was in fact initiated by a DSB created in, or close to the palindrome. In this case, heterologous termini containing palindromic sequence should be degraded prior to the completion of the recombination process (18). We prefer the latter

interpretation since palindromic sequences have been shown to be refractory to the mismatch repair, at least in meiotic cells (11).

Yeast transformation

In order to find out whether the palindromic sequence would promote integrative transformation we compared experimental systems in which both the chromosomal and the plasmid copy of the CYC1 region contained either palindromic or non-palindromic insertions. The insertions were of the same size and placed in the same location with-in the CYC1 gene. Transformation was performed with cir-cular plasmids and also with the plasmids linearized with restriction endonuclease Xho I. For each experiment, the relative efficiency of transformation was determined as the ratio between transformation efficiencies obtained with circular and linearized molecules, in order to compare the results obtained with different strains and different plasmid preparations. In all experiments performed the relative efficiency of transformation was higher when the palindromic sequences were present in the CYC1 region (Figure 3). Higher fluctuations were observed with palin-dromic sequences (2.26 ± 1.13 x 102) than with non-palin-dromic sequences (0.47 ± 0.22 x 102). This could be due, for example, to different degrees of super-coiling present in different plasmid preparations used for transformation, since negative supercoiling was shown to stimulate cru-ciform extrusion in vivo (8). The average increase in the relative efficiency of transformation due to the presence of the palindrome was 4.8 fold. This effect is rather modest in relation to the DSB-stimulated plasmid integration (14) and is comparable to the single base-pair mismatch-stimulat-ed plasmid integration (20). However, based on the work of Lobachev et al. (10), we could expect highly increased stimulation of recombination with increasing size of a palindrome. The palindromes longer than approximately 150 bp, that cannot be propagated in wild-type Escheri-chia coli, can be stably maintained in sbcCD mutants (8) and used for the study of integrative transformation

I.K. Svetec et al. Palindrome-induced plasmid integration

423 Period biol, Vol 104, No 4, 2002.

FIGURE 1. Targeted plasmid integrations in the CYC1 region. Plasmid DNA was treated with the restriction endonuclease Xho I before transformation. Different types of integration events were revealed by Southern blot analysis of transformants.

FIGURE 2. Recombination-associated loss of the insert. The presence of the insert in the CYC1 region among uracil auxo-trophs was revealed by Southern blot analysis. Symbols are as in Figure 1.

FIGURE 3. The influence of the palindrome on the transforma-tion efficiency. Both plasmid and chromosomal copy of the CYC1 gene contained either palindromic (A) or non-palindromic (B) insertions. For each strain, the relative transformation efficien-cies obtained in five independent experiments and the average values are indicated.

A

rela

tive

effic

ienc

y of

tran

sfor

mat

ion

(x 1

02 )

B

84 | A 110 bp palindrome stimulates plasmid integration in yeast

paper 3

Page 31: and distRibution of PalindRomes y Saccharomyces cerevisiae g · 2012. 1. 29. · genetic alterations which can have harmful impact on human health. The ability of palindromes to undermine

in organisms other than yeast, where DSBs fail to stimu-late and target plasmid integration.

Acknowledgements: This work was supported by the grants from the Ministry of Science and Technology of the Republic of Croatia Nos. 0058405 and 0058014.

REFERENCES

1. BISSLER J J 1998 DNA inverted repeats and human dis- ease. Frontiers in Bioscience 3:408-418

2. BOEKE J, LACRUTE F, FINK G R 1984 A positive selec- tion for mutants lacking orotidine-5’-phosphate decar- boxylase activity in yeast: 5-fluoro-orotic acid resistance. Mol Gen Genet 197: 345-346

3. EZEKIEL U R, ZASSENHAUS H P 1993 Localization of a cruciform cutting endonuclease to yeast mitochon- dria. Mol Gen Genet 240: 414-418

4. GJURAČIĆ K, ZGAGA Z 1996 Illegitimate integration of single-stranded DNA in Saccharomyces cerevisiae. Mol Gen Genet 253: 173-181

5. GORDENIN D A, LOBACHEV K S, DEGTYAREVA N P, MALKOVA A L, PERKINS E 1993 Inverted DNA re- peats: a source of eukaryotic genomic instability. Mol Cell Biol 13: 5315-5322

6. GORDENIN D A, RESNICK M A 1998 Yeast ARMs (DNA at-risk motifs) can reveal sources of genome in- stability. Mutat Res 400: 45-58

7. KOREN P, SVETEC I K, MITRIKESKI P T, ZGAGA Z 2000 Influence of homology size and polymorphism on plas- mid integration in the yeast CYC1 DNA region. Curr Genet 37: 292-2978. LEACH D R F 1994 Long DNA palindromes, cruciform structures, genetic instability and secondary structure repair. BioEssays 16: 893-8989. LOBACHEV K S, GORDENIN D A, RESNICK M A 2002 The Mre11 complex is required for repair of hairpin-capped double-strand breaks and preven- tion of chromosome rearrangements. Cell 103: 183-193

10. LOBACHEV K S, SHOR B M, TRAN H T, TAYLOR W, KEEN J D, RESNICK M A, GORDENIN D A 1998 Fac- tors affecting inverted repeat stimulation of recombi- nation and deletion in Saccharomyces cerevisiae. Ge- netics 745:1507-1524

11. NAG D K, WHITE M A, PETES T D 1989 Palindromic sequences in heteroduplex DNA inhibit mismatch re- pair in yeast. Nature 340: 318-320

12. NASAR F, JANOWSKI C, NAG D K 2000 Long palin- dromic sequences induce double-strand breaks dur- ing meiosis in yeast. Mol Cell Biol 20: 3449-3458

13. ORAM M, KEELEY M, TSANEVA A 1998 Holliday junction resolvase in Schizosaccharomyces pombe has identical endonuclease activity to the CCE1 homologue YDC2. Nucl Acids Res 26: 591-601

14. ORR-WEAVER T L, SZOSTAK J W, ROTHSTEIN R J 1981 Yeast transformation: a model system for the study of recombination. Proc Natl Acad Sci USA 78: 6354-6358

15. PAQUES F, HABER J E 1999 Multiple pathways of re- combination induced by double-strand breaks in Sac- charomyces cerevisiae. Microbiol Mol Biol Rev 63: 349-404

16. RAYKO E, GOURSOT R 1999 Amphimeric mitochond- rial genomes of petite mutants of yeast. III. Genera- tion by linking two secondary-structure-dependent il- legitimate recombination events. Curr Genet 35: 14- 22

17. SAMBROOK J, FRITCH E F, MANIATIS T 1989 Mo- lecular cloning: a laboratory manual, 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York

18. SVETEC I K 2000 The influence of short heterologous sequences on plasmid integration in the yeast Sac- charomyces cerevisiae genome. Master’s Thesis, University of Zagreb

19. SVETEC I, STJEPANDIĆ D, BORIĆ V, MITRIKESKI P T, ZGAGA Z 2001 The influence of a palindromic in- sertion on plasmid integration in yeast. Food Technol Biotechnol 39: 169-173

20. ZGAGA Z, CHANET R, RADMAN M, FABRE F 1991 Mismatch-stimulated plasmid integration in yeast. Curr Genet 19: 329-332

I.K. Svetec et al. Palindrome-induced plasmid integration

424 Period biol, Vol 104, No 4, 2002.

A 110 bp palindrome stimulates plasmid integration in yeast | 85

paper 3