in search of new genomic regions involved in maize domestication. alvarez-mejia, cesar, martinez de...

1
In search of new genomic regions involved in maize domestication. Alvarez-Mejia, Cesar, Martinez de la Vega Octavio, Herrera-Estrella Luis, Herrera-Estrella Alfredo, and Vielle-Calzada Jean-Philippe. Laboratorio Nacional de Genómica para la Biodiversdad; Langebio Cinvestav Irapuato. Km. 9.6, Libramiento Norte Carretera Irapuato-León, CP 36821, Irapuato Guanajuato México. Methodology and Results Summary. a) Genomic sequence comparisons between a maize landrace and an inbred line are useful to find new genomic regions involved in domestication. b) Regions containing a high number of NIdSRs can often represent organellar genome insertions. c) A pilot analysis of nucleotide variability is underway for new widely distributed regions showing low polymorphism in B73, Palomero, and Mo17. C. Selection of widely diverse regions for analysis of nucleotide variabi A. Genomic comparison of Zea mays ssp mays B73 inbred line 6,7 (ZmB73) and Zea mays ssp mays Palomero landrace 8 (ZmPal) and identification of Nearly Identical Sequences Regions (NIdSRs) Maize was domesticated from Teosinte (Zea mays ssp parviglumis) in Mexico, ~9000 years ago, presumably in a region of the Balsas river drainage, at the intersection of Michoacán and Guerrero and México States 1,3 . Although some genes such as TB1 and TGA1 were shown to have been affected by artificial selection related to domestication 2,4, , their function is not sufficient to explain the drastic morphological differences that distinguish teosinte and maize, suggesting that an important portion of genomic regions that contributed to maize domestication remains to be explored. We have initiated a genomic comparative procedure to find new genomic regions containing gene candidates that were influenced by maize domestication. Example of genomic structure and annotation of 100 kb around a target gene (A). Blue: ZmB73 genome linear trend. Red: location of Palomero sequences with at least 95% shared identity. Regions for nucleotide polymorphic analysis are marked in green. B. Initial studies of nucleotide variability in gene target regions. Red Line. Distribution of NIdSRs in segments of 150 Kb each. Blue Line, Frequency of recombination (in Cm/Mb 5 ) Grey Lines, Syntenic analysis of 150 Kb regions containg highest number of NIdSRs. References 1. Moeller, D. A., Tenaillon, M. I., Tiffin, P. , Genetics. 176, 1799-809 (2007). 2. Clark, R. M., Wagler, T. N., Quijada, P., Doebley, J. , Nature Genetics. 38, 594-7 (2006). 3. Matsuoka, Y. , Breeding Science. 55, 383-390 (2005). 4. Doebley, J., Gaut, B., Smith, B. , Cell. 127, 1309-1321 (2006). 5. Liu S., Yeh, C.-T., Ji, T., K. Ying, Wu, H., Tang, H. M., Fu, Y., Nettleton, D., and Schnable, P. S., PLoS Genetics 5:e1000 6. http://www.maizegdb.org 7. http://www.maizesequence.org 8. Vielle-Calzada J.-P., Martinez De La Vega, O., Hernandez-Guzman, G., et al. Science. 326:1078-1078 (2009). Comparative Genomic Analysis ZmB73-Palomero - ≥ 95% of identity. - ≥ 200 pb 100% identical - Non redundant at the genome level. -Selection of Nearly Identical Sequence Region. (NIdSR) Gene Target - Length ≥ 1 Kb - Gene Annotation - High concentration of NidSR/150 Kb - Frequency of recombination ≥ 1 Cm/Mb Polymorphic analysis determination -Selection of 10 candidates genes. - Selection and annotation of 100 Kb around the NidSR selected. - Polymorphic analysis of three regions contained within 100 Kb. The NIdSR distribution in the genome of zmB73 was grouped in arbitrary classes depending on their NidSR content per 150 Kb segments. Whereas classes containing the highest number NIdSRs (class 17 to 26) are related to chloroplast and mitochondria genomic insertions, classes with lowest number of NIdSRs are related to neutral genes (Class 1 to 5). Classification and distribution of NIdSRs. Neutral Gene chr Class NidSR/ 150 CM/Mb bz2 chr3 2 1 1.12 adh1 chr1 1 0 1.2 umc128 chr1 1 0 1.46 an1 chr1 1 0 1.49 gbl1 chr9 4 3 3.99 fus6 chr1 1 0 4.36 Gene low variability chr Class NidSR/ 150 CM/Mb tb1 chr1 7 6 1.03 tga1 chr4 9 8 0.3 Cd transporter chr5 4 3 1.12 Cu transporter chr5 8 7 1.08 Multicopper Oxidase chr5 6 5 1.22 Class Total Number of NIdSRs per Class Total Number of 150 Kb segments per Class 1 0 4637 2 1 3561 3 2 2275 4 3 1371 5 4 852 6 5 455 7 (tb1) 6 272 8 7 132 9 (tga1) 8 90 10 9 48 11 10 22 12 11 12 13 12 2 14 13 8 15 14 1 16 15 1 17 18 2 18 19 2 19 22 1 20 23 1 21 24 1 22 26 2 23 47 1 24 49 1 25 64 1 26 96 1 The distribution of NIdSRs in the ZmB73 genome shows some particularities: (1) Zones highest content of NIdSR’s correspond to regions with chloroplast and mitochondria identity; (2) Zones with lowest content of NIdSR correspond to regions with high nucleotide variability, often related to redundant or repeated sequences, or neutral genes. For further investigation, we selected classes where genomic segments containing tb1 and tga1 are classified. The selection of these classes takes also into consideration a frequency of recombination of at least 1 Cm/Mb, a parwaise length representation of at least 1 Kb, and ZmB73 gene annotation. Close to 200 genes were analyzed and 7 were selected in a pilot screen to study nucleotide variability and test for neutrality. From a 200 genes list, we selected 7 genes to analyse nucleotide variability in 16 native landraces and 16 local Balsas teosinte populations. We also plan to pursue our analysis of previously identified regions containing heavy meal response affected by domestication 8 . Identification and comparison of these genes with their genomic sequence in Mo17 inbred line confirmed a drastic reduction in nucleotide variability. Taking the position of the target gene as a reference, we will analyse nucleotide variability in a region encompassing 100 kb around the coding sequence. This procedure could offer hints on possible events of selection sweep. Introduction Acknowledgements. We thank Patrick Schnable for frequency recombination data and maizegdb and maizesequence for the B73 genome data. This work was supported by grant ZEA-2006 from SAGARPA, CONACyT, and the Howard Hughes Medical Institute International Scholars Program. Distribution of NIdSRs in continous 150 Kb genomic segments NidSR chr length iden anotation Mo17 comparision E09Contig186729. 1 chr8 1538 100 Hypothetical protein SORBIDRAFT_03g013535 100% (509 pb) E09Contig86112.1 chr7 1206 100 NIN-like protein 1 [Zea mays] 99.60 % (502 pb) E09Contig157162. 1 chr1 1311 100 Leucine-rich repeat transmembrane protein kinase 1 99.21 % (505 pb) E09Contig156774. 1 chr8 1046 100 ER degradation-enhancing alpha-mannosidase-like 1 99.59% (492 pb) E09Contig17638.1 chr2 1306 99.92 Antiporter/drug/transporter/transporter [Zea mays] 99.61% (508 pb) E09Contig210809. 1 chr7 1090 99.91 ARF gap like zinc finger protein ZIGA3 99.59% ( 482 pb E09Contig189736. 1 chr9 2534 100 Leucine-rich repeat transmembrane protein kinase2 [Zea mays] 100% (525 pb) E09Contig10846.1 chr5 2405 100 ATPase cadmium transporter [Zea mays]. 99.61% (518 pb) E09Contig188819. 1 chr5 2276 100 Copper transporter [Zea mays]. 99.81% (513 pb) E09Contig74849.1 chr5 1589 100 Multicopper oxidase protein [Zea mays]. 100% (516 pb) 26 kb 24 kb A Distribution of Teosinte in Mexico. Adapted from Matzuoka et al 2005

Upload: david-hollis

Post on 27-Mar-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: In search of new genomic regions involved in maize domestication. Alvarez-Mejia, Cesar, Martinez de la Vega Octavio, Herrera-Estrella Luis, Herrera-Estrella

In search of new genomic regions involved in maize domestication.

Alvarez-Mejia, Cesar, Martinez de la Vega Octavio, Herrera-Estrella Luis, Herrera-Estrella Alfredo, and Vielle-Calzada Jean-Philippe.

Laboratorio Nacional de Genómica para la Biodiversdad; Langebio Cinvestav Irapuato. Km. 9.6, Libramiento Norte Carretera Irapuato-León, CP 36821, Irapuato Guanajuato México.

Methodology and Results

Summary.a) Genomic sequence comparisons between a maize landrace and an inbred line are useful to find new genomic regions involved in domestication.

b) Regions containing a high number of NIdSRs can often represent organellar genome insertions.

c) A pilot analysis of nucleotide variability is underway for new widely distributed regions showing low polymorphism in B73, Palomero, and Mo17.

C. Selection of widely diverse regions for analysis of nucleotide variability.A. Genomic comparison of Zea mays ssp mays B73 inbred line6,7 (ZmB73) and Zea mays ssp mays Palomero landrace8 (ZmPal) and identification of

Nearly Identical Sequences Regions (NIdSRs)

Maize was domesticated from Teosinte (Zea mays ssp parviglumis) in Mexico, ~9000 years ago, presumably in a region of the Balsas river drainage, at the intersection of Michoacán and Guerrero and México States1,3. Although some genes such as TB1 and TGA1 were shown to have been affected by artificial selection related to domestication2,4,, their function is not sufficient to explain the drastic morphological differences that distinguish teosinte and maize, suggesting that an important portion of genomic regions that contributed to maize domestication remains to be explored. We have initiated a genomic comparative procedure to find new genomic regions containing gene candidates that were influenced by maize domestication.

Example of genomic structure and annotation of 100 kb around a target gene (A). Blue: ZmB73 genome linear trend. Red: location of Palomero sequences with at least 95% shared identity. Regions for nucleotide polymorphic analysis are marked in green.

B. Initial studies of nucleotide variability in gene target regions.

Red Line. Distribution of NIdSRs in segments of 150 Kb each.Blue Line, Frequency of recombination (in Cm/Mb5)Grey Lines, Syntenic analysis of 150 Kb regions containg highest number of NIdSRs.

References1. Moeller, D. A., Tenaillon, M. I., Tiffin, P. , Genetics. 176, 1799-809 (2007).2. Clark, R. M., Wagler, T. N., Quijada, P., Doebley, J. , Nature Genetics. 38, 594-7 (2006).3. Matsuoka, Y. , Breeding Science. 55, 383-390 (2005).4. Doebley, J., Gaut, B., Smith, B. , Cell. 127, 1309-1321 (2006).5. Liu S., Yeh, C.-T., Ji, T., K. Ying, Wu, H., Tang, H. M., Fu, Y., Nettleton, D., and Schnable, P. S., PLoS Genetics 5:e1000733 (2009). 6. http://www.maizegdb.org 7. http://www.maizesequence.org8. Vielle-Calzada J.-P., Martinez De La Vega, O., Hernandez-Guzman, G., et al. Science. 326:1078-1078 (2009).

Comparative Genomic Analysis ZmB73-Palomero

Comparative Genomic Analysis ZmB73-Palomero

- ≥ 95% of identity.- ≥ 200 pb 100% identical - Non redundant at the genome level.-Selection of Nearly Identical Sequence Region. (NIdSR)

Gene TargetGene Target- Length ≥ 1 Kb- Gene Annotation- High concentration of NidSR/150 Kb- Frequency of recombination ≥ 1 Cm/Mb

Polymorphic analysis determination

Polymorphic analysis determination

-Selection of 10 candidates genes.- Selection and annotation of 100 Kb around the NidSR selected.- Polymorphic analysis of three regions contained within 100 Kb.

The NIdSR distribution in the genome of zmB73 was grouped in arbitrary classes depending on their NidSR content per 150 Kb segments. Whereas classes containing the highest number NIdSRs (class 17 to 26) are related to chloroplast and mitochondria genomic insertions, classes with lowest number of NIdSRs are related to neutral genes (Class 1 to 5).

Classification and distribution of NIdSRs.

Neutral Gene chr ClassNidSR/

150CM/Mb

bz2 chr3 2 1 1.12

adh1 chr1 1 0 1.2

umc128 chr1 1 0 1.46

an1 chr1 1 0 1.49

gbl1 chr9 4 3 3.99

fus6 chr1 1 0 4.36

Gene low variability

chr Class NidSR/150 CM/Mb

tb1 chr1 7 6 1.03

tga1 chr4 9 8 0.3

Cd transporter chr5 4 3 1.12

Cu transporter chr5 8 7 1.08

Multicopper Oxidase chr5 6 5 1.22

Class

TotalNumber of NIdSRs per

Class

Total Number of

150 Kb segments per Class

1 0 4637

2 1 3561

3 2 2275

4 3 1371

5 4 852

6 5 455

7 (tb1) 6 272

8 7 132

9 (tga1) 8 90

10 9 48

11 10 22

12 11 12

13 12 2

14 13 8

15 14 1

16 15 1

17 18 2

18 19 2

19 22 1

20 23 1

21 24 1

22 26 2

23 47 1

24 49 1

25 64 1

26 96 1

The distribution of NIdSRs in the ZmB73 genome shows some particularities: (1) Zones highest content of NIdSR’s correspond to regions with chloroplast and mitochondria identity; (2) Zones with lowest content of NIdSR correspond to regions with high nucleotide variability, often related to redundant or repeated sequences, or neutral genes. For further investigation, we selected classes where genomic segments containing tb1 and tga1 are classified. The selection of these classes takes also into consideration a frequency of recombination of at least 1 Cm/Mb, a parwaise length representation of at least 1 Kb, and ZmB73 gene annotation. Close to 200 genes were analyzed and 7 were selected in a pilot screen to study nucleotide variability and test for neutrality.

From a 200 genes list, we selected 7 genes to analyse nucleotide variability in 16 native landraces and 16 local Balsas teosinte populations. We also plan to pursue our analysis of previously identified regions containing heavy meal response affected by domestication8. Identification and comparison of these genes with their genomic sequence in Mo17 inbred line confirmed a drastic reduction in nucleotide variability.

Taking the position of the target gene as a reference, we will analyse nucleotide variability in a region encompassing 100 kb around the coding sequence. This procedure could offer hints on possible events of selection sweep.

Introduction

Acknowledgements.We thank Patrick Schnable for frequency recombination data and maizegdb and maizesequence for the B73 genome data. This work was supported by grant ZEA-2006 from SAGARPA, CONACyT, and the Howard Hughes Medical Institute International Scholars Program.

Distribution of NIdSRs in continous 150 Kb genomic segments

NidSR chr length iden anotationMo17

comparision

E09Contig186729.1 chr8 1538 100 Hypothetical protein SORBIDRAFT_03g013535 100% (509 pb)

E09Contig86112.1 chr7 1206 100 NIN-like protein 1 [Zea mays] 99.60 % (502 pb)

E09Contig157162.1 chr1 1311 100 Leucine-rich repeat transmembrane protein kinase 1 99.21 % (505 pb)

E09Contig156774.1 chr8 1046 100 ER degradation-enhancing alpha-mannosidase-like 1 99.59% (492 pb)

E09Contig17638.1 chr2 1306 99.92 Antiporter/drug/transporter/transporter [Zea mays] 99.61% (508 pb)

E09Contig210809.1 chr7 1090 99.91 ARF gap like zinc finger protein ZIGA3 99.59% ( 482 pb

E09Contig189736.1 chr9 2534 100 Leucine-rich repeat transmembrane protein kinase2 [Zea mays] 100% (525 pb)

E09Contig10846.1 chr5 2405 100 ATPase cadmium transporter [Zea mays]. 99.61% (518 pb)

E09Contig188819.1 chr5 2276 100 Copper transporter [Zea mays]. 99.81% (513 pb)

E09Contig74849.1 chr5 1589 100 Multicopper oxidase protein [Zea mays]. 100% (516 pb)

26 kb24 kbA

Distribution of Teosinte in Mexico. Adapted from Matzuoka et al 2005