association genetics of chemical wood properties in black poplar ( populus nigra )

15
Association genetics of chemical wood properties in black poplar (Populus nigra) Fernando P. Guerra 1,2 , Jill L. Wegrzyn 1 , Robert Sykes 3 , Mark F. Davis 3 , Brian J. Stanton 4 and David B. Neale 1,5 1 Department of Plant Sciences, University of California at Davis, Davis, CA, 95616, USA; 2 Instituto de Biologı ´a Vegetal y Biotecnologı ´a, Universidad de Talca, Talca, PO Box 747, Chile; 3 National Renewable Energy Laboratory, Golden, CO, 80401, USA; 4 Genetic Resources Conservation Program, Greenwood Resources, Portland, OR, 97201, USA; 5 Bioenergy Research Center (BERC), University of California at Davis, Davis, CA, 95616, USA Author for correspondence: David B. Neale Tel: +1 530 754 8431 Email: [email protected] Received: 16 July 2012 Accepted: 12 September 2012 New Phytologist (2013) 197: 162–176 doi: 10.1111/nph.12003 Key words: association genetics, bioethanol, cellulose, half-sib families, lignin, Populus nigra, single nucleotide polymorphism (SNP). Summary Black poplar (Populus nigra) is a potential feedstock for cellulosic ethanol production, although breeding for this specific end use is required. Our goal was to identify associations between single nucleotide polymorphism (SNP) markers within candidate genes encoding cel- lulose and lignin biosynthetic enzymes, with chemical wood property phenotypic traits, toward the aim of developing genomics-based breeding technologies for bioethanol produc- tion. Pyrolysis molecular beam mass spectrometry was used to determine contents of five- and six-carbon sugars, lignin, and syringyl : guaiacyl ratio. The association population included 599 clones from 17 half-sib families, which were successfully genotyped using 433 SNPs from 39 candidate genes. Statistical analyses were performed to estimate genetic parameters, link- age disequilibrium (LD), and single marker and haplotype-based associations. A moderate to high heritability was observed for all traits. The LD, across all candidate genes, showed a rapid decay with physical distance. Analysis of single markerphenotype associations identified six significant markertrait pairs, whereas nearly 280 haplotypes were associated with phenotypic traits, in both an individual and multiple trait-specific manner. The rapid decay of LD within candidate genes in this population and the genetic associa- tions identified suggest a close relationship between the associated SNPs and the causative polymorphisms underlying the genetic variation of lignocellulosic traits in black poplar. Introduction The need for alternative sources of energy has led to recent efforts for developing renewable sources for biofuels. Ethanol is the liquid transportation fuel that is expected to be widely used around the globe (Sticklen, 2008). Feedstock crops recom- mended for producing cellulosic ethanol have a high amount of lignocellulosic biomass. Candidate species include perennial grasses (e.g. switchgrass and miscanthus) and trees (e.g. poplars and eucalyptus) (Abramson et al., 2010). Poplars have several advantages related to herbaceous crops, including amount of bio- mass, flexibility of harvest time, lower requirements for cultiva- tion and the possibility of combining biomass production with parallel purposes (e.g. phytoremediation) (Davis, 2008). Addi- tionally, poplar wood contains much lower amounts of fermenta- tion-inhibiting extractives than other tree species, with a higher conversion efficiency of the biomass. However, the development of poplars as a dedicated bioenergy crop will require matching of genetically improved material with appropriate silviculture so as to generate sustainable biomass yield in a cost-effective manner (Davis, 2008). Hybridization of selected poplars is an important alternative to generate improved fast-growing genotypes for bioenergy. Black poplar (Populus nigra) is a primary candidate species. It is naturally distributed across Europe, Northern Africa, and Asia (Chu et al., 2009; Slavov & Zhelev, 2010), and it is widely used as a parental species to produce commercial hybrids (P. 9 canadensis) by crossing with P. deltoides. However, a better understanding of genetic mechanisms underlying its chemical wood properties is needed to guide hybrid breeding programs for cellulosic ethanol production. Traditional breeding of forest trees for cellulosic bioethanol production can be enhanced by approaches based on marker- assisted selection (MAS). Advantages of MAS include a reduced breeding cycle time, reduced cost of field testing, an increased intensity and efficiency of selection (Neale & Kremer, 2011). One possible approach to MAS depends on the dissection of complex traits to their individual genes and the identification of markertrait associations, accounting for a sizable portion of the variation in the breeding population. A second approach, geno- mic selection, does not require specific markertrait associations to be identified, but it requires enough genetic marker informa- tion to predict accurately phenotypic breeding values on the basis of marker information alone. Discovery of markertrait relation- ships using association genetics is a widely used approach and is 162 New Phytologist (2013) 197: 162–176 Ó 2012 The Authors New Phytologist Ó 2012 New Phytologist Trust www.newphytologist.com Research

Upload: independent

Post on 21-Apr-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Association genetics of chemical wood properties in black poplar(Populus nigra)

Fernando P. Guerra1,2, Jill L. Wegrzyn1, Robert Sykes3, Mark F. Davis3, Brian J. Stanton4 and David B. Neale1,5

1Department of Plant Sciences, University of California at Davis, Davis, CA, 95616, USA; 2Instituto de Biologıa Vegetal y Biotecnologıa, Universidad de Talca, Talca, PO Box 747, Chile;

3National Renewable Energy Laboratory, Golden, CO, 80401, USA; 4Genetic Resources Conservation Program, Greenwood Resources, Portland, OR, 97201, USA; 5Bioenergy Research

Center (BERC), University of California at Davis, Davis, CA, 95616, USA

Author for correspondence:David B. Neale

Tel: +1 530 754 8431Email: [email protected]

Received: 16 July 2012

Accepted: 12 September 2012

New Phytologist (2013) 197: 162–176doi: 10.1111/nph.12003

Key words: association genetics, bioethanol,cellulose, half-sib families, lignin, Populusnigra, single nucleotide polymorphism (SNP).

Summary

� Black poplar (Populus nigra) is a potential feedstock for cellulosic ethanol production,

although breeding for this specific end use is required. Our goal was to identify associations

between single nucleotide polymorphism (SNP) markers within candidate genes encoding cel-

lulose and lignin biosynthetic enzymes, with chemical wood property phenotypic traits,

toward the aim of developing genomics-based breeding technologies for bioethanol produc-

tion.� Pyrolysis molecular beam mass spectrometry was used to determine contents of five- and

six-carbon sugars, lignin, and syringyl : guaiacyl ratio. The association population included

599 clones from 17 half-sib families, which were successfully genotyped using 433 SNPs from

39 candidate genes. Statistical analyses were performed to estimate genetic parameters, link-

age disequilibrium (LD), and single marker and haplotype-based associations.� A moderate to high heritability was observed for all traits. The LD, across all candidate

genes, showed a rapid decay with physical distance. Analysis of single marker–phenotype

associations identified six significant marker–trait pairs, whereas nearly 280 haplotypes were

associated with phenotypic traits, in both an individual and multiple trait-specific manner.� The rapid decay of LD within candidate genes in this population and the genetic associa-

tions identified suggest a close relationship between the associated SNPs and the causative

polymorphisms underlying the genetic variation of lignocellulosic traits in black poplar.

Introduction

The need for alternative sources of energy has led to recent effortsfor developing renewable sources for biofuels. Ethanol is theliquid transportation fuel that is expected to be widely usedaround the globe (Sticklen, 2008). Feedstock crops recom-mended for producing cellulosic ethanol have a high amount oflignocellulosic biomass. Candidate species include perennialgrasses (e.g. switchgrass and miscanthus) and trees (e.g. poplarsand eucalyptus) (Abramson et al., 2010). Poplars have severaladvantages related to herbaceous crops, including amount of bio-mass, flexibility of harvest time, lower requirements for cultiva-tion and the possibility of combining biomass production withparallel purposes (e.g. phytoremediation) (Davis, 2008). Addi-tionally, poplar wood contains much lower amounts of fermenta-tion-inhibiting extractives than other tree species, with a higherconversion efficiency of the biomass. However, the developmentof poplars as a dedicated bioenergy crop will require matching ofgenetically improved material with appropriate silviculture so asto generate sustainable biomass yield in a cost-effective manner(Davis, 2008). Hybridization of selected poplars is an importantalternative to generate improved fast-growing genotypes for

bioenergy. Black poplar (Populus nigra) is a primary candidatespecies. It is naturally distributed across Europe, Northern Africa,and Asia (Chu et al., 2009; Slavov & Zhelev, 2010), and it iswidely used as a parental species to produce commercial hybrids(P.9 canadensis) by crossing with P. deltoides. However, a betterunderstanding of genetic mechanisms underlying its chemicalwood properties is needed to guide hybrid breeding programs forcellulosic ethanol production.

Traditional breeding of forest trees for cellulosic bioethanolproduction can be enhanced by approaches based on marker-assisted selection (MAS). Advantages of MAS include a reducedbreeding cycle time, reduced cost of field testing, an increasedintensity and efficiency of selection (Neale & Kremer, 2011).One possible approach to MAS depends on the dissection ofcomplex traits to their individual genes and the identification ofmarker–trait associations, accounting for a sizable portion of thevariation in the breeding population. A second approach, geno-mic selection, does not require specific marker–trait associationsto be identified, but it requires enough genetic marker informa-tion to predict accurately phenotypic breeding values on the basisof marker information alone. Discovery of marker–trait relation-ships using association genetics is a widely used approach and is

162 New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

particularly powerful in forest trees because of the general lack ofsignificant population structure in many tree populations and therapid decay of LD (Neale & Savolainen, 2004). This decay ofLD is important because once a marker–trait association has beendiscovered and validated, it is likely that such a marker is at aclose physical distance to the functional variant or even the func-tional variant itself (Neale & Kremer, 2011). In recent years,association studies utilizing SNP markers have greatly benefitedfrom the development of high-throughput technologies forsequencing and genotyping (Syvanen, 2005; Varshney et al.,2009). A diverse group of complex traits have been studied inforest tree species by association genetics (Gonzalez-Martınezet al., 2007; Eckert et al., 2009, 2012; Dillon et al., 2010;Quesada et al., 2010; Beaulieu et al., 2011; Cumbie et al., 2011).Specifically in poplar, there is one published associationstudy where candidate gene SNP associations with lignin and cel-lulose traits of Populus trichocarpa were reported (Wegrzyn et al.,2010).

The plant secondary cell wall is the source of lignocellulosicbiomass, which is made of complex structures that mainly com-prise cellulose, hemicellulose and lignin (Wang & Dixon, 2011).The structure, configuration and composition of cell walls varydepending on plant species, tissue, age and cell type, and cell walllayers (Sticklen, 2008). Aspects related to the development oflignocellulosic biomass (xylem) in poplars have been reviewedelsewhere by Groover et al. (2010). Ethanol production from bio-mass involves a pretreatment stage followed by saccharification ofcellulose and hemicellulose to simple sugars via hydrolysis andthen fermentation of the free sugars to ethanol. This conversionprocess is limited by the effective breakdown of the polysaccha-ride wall into fermentable sugars. The degree of lignification aswell as cellulose crystallinity and extent of its polymerization aredetermining factors (Abramson et al., 2010). The effects of lignincontent and ratio of syringyl (more easily extractable) to guaicyl(less easily extractable) units (S : G ratio) and sugar release havebeen studied in poplars (Davison et al., 2006; Studer et al.,2011). However, a determination of the genes underlying thesecomplex traits has just begun (Wegrzyn et al., 2010). In thisstudy, the genetic association of wood chemistry traits and cellu-lose and lignin biosynthesis pathway candidate gene SNP mark-ers were first reported. Nearly 880 SNPs were genotyped in apopulation comprising 448 clones. Significant single-marker andhaplotype-based associations were identified across 40 candidategenes in traits representing lignin content, S : G ratio and six-carbon cellulose sugar.

This association study represents both a de novo discovery inP. nigra and a validation of results previously found by Wegrzynet al. (2010) in P. trichocarpa. Pyrolysis molecular beam massspectrometry (pyMBMS) was utilized to analyze the chemicalcomposition of wood samples extracted from 2-yr-old trees. Thismethod allowed us to determine five-carbon hemicellulose sugarcontent, six-carbon cellulose sugar content, S : G ratio and lignincontent in an association population composed of open-pollinated families. Single-marker and haplotype-based modelswere used to assess the SNP–trait combinations. Theseapproaches allowed us to identify several significant associations

explaining part of the genetic variation of chemical wood proper-ties in black poplar.

Materials and Methods

Association population

The population included 599 Populus nigra L. clones, represent-ing 17 different open-pollinated families (34–36 clones perfamily), from an equal number of provenances collected in Italy,Belgium and Serbia. Clones were produced from seedlings grownfrom the female parent trees. The clonal population was estab-lished in a provenance test plot, located at Boardman, OR, USA(45°50′N, 119°35′W, elevation 135 m) in 2008, by GreenWoodResources (Portland, OR, USA). The test plot considered arandomized block design with two blocks (one ramet each).Two-year-old trees were sampled for DNA isolation and woodchemical analysis as described in the following. The soil in thetrial site is part of the Quincy Soil Series and is described as verydeep, excessively drained mixed sands, ranging from 0 to 12%slope. Annual precipitation averages 200 mm, occurring mostlyduring the winter and early spring. For the 2008 and 2009 cropyears, the average maximum temperature during the April–September growing season averaged 24.4°C. Monthly averagemaximum temperatures ranged from 16.1 to 34.4°C. The testplot was irrigated at the rate of 30 cm per growing season and56 kg ha�1 of elemental nitrogen was applied each year. Com-plete weed control was accomplished using both mechanical andchemical means.

Wood chemistry phenotyping

Sampling and chemical analysis Disks of wood (25 mm thickand full diameter) were collected at 1.4 m height. Samplingincluded two ramets per clone (a total of 1000 sampled trees).Disks were kept in a freezer until they were processed at theGreenWood Resources Wood Laboratory (Westport, OR, USA). Athree-dimensional piece of wood (15mm wide, 25mm height andvariable length, depending on the tree diameter), excluding pith andbark, was extracted from the center of each disk. It was dried (50°Cfor at least 48 h), ground (using a Thomas Wiley Mini-Mill,Swedesboro, NJ, USA) and screened (40-mesh) to obtain a finewood powder. Finally, ground wood samples were analyzed usingpyMBMS (Sykes et al., 2009), at the National Renewable EnergyLaboratory (Golden, CO, USA), as previously described byWegrzyn et al. (2010). Wood chemistry determinations includedfour traits: five-carbon hemicellulose sugar content (C5), six-carboncellulose sugar content (C6), S : G ratio and lignin content.

Statistical and genetic analysis A mixed linear model approachwas applied to assess the genetic variation underlying the woodchemistry traits. In a first stage, the normal distribution of thefour chemical phenotypes was evaluated. The transformations,C5T = log((C5 – 20)/(100 – C5))), C6T = log ((C6 – 25)/(1000– C6)) and LigninT = ((lignin)9)/10) were used to normalize C5,C6 and lignin content, respectively. The S : G ratio was

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 163

distributed approximately normal and it was not modified. AnANOVA was carried out for each trait using the following model:

yijk ¼ lþ fi þ cij þ bk þ f � bik þ eijk ; Eqn 1

where yijk is the response measured to the kth ramet of the jthclone within the ith family, l is the overall mean, fi is the randomeffect of family i, cij is the random effect of clone j in family i, bkis the fixed effect of block k, f9 bik is the random interactioneffect of family i with block k, and eijk is the random error inblock k on clone j within family i. The ANOVA was performedusing the GLM procedure in the software SAS 9.2 (SAS Institute,Cary, NC, USA). Additionally, variance components were esti-mated by restricted maximum likelihood analysis (REML) usingthe MIXED procedure. Individual broad-sense heritability (H 2)was estimated by Eqn 2, according to the methodology developedby Zamudio et al. (2008) for the genetic analysis of spatiallyrepeated measures from clonal tests. Phenotypic and genetic cor-relations between chemical traits were estimated using Eqns 3and 4, respectively:

H 2 ¼r2f þ r2

c þ r0ee

r2f þ r2

c þ r2f �b þ r2

e

; Eqn 2

where, r2f , r

2c , r

2f �b and r2

e represents the variance as a result offamily, clone, family9 block and residual, respectively. r2

ee 0 isthe covariance between residuals of the same clone in two blocks.

rpðx ;yÞ ¼rðx ;yÞffiffi½p

2�ðr2x � r2

y Þ; Eqn 3

where r(x,y), r2x , and r2

y indicate the covariance between traits xand y, and the variances of traits x and y, respectively.

rgðx ;yÞ ¼rgðx ;yÞffiffi½p

2�ðr2gx � r2

gyÞ; Eqn 4

where rg(x,y), r2gx , and r2

gy indicate the genetic covariance fortraits x and y, and the genetic variance of traits x and y, respec-tively. Genetic covariance was calculated by REML analysis usingthe VARCOM procedure.

Resequencing and SNP discovery

A set of 39 candidate genes encoding enzymes involved in the lig-nin and cellulose biosynthesis were considered for genotyping(Table 2). Methods for gene selection, DNA extraction,resequencing and SNP discovery were described previously byWegrzyn et al. (2010). The v2.0 release of the P. trichocarpagenome (available at http://www.phytozome.net/) was used asthe reference to design sequencing primers utilized for Sangerresequencing of P. nigra gene amplicons. Aligments for eachamplicon comprising sequences > 200 bp are available in Gen-bank (accessions JX549461–JX555881). All sequences analyzed,

including those < 200 bp, are available in the TreeGenes database(http://dendrome.ucdavis.edu/DiversiTree). A total of 768 SNPswere identified and included in the genotyping assay. Thatamount comprised 698 focus SNPs (from genes related to ligninand cellulose biosynthesis) and 70 eSNPs used previously byWegrzyn et al. (2010) for population structure control. SNPannotation polymorphisms were determined by aligning P. nigrasequences against corresponding P. trichocarpa gene models usingthe Bioedit software (Hall, 1999). The analysis of promoterregions and search of splicing sites were done with PlantCARE(Lescot et al., 2002) and GENSCAN software (Burge & Karlin,1997), respectively.

SNP genotyping

The SNP genotyping for all clones was carried out using the Illu-mina GoldenGate SNP genotyping platform (Fan et al., 2003),at the University of California, Davis Genome Center (Davis,CA, USA). The Golden Gate assay included a total of 768 SNPs.Signal intensities were determined using the iScan Reader plat-form (Illumina, USA) and matched to specific alleles by theGenomeStudio Genotyping Module v1.0 software (Illumina,USA). Manual corrections were performed on the automaticallyassigned genotypic clusters when it was required. Quality metricswere applied to the raw data to select the SNPs utilized in thegenetic analysis. Thresholds of 0.20 and 0.60 for the GenCall50(GC50) and call rate (CR) indices, respectively, were applied(Wegrzyn et al., 2010).

Genetic diversity and linkage disequilibrium (LD)

Hierarchical fixation indices were estimated for SNP markersusing the GENETICS and HIERFSTAT packages in R (Goudet,2005; R Development Core Team, 2010; Warnes et al., 2011).SNPs with |FIS| > 0.25 were excluded from further analyses. LDwas estimated among pairwise combinations of the 433 selectedSNPs. It was expressed in terms of the squared correlation ofallele frequencies r2. The r2 value between pairs of SNP markerswithin candidate genes was estimated using the GENETICSpackage in R (Warnes et al., 2011). To assess the extent of LD inthe genomic regions, the decay of LD with physical distance (basepairs) between SNPs within each candidate locus and over allcandidate genes was evaluated by nonlinear regression analysis ofr2 values (Remington et al., 2001). Analysis was performed apply-ing the NLIN procedure in SAS.

Association analysis

Single SNP-based associations The genetic association of can-didate gene SNPs with wood chemistry traits was assessed byapplying a transmission disequilibrium test, implemented in thesoftware Quantitative Transmission Disequilibrium Test(QTDT) (Abecasis et al., 2000). QTDT applies a variance-component approach to test for allelic association, decomposinga genotype score into its orthogonal between and within familycomponents (Fulker et al., 1999). This method is robust for

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist164

controlling the spurious effects of population stratification andadmixture in association designs. Linear models of associationwere tested using a likelihood ratio tests Given a marker M, withalleles A and B, a genotype score gij for the jth offspring in the ithfamily is defined as the number of A alleles at locus M minus one.The genotype score can be decomposed into the between-familycomponent bi, and the within-family component wij = gij – bi. Inthis equation, bi represents the average within-family genotype,obtained from sib information as:

bi ¼Xsibship

k

ðgik=nsibsÞ Eqn 5

Thus, the means model under this specification is:

yij ¼ lþ bbbi þ bwwij ; Eqn 6

where l is the overall mean, bi and wij are the orthogonals of gij,and bb and bw are regressors (Gonzalez-Martinez et al., 2008).The presence of population stratification for each SNP wasassessed by comparing the between- and within-family compo-nents of variance. If stratification was detected (v2, P < 0.05), thewithin-family component was assessed, comparing models includ-ing only the within-family component or both the between- andwithin-family components. In the absence of stratification(bb = bw), both components were modeled together in the totalassociation model. The matrix of identity by descent probabilities,utilized in the association models, was estimated using the Sim-Walk2 package (Sobel et al., 2001) in QTDT. Correction formultiple testing was performed using the false discovery rate(FDR) through the program QVALUE (Storey & Tibshirani,2003) implemented in R. An arbitrary q-value of 0.1 was consid-ered as a significance threshold. For significant genetic associa-tions, the phenotypic variation explained by the SNP marker wasestimated as 2p(1 – p)a2/Vp, where Vp is the total phenotypic vari-ance, p is the marker allele frequency and a is the additive effect(Gonzalez-Martinez et al., 2008). The additive effect was calcu-lated as a = pB(GBB) + pb(GBb) – G, where G is the overall traitmean, Gij is the trait mean in the ijth genotypic class and pi is thefrequency of the ith marker allele (Wegrzyn et al., 2010). Thesevalues were estimated with respect to the minor allele. Means arereferred to least-square means. Significant differences amongleast-square means of genotypic classes were determined by ANO-VA and the Tukey–Kramer test using the GLM procedure in SAS,from a model including the family, clone and genotype effects.

Haplotype-based associations Tests of associations betweencandidate gene haplotypes and wood chemistry traits were per-formed using the Haplo.stats package in R (Sinnwell & Schaid,2009). Haplotypes were estimated for each gene and their fre-quencies were determined using the modified expectation maxi-mization method of haplotype inference (Schaid et al., 2002).Association was estimated utilizing as input matrices containingthe genotype, phenotype and principal component analysis(PCA) coefficients for each tree. These coefficients were included

to correct the population structure (Patterson et al., 2006). Eigen-values and eigenvectors considered in PCA were obtained fromthe singular value decomposition of the matrix containing thegenotypes of each tree and SNP markers over quality thresholds.The first four significant principal components (Tracy–Widomtest, P < 0.0001), explaining 27.68% of the overall variance, wereconsidered in the association analysis. Singleton alleles were dis-carded as well as the haplotypes with a frequency < 0.05. A globalscore statistic and haplotype-specific scores were derived fromgeneralized linear models. A correction for multiple testing wasalso performed with an FDR, in the same way as was done forsingle marker associations (Wegrzyn et al., 2010). Classificationof significant haplotypes was made with Venn diagrams using thesoftware VENNY (Oliveros, 2007).

Mode of gene action

The mode of gene action was estimated using the ratio of domi-nance (d) to additive (a) effects calculated from least-squaremeans for each genotypic class (Wegrzyn et al., 2010). Partial orcomplete dominance was defined as values in the range 0.50 < |d/a| < 1.25. An additive effect was defined for values in the range0.50 � d/a � 0.50. Values of |d/a| > 1.25 were equated withover- or underdominance. The dominance effect was calculatedas the difference between the phenotypic mean observed withinthe heterozygous class and the average phenotypic mean acrossboth homozygous classes (d =GBb� 0.5(GBB +Gbb), where Gij isthe trait mean in the ijth genotypic class). The additive effect (a)was calculated as described earlier.

Results

Quantitative genetic analysis of wood chemistry traits

The variation and inheritance of the contents of C5 and C6 sug-ars, lignin and the S : G ratio were evaluated from determinationsby pyMBMS (Table 1). C5 and C6 sugars registered mean per-centage values of 30.2 and 33.4%, respectively. The mean per-centage lignin content was 24.6% and the mean S : G ratio was1.7 (Table 1). Significant phenotypic correlations (P < 0.001)were observed between almost all the measured traits. A high andpositive correlation (0.94) was present between the contents ofC5 and C6 sugars. The correlation of both variables with per-centage lignin content was high and negative, with values of�0.74 and �0.81 for C5 and C6, respectively. The S : G ratiowas weakly correlated with both C5 and C6 sugar and lignin con-tent. A significant effect (F-test, P � 0.001) of family and clonewithin family was detected for all traits. Family means rangedfrom 29.2 to 31.0% and from 32.2 to 34.5% for C5 and C6 sug-ars, respectively. In the case of lignin and S : G ratio, meansranged from 23.6 to 25.2% and 1.6 to 1.8, respectively. A full listof family means is included in Supporting Information, TableS1. The variance components, explaining most of the phenotypicvariation, were family, clone and residual (Table 1). The estima-tion of the inheritance associated with wood chemistry traits,expressed in terms of individual broad sense heritability (H2),

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 165

established values of 0.48, 0.46, 0.58 and 0.70 for C5, C6 sugars,lignin and S : G, respectively. The genetic relationship amongtraits, assessed through the genetic correlations (rg) indicated astrong positive correlation between C5 and C6 sugar contents (rg,0.98; Table 1). Similarly to the phenotypic correlations, bothtraits were negatively correlated with lignin content (rg, �0.81and �0.86, respectively). The content of these saccharides werenot correlated with the S : G ratio (rg, 0.05 and 0.03, respec-tively). The genetic correlation between lignin content and S : Gratio was 0.22.

Genotyping

A set of 433 SNPs from 39 candidate genes met the quality controlthresholds and were selected for association analyses (Table 2).This represents a conversion rate of 62%. The percentage of suc-cessfully genotyped SNPs in noncoding regions of genes was70.4%, distributed within introns (40.4%), 5′ UTR (17.1%) and3′UTR (12.9%). Of those found in coding regions, 21.7 and7.9% were synonymous and nonsynonymous, respectively.

Linkage disequilibrium

The decay of LD with physical distance was assessed for all pair-wise combinations of SNPs, and estimates of r2 values were

plotted to observe genome-wide patterns (Fig. 1). LD decayedfrom 0.45 to 0.10 at a distance of c. 400 bp. A similar analysiswas performed at the individual gene level. For two candidategenes containing SNPs significantly associated with the S : Gratio and lignin, CoAOMT1 and TUB15 (Table 3), the LDdecline to 0.1 was found at a much greater distance (1000 and2500 bp, respectively; Fig. 1).

Single marker associations

Tests of association between single SNP markers and woodchemistry traits were assessed by QTDT. Table 3 shows sixhighly significant associations (v2 test, P < 0.05, q < 0.1) wereidentified from a total of 1732 tests (433 SNP9 four trait com-binations). As a reference, Table 3 also presents a series of othermarkers associated to traits, with the most significant associa-tions, which were discarded after multiple testing. SNPs CoA-OMT1-06-434 and CoAOMT1-06-297, belonging to a geneencoding a caffeoylCoA O-methyltransferase, were significantlyassociated to the S : G ratio. Both SNPs were in the fourth intronof the gene, separated 137 bp each other, close to GU-AG splicesites (Fig. 2). Each SNP accounted for nearly 5% of the pheno-typic variation and alleles were additive (Table 4). Additionally,another four highly significant SNP markers (TUB15-03-340,TUB15-03-346, TUB15-03-301, TUB15-05-458 and TUB15-09-32), from gene TUB15, encoding a b-tubulin were associatedwith lignin content (as determined by the within-family associa-tion model of QTDT) (Table 3). Significant SNPs are part of thepromoter and exons of the b-tubulin gene (Fig. 3). The pheno-typic effect and mode of gene action of TUB15 markersdepended on specific families (Table S2). On average, TUB15-SNPs also explained c. 5% of phenotypic variation, ranging from0.1 to 14.9%. Significant differences (P < 0.05) in mean S : Gratio and lignin percentage, detected among CoAOMT1 andTUB15 SNPs, respectively, are depicted in Fig. 4. No SNPs wereassociated significantly with C5 or C6 sugar content.

Haplotype-based associations

Haplotype-based association tests were performed to utilize theinformation from several markers simultaneously to assess theirrelationship with the xylem chemistry traits. A total of 115amplicons from the 39 candidate genes were tested with each ofthe four traits. Forty-three amplicons, including 130 haplotypes,from 25 different genes, were significantly associated (P � 0.01,q < 0.1). The number of amplicons significantly associated witheach trait is shown in Fig. 5. A selected list of associations isshown in Table 5 (a full list is provided in Table S3). The num-ber of significant associations, in terms of number of ampliconsand haplotypes (in parenthesis), was 26 (46), 17 (35), 22 (36)and six (13) for C5 and C6 sugars, S : G ratio and lignin content,respectively. Most of significantly associated genes/ampliconswere shared among the traits, but it was also possible to identifysome trait-specific ones. Haplotypes belonging to CoAOMT1and LAC90a genes were simultaneously associated with the fourchemical traits. On the other hand, haplotypes from amplicons

Table 1 Phenotypic and genetic parameters for Populus nigra wood chem-istry traits

Trait

C5 sugars C6 sugars Lignin S : G ratio

Overall valuesMean ± SD 30.2 ± 1.3 33.4 ± 1.6 24.6 ± 1.0 1.7 ± 0.1Minimum –maximum

25.2–34.8 27.7–39.7 19.5–26.5 1.3–2.1

ANOVAFamily (F) *** *** *** ***Clone/F *** *** *** ***Block (B) ns ns ns nsF9 B ** *** *** *

Variance components (%)r2f =r

2p 17.3 14.6 14.4 12.2

r2c =r

2p 30.9 30.6 42.4 58.0

r2fb=r

2p 2.0 2.9 2.0 0.9

r2e =r

2p 49.8 51.9 41.2 28.9

HeritabilityH2 0.4843 0.4588 0.5798 0.6998Correlations (rp/rg)C5 sugars – 0.9803 �0.8087 0.0530C6 sugars 0.9408 – �0.8628 0.0304Lignin �0.7401 �0.8098 – 0.2230

S : G ratio 0.1360 0.1281 0.0548 –

Overall values for sugar content (C5, C6) and lignin are expressed aspercentageS. The S : G ratio represents the proportion of syringyl toguaiacyl lignin precursors. In the ANOVA for transformed variables, *, **,***, and ns, correspond to P � 0.05, P � 0.01, P � 0.001, and nonsignif-icant, respectively. Variance components are expressed as percentages oftotal phenotypic variance (r2

p). Genetic and phenotypic correlationsamong traits are shown over and under the diagonal, respectively.

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist166

belonging to C4H1, CesA1A/B, F5H4, gdcH1, HCT1, SAM1,and TUA1 were exclusively associated with C5 sugars. In the caseof C6 sugars, there were no specifically associated genes; all geneswere shared with the other traits. Haplotypes from 4CL1/3,CesA1A, CesA2B, CesA3A, F5H3, gdcH1, HCT1, KOR1, LAC2and SHMT3 were specific to S : G ratio, whereas the haplotypesbelonging to the TUA5 gene were only associated with lignincontent. As an example, Fig. 6 depicts the specificity of ampliconF5H3-10. It is part of the gene encoding a ferulate 5-hydroxylase,which regulates the transformation of coniferyl alcohol to sinapylalcohol, determining the synthesis of S or G lignin units. That

amplicon was specifically associated with S : G ratio and signifi-cant differences among its haplotypes are not observed for theother chemical properties.

Discussion

Most commercial traits in forest trees, including those that influ-ence the efficiency of conversion of biomass to cellulosic ethanol,are complex quantitative traits. Breeding programs to improvesuch traits could increase efficiency using genomics-based breed-ing following information obtained from association studies. In

Table 2 Summary of single nucleotide polymorphism (SNP) genotyping

Gene product Gene P. trichocarpa locus Amplicons

SNPs

Targeted Selected

Coding Noncoding

S NS 3′UTR 5′UTR Intron

4-Coumarate:CoA ligase 4CL1 POPTR_0006s18510 2 5 5 20.0 40.0 40.04CL2 POPTR_0003s18720 3 20 7 57.1 28.6 14.34CL3 POPTR_0001s07400 12 61 39 20.5 7.7 20.5 10.3 41.0

Coumarate 3-hydroxylase C3H1 POPTR_0006s03180 2 5 3 33.3 33.3 33.3Cinnamate 4-hydroxylase C4H1 POPTR_0019s15110 4 18 11 27.3 18.2 54.5

C4H2 POPTR_0013s15380 1 1 0Cinnamyl alcohol dehydrogenase CAD POPTR_0009s09870 7 19 14 14.3 7.1 28.6 50.0Cinnamoyl-CoA reductase CCR POPTR_0003s17980 3 15 9 33.3 22.2 44.4Cellulose synthase CesA1A POPTR_0011s07040 9 39 25 16.0 4.0 8.0 28.0 44.0

CesA1B POPTR_0004s05830 10 58 39 15.4 5.1 7.7 33.3 38.5CesA2A POPTR_0018s11290 6 22 20 40.0 15.0 10.0 35.0CesA2B POPTR_0006s19580 5 16 11 9.1 18.2 18.2 54.5CesA3A POPTR_0002s25970 7 21 15 26.7 13.3 6.7 53.3

Caffeoyl CoA O-methyltransferase CoAOMT1 POPTR_0009s10270 4 25 15 13.3 6.7 53.3 26.7CoAOMT2 POPTR_0001s31220 2 8 3 100.0

Caffeate O-methyltransferase COMT1 POPTR_0015s00550 3 7 4 25.0 25.0 25.0 25.0COMT2 POPTR_0012s00670 2 11 6 50.0 50.0

Ferulate 5-hydroxylase F5H3 POPTR_0005s11950 3 4 2 100.0F5H4 POPTR_0007s13720 3 9 7 42.9 42.9 14.3

Glycine decarboxylase complex, H gdcH1 POPTR_0012s14960 6 60 31 12.9 16.1 71.0Glycine decarboxylase complex, T gdcT2 POPTR_0004s01030 3 15 11 9.1 18.2 72.7Hydroxcinnamoyl-CoA transferase HCT1 POPTR_0003s18210 5 29 19 15.8 5.3 10.5 68.4

HCT6 POPTR_0001s03440 5 14 12 25.0 8.3 16.7 50.0Cellulase KOR1 POPTR_0001s11870 4 23 14 28.6 7.1 7.1 14.3 42.9Laccase LAC1A POPTR_0016s11950 3 12 2 100.0

LAC2 POPTR_0008s06430 1 5 4 50.0 25.0 25.0LAC90A POPTR_0008s07370 4 21 13 16.7 50.0 8.3 25.0

Phenylalanine ammonia-lyase PAL2 POPTR_0008s03810 3 10 4 50.0 50.0PAL4 POPTR_0010s23100 3 9 4 50.0 50.0PAL5 POPTR_0010s23100 2 16 4 50.0 25.0 25.0

S-Adenosylmethionine synthetase SAM1 POPTR_0008s09870 3 14 8 42.9 28.6 28.6Serine hydroxymethyl transferase SHMT1 POPTR_0001s32770 2 6 5 60.0 20.0 20.0

SHMT3 POPTR_0002s10990 5 21 15 13.3 13.3 73.3SHMT6 POPTR_0017s08600 3 15 9 33.3 11.1 55.6

Sucrose synthase SUSY1 POPTR_0018s07380 6 13 9 11.1 33.3 55.6a-Tubulin TUA1 POPTR_0002s11250 3 6 4 25.0 25.0 50.0

TUA5 POPTR_0009s08850 2 8 7 28.6 14.3 57.1b-Tubulin TUB15 POPTR_0001s27960 4 20 14 57.1 7.1 21.4 14.3

TUB16 POPTR_1455s00200 2 14 9 55.6 22.2 22.2Average 4 17.8 11.1 21.7 7.9 12.9 17.1 40.4

Columns ‘amplicons’, ‘targeted’ and ‘selected’ represent the number of sequenced amplicons, and the number of originally targeted and finally selectedSNPs, respectively. ‘S’, ‘NS’ are the percentage (of the total selected number of SNPs) corresponding to synonymous and nonsynonymous SNPs,respectively. ‘3′UTR’, ‘5′UTR’ and ‘Intron’ are the percentages of SNPs located at the 3′UTR, 5′UTR or intron gene regions, correspondingly.

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 167

(a) (b)

(c) (d)

1

0.9

0.8

0.7

0.6

0.5

0.3

0.2

0.1

00 1000 2000 3000 4000 5000

Within-gene distance (bp)

Within-gene distance (bp)

Within-gene distance (bp)

r2

6000 7000 8000

1 – SSM/SSC = 0.3954

1 – SSM/SSC = 0.5185 1 – SSM/SSC = 0.4527

9000 0 100 200 300 400 500

0.4

1

0.9

0.8

0.7

0.6

0.5

0.3

0.2

0.1

00 500 1000 1500 25002000

r2

0.4

Within-gene distance (bp)

1

0.9

0.8

0.7

0.6

0.5

0.3

0.2

0.1

00 500 1000 1500 25002000 3000

r20.4

1

0.9

0.8

0.7

0.6

0.5

0.3

0.2

0.1

0

r2

0.4

Fig. 1 Decay of linkage disequilibrium (LD) with distance within genes in Populus nigra. (a) Decay of LD considering all single nucleotide polymorphism(SNP) sites pooled across all analyzed genes. (b) Detail of (a) for a 0–500 bp distance. (c, d) Decay in LD for CoAOMT1 (c) and TUB15 genes (d). The redline indicates the estimated trend of decay in LD. SSM and SSC are the sum of squares from the model (for trend) and the corrected total, respectively.

Table 3 List of single marker associations

Trait Gene product Marker SNP Polymorphism Position v2 P-value q-value

Total association modelC5 sugars Cinnamoyl-CoA reductase CCR-12-384 [T:C] NC 3′UTR 9.77 0.00180 0.47691

Cellulose synthase CesA1A-20-344 [T:C] NC 3′UTR 7.78 0.00530 0.47691CesA2B-08-103 [A:G] NC Intron 7.64 0.00570 0.47691

C6 sugars b-Tubulin TUB15-07-238 [T:G] NC Intron 6.32 0.01190 0.47691a-Tubulin TUA5-04-307 [C:G] NC Intron 8.57 0.00340 0.37818Serine hydroxymethyltransferase SHMT3-05-123 [A:G] NS Exon 7.37 0.00660 0.37818

SHMT3-13-334 [T:C] NC Intron 6.79 0.00920 0.37818S : G ratio CaffeoylCoA O-methyltransferase CoAOMT1-06-434 [A:G] NC Intron 18.01 0.00002 0.00520

CoAOMT1-06-297 [A:G] NC Intron 17.06 0.00004 0.00520Cellulose synthase CesA2A-08-157 [T:C] S Exon 9.02 0.00270 0.21320Hydroxycinnamoyl-CoA transferase HCT1-08-410 [T:G] NC Intron 8.47 0.00360 0.21320

Lignin Laccase LAC2-06-463 [T:C] S Exon 8.25 0.00410 0.99510Caffeate O-methyl transferase COMT1-10-388 [A:C] NC Intron 7.08 0.00780 0.99510Laccase LAC2-06-270 [T:C] NC Intron 6.67 0.00980 0.99510CaffeoylCoA O-methyl transferase CoAOMT1-08-94 [A:T] NC 3′UTR 6.45 0.01110 0.99510

Within-family association modelLignin b-Tubulin TUB15-03-340 [A:C] NC 5′UTR 10.99 0.0009 0.06175

TUB15-03-346 [T:C] NC 5′UTR 10.97 0.0009 0.06175TUB15-05-301 [A:T] S Exon 10.08 0.0015 0.06861TUB15-05-451 [T:C] S Exon 9.49 0.0021 0.07204TUB15-09-32 [T:C] S Exon 8.97 0.0027 0.0741

Significant associations after correction for multiple testing (false discovery rate (FDR)) are shown in bold. NC, noncoding polymorphism; NS,nonsynonymous polymorphism; S, synonymous polymorphism. The v2 statistic was estimated from the likelihood ratio test performed by the QuantitativeTransmission Disequilibrium Test (QTDT) software.

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist168

this study, LD decayed rapidly across 39 candidate genes (Fig. 1),supporting the possibility of using association genetics to identifycausal genes and even possibly causal SNPs. We tested the geneticassociation of 433 SNP markers, located in 39 cellulose and lig-nin synthesis genes, with four wood chemistry traits. Singlemarker and haplotype-based association tests allowed us to per-form de novo discovery in this species, and simultaneously com-pare the results with our previous work with P. trichocarpa(Wegrzyn et al., 2010).

The contents of C5 and C6 sugars, lignin, and the S : G ratiowere determined in wood samples of 2-yr-old trees (Table 1).The observed mean values of traits are within the ranges reportedfor other Populus spp. and hybrids, measured by pyMBMS: P.trichocarpa (Wegrzyn et al., 2010; Studer et al., 2011) and(P. trichocarpa9 P. deltoides) 9 P. deltoides (Novaes et al., 2009).Additionally, a significant effect of the genotype (combined effectof family and clone within family) on the phenotypic variationwas observed for all traits. This effect accounted for an importantpart of total variation, ranging from 45.2 to 70.2% for C6 sugarsand S : G ratio, respectively. Genetic differences can influence

strongly the hydrolyzability of the biomass and the yield of fer-mentable sugars in poplars (Davison et al., 2006; Studer et al.,2011).

Quantitative genetic analysis of lignocellulosic traits indicatedthat their inheritance, as expressed by H2, was moderate to highfor C6 sugars and S : G ratio, respectively (Table 1). This ten-dency is expected for these traits (Davis, 2008; Groover et al.,2010), and it confirms their strong genetic influence. On theother hand, from the estimated genetic correlations, a highdependency was detected between traits, particularly the contentsof C5 and C6 sugars and lignin (Table 1). Additionally, the nullor low genetic correlation of these traits and S : G suggests anindependence between the lignin composition and the total con-tent of hemicellulose/cellulose sugars and lignin. The widegenetic variation, moderate to high heritability and extremegenetic correlations present in the assessed traits suggest a positiveimpact of future breeding applications. Significant genetic gainin these lignocellulose traits could be reached as a result of director indirect (based on correlated traits) selection in breeding pop-ulations (Zobel & Jett, 1995).

Table 4 Genetic effect of CoAOMT1 markers on the ratio of syringyl to guaicyl units (S : G ratio)

Marker 2aa Db d/a 2a/Sp MAF Allele PV (%)

CoAOMT1-06-297 0.1250 0.0034 0.0550 1.0159 0.1050 A 4.89CoAOMT1-06-434 0.1046 �0.0058 �0.1116 0.8504 0.1570 G 4.83

D, dominance effect; MAF, minor allele frequency; Sp, general standard deviation for S : G ratio; PV, phenotypic variance explained by marker.aEstimated as the difference between the phenotypic least-squares (LS) means within each homozygous class (2a = |GBB-Gbb|, where Gij represent the LSmean in the ij genotypic class).bEstimated as the difference between phenotypic mean observed within the heterozygous class and the average phenotypic mean across bothhomozygous classes (d =GBb �0.5(GBB +Gbb)).

(a)

(b)

Fig. 2 Linkage disequilibrium (LD) and physical position of single nucleotide polymorphisms (SNPs) in the CoAOMT1 gene in Populus nigra. (a) LD amongSNP markers measured by r2. *, significant associations with S : G ratio. (b) Gene structure showing the SNP positions and alternative splicing cutting sites(in bold and underlined). Gray zones represent 5′ and 3′ untranslated regions.

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 169

Analyses based on single markers and haplotypes were success-ful in identifying significant associations with wood chemistrytraits. The first approach revealed only seven SNPs from twogenes associated with S : G ratio and lignin content (Table 3).The haplotype-based approach, however, discovered 130 haplo-types from 25 different genes, associated with the four traits(Table 5, Table S3). The large difference in the number of associ-ations detected using different approaches could be explained byboth the greater power of haplotype analyses and some possiblelimitations of the QTDT. The use of haplotypes can improve thepower and robustness of single marker association tests incorpo-rating LD information contained in flanking markers (Akeyet al., 2001). On the other hand, it has been reported thatQTDT, applied to test single SNP–trait associations, could havelimited power to detect SNPs with a small effect on phenotype,under some experimental designs (Gonzalez-Martinez et al.,2008; Ewens et al., 2011). Thus, it is possible that the family-based population, sample size and number of SNPs used in thisstudy influenced the statistical power required for detecting moresignificant associations. Each SNP marker within genes associatedwith S : G ratio (CoAOMT1) and lignin content (TUB15)accounted for c. 5% of phenotypic variation of each trait. Thisamount is similar to that observed in quantitative trait locus map-ping and marker–trait association studies, where individualmarkers account for only a small amount of the genetic variation(Neale & Kremer, 2011). Although genome-wide searches willbe needed to discover all the genes controlling complex traits(which reflect an underlying variation at several to many loci,each with a modest influence on phenotype), such as those relatedto cellulosic bioethanol, our results and similar studies are

validating the use of association mapping as a suitable approachfor assessing the relative effect of SNP polymorphisms underlyingthe variation of quantitative traits.

The contents of C5 and C6 sugars were both significantly asso-ciated with haplotypes from a similar set of genes (Table 5).Haplotypes were from genes encoding enzymes from pathwaysfor the synthesis of cellulose (CesA1A, CesA1B, CesA2B) andsucrose (SUSY1) from UDP-glucose, and the S and G ligninunits from p-coumaroyl-CoA (CCR, CoAOMT1, HCT1/6 andLAC90a). Additionally, C5 and C6 sugars were associated withTUA1 and TUB16. On the other hand, lignin content was asso-ciated with single SNP markers from TUB15 and with severalhaplotypes. These haplotypes belong to genes encoding enzymesthat participate in the formation of cellulose (CesA2A, CesA2B),the production of S and G lignin units (CoAOMT1, LAC90a), aswell as to genes encoding tubulins (TUA5, TUB16). Finally, theS : G ratio was associated with single SNP markers fromCoAOMT1 and haplotypes from genes related to the synthesis ofcellulose (CesA1A, CesA2B and CesA3A), 1,4-b-glucan (KOR1)from cellulose, and S and G lignin units (4CL1/3, F5H3, HCT1,LAC2, LAC90a and CoAOMT1). As with other traits, the S : Gratio was also significantly associated with tubulin genes(TUB15).

CaffeoylCoA O-methyltransferase (CoAOMT) catalyzes theproduction of feruloyl-CoA and sinapoyl-CoA, involvedupstream of the synthesis pathway for S and G lignin subunits(Whetten & Sederoff, 1995). According to Zhong et al. (2000),in the hybrid P. tremula9 P. alba, CoAOMT is expressed in dif-ferent xylem cells. The repression of this gene caused a significantdecrease in total lignin content, as the result of a decrease in both

(a)

(b)

Fig. 3 Linkage disequilibrium (LD) and physical position of single nucleotide polymorphisms (SNPs) in the TUB15 gene in Populus nigra. (a) LD among SNPmarkers measured by r2. *, significant associations with SNP-lignin content. (b) Gene structure and positions of significant SNPs.

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist170

S and G lignins. Our analyses identified two CoAOMT1 SNPssignificantly associated with the S : G ratio (Table 3). Both SNPmarkers were within an intron, near (< 10 bp) to the GU-AGsplice sites (Fig. 2). This kind of polymorphism could cause dif-ferential splicing, producing alternative forms of the encodedprotein, and eventually affecting the S : G ratio. Further studieswill be needed to test this hypothesis.

b-Tubulins are essential in the organization of microtubulesleading to microfibril deposition during secondary cell wall devel-opment in wood fiber cells, and these in turn influence the direc-tion of movement of the cellulose-synthesizing machinery at theplasma membrane (Qiu et al., 2008). One of the members of thegene family, TUB15, represents the predominant tubulin indeveloping xylem in Populus tremuloides, and its expression is

up-regulated by tension stress (Oakley et al., 2007). In our study,single SNPs and haplotypes from TUB15 were associated withthe lignin content (Table 3, Fig. 4). They were located in the 5′UTR region (TUB15-03-340, TUB15-03-346; 37 and 31 bpupstream of TATA-box, respectively), exons (TUB15-03-301,TUB15-03-451 and TUB15-09-32), and an intron (TUB15-03-451; 10 bp from a GU-AG splice site) (Fig. 3). Accumulating evi-dence shows a role of plant tubulin introns in the control of geneexpression (Breviario, 2008). Thus, our results suggest that someof the TUB15 SNPs couldbe causative polymorphisms involvedin the regulation of the expression and production of alternativeforms of the TUB15 protein. Future studies will be required toassess this possibility as well as the role of other members of thegene family.

Cellulose synthase (CesA) is a key enzyme participating in thesynthesis of cellulose in higher plants (Somerville, 2006). ThePopulus genome harbors 17 CesA genes (Kumar et al., 2009) andsome are related to wood development (Suzuki et al., 2006;Dharmawardhana et al., 2010; Song et al., 2010). Members ofthe CesA gene family were analyzed in our study (Table 2). Hapl-otypes from all genes were significantly associated with all traits(Table 5, Table S2). Another enzyme involved in cellulose forma-tion during xylem development in poplars is the Korrigan cellu-lase (KOR), which converts cellulose in 1,4-b-glucan (Bhandariet al., 2006). Two haplotypes representing KOR1 were associatedwith S : G ratio (Table 5). These results confirm the effect ofCesA and KOR in the determination of wood chemical properties,influencing cellulose/hemicellulose content as well as lignin con-centration and composition.

Sucrose synthase (SUSY) catalyzes the formation of UDP-glucose from sucrose, which is the precursor of cellulose (Hertzberget al., 2001). Activity of SUSY has been linked to regulationof sucrose supply (sink strength) and cellulose synthesis anddeposition during secondary cell wall formation (Coleman et al.,2009). The SUSY gene family in Populus includes 11 members,with SUSY1 and SUSY2 associated with cells forming secondarycell walls in wood-forming tissues (Geisler-Lee et al., 2006). Ourassociation analysis included SUSY1. Two amplicons were highlysignificant for C5 and C6 sugar content (Table 5), supporting

(a) (b)

(c) (d)

AA AG

AA

BB

B

A

1.4

600

500

400

300

Lign

in c

onte

nt (

norm

aliz

ed d

ata)

200

100

600

500

400

300

Lign

in c

onte

nt (

norm

aliz

ed d

ata)

200

100

1.6

S :

G r

atio

(fo

ld)

1.8

a

ab2.

0

GGGenotype

AA AG

1.4

1.6

S :

G r

atio

(fo

ld)

1.8

ab

b

2.0

GGGenotype

AA AG GGGenotype

AA AG GGGenotype

Fig. 4 Significant single-marker associations for the CoAOMT1 and TUB15genes in Populus nigra. (a, b) Least-squares means of S : G ratio forgenotypes from CoAOMT1-06-297 (a) and CoAOMT1-06-434 SNPmarkers (b). (c, d) Least-squares means of lignin contents for genotypeclasses from two families, for TUB15-03-346 (family 9016) (c) and TUB15-05-451 (family 9013) markers (d). Significant differences (Tukey–Kramertest, P < 0.05) among genotype classes are indicated by different letters.

Fig. 5 Haplotype-based associations. Numbers in the Venn diagramindicate the amount of amplicons significantly associated with woodchemistry traits in Populus nigra.

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 171

Tab

le5Se

lected

listofsignificanthap

lotype-based

associationsforwoodchem

istrytraits

Amplicon

C5sugars

C6sugars

S:G

ratio

Lignin

GSS

Q

Significant

hap

lotypes

Freq

.GSS

Q

Significant

hap

lotypes

Freq

.GSS

Q

Significant

hap

lotypes

Freq

GSS

Q

Significant

hap

lotypes

Freq

uen

cy

4CL1

_12

20.1

0.020

AACCAA

0.28

GGACAA

0.15

GGAAAG

0.09

C4H2_0

916.3

0.014

AAAA

0.4

CCR_1

227.9

0.001

GCGT

0.15

25.4

0.001

GCGT

0.15

CesA1A-02

22.9

0.002

CCGC

0.7

22.8

0.002

CCGC

0.7

ACGC

0.11

CesA1A-20

20.6

0.024

ACAA

0.49

CesA1B-12

16.8

0.020

AGGGA

0.1

CesA2A-08

31.8

0.000

GGAA

0.11

16.5

0.021

CGAA

0.49

CGTA

0.35

CesA2B-08

14.1

0.016

AAA

0.65

15.4

0.011

AAA

0.65

GAA

0.24

GAA

0.24

CesA3A-05

23

0.004

GAAAC

0.06

CoAOMT1-06

28.9

0.000

GGAC

0.41

23.4

0.002

GGAC

0.41

37.4

0.000

GAGA

0.1

27.4

0.001

GAGA

0.1

GGGC

0.05

GGGC

0.05

GGAA

0.35

GGAC

0.41

GGAA

0.35

GGAA

0.35

GGAC

0.41

AGAA

0.08

AGAA

0.08

AGAA

0.08

GGAA

0.35

F5H1-10

47.6

0.000

AG

0.16

CA

0.02

AA

0.77

gdcH

1_1

018.3

0.008

GAAAC

0.08

gdcT2_0

239.2

0.001

AAAATATA

0.08

35.4

0.004

AAAATATA

0.08

52.1

0.000

AAAGAATA

0.09

AAAATATG

0.22

AAAATATG

0.22

AAAATAAA

0.08

TGAGTAAA

0.07

TGAGTAAA

0.07

AAAATATA

0.08

AAAGAAAA

0.08

AAAGAAAA

0.08

HCT1_0

315

0.007

AT

0.65

15.4

0.006

AT

0.65

GA

0.06

GA

0.06

HCT1_0

621.2

0.011

AAACAA

0.13

17.7

0.023

AAACAA

0.13

19.9

0.014

TGGAAA

0.32

AAACTA

0.05

AAACTA

0.05

AAACTA

0.05

AAGCTA

0.08

AAGCTA

0.08

AAACAA

0.13

AAGCAA

0.22

AAGCAA

0.22

HCT6_1

112.4

0.015

ACAA

0.25

11.1

0.021

ACAA

0.25

AAAA

0.7

KOR1_0

417.4

0.006

AGG

0.08

AAA

0.2

LAC2_0

621

0.004

GTGA

0.72

GTGG

0.15

LAC90a_11

30.4

0.002

GGAAAAA

0.07

25.8

0.007

GGAAAAA

0.07

21.9

0.019

AGAACGG

0.37

28.5

0.004

AGCGCGG

0.07

SAM1_0

512.8

0.013

CAAAT

0.39

12.4

0.015

CGGGA

0.05

CGGGA

0.05

CAAAT

0.39

AAAAT

0.56

AAAAT

0.56

SHMT3_0

523.3

0.000

GAG

0.05

18.9

0.001

GAG

0.05

SUSY

_14

27.5

0.000

GTA

0.56

21.5

0.002

GTA

0.56

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist172

the proposal by Coleman et al. (2009) regarding the connectionbetween sucrose supply, its breakdown and cellulose formationthrough the activity of SUSY.

Glycine and serine are two interconvertible amino acids thatplay an important role in the generation of one-carbon (C1) unitsin nonphotosynthetic cells in higher plants (Mouillon et al.,1999). These units are required for the biosynthesis of lignin(Hanson & Roje, 2001). The glycine decarboxylase complex(GDC) and serine hydroxymethyltransferase (SHMT) are animportant part of the mitochondrial multienzyme systeminvolved in the generation of C1 units. Genes encoding GDCand SHMT protein isoforms have been characterized in poplars(Wang et al., 2004; Rajinikanth et al., 2007) and are related tophotorespiration and C1 metabolism during the lignificationinvolved in the wood formation. In our study haplotypes fromgdcH1, gdcT2 and SHMT3 genes were significantly associatedwith C5 and C6 sugars (and S : G ratio in the case of gdcT2)(Table 5). Although total lignin content was not associated withthese haplotypes, an indirect effect, dealing with the carbonT

able5(Continued

)

Amplicon

C5sugars

C6sugars

S:G

ratio

Lignin

GSS

QSignificant

hap

lotypes

Freq

.GSS

QSignificant

hap

lotypes

Freq

.GSS

QSignificant

hap

lotypes

Freq

GSS

QSignificant

hap

lotypes

Freq

uen

cy

AAG

0.35

AAG

0.35

TUA5_0

422.3

0.024

ACAAC

0.11

GGTCA

0.06

TUB16_0

514.6

0.014

GGCA

0.11

14.8

0.013

GGCA

0.11

23.2

0.001

GGCA

0.11

AGCA

0.54

AGCA

0.54

AGCA

0.54

Afulllistisincluded

inTab

leS2

.GSS

,globalscore

statistic;Q,q-value.

(a) (b)

(c) (d)

AA AG

1.4

1.6

S :

G r

atio

(fo

ld)

1.8

b

b

a

2.0

CAHaplotype

AA AG

1.6

1.8

C5

suga

rs (

norm

aliz

ed d

ata)

2.0

2.2

C6

suga

rs (

norm

aliz

ed d

ata)

5.2

5.0

4.8

4.6

4.4

CAHaplotype

AA AG CAHaplotype

AA AG CAHaplotype

600

500

400

300

Lign

in c

onte

nt (

norm

aliz

ed d

ata)

200

100

Fig. 6 An example of a trait-specific significant haplotype in Populus nigra.(a–d) Least-squares means for S : G ratio (a) and contents of lignin (b), C5(c) and C6 sugars (d) in the F5H3-10 amplicon. Significant differences(Tukey–Kramer test, P < 0.05) among haplotypes classes were onlyobserved for the S : G ratio. These are indicated as different letters in (a).

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 173

distribution toward the synthesis of C5 or C6 sugars, is suggestedfrom the highly negative correlations between sugar and lignincontent (Table 1).

The wood of poplars contains syringyl-guaiacyl lignin(Groover et al., 2010). Guaiacyl (G) and syringyl (S) units areproduced from phenylpropanoid metabolism (Croteau et al.,2000). Genes encoding different enzymes related to the biosyn-thesis of G and S monomers have been identified in the P.trichocarpa genome (Tsai et al., 2006; Shi et al., 2010). Associa-tions between haplotype markers from these genes and woodchemistry traits (mainly C5 and C6 content and S : G ratio) werefound in our study (Table 5, Table S3). The functions ofenzymes encoded by these genes included the formation ofp-coumaric acid from trans-cinnamate (C4H), the conversion of4-coumarate to 4-coumaroyl-CoA (4CL), the production ofcaffeoyl-CoA from caffeoyl-shikimate/quinate (HCT), the gener-ation of feruloyl-CoA from caffeoyl-CoA (CoAOMT), the pro-duction of coniferyl-aldehyd from feruloyl-CoA (CCR) and thesynthesis of 5-hydroxyconiferyl-aldehyde from coniferyl-aldehyde(F5H). Additionally, SAM encodes a protein catalyzing the syn-thesis of S-adenosyl-L-methionine, needed for methylation of lig-nin precursors, and LAC is a ubiquitous polyphenol oxidase,which oxidizes cinnamyl alcohols. Members of correspondinggene families have been identified and characterized in poplars(Mijnsbrugge et al., 2000; Li et al., 2003; Leple et al., 2007; Shiet al., 2010; Suzuki et al., 2010). Associations detected are consis-tent with the roles of enzymes regulating different points of bio-synthesis of S and G units (and the composition of lignin), aswell as a negative relationship between C5 and C6 sugars and lig-nin content.

The current study is an extension of previous work withP. trichocarpa (Wegrzyn et al., 2010), for which LD and tests ofassociation were performed on the same set of candidate genesand traits. A main difference between both studies is the type ofassociation population utilized: half-sib families in the case ofP. nigra, and unrelated clones from different provenances col-lected in the natural distribution range of P. trichocarpa. Distinctlevels of genetic structuring were present in both populations,with multilocus average Fst values of 0.034 and 0.203 forP. trichocarpa and P. nigra, respectively. Additionally, a morerapid decay of LD was observed in P. trichocarpa (r2 decayedfrom 0.5 to 0.2, a 60% fall, within a distance of 200 bp) than inP. nigra (Fig. 1). The comparison between the results of bothstudies showed quantitative and qualitative differences in termsof the SNP–trait combinations identified from analyses based onboth single markers and haplotypes. The total number of genes,represented by significant SNPs and haplotypes, varied betweenspecies according to specific traits (Fig. 7). For C6 sugars, ahigher number (23) of single SNPs and a lower number (three)of haplotypes were associated in P. trichocarpa compared withP. nigra. Six genes were common to both species (CesA1A,CesA2B, SUSY1, CoAOMT1, HCT6 and SAM1). Another sixgenes were present in P. nigra only (CCR, HCT1, LAC90a,SHMT3, gdcT2 and TUB16), and eight were exclusive toP. trichocarpa (CesA1B, CesA2A, C4H2, CAD, COMT2, LAC1a,PAL2 and 4CL1). The content of C5 sugars was not analyzed by

Wegrzyn et al. (2010). For the S : G ratio, one single SNP wasassociated in each species, and significant haplotypes (36) wereonly identified in P. nigra. These SNPs and haplotypes repre-sented one gene exclusive to P. trichocarpa (CAD), and 13 exclu-sive to P. nigra (CesA1A, CesA2A, CesA3A, 4CL1, 4CL3,CoAOMT1, F5H3, HCT1, LAC90a, SHMT3, gdcH, gdcT2, andTUB15). No common gene was observed between the species.Concerning lignin content, a higher number (13) of single SNPsand a lower number (eight) of haplotypes were associated inP. trichocarpa relative to P. nigra. Three genes were present inboth species (CesA2A, CesA2B and TUA5). Four genes weredetected only in P. nigra (CoAOMT1, LAC90a, TUB15 andTUB16) and nine were detected only in P. trichocarpa (CesA1A,CesA1B, SUSY1, C4H1, CCR, HCT1, HCT6, CAD and 4CL1).In general terms, the differences between species in relation tothe types of genes significantly associated with traits could beexplained by natural variations in the patterns of gene controland inheritance underlying the distinctive chemical compositionof each wood type (Zobel & Jett, 1995). Common genes to bothspecies, associated with the content of both C6 sugars and lignin,were closely related to cellulose synthesis (CesA and SUSY genes)and lignin S and G-subunit generation from p-coumaroyl-CoA(CoAOMT, HCT genes) (Hertzberg et al., 2001), indicating acommon point of conservation across species for these funda-mental biochemical pathways.

In this study, we performed the first comprehensive analysis ofLD and SNP–trait associations in P. nigra, an important parentalspecies for hybrid breeding programs focused on the productionof biomass for cellulosic ethanol. The genetic mechanisms con-trolling fundamental biological processes, such as xylem develop-ment, which determines the chemical and physical properties ofwood, are complex and our current understanding is still limited.In this sense, this study is an important step toward the identifi-cation of SNP markers likely related to causative polymorphismsin genes from two key biosynthetic pathways determining thewood chemistry composition, and toward the estimation of theireffects on phenotypes of interest for bioethanol production. The

Fig. 7 Genes significantly associated with wood chemistry traits in Populus

nigra and Populus trichocarpa. The Venn diagram shows the number ofgenes with significant single or haplotype SNP–trait associations.

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist174

incorporation of new sets of genes (e.g. those encoding transcrip-tion factors and other potential regulators of xylogenesis) infuture studies based on the candidate gene approach will help toobtain a more complete dissection of the lignocellulosic traits. Inthe same way, functional analysis of the detected polymorphismswill allow the effect on phenotype to be determined. Similarly,evaluation under field conditions will be required to assess theeffect of environment (and genotype 9 environment interaction)on the final performance of the selected trees. Information gener-ated by association genetics represents a promising input to sup-port marker-assisted breeding strategies for improving poplars forbiofuels.

Acknowledgements

This study was funded by a USDA-NIFA-SBIR grant (2009-33610-19883) award to GreenWood Resources.

References

Abecasis GR, Cardon LR, Cookson WO. 2000. A general test of association for

quantitative traits in nuclear families. American Journal of Human Genetics 66:279–292.

Abramson M, Shoseyov O, Shani Z. 2010. Plant cell wall reconstruction toward

improved lignocellulosic production and processability. Plant Science 178:61–72.

Akey J, Jin L, Xion M. 2001.Haplotypes vs single marker linkage disequilibrium

tests: what do we gain? European Journal of Human Genetics 9: 291–300.Beaulieu J, Doerksen T, Boyle B, Clement S, Deslauriers M, Beauseigle S, Blais

S, Poulin P-L, Lenz P, Caron S et al. 2011. Association genetics of wood

physical traits in the conifer white spruce and relationships with gene

expression. Genetics 188: 197–214.Bhandari S, Fujino T, Thammanagowda S, Zhang DY, Xu FY, Joshi CP. 2006.

Xylem-specific and tension stress-responsive coexpression of KORRIGAN

endoglucanase and three secondary wall-associated cellulose synthase genes in

aspen trees. Planta 224: 828–837.Breviario D. 2008. Plant tubulin genes: regulatory and evolutionary aspects. In:

Nick P, ed. Plant microtubules. Berlin, Heidelberg, Germany: Springer, 207–232.

Burge C, Karlin S. 1997. Prediction of complete gene structures in human

genomic DNA. Journal of Molecular Biology 268: 78–94.Chu Y, Su X, Huang Q, Zhang X. 2009. Patterns of DNA sequence variation at

candidate gene loci in black poplar (Populus nigra L.) as revealed by singlenucleotide polymorphisms. Genetica 137: 141–150.

Coleman HD, Yan J, Mansfield SD. 2009. Sucrose synthase affects carbon

partitioning to increase cellulose production and altered cell wall ultrastructure.

Proceedings of the National Academy of Sciences, USA 106: 13118–13123.Croteau R, Kutchan T, Lewis N. 2000 Natural products (secondary

metabolites). In: Buchanan B, Gruissem W, Jones R, eds. Biochemistry &molecular biology of plants. Rockville, MD, USA: American Society of Plant

Physiologists, 1250–1318.Cumbie WP, Eckert A, Wegrzyn J, Whetten R, Neale D, Goldfarb B. 2011.

Association genetics of carbon isotope discrimination, height and foliar

nitrogen in a natural population of Pinus taeda L. Heredity 107: 105–114.Davis JM. 2008. Genetic Improvement of Poplar (Populus spp.) as a BioenergyCrop. In: Vermerris W, ed. Genetic improvement of bioenergy crops. New York,

NY, USA: Springer, 397–419.Davison B, Drescher S, Tuskan G, Davis M, Nghiem N. 2006. Variation of S/G

ratio and lignin content in a Populus family influences the release of xylose by

dilute acid hydrolysis. Applied Biochemistry and Biotechnology 130: 427–435.Dharmawardhana P, Brunner A, Strauss S. 2010. Genome-wide transcriptome

analysis of the transition from primary to secondary stem development in

Populus trichocarpa. BMC Genomics 11: 150.

Dillon S, Nolan M, Wu H, Southerton S. 2010. Association genetics reveal

candidate gene SNPs affecting wood properties in Pinus radiata. AustralianForestry 73: 185–190.

Eckert AJ, Bower AD, Wegrzyn JL, Pande B, Jermstad KD, Krutovsky KV,

Clair JBS, Neale DB. 2009. Asssociation genetics of coastal douglas fir

(Pseudotsuga menziesu var. menziesii, Pinaceae). I. Cold-hardiness related traits.

Genetics 182: 1289–1302.Eckert AJ, Wegrzyn JL, Cumbie WP, Goldfarb B, Huber DA, Tolstikov V,

Fiehn O, Neale DB. 2012. Association genetics of the loblolly pine (Pinustaeda, Pinaceae) metabolome. New Phytologist 193: 890–902.

Ewens K, Jones M, Ankener W, Stewart D, Urbanek M, Dunaif A, Legro

R, Chua A, Azziz R, Spielman R et al. 2011. FTO and MC4R Gene

variants are associated with obesity in polycystic ovary syndrome. PLoSONE 6: e16390.

Fan J, Oliphant A, Shen R, Kermani B, Garcia F, Gunderson K, Hansen M,

Steemers F, Butler S, Deloukas P et al. 2003.Highly parallel SNP genotyping.

Cold Spring Harbor Symposia on Quantitative Biology 68: 69–78.Fulker D, Cherny S, Sham P, Hewitt J. 1999. Combined linkage and association

sib-pair analysis for quantitative traits. The American Journal of Human Genetics64: 259–267.

Geisler-Lee J, Geisler M, Coutinho PM, Segerman B, Nishikubo N, Takahashi J,

Aspeborg H, Djerbi S, Master E, Andersson-Gunneras S et al. 2006. Poplarcarbohydrate-active enzymes. Gene identification and expression analyses. PlantPhysiology 140: 946–962.

Gonzalez-Martinez SC, Huber D, Ersoz E, Davis JM, Neale DB. 2008.

Association genetics in Pinus taeda L. II. Carbon isotope discrimination.

Heredity 101: 19–26.Gonzalez-Martınez SC, Wheeler NC, Ersoz E, Nelson CD, Neale DB. 2007.

Association Genetics in Pinus taeda L. I. Wood Property Traits. Genetics 175:399–409.

Goudet J. 2005.Hierfstat, a package for r to compute and test hierarchical F-

statistics.Molecular Ecology Notes 5: 184–186.Groover AT, Nieminen K, Helariutta Y, Mansfield SD. 2010.Wood formation

in Populus. In: Jansson S, Bhalerao R, Groover A, eds. Genetics and genomics ofPopulus. New York, NY, USA: Springer, 201–224.

Hall T. 1999. BioEdit: a user-friendly biological sequence alignment editor and

analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series 41:95–98.

Hanson AD, Roje S. 2001.One-carbon metabolism in higher plants. AnnualReview of Plant Physiology and Plant Molecular Biology 52: 119–137.

Hertzberg M, Aspeborg H, Schrader J, Andersson A, Erlandsson R, Blomqvist

K, Bhalerao R, Uhlen M, Teeri TT, Lundeberg J et al. 2001. A transcriptional

roadmap to wood formation. Proceedings of the National Academy of Sciences,USA 98: 14732–14737.

Kumar M, Thammannagowda S, Bulone V, Chiang V, Han K-H, Joshi CP,

Mansfield SD, Mellerowicz E, Sundberg B, Teeri T et al. 2009. An update on

the nomenclature for the cellulose synthase genes in Populus. Trends in PlantScience 14: 248–254.

Leple JC, Dauwe R, Morreel K, Storme V, Lapierre C, Pollet B, Naumann A,

Kang KY, Kim H, Ruel K et al. 2007. Downregulation of cinnamoyl-

coenzyme a reductase in poplar: multiple-level phenotyping reveals

effects on cell wall polymer metabolism and structure. Plant Cell 19:3669–3691.

Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, Rouze P,

Rombauts S. 2002. PlantCARE, a database of plant cis-acting regulatory

elements and a portal to tools for in silico analysis of promoter sequences.

Nucleic Acids Research 30: 325–327.Li L, Zhou Y, Cheng X, Sun J, Marita JM, Ralph J, Chiang VL. 2003.

Combinatorial modification of multiple lignin traits in trees through multigene

cotransformation. Proceedings of the National Academy of Sciences, USA 100:

4939–4944.Mijnsbrugge KV, Meyermans H, Van Montagu M, Bauw G, Boerjan W. 2000.

Wood formation in poplar: identification, characterization, and seasonal

variation of xylem proteins. Planta 210: 589–598.Mouillon J-M, Aubert S, Bourguignon J, Gout E, Douce R, Rebeille F. 1999.

Glycine and serine catabolism in non-photosynthetic higher plant cells: their

role in C1 metabolism. The Plant Journal 20: 197–205.

� 2012 The Authors

New Phytologist� 2012 New Phytologist TrustNew Phytologist (2013) 197: 162–176

www.newphytologist.com

NewPhytologist Research 175

Neale DB, Kremer A. 2011. Forest tree genomics: growing resources and

applications. Nature Reviews Genetics 12: 111–122.Neale DB, Savolainen O. 2004. Association genetics of complex traits in conifers.

Trends in Plant Science 9: 325–330.Novaes E, Osorio L, Drost DR, Miles BL, Boaventura-Novaes CRD, Benedict

C, Dervinis C, Yu Q, Sykes R, Davis M et al. 2009.Quantitative genetic

analysis of biomass and wood chemistry of Populus under different nitrogen

levels. New Phytologist 182: 878–890.Oakley RV, Wang Y-S, Ramakrishna W, Harding SA, Tsai C-J. 2007.

Differential expansion and expression of a- and b-tubulin gene families in

Populus. Plant Physiology 145: 961–973.Oliveros J. 2007. VENNY. An interactive tool for comparing lists with Venn

Diagrams. URL http://bioinfogp.cnb.csic.es/tools/venny/index.html [accessed

on August 2012].

Patterson N, Price A, Reich D. 2006. Population structure and eigenanalysis.

PLoS Genetica 2: 2074–2093.Qiu D, Wilson IW, Gan S, Washusen R, Moran GF, Southerton SG. 2008.

Gene expression in Eucalyptus branch wood with marked variation in cellulose

microfibril orientation and lacking G-layers. New Phytologist 179: 94–103.Quesada T, Gopal V, Cumbie WP, Eckert AJ, Wegrzyn JL, Neale DB, Goldfarb

B, Huber DA, Casella G, Davis JM. 2010. Association mapping of

quantitative disease resistance in a natural population of loblolly pine (Pinustaeda L.). Genetics 186: 677–686.

R Development Core Team. 2010. R: a language and environment for statisticalcomputing. Vienna, Austria:R Foundation for Statistical Computing.

Rajinikanth M, Harding SA, Tsai C-J. 2007. The glycine decarboxylase complex

multienzyme family in Populus. Journal of Experimental Botany 58: 1761–1770.Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR, Doebley

J, Kresovich S, Goodman MM, Buckler ES. 2001. Structure of linkage

disequilibrium and phenotypic associations in the maize genome. Proceedings ofthe National Academy of Sciences, USA 98: 11479–11484.

Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. 2002. Score

tests for association between traits and haplotypes when linkage phase is

ambiguous. American Journal of Human Genetics 70: 425–434.Shi R, Sun YH, Li QZ, Heber S, Sederoff R, Chiang VL. 2010. Towards a

systems approach for lignin biosynthesis in Populus trichocarpa: transcriptabundance and specificity of the monolignol biosynthetic genes. Plant and CellPhysiology 51: 144–163.

Sinnwell J, Schaid D. 2009. haplo.stats: Statistical analysis of haplotypes

with traits and covariates when linkage phase is ambiguous. R package

version 1.4.4. URL http://CRAN.R-project.org/package=haplo.stats

[accessed on 01 August 2012].

Slavov GT, Zhelev P. 2010. Salient biological features, systematics, and genetic

variation of Populus. In: Jansson S, Bhalerao RP, Groover A, eds. Genetics andgenomics of Populus. New York, NY, USA: Springer, 15–38.

Sobel E, Sengul H, Weeks DE. 2001.Multipoint estimation of identity-by-

descent probabilities at arbitrary positions among marker loci on general

pedigrees. Human Heredity 52: 121–131.Somerville C. 2006. Cellulose synthesis in higher plants. Annual Review of Celland Developmental Biology 22: 53–78.

Song D, Shen J, Li L. 2010. Characterization of cellulose synthase complexes in

Populus xylem differentiation. New Phytologist 187: 777–790.Sticklen MB. 2008. Plant genetic engineering for biofuel production: towards

affordable cellulosic ethanol. Nature Reviews Genetics 9: 433–443.Storey JD, Tibshirani R. 2003. Statistical significance for genomewide

studies. Proceedings of the National Academy of Sciences, USA 100: 9440–9445.

Studer MH, DeMartini JD, Davis MF, Sykes RW, Davison B, Keller M,

Tuskan GA, Wyman CE. 2011. Lignin content in natural Populus variantsaffects sugar release. Proceedings of the National Academy of Sciences, USA 108:

6300–6305.

Suzuki S, Li L, Sun Y-H, Chiang VL. 2006. The cellulose synthase gene

superfamily and biochemical functions of xylem-specific cellulose synthase-like

genes in Populus trichocarpa. Plant Physiology 142: 1233–1245.Suzuki S, Sakakibara N, Li L, Umezawa T, Chiang V. 2010. Profiling of

phenylpropanoid monomers in developing xylem tissue of transgenic aspen

(Populus tremuloides). Journal of Wood Science 56: 71–76.Sykes R, Yung M, Novaes E, Kirst M, Peter G, Davis M. 2009.High-

throughput screening of plant cell-wall composition using pyrolysis molecular

beam mass spectroscopy. In: Mielenz J, ed. Biofuels: methods and protocols. New

York, NY, USA: Humana Press, 169–183.Syvanen A-C. 2005. Toward geneome-wide SNP genotyping. Nature Genetics 37:S5–S10.

Tsai C, El Kayal K, Harding S. 2006. Populus, the new model system for

investigating phenylpropanoid complexity. International Journal of AppliedScience and Engineering 4: 221–233.

Varshney R, Nayak S, May G, Jackson S. 2009. Next-generation sequencing

technologies and their implications for crop genetics and breeding. Trends inBiotechnology 27: 522–530.

Wang H-Z, Dixon RA. 2011.On-off switches for secondary cell wall

biosynthesis.Molecular Plant 5: 297–303.Wang Y-S, Harding SA, Tsai C-J. 2004. Expression of a glycine decarboxylase

complex H-protein in non-photosynthetic tissues of Populus tremuloides.Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression 1676: 266–272.

Warnes G, Gorjanc G, Leisch F, Man M. 2011. genetics: Population Genetics. R

package version 1.3.6. URL http://CRAN.R-project.org/package=genetics

[accessed on 01 August 2012].

Wegrzyn JL, Eckert AJ, Choi M, Lee JM, Stanton BJ, Sykes R, Davis MF, Tsai

C-J, Neale DB. 2010. Association genetics of traits controlling lignin and

cellulose biosynthesis in black cottonwood (Populus trichocarpa, Salicaceae)secondary xylem. New Phytologist 188: 515–532.

Whetten R, Sederoff R. 1995. Lignin biosynthesis. The Plant Cell Online 7:1001–1013.

Zamudio F, Wolfinger R, Stanton B, Guerra F. 2008. The use of linear mixed

model theory for the genetic analysis of repeated measures from clonal tests of

forest trees. I. A focus on spatially repeated data. Tree Genetics & Genomes 4:299–313.

Zhong R, Morrison WH, Himmelsbach DS, Poole FL, Ye Z-H. 2000. Essential

role of caffeoyl coenzyme A O-methyltransferase in lignin biosynthesis in

woody poplar plants. Plant Physiology 124: 563–578.Zobel B, Jett JB. 1995. Genetics of wood production. Heidelberg, Germany:

Springer-Verlag.

Supporting Information

Additional supporting information may be found in the onlineversion of this article.

Table S1 Family means per trait

Table S2Mode of gene action per family

Table S3 Significant haplotypes

Please note: Wiley-Blackwell are not responsible for the contentor functionality of any supporting information supplied by theauthors. Any queries (other than missing material) should bedirected to the New Phytologist Central Office.

New Phytologist (2013) 197: 162–176 � 2012 The Authors

New Phytologist� 2012 New Phytologist Trustwww.newphytologist.com

Research

NewPhytologist176