caffeoyl shikimate esterase (cse) is an enzyme in the lignin biosynthetic pathway in arabidopsis

44
New Guinea, which we do not. Our inclusion of Turkey and the Middle East, Japan, Korea, and more of China, northern South America, and Southern Africa reflects our estimates of their numbers of endemic species. Our results have five consequences. 1) Broadly, tropical and subtropical islands, moist tropical and subtropical forests (especially those in mountains), and Mediterranean ecosys- tems hold concentrations of plant endemics. The majority of as-yet undescribed plant species also live in these regions (18). Further discoveries would likely enhance their importance. 2) Our figures suggest the achievement of Aichi goals by concentrating protected areas in regions of highest endemism. Had nations al- ready implemented this strategy, we would see proportionally greater protection rates where species densities are high. Figure 2 shows that, within the regions we select, strict protected areas [International Union for Conservation of Na- ture (IUCN) classes I and II (26)] occur at only slightly higher rates than in nonselected regions. The most important areas include Costa Rica and Panama, which have >10% of their land in IUCN classes I and II. This is a weak trend, however. When considering all the categories of protection (IUCN I to VI plus indigenous territories), the total protected is much higher, but the trend similar. 3) The ability of protected areas to protect de- pends upon the nature and location of threats ( 2, 5). For example, within our 17% set, 9.4% of plant species are endemic to a total of ~1.9 mil- lion km 2 of islands. Island plants suffer greatly from introduced species (27), something the es- tablishment of protected areas does not com- pletely address. 4) Figure 2 (and see table S1b) show a hitherto poorly appreciated effect of indigenous areas. In tropical South America, these protect large areas of tropical moist forest with high plant richness. 5) Overall, the global land area currently protected, ~13%, is close to Aichis 17%. This seems encouraging. However, of the 17% that contains the entire range of 67% of the worlds plant species, only 14% is protected in some way, barely more than the global average. The total area protected imperfectly measures speciesprotection, however. Even with perfect data on speciesdistributions, the Noahs Ark effect (28) renders simple optimal allocation of priority areas meaningless. A small total areaa metaphorical ark”—can capture many species but ignore long-term viability. Numerous protected areas of large aggregate size may house many species, but be individually too small to maintain viable populations. How small is too smallde- pends on the speciestigers demand more area than tiger liliesas well as the distribution of habitat fragments (29) and levels of threat (2, 5). How much area countries should protectand whereare ecological questions. Political practi- calities dominate actions, as the Aichi target of 17% testifies. The spatial resolution of presently available data is inadequate to address this key concern at the spatial scales at which conservation actions are taken and protected areas established. Nonetheless, we show that how protected areas are allocated within and across regions constrains how efficient- ly plant diversity can be sustained. Understanding this is necessary to achieve the Convention on Biological Diversitys conservation goals. References and Notes 1. J. M. Adeney, N. L. Christensen, S. L. Pimm, PLoS ONE 4, e5014 (2009). 2. K. S. Andam, P. J. Ferraro, A. Pfaff, G. A. Sanchez-Azofeifa, J. A. Robalino, Proc. Natl. Acad. Sci. U.S.A. 105, 1608916094 (2008). 3. J. Geldmann et al., Biol. Conserv. 161, 230238 (2013). 4. L. N. Joppa, S. R. Loarie, S. L. Pimm, Proc. Natl. Acad. Sci. U.S.A. 105, 66736678 (2008). 5. A. Nelson, K. M. Chomitz, PLoS ONE 6, e22722 (2011). 6. L. N. Joppa, A. Pfaff, Proc. Biol. Sci. 278, 16331638 (2011). 7. S. L. Pimm, G. J. Russell, J. L. Gittleman, T. M. Brooks, Science 269, 347350 (1995). 8. C. N. Jenkins, S. L. Pimm, L. N. Joppa, Proc. Natl. Acad. Sci. U.S.A. 110, E2602E2610 (2013). 9. BirdLife International, Bird species distribution maps of the world (BirdLife International, Cambridge, UK, 2011), vol. 2013; www.birdlife.org/datazone/info/ spcdownload. 10. International Union for the Conservation of Nature, IUCN Red List of Threatened Species. Version 2010.4 (2013); www.iucnredlist.org/technical-documents/spatial-data. 11. L. Cantú-Salazar, C. D. L. Orme, P. C. Rasmussen, T. M. Blackburn, K. J. Gaston, Biodivers. Conserv. 22, 10331047 (2013). 12. B. R. Scheffers, L. N. Joppa, S. L. Pimm, W. F. Laurance, Trends Ecol. Evol. 27, 501510 (2012). 13. L. N. Joppa, D. L. Roberts, S. L. Pimm, Proc. R. Soc. B 278, 554559 (2011). 14. D. R. Strong, J. H. Lawton, S. R. Southwood, Insects on Plants. Community Patterns and Mechanisms (Blackwell Scientific Publications, Oxford, 1984). 15. C. N. Jenkins, L. Joppa, Biol. Conserv. 142, 21662174 (2009). 16. D. M. Olson et al., Bioscience 51, 933 (2001). 17. L. N. Joppa, A. Pfaff, PLoS ONE 4, e8273 (2009). 18. L. N. Joppa, D. L. Roberts, N. Myers, S. L. Pimm, Proc. Natl. Acad. Sci. U.S.A. 108, 1317113176 (2011). 19. A. Paton, E. Nic Lughadha, Bot. J. Linn. Soc. 166, 250260 (2011). 20. WCSP 2008 World checklist of selected plant families. The Board of Trustees of the Royal Botanic Gardens, Kew. See www.kew.org/wcsp. 21. C. N. Jenkins et al., Divers. Distrib. 17, 652662 (2011). 22. N. Brummitt, S. Bachman, J. Moat, Endanger. Species Res. 6, 127135 (2008). 23. Materials and methods are available as supplementary materials on Science Online. 24. N. Myers, R. A. Mittermeier, C. G. Mittermeier, G. A. da Fonseca, J. Kent, Nature 403, 853858 (2000). 25. M. L. Rosenzweig, Species Diversity in Space and Time (Cambridge Univ. Press, Cambridge, 1995). 26. International Union for the Conservation of Nature and United Nations Environment Programme, World Conservation Monitoring Centre, The World Database on Protected Areas (WDPA) July Release (Cambridge, UK, 2012); www.protectedplanet.net. 27. D. F. Sax, S. D. Gaines, Proc. Natl. Acad. Sci. U.S.A. 105 (suppl. 1), 1149011497 (2008). 28. S. L. Pimm, J. H. Lawton, Science 279, 20682069 (1998). 29. J. K. Schnell, G. M. Harris, S. L. Pimm, G. J. Russell, Conserv. Biol. 27, 520530 (2013). Acknowledgments: The original data for this paper are in public archives from BirdLife International (9), IUCN (10), WCSP (20), and WCMC (26). We thank those responsible for access to them and especially the many professionals and amateurs who collected them. Supplementary Materials www.sciencemag.org/cgi/content/full/341/6150/1100/DC1 Materials and Methods Figs. S1 to S3 Table S1 10 June 2013; accepted 7 August 2013 10.1126/science.1241706 Caffeoyl Shikimate Esterase (CSE) Is an Enzyme in the Lignin Biosynthetic Pathway in Arabidopsis Ruben Vanholme, 1,2 Igor Cesarino, 1,2 Katarzyna Rataj, 3 § Yuguo Xiao, 3 § Lisa Sundin, 1,2 Geert Goeminne, 1,2 Hoon Kim, 4 Joanna Cross, 1,2 Kris Morreel, 1,2 Pedro Araujo, 1,2 Lydia Welsh, 3 Jurgen Haustraete, 5 Christopher McClellan, 3 Bartel Vanholme, 1,2 John Ralph, 4 Gordon G. Simpson, 3,6 Claire Halpin, 3 *Wout Boerjan 1,2 *Lignin is a major component of plant secondary cell walls. Here we describe caffeoyl shikimate esterase (CSE) as an enzyme central to the lignin biosynthetic pathway. Arabidopsis thaliana cse mutants deposit less lignin than do wild-type plants, and the remaining lignin is enriched in p-hydroxyphenyl units. Phenolic metabolite profiling identified accumulation of the lignin pathway intermediate caffeoyl shikimate in cse mutants as compared to caffeoyl shikimate levels in the wild type, suggesting caffeoyl shikimate as a substrate for CSE. Accordingly, recombinant CSE hydrolyzed caffeoyl shikimate into caffeate. Associated with the changes in lignin, the conversion of cellulose to glucose in cse mutants increased up to fourfold as compared to that in the wild type upon saccharification without pretreatment. Collectively, these data necessitate the revision of currently accepted models of the lignin biosynthetic pathway. T he evolutionary emergence of lignin, a phe- nolic polymer deposited in the secondary cell wall, allowed the development of vas- cular land plants. The hydrophobic and strengthen- ing nature of lignin enables conducting xylem vessels to transport water and nutrients from the www.sciencemag.org SCIENCE VOL 341 6 SEPTEMBER 2013 1103 REPORTS on September 5, 2013 www.sciencemag.org Downloaded from

Upload: independent

Post on 16-May-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

NewGuinea, which we do not. Our inclusion ofTurkey and the Middle East, Japan, Korea, andmore of China, northern South America, andSouthern Africa reflects our estimates of theirnumbers of endemic species.

Our results have five consequences.1) Broadly, tropical and subtropical islands,

moist tropical and subtropical forests (especiallythose in mountains), and Mediterranean ecosys-tems hold concentrations of plant endemics. Themajority of as-yet undescribed plant species alsolive in these regions (18). Further discoverieswould likely enhance their importance.

2) Our figures suggest the achievement ofAichi goals by concentrating protected areas inregions of highest endemism. Had nations al-ready implemented this strategy, we would seeproportionally greater protection rates wherespecies densities are high. Figure 2 shows that,within the regions we select, strict protected areas[International Union for Conservation of Na-ture (IUCN) classes I and II (26)] occur at onlyslightly higher rates than in nonselected regions.The most important areas include Costa Rica andPanama, which have >10% of their land in IUCNclasses I and II. This is a weak trend, however.When considering all the categories of protection(IUCN I to VI plus indigenous territories), thetotal protected is much higher, but the trendsimilar.

3) The ability of protected areas to protect de-pends upon the nature and location of threats(2, 5). For example, within our 17% set, 9.4% ofplant species are endemic to a total of ~1.9 mil-lion km2 of islands. Island plants suffer greatlyfrom introduced species (27), something the es-tablishment of protected areas does not com-pletely address.

4) Figure 2 (and see table S1b) show a hithertopoorly appreciated effect of indigenous areas. Intropical South America, these protect large areasof tropical moist forest with high plant richness.

5) Overall, the global land area currentlyprotected, ~13%, is close to Aichi’s 17%. Thisseems encouraging. However, of the 17% thatcontains the entire range of 67% of the world’splant species, only 14% is protected in someway,barely more than the global average.

The total area protected imperfectly measuresspecies’ protection, however. Even with perfectdata on species’ distributions, the “Noah’s Arkeffect ” (28) renders simple optimal allocation ofpriority areas meaningless. A small total area—ametaphorical “ark”—can capture many speciesbut ignore long-term viability. Numerous protectedareas of large aggregate size may house manyspecies, but be individually too small to maintainviable populations. How small is “too small” de-pends on the species—tigers demand more areathan tiger lilies—as well as the distribution ofhabitat fragments (29) and levels of threat (2, 5).How much area countries should protect—andwhere—are ecological questions. Political practi-calities dominate actions, as the Aichi target of17% testifies.

The spatial resolution of presently availabledata is inadequate to address this key concern atthe spatial scales at which conservation actions aretaken and protected areas established. Nonetheless,we show that how protected areas are allocatedwithin and across regions constrains how efficient-ly plant diversity can be sustained. Understandingthis is necessary to achieve the Convention onBiological Diversity’s conservation goals.

References and Notes1. J. M. Adeney, N. L. Christensen, S. L. Pimm, PLoS ONE 4,

e5014 (2009).2. K. S. Andam, P. J. Ferraro, A. Pfaff, G. A. Sanchez-Azofeifa,

J. A. Robalino, Proc. Natl. Acad. Sci. U.S.A. 105,16089–16094 (2008).

3. J. Geldmann et al., Biol. Conserv. 161, 230–238 (2013).4. L. N. Joppa, S. R. Loarie, S. L. Pimm, Proc. Natl. Acad.

Sci. U.S.A. 105, 6673–6678 (2008).5. A. Nelson, K. M. Chomitz, PLoS ONE 6, e22722 (2011).6. L. N. Joppa, A. Pfaff, Proc. Biol. Sci. 278, 1633–1638 (2011).7. S. L. Pimm, G. J. Russell, J. L. Gittleman, T. M. Brooks,

Science 269, 347–350 (1995).8. C. N. Jenkins, S. L. Pimm, L. N. Joppa, Proc. Natl. Acad.

Sci. U.S.A. 110, E2602–E2610 (2013).9. BirdLife International, Bird species distribution maps

of the world (BirdLife International, Cambridge, UK,2011), vol. 2013; www.birdlife.org/datazone/info/spcdownload.

10. International Union for the Conservation of Nature, IUCNRed List of Threatened Species. Version 2010.4 (2013);www.iucnredlist.org/technical-documents/spatial-data.

11. L. Cantú-Salazar, C. D. L. Orme, P. C. Rasmussen,T. M. Blackburn, K. J. Gaston, Biodivers. Conserv. 22,1033–1047 (2013).

12. B. R. Scheffers, L. N. Joppa, S. L. Pimm, W. F. Laurance,Trends Ecol. Evol. 27, 501–510 (2012).

13. L. N. Joppa, D. L. Roberts, S. L. Pimm, Proc. R. Soc. B278, 554–559 (2011).

14. D. R. Strong, J. H. Lawton, S. R. Southwood, Insects onPlants. Community Patterns and Mechanisms (BlackwellScientific Publications, Oxford, 1984).

15. C. N. Jenkins, L. Joppa, Biol. Conserv. 142, 2166–2174(2009).

16. D. M. Olson et al., Bioscience 51, 933 (2001).17. L. N. Joppa, A. Pfaff, PLoS ONE 4, e8273 (2009).18. L. N. Joppa, D. L. Roberts, N. Myers, S. L. Pimm, Proc.

Natl. Acad. Sci. U.S.A. 108, 13171–13176 (2011).19. A. Paton, E. Nic Lughadha, Bot. J. Linn. Soc. 166,

250–260 (2011).20. WCSP 2008 World checklist of selected plant families.

The Board of Trustees of the Royal Botanic Gardens, Kew.See www.kew.org/wcsp.

21. C. N. Jenkins et al., Divers. Distrib. 17, 652–662 (2011).22. N. Brummitt, S. Bachman, J. Moat, Endanger. Species

Res. 6, 127–135 (2008).23. Materials and methods are available as supplementary

materials on Science Online.24. N. Myers, R. A. Mittermeier, C. G. Mittermeier,

G. A. da Fonseca, J. Kent, Nature 403, 853–858 (2000).25. M. L. Rosenzweig, Species Diversity in Space and Time

(Cambridge Univ. Press, Cambridge, 1995).26. International Union for the Conservation of Nature and

United Nations Environment Programme, WorldConservation Monitoring Centre, The World Database onProtected Areas (WDPA) July Release (Cambridge, UK,2012); www.protectedplanet.net.

27. D. F. Sax, S. D. Gaines, Proc. Natl. Acad. Sci. U.S.A. 105(suppl. 1), 11490–11497 (2008).

28. S. L. Pimm, J. H. Lawton, Science 279, 2068–2069 (1998).29. J. K. Schnell, G. M. Harris, S. L. Pimm, G. J. Russell,

Conserv. Biol. 27, 520–530 (2013).

Acknowledgments: The original data for this paper are inpublic archives from BirdLife International (9), IUCN (10),WCSP (20), and WCMC (26). We thank those responsible foraccess to them and especially the many professionals andamateurs who collected them.

Supplementary Materialswww.sciencemag.org/cgi/content/full/341/6150/1100/DC1Materials and MethodsFigs. S1 to S3Table S1

10 June 2013; accepted 7 August 201310.1126/science.1241706

Caffeoyl Shikimate Esterase (CSE)Is an Enzyme in the LigninBiosynthetic Pathway in ArabidopsisRuben Vanholme,1,2‡ Igor Cesarino,1,2‡ Katarzyna Rataj,3§ Yuguo Xiao,3§ Lisa Sundin,1,2

Geert Goeminne,1,2 Hoon Kim,4 Joanna Cross,1,2 Kris Morreel,1,2 Pedro Araujo,1,2 Lydia Welsh,3

Jurgen Haustraete,5 Christopher McClellan,3 Bartel Vanholme,1,2 John Ralph,4

Gordon G. Simpson,3,6 Claire Halpin,3*† Wout Boerjan1,2*†

Lignin is a major component of plant secondary cell walls. Here we describe caffeoyl shikimateesterase (CSE) as an enzyme central to the lignin biosynthetic pathway. Arabidopsis thaliana csemutants deposit less lignin than do wild-type plants, and the remaining lignin is enriched inp-hydroxyphenyl units. Phenolic metabolite profiling identified accumulation of the lignin pathwayintermediate caffeoyl shikimate in cse mutants as compared to caffeoyl shikimate levels in thewild type, suggesting caffeoyl shikimate as a substrate for CSE. Accordingly, recombinant CSEhydrolyzed caffeoyl shikimate into caffeate. Associated with the changes in lignin, the conversionof cellulose to glucose in cse mutants increased up to fourfold as compared to that in the wild typeupon saccharification without pretreatment. Collectively, these data necessitate the revision ofcurrently accepted models of the lignin biosynthetic pathway.

Theevolutionary emergence of lignin, a phe-nolic polymer deposited in the secondarycell wall, allowed the development of vas-

cular land plants. The hydrophobic and strengthen-ing nature of lignin enables conducting xylemvessels to transport water and nutrients from the

www.sciencemag.org SCIENCE VOL 341 6 SEPTEMBER 2013 1103

REPORTS

on

Sept

embe

r 5, 2

013

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

roots to photosynthetic organs while withstand-ing the negative pressure caused by transpiration(1, 2). The strengthening of fiber cells by ligni-fication allows vascular plants to grow tall andstand upright (1, 3, 4). However, these very samephysicochemical properties of lignin are a barrierto the isolation of cellulose fibers by chemical pulp-ing and the enzymatic hydrolysis of cell wall poly-saccharides in biorefining. Biomass feedstockswith less lignin or with more-degradable ligninwould reduce the high processing costs and carbonfootprint of paper, biofuels, and chemicals (5).

The lignin biosynthetic pathway has been ex-tensively studied (6–10). In dicotyledonous plants,lignin is mainly synthesized from two monomersor monolignols, coniferyl alcohol and sinapylalcohol (6, 11, 12), that upon incorporation intolignin give rise to guaiacyl (G) and syringyl (S)units, respectively. p-Coumaryl alcohol gives riseto the less abundant p-hydroxyphenyl (H) ligninunits. 3-Hydroxylation of the aromatic ring,catalyzed by p-coumarate 3-hydroxylase (C3H,Fig. 1), diverts flux away from H lignin andtoward G and S lignin. The discovery that C3Haccepts p-coumaroyl shikimate and quinate es-ters as substrates (13–15) led to the identificationof hydroxycinnamoyl-CoA:shikimate/quinatehydroxycinnamoyltransferase (HCT) as the enzymecatalyzing the preceding step, the productionof p-coumarate esters from p-coumaroyl–coenzyme A (CoA) (16) (Fig. 1). The suggestionthat HCT also catalyzes a second reaction in thelignin pathway, converting the resulting caffeateesters into caffeoyl-CoA (16), was attractive be-cause it brought the interpreted pathway back tothe next expected intermediate, caffeoyl-CoA.Here we describe caffeoyl shikimate esterase(CSE, encoded by At1g52760) as an enzyme inthe lignin biosynthetic pathway that, togetherwith 4-coumarate:CoA ligase (4CL), bypassesthe second HCT reaction.

CSE was first identified through a potentialfunction in phospholipid repair upon oxidativestress (17). However, we identified CSE as a can-didate for involvement in lignification, based onanalyses designed to identify genes coexpressedwith known components of the lignin biosyn-thetic pathway. Of 13 genes identified by each of

three different coexpression software tools andpublicly available data sets, 9 were establishedlignin pathway genes, but three, including CSE,had no known role in lignin biosynthesis (fig. S1and table S1). Of these three, only CSE was alsoidentified as being coexpressed with lignin path-way genes in a set of lignin mutants (7). Consist-ent with the coexpression analysis, CSE-reporterfusion proteins expressed from transgenes (com-posed of the native CSE promoter, exons, andintrons, fused to a reporter gene) were detectedin lignifying vascular tissue of primary transfor-mants (fig. S2).

To investigate a role for CSE in lignification,we studied two transfer DNA insertion mutants:cse-1, a knockdown mutant with an insertion inthe promoter, and cse-2, a knockout mutant withan insertion in the second exon (fig. S3). Al-though cse-1 did not show developmental abnor-malities, the inflorescence stems of cse-2mutantswere 37% smaller and 42% lighter at senescencethan were those of the wild type (Fig. 2 and fig.S4). CSE transcript levels analyzed by quantita-tive reverse transcription polymerase chain reac-tion were 6.3% of that in the wild type in cse-1and undetectable in cse-2, the weak and strongmutant alleles, respectively (fig. S3).

Analysis of transverse sections of cse-1 andcse-2 mutant stems revealed reduced autofluo-rescence and less intenseWiesner andMäule stain-ing in vessels and fibers of cse-2, indicative ofreduced lignin content and fewer lignin S units(Fig. 2 and fig. S5). In addition, cse-2 mutantshad collapsed vessel elements (Fig. 2), a phe-notype typical of plants with weakened second-ary cell walls (18). The mutant phenotype ofcse-2was complemented in stable transgenic linesin which expression of CSE was driven by thecauliflower mosaic virus 35S promoter (Fig. 2and figs. S4 and S5), verifying that the mutationof CSE is the cause of the observed phenotypes.

In order to examine the connection betweenCSE and lignification in more detail, senescedinflorescence stems of cse-1 and cse-2 were sub-jected to compositional analyses. The fraction ofthe dry weight made up by cell wall polymersafter soluble molecules had been extracted (i.e.,the cell wall residue) was significantly reducedfrom 79.8% in the wild type to 72.9% in the cse-2mutant (0.05 > P > 0.01) (table S2). The fractionof acetyl bromide–soluble lignin released fromthis cell wall residue was reduced by 17 and36% in the cse-1 and cse-2mutants as comparedto their corresponding wild-type controls (Fig. 2and table S2). The lignin composition of cse-2,determined by thioacidolysis and nuclear mag-netic resonance (NMR), revealed that the relativeproportion of H units increased over 30-fold (Fig.2, fig. S6 and table S2). Milder compositionalshifts were apparent in the lignin of cse-1 (tableS2). The increase in the proportion of H unitsin both mutants suggests that CSE is active inthe general phenylpropanoid pathway after thebranch leading to H units but before the path-ways for G and S units diverge. At this partof the pathway, HCT, C3H, and caffeoyl-CoAO-methyltransferase (CCoAOMT) are also active(Fig. 1) and, accordingly, plants with reducedHCTand C3H activity also have lignin enrichedin H units (14, 19, 20).

Because lignin composition was altered in bothcse mutant alleles, we expected to find a shift inphenolic metabolism. We analyzed methanol-soluble phenolics of stem extracts of both csemutant alleles by liquid chromatography massspectrometry (LC-MS) (5). The abundance of twocompounds, both oligolignols containing G andS units, was reduced in each mutant allele (com-pounds 28 and 29; figs. S7 and S8 and table S3).These findings are consistent with the reducedbiosynthesis of lignin in the csemutants, becausethe abundance of oligolignols duringArabidopsis

1Department of Plant Systems Biology, VIB (Flanders Institute forBiotechnology), Technologiepark 927, B-9052 Ghent, Belgium.2Department of Plant Biotechnology and Bioinformatics, GhentUniversity, Technologiepark 927, B-9052 Ghent, Belgium.3Division of Plant Sciences, College of Life Sciences, Universityof Dundee at the James Hutton Institute, Dundee DD2 5DA, UK.4Departments of Biochemistry and Biological Systems Engineer-ing, and the Department of Energy Great Lakes BioenergyResearch Center, the Wisconsin Energy Institute, University ofWisconsin, Madison, 1552University Avenue, Madison,WI 53726,USA. 5Protein Service Facility, Department for Molecular Bio-medical Research, VIB, Ghent University, Technologiepark 927,B-9052 Ghent, Belgium. 6Cell and Molecular Science, JamesHutton Institute, Invergowrie, Dundee DD2 5DA, UK.

*Corresponding author. E-mail: [email protected] (W.B.); [email protected] (C.H.)†These authors contributed equally to this work.‡These authors contributed equally to this work.§These authors contributed equally to this work.

Fig. 1. The lignin biosynthetic pathway incorporating the CSE-dependent reaction establishedin this study.

6 SEPTEMBER 2013 VOL 341 SCIENCE www.sciencemag.org1104

REPORTS

on

Sept

embe

r 5, 2

013

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

stem development is correlated with the amountof lignin (7). Of the 27 compounds with increasedabundance in the csemutants as compared to thewild type, 21 could be identified. Nineteen werecaffeate- and ferulate-derived products (figs. S7and S8), of which caffeoyl shikimate was mostabundant (Fig. 3). In addition, (hexosylated) Hunit–containing oligolignols accumulated in thecsemutants, as well as a neolignan containing anH unit and ferulate (figs. S7 and S8). These dataalso support the hypothesis that CSE functionsafter the branch in the lignin pathwaywhereG andS unit biosynthesis diverges from that of H units.

Some of the compounds that accumulate incse mutants relative to the wild type might besubstrates for CSE. In order to test this possibility,we incubated purified recombinant CSE enzymewith extracts of cse-1 mutants, after the com-plexity of the extract had been reduced by chro-matographic separation into 96 fractions. Threecompounds decreased in abundance upon treat-ment with CSE, and new compounds appearedin the same fractions (fig. S9 and tables S4 andS5). Caffeoyl shikimate, whichwas the compoundthat showed greatest abnormal accumulation incse mutants, was almost completely hydrolyzed

(97%) by recombinant CSE into caffeic acid (fig.S9). Modeling the structure of CSE revealed thatcaffeoyl shikimate could fit into the active site(fig. S10). Correspondingly, crude extracts fromcse-2 lignifying tissues were less able to hydro-lyze caffeoyl shikimate into caffeate than thoseof the wild type (fig. S11). We therefore suggestthat caffeoyl shikimate is a substrate for CSEin vivo. Because caffeoyl shikimate is an ac-cepted intermediate in lignin biosynthesis (Fig. 1)(13, 15, 21), this places CSE in the lignin bio-synthetic pathway. The Michaelis-Menten con-stant (Km) and maximum reaction velocity (Vmax)values of CSE for caffeoyl shikimate were 96.5 mMand 9.3 picokatals (pkat) per mg of protein, re-spectively (Fig. 3). We also tested p-coumaroylshikimate, which is structurally similar to caffeoylshikimate andan intermediate in thephenylpropanoidpathway, as a potential substrate of CSE (13, 15).The Km and Vmax values of CSE for p-coumaroylshikimate were 211 mM and 0.66 pkat per mg ofprotein, respectively. Thus, CSE showed a higheraffinity for caffeoyl shikimate, with aVmax/Km valuethat is 31 times greater than that for p-coumaroylshikimate, suggesting that caffeoyl shikimate isthe preferred CSE substrate.

Current lignin pathway models indicate thatcaffeoyl shikimate is converted to caffeoyl-CoA(Fig. 1) (1, 6, 16). When we tested whether thisreaction could also be catalyzed by CSE withcaffeoyl shikimate and CoA as substrates, onlycaffeate, but not caffeoyl-CoA, was produced(fig. S12). Our data suggest that current ligninbiosynthetic pathway models should be revisedto include theCSE-catalyzed conversion of caffeoylshikimate into caffeate, although we cannot ex-clude the possibility that CSE can convertcaffeoyl shikimate into other phenolic compoundsin vivo.

Because the Arabidopsis 4CLs involved inlignification (4CL1 and 4CL2) haveKm andVmaxvalues of the same order of magnitude for bothp-coumarate and caffeate (7, 22, 23), caffeatemight be used by 4CL to form caffeoyl-CoAin vivo (Fig. 1). Current pathwaymodels indicatedirect conversion by HCT of caffeoyl shikimateand CoA into caffeoyl-CoA (1, 6, 16). We con-firmed that purified recombinant ArabidopsisHCT enzyme could indeed catalyze this reaction(fig. S13). These data show that caffeoyl shikimatemay be a substrate for both CSE andHCT in vivo.A HCT-dependent route from caffeoyl shikimateto caffeoyl-CoA (Fig. 1)may explain how residuallignin in the cse-2 null mutant is synthesized. Inaddition, the C3H/C4H (cinnamate 4-hydroxylase)heteromeric complex may contribute to carbonflux toward lignin (Fig. 1), because homologs ofthese enzymes from poplar have been shown to

Fig. 2. Phenotype and lignin characteristics of csemutants. (A) Fully grown plants after cultivationfor 8 weeks in short day photoperiods and for 5 weeks in long day photoperiods. Height and weightmeasurements are in fig. S5. (B) Transverse stem sections of cse-2mutants and wild-type plants. Wiesnerstaining and additional images of cse-1, cse-2 CSE 1, and cse-2 CSE 2 sections are in fig. S5. Collapsedvessel elements are indicated by arrowheads. Scale bar, 100 mm. (C) Lignin levels determined by theacetyl bromide method. See table S2 for lignin data of cse-1. ***0.001 > P; unpaired two-sided t test (D)Relative H:G:S lignin composition as determined by thioacidolysis (left) and from whole–cell wall NMRintegrals (right). See table S2 for full thioacidolysis details. (E) Partial short-range 13C-1H [heteronuclearsingle-quantum coherence (HSQC)] spectra (aromatic region) of the cell walls of cse-2 mutants and wild-type plants. For the side-chain region of the HSQC spectra, see fig. S6. Error bars indicate T SEM.

Fig. 3. Caffeoyl shikimate accumulates in csemutants and is hydrolyzed by CSE. (A) Repre-sentative LC-MS chromatograms of wild type, cse-1,and cse-2 plants, showing the increased peak ofcaffeoyl shikimate in the cse mutants. See fig. S7and table S3 for full details. m/z, mass-to-chargeratio. (B) Enzyme kinetics curve measured with therecombinant CSE, showing that CSE accepts caffeoylshikimate as a substrate.

www.sciencemag.org SCIENCE VOL 341 6 SEPTEMBER 2013 1105

REPORTS

on

Sept

embe

r 5, 2

013

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

convert p-coumarate to caffeate (24). However,these alternative routes to lignin biosynthesis donot fully compensate for a loss of CSE activity, be-cause cse mutants are compromised in lignificationand development. Likewise, the accumulation ofcaffeoyl shikimate that occurs in csemutants sug-gests that HCT is relatively ineffective at metab-olizing this substrate in vivo.

Lignin limits the processing of plant biomassto fermentable sugars (25, 26). Processing of csemutant plants, which have reduced lignin con-tent, might yield more sugars on saccharification.We compared cellulose-to-glucose conversionof senesced stems from both cse mutants andwild-type plants. Cell wall residues of senescedinflorescence stems of cse-1 have normal amountsof cellulose, whereas those of cse-2 have 73% ofthe normal amount of cellulose (table S2). Thecellulose-to-glucose conversion of the unpre-treated cell wall residue was monitored over aperiod of 48 hours (Fig. 4); when the plateauwas reached, the conversion had increased from~18% in the wild type to ~24% in cse-1 (i.e., arelative increase of 32%) and to ~78% (fourfoldhigher than in the wild type) in cse-2. Therefore,saccharification efficiency increases as lignincontent decreases. On a plant basis, cse-2mutantsreleased 75% more glucose than the wild type.Saccharification efficiency frommaterial derivedfrom cse-2 plants is similar to that of ccr1-3, amutant in the lignin pathway gene for cinnamoyl-CoA reductase that has the highest saccharifica-tion efficiency described so far (26).

We found orthologs of CSE in a wide rangeof plant species (fig. S14), including biofuelfeedstocks such as poplar, eucalyptus, and switch-grass. Consistent with a potential conserved rolein lignification, CSE copurifies with lignin bio-synthetic enzymes in extracts from poplar xylem

(27). The characterization of CSE in other specieswill reveal how widely the revision of the ligninbiosynthetic pathway we propose here appliesand whether CSE could be a generally usefultarget for reducing cell wall recalcitrance to di-gestion or industrial processing in biomass crops.

References and Notes1. J. K. Weng, C. Chapple, New Phytol. 187, 273–285 (2010).2. F. Yang et al., Plant Biotechnol. J. 11, 325–335 (2013).3. N. Mitsuda et al., Plant Cell 19, 270–280 (2007).4. R. Zhong, E. A. Richardson, Z. H. Ye, Planta 225,

1603–1611 (2007).5. R. Vanholme et al., New Phytol. 196, 978–1000 (2012).6. W. Boerjan, J. Ralph, M. Baucher, Annu. Rev. Plant Biol.

54, 519–546 (2003).7. R. Vanholme et al., Plant Cell 24, 3506–3529 (2012).8. F. Chen et al., Plant J. 48, 113–124 (2006).9. J. M. Humphreys, C. Chapple, Curr. Opin. Plant Biol. 5,

224–229 (2002).10. N. D. Bonawitz, C. Chapple, Annu. Rev. Genet. 44,

337–363 (2010).11. K. Freudenberg, A. C. Neish, in Constitution and Biosynthesis

of Lignin, vol. 2 of Molecular Biology, Biochemistry andBiophysics, A. Kleinzeller, G. F. Springer, H. G. Wittman, Eds.(Springer-Verlag, New York, 1968), pp. 1–129.

12. J. Ralph et al., Phytochem. Rev. 3, 29–60 (2004).13. G. Schoch et al., J. Biol. Chem. 276, 36566–36574

(2001).14. R. Franke et al., Plant J. 30, 47–59 (2002).15. R. Franke et al., Plant J. 30, 33–45 (2002).16. L. Hoffmann, S. Maury, F. Martz, P. Geoffroy, M. Legrand,

J. Biol. Chem. 278, 95–103 (2003).17. W. Gao, H. Y. Li, S. Xiao, M. L. Chye, Plant J. 62,

989–1003 (2010).18. S. R. Turner, C. R. Somerville, Plant Cell 9, 689–701 (1997).19. X. Li, N. D. Bonawitz, J.-K. Weng, C. Chapple, Plant Cell

22, 1620–1632 (2010).20. S. Besseau et al., Plant Cell 19, 148–162 (2007).21. H. D. Coleman, J.-Y. Park, R. Nair, C. Chapple, S. D. Mansfield,

Proc. Natl. Acad. Sci. U.S.A. 105, 4501–4506 (2008).22. J. Ehlting et al., Plant J. 19, 9–20 (1999).23. J. Raes, A. Rohde, J. H. Christensen, Y. Van de Peer,

W. Boerjan, Plant Physiol. 133, 1051–1071 (2003).24. H.-C. Chen et al., Proc. Natl. Acad. Sci. U.S.A. 108,

21253–21258 (2011).25. F. Chen, R. A. Dixon, Nat. Biotechnol. 25, 759–761 (2007).

26. R. Van Acker et al., Biotechnology for Biofuels,www.biotechnologyforbiofuels.com/content/6/1/46(2013).

27. R. Nilsson et al., Mol. Cell. Proteomics 9, 368–387 (2010).

Acknowledgments: The authors thank A. Bleys for help inpreparing the manuscript and K. Graham for technical support.We gratefully acknowledge funding through the EuropeanCommission’s Directorate-General for Research within the7th Framework Program (FP7/2007-2013) under grantagreements 211982 (RENEWALL) and 270089 (MULTIBIOPRO);Stanford University’s Global Climate and Energy Project(Towards New Degradable Lignin Types, Novel MutantsOptimized for Lignin, Growth and Biofuel Production viaRe-Mutagenesis, and Efficient Biomass Conversion: Delineatingthe Best Lignin Monomer-Substitutes); the Hercules programof Ghent University for the Synapt Q-Tof (grant AUGE/014);the Bijzonder Onderzoeksfonds-Zware Apparatuur of GhentUniversity for the Fourier transform ion cyclotron resonancemass spectrometer (174PZA05); and the MultidisciplinaryResearch Partnership Biotechnology for a SustainableEconomy (01MRB510W) of Ghent University. R.V. is indebtedto the Research Foundation-Flanders for a postdoctoralfellowship and L.S., I.C., and P.A. to the Agency forInnovation by Science and Technology (IWT), CAPES-Brazil(grant 201660/2010-5), and the CNPq-Brazil sandwichPh.D. (grant 201998/2011-4), respectively, for predoctoralfellowships. J.R. and H.K. were funded in part by the U.S.Department of Energy’s Great Lakes Bioenergy ResearchCenter (DOE Office of Science BER DE-FC02-07ER64494).G.G.S. was partially funded by the Scottish Government.W.B. is on the Science Advisory Board of the NSF-fundedproject Regulation and Modeling of Lignin Biosynthesis,DBI-0922391. A patent application, “Modified plants”PCT/GB2013/051206, on the modification of CSE expressionto improve processes that require carbohydrate extraction,has been filed jointly by the University of Dundee and theFlanders Institute for Biotechnology (VIB).

Supplementary Materialswww.sciencemag.org/cgi/content/full/science.1241602/DC1Materials and MethodsFigs. S1 to S14Table S1 to S5References (28–55)

7 June 2013; accepted 5 August 2013Published online 15 August 2013;10.1126/science.1241602

Epigenetic Regulation of Mouse SexDetermination by the HistoneDemethylase Jmjd1aShunsuke Kuroki,1 Shogo Matoba,2 Mika Akiyoshi,1 Yasuko Matsumura,1 Hitoshi Miyachi,1

Nathan Mise,2* Kuniya Abe,2 Atsuo Ogura,2 Dagmar Wilhelm,3† Peter Koopman,3

Masami Nozaki,4 Yoshiakira Kanai,5 Yoichi Shinkai,6‡ Makoto Tachibana1,7‡

Developmental gene expression is defined through cross-talk between the function of transcriptionfactors and epigenetic status, including histone modification. Although several transcriptionfactors play crucial roles in mammalian sex determination, how epigenetic regulation contributesto this process remains unknown. We observed male-to-female sex reversal in mice lacking theH3K9 demethylase Jmjd1a and found that Jmjd1a regulates expression of the mammalian Ychromosome sex-determining gene Sry. Jmjd1a directly and positively controls Sry expression byregulating H3K9me2 marks. These studies reveal a pivotal role of histone demethylation inmammalian sex determination.

The development of two sexes is essentialfor the survival and evolution of mostanimal species. Although several transcrip-

tion factors, including the factor encoded by theY chromosome gene Sry (1, 2), have been shownto play crucial roles in mammalian sex differen-

Fig. 4. Cellulose-to-glucose conversion dur-ing saccharification of the senesced inflo-rescence stems of cse mutants. h, hours. Errorbars indicate T SEM. *0.05 > P> 0.01, **0.01 > P>0.001, ***0.001 > P; unpaired two-sided t test.

6 SEPTEMBER 2013 VOL 341 SCIENCE www.sciencemag.org1106

REPORTS

on

Sept

embe

r 5, 2

013

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

DOI: 10.1126/science.1241602, 1103 (2013);341 Science

et al.Ruben VanholmeArabidopsisBiosynthetic Pathway in

Caffeoyl Shikimate Esterase (CSE) Is an Enzyme in the Lignin

This copy is for your personal, non-commercial use only.

clicking here.colleagues, clients, or customers by , you can order high-quality copies for yourIf you wish to distribute this article to others

here.following the guidelines

can be obtained byPermission to republish or repurpose articles or portions of articles

): September 5, 2013 www.sciencemag.org (this information is current as of

The following resources related to this article are available online at

http://www.sciencemag.org/content/341/6150/1103.full.htmlversion of this article at:

including high-resolution figures, can be found in the onlineUpdated information and services,

http://www.sciencemag.org/content/suppl/2013/08/14/science.1241602.DC1.html can be found at: Supporting Online Material

http://www.sciencemag.org/content/341/6150/1103.full.html#ref-list-1, 22 of which can be accessed free:cites 53 articlesThis article

registered trademark of AAAS. is aScience2013 by the American Association for the Advancement of Science; all rights reserved. The title

CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience

on

Sept

embe

r 5, 2

013

www.

scie

ncem

ag.o

rgDo

wnlo

aded

from

www.sciencemag.org/cgi/content/full/science.1241602/DC1

Supplementary Materials for

Caffeoyl Shikimate Esterase (CSE) Is an Enzyme in the Lignin Biosynthetic Pathway

Ruben Vanholme, Igor Cesarino, Katarzyna Rataj, Yuguo Xiao, Lisa Sundin, Geert Goeminne, Hoon Kim, Joanna Cross, Kris Morreel, Pedro Araujo, Lydia Welsh, Jurgen Haustraete, Christopher McClellan, Bartel Vanholme, John Ralph, Gordon G. Simpson,

Claire Halpin,* Wout Boerjan*

*Corresponding author. E-mail: [email protected] (W.B.); [email protected] (C.H.)

Published 15 August 2013 on Science Express DOI: 10.1126/science.1241602

This PDF file includes:

Materials and Methods Figs. S1 to S14 Table S2 Captions for Tables S1 and S3 to S5 References (28–55)

Other Supplementary Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/science.1241602/DC1)

Tables S1 and S3 to S5

2

Materials and Methods Co-expression analyses

Analyses to identify genes that are co-expressed with known lignin biosynthesis genes were performed using three different tools/datasets; ACT (the Arabidopsis Co-expression Tool) (28), CressExpress (29), and a high-resolution root spatiotemporal (HRRS) expression dataset (30). For each analysis, 10 known monolignol biosynthetic genes were used as query baits (PAL1, At2g37040; C4H, AT2g30490; 4CL1, At1g51680; HCT, At5g48930; C3H1, At2g40890; CCoAOMT1, At4g34050; CCR1, At1g15950; F5H1, At4g36220; COMT, At5g54160 and CAD6/CAD-D, At4g34230). For the ACT analyses, the 22K array option was selected (corresponding to experiments performed with the Affymetrix ATH1 array), and for each lignin biosynthesis gene bait, a list was derived of the top 50 co-expressed genes (ranking by r-value). These 10 individual gene lists were then compared to identify 94 genes that were co-expressed with two or more lignin biosynthesis genes. Similarly, the HRRS dataset (30) was analysed using the graphical representation tool featured in Brown et al. (31) to identify candidate genes that are spatially co-expressed with each lignin biosynthesis gene; then the 10 individual lists were compared to identify 99 genes that appeared on more than one list. CressExpress is a co-expression analysis tool (http://cressexpress.org/) that can rank co-expressed genes based on their common connections with two or more query genes (29). The 10 monolignol biosynthetic genes were treated as a set of query baits to retrieve the co-expressed genes using the settings: array version 3.0, 1779 arrays, RMA processing, cut-off value for Kolmogorov-Smirnov quality-control statistic: 0.15. In total, 134 genes co-expressed with two or more lignin query genes were identified using CressExpress. The final lists from all three analyses were compared to reveal common genes identified by two, or all three analyses. Plant Material

Two T-DNA insertion mutants were obtained from the SALK collection (32): SALK_008202C and SALK_023077, which were called cse-1 and cse-2, respectively. These lines were previously described by Gao and coworkers (17). The T-DNA flanking sequence of cse-1 was analysed with the primers 5'-CAGATCTTCTGAAACATCCAG-3' (1) and 5'-GCCCTTTGACGTTGGAGTC-3' (2), whereas the absence of the T-DNA was analysed with the primers 1 and 5'-GGCTAATGGGTTAATCTTTAGG-3' (3). The cse-1 T-DNA insertion was confirmed to be positioned in the promoter region, 50 bp upstream of the start codon. The T-DNA flanking sequence of cse-2 was analysed with the primers 5'-CTCTCCTTGAATCAGCGAGTG-3' (4) and 5'-ATTTTGCCGATTTCGGAAC-3' (5), whereas the absence of the T-DNA was analysed with the primers 4 and 5'-AAAACACATCAAAACGATGCC-3' (6). The cse-2 T-DNA insertion was confirmed to be in the second exon (Fig. S3). The line cse-2 CSE 1 was complemented with p35S:CSE:GFP and the line cse-2 CSE 2 with p35S:CSE:HBH. To this end, the coding sequence of CSE was amplified with primers mutating the stop codon and introducing attB Gateway flanking sites: 5'-GGGGACAAGTTTGTACAAAAAAGCAGGCTTAAAAATG-CCGTCGGAAGCGGAGAG-3' and 5'-GGGGACCACTTTGTACAAGAAAGCTGGGT-

3

AAGCGGTTTTAGATCCATACTTC-3'. The PCR product was introduced into the pDONR207 vector. The generated entry clone was subsequently introduced into the pB7WG2 vector (33) for the CSE:GFP fusion or pEarleyGate 100 vector (34) with a His-Biotin carboxyl carrier protein domain – His tag (HBH) sequence added for the CSE:HBH fusion. The final constructs were transformed into cse-2 mutants using the floral dip PHWKRG�������)RU�WKH�ȕ-glucuronidase (GUS) and GFP reporter lines, a region of 2 kb directly upstream of the CSE open reading frame (proCSE) was amplified by PCR from Arabidopsis Col-0 genomic DNA using primers 5'-AGGGGATCCAAATAAATGAGT-GGGAAATATTAAATG-3' and 5'-GGTCTCGAGCGTTTTGATGTGTTTTTTTTGGC-AGTG-3' containing the restriction sites for BamHI and XhoI. Subsequently, the PCR product was cloned into the Gateway pEN-R4L1 vector using T4 DNA Ligase (Invitrogen). The introduction of attB Gateway flanking sites into the CSE genomic region (gCSE) was performed by PCR using Arabidopsis Col-0 genomic DNA as template and primers 5'- AAAAAGCAGGCTTCAC-CATGCCGTCGGAAGCGGAGAGCTCA-3' and 5'-GGGGACCACTTTGTACAAGAA-AGCTGGGTGAGCGGTTTTAGATCCATACTT-3'. Subsequently, the PCR product was cloned into the Gateway pDONR221 vector. The sequence identity of pENTR-proCSE and pENTR-gCSE was confirmed by sequencing. Then, the three building blocks pENTR-proCSE, pENTR-gCSE and pEN-R2-S*-L3 (for GUS reporter enzyme) or pENTR-proCSE, pENTR-gCSE and pEN-R2-F-L3 (for GFP fusion tag) (34) were introduced into the destination vector pK8m34GW-FAST via Multisite LR Clonase Plus (Invitrogen), which resulted in proCSE::CSE-GUS and proCSE::CSE-GFP expression clones, respectively. Afterwards, the expression clones were introduced into Agrobacterium tumefaciens strain C58C1 PMP90 by electroporation. After plant transformation using the floral dip method, the identification of transformed seeds was based on seed fluorescence (33). qRT-PCR

Plants were grown in long-day conditions (16-h light/8-h dark, 20 °C). Inflorescence stems of 15 cm in height were harvested at stage 6.3-6.5 (36). The first two basal internodes were taken of three individual plants of each line. For each of the nine samples, RNA was isolated with the RNeasy Plant Mini kit (Qiagen). RNA was treated with DNaseI (Invitrogen) and reverse transcribed with the iScript™ cDNA Synthesis Kit (Bio-Rad). The qRT-PCR was performed in triplicate for each samples, using SYBR Green I (Qiagen) and primers 5'- CTCTTTGGTTTGGCTGATACG-3' (7) and 5'-CAGTAACTCTCTCATTGTTCCCAC-3' (8) for the CSE gene and primers 5'-CTGCGACTCAGGGAATCTTCTAA-3' (9) and 5'- TTGTGCCATTGAATTGAACCC-3' (10) for the UBIQUITIN-CONJUGATING (UBC) 21 gene (At5g25760) as an internal control (37). The PCR program consisted of an initial activation step of 15 min at 95 °C, followed by 40 cycles of 15 s at 95 °C, 30 s at 54 °C, and 30 s at 72 °C. The fold change in gene expression was calculated relative to Col-0 wild-type plants using the UBC gene as a reference. The mathematical model used to quantify the relative mRNA levels was previously described (28). Cell Wall Characterisation and Saccharification

4

The cse-1 mutants were grown together with Col-0 wild type for eight weeks in short-day conditions (9-h light/15-h dark, 22 °C), which allowed for the development of a rosette but suppressed inflorescence stem development. After eight weeks, they were moved to long-day conditions (16-h light/8-h dark, 22 °C). cse-2 mutants were also grown alongside Col-0 controls, but with long-day growth conditions (16-h light/8-h dark 20 °C). The main stem was harvested just above the rosette when the plant was completely senesced and dry. Once harvested, leaves, axillary inflorescences, and siliques were removed. The lowest 10 cm of the stem was chopped in 2-mm pieces. Preparation of insoluble cell wall residue �Q�����DFHW\O�EURPLGH�VROXEOH�OLJQLQ�TXDQWLILFDWLRQ��Q�����FHOOXORVH�TXDQWLILFDWLRQ��Q�����WKLRDFLGRO\VLV��Q����DQG�VDFFKDULILFDWLRQ�ZLWKRXW�FHOO�ZDOO�SUHWUHDWPHQW��Q ���ZDV�performed as previously described (26). Acetyl bromide soluble lignin is expressed relative to insoluble cell wall residue. Saccharification is expressed as the amount of released glucose upon saccharification, relative to the calculated amount of glucose that makes up the cellulose. Doing so, the cellulose-to-glucose conversion is obtained. Plant Phenotype, Growth and Weight

Wild-W\SH�SODQWV��Q ����cse-1 �Q ����cse-2 �Q ����cse-2 CSE 1 �Q ���DQG�cse-2 CSE 2 �Q ���ZHUH�JURZQ�LQ��-h light/15-h dark, 22 °C for eight weeks, followed by 16-h light/8-h dark, 22 °C. Microscopy

Stems from fully grown but still green plants (grown in 9-h light/15-h dark, 22 °C for eight weeks, followed by 16-h light/8-h dark, 22 °C for four weeks) were cut and the bottom 2 cm was embedded in 7% DJDURVH��6OLFHV�RI�����ȝP�WKLFN�ZHUH�FXW�ZLWK�D�Vibratome (Campden Instruments, UK) and stained as in (38), but with slight modifications: Wiesner staining was performed with a drop of following solution: 1 g phloroglucinol (Sigma-Aldrich Steinheim, Germany) in 100 mL 95% EtOH and 16 mL 37% HCl. For Mäule staining, the samples were prepared by incubating them for 5 minutes in 1% w/v KMnO4, then rinsed with water, followed by incubation in 37% HCl and observed after addition of a drop of NH4OH. Images were taken with an Olympus BX51microscope (Olympus Corporation, Tokyo, Japan). The images showing lignin autofluorescence were taken with a Zeiss Axioskop (Carl Zeiss, Oberkochen, Germany) equipped with a Filter Set 02 (488002-9901-000; excitation: 365 nm; beam splitter: 395 QP��HPLVVLRQ�������QP��

Localization of CSE-GFP fusion protein was assessed in roots of 4 independent transformed 1 week-old T1 seedlings germinated under long-day conditions (16 h light/8 h dark). Confocal microscopy was carried out with a Zeiss LSM 710 microscope (Zeiss) and fluorescence signal for GFP (excitation 488 nm, emission peak 509 nm) and propidium iodide (PI) staining (excitation 536 nm, emission peak 617 nm) were detected with a 20x objective.

Histochemical GUS staining for GUS activity was performed in the first pair of true leaves of 3 independent transformed 12 days-old T1 seedlings germinated under long-day conditions (16 h light/8 h dark). The plant material was incubated at 37 °C, in the dark, for two hours in a staining buffer containing 1 mM 5-bromo-4-chloro-3-indolyl ß-D-glucopyranoside sodium salt (X-Gluc), 0.5 % Triton X-100, 1 mM

5

ethylenediaminotetraacetic acid (EDTA) pH 8.0, 0.5 mM potassium ferricyanide (K3Fe(CN)6), 0.5 potassium ferrocyanide (K4Fe(CN)6) and 500 mM sodium phosphate buffer pH 7.0. The reaction was terminated by replacing the staining buffer with 70 % ethanol. The material mounted in 50% glycerol was analysed with an Olympus BX51 microscope and 10x objective equipped with a Nikon Digital Sight DS-SM camera.

NMR

The cse-2 mutants were grown with control plants as described in the section of cell wall characterisation. Fully senescent inflorescence stems of 5 individual cse-2 mutants were pooled into one cse-2 samples, and stems of 8 individual wild-types were pooled into one control sample for NMR. The whole plant cell wall gel-state samples for NMR experiments were prepared as previously described (31). The plant material was pre-ground for 1 min in a Retsch MM400 mixer mill at 30 Hz, using zirconium dioxide (ZrO2) vessels (10 mL) containing ZrO2 ball bearings (2 u 10 mm). The pre-ground cell walls were extracted with distilled water (ultrasonication, 1 h, three times) and 80% ethanol (ultrasonication, 1 h, three times). Isolated cell walls were dried and ball-milled using a Retsch PM100 planetary ball mill at 600 rpm, using ZrO2 vessels (50 mL) containing ZrO2 ball bearings (10 u 10 mm). Each sample (100 mg) was ground for 25 min (interval: 5 min, break: 5 min, repeated 5x). The cell walls were collected directly into NMR tubes and gels formed using DMSO-d6/pyridine-d5 (4:1). For acetylated NMR whole cell wall analysis of cse-2 and its wild-type control, the ball-milled walls were completely dissolved in DMSO/N-methylimidazole (2/1, v/v) followed by acetic anhydride addition (39). The acetylated cell walls were collected for NMR experiments after precipitation in water. The NMR spectra were acquired on a Bruker Biospin (Billerica, MA) Avance 500 MHz spectrometer fitted with a cryogenically-cooled 5-mm TCI gradient probe with inverse geometry (proton coils closest to the sample). NMR spectra of acetylated cell wall samples were acquired in CDCl3. The central solvent peak was used as an internal reference (GC 77.0, GH 7.24 ppm). The following parameters were used: 16 transient spectral increments were acquired from 10 to 0 ppm in F2 (1H) using 1998 data points for an acquisition time of 200 ms, an interscan delay of 1 s, 200 to 0 ppm in F1 (13C) using 512 increments (F1 acquisition time: 10 ms), with a total acquisition time of 5.5 h. Lignin assignments were via comparison with previously assigned spectra (29). Processing used typical matched *DXVVLDQ�DSRGL]DWLRQ��*%� ��������/%� �-0.1) in F2 and squared cosine-bell and one level of linear prediction (32 coefficients) in F1. Volume integration of contours in HSQC plots used Bruker’s TopSpin 3.1 (Mac version) software. Phenolic Profiling

Wild type and cse-1 and cse-2 PXWDQWV��Q �����DQG�����UHVSHFWLYHO\��ZHUH�JURZQ�DQG�harvested as described for qRT-PCR, except that the first three internodes of the stem were collected for analysis. Stems were ground in a 2-mL Eppendorf tube with a 4-mm iron bead with a Retch ball mill. Metabolites were extracted by adding 1 mL methanol and exposing the samples to 70 °C under 1000 rpm shaking in a thermomixer for 15 min. After centrifugation, 800 µL of the liquid phase was lyophilised in a speedvac, 100 µL cyclohexane was added to dissolve the pellet followed by another 100 µL water. The tubes

6

were vortexed and centrifuged at 14,000 rpm (20,000 x g) for 10 min. 15 µL of the lower water phase was injected on a ultra high performance liquid chromatography (UHPLC) system (Waters Acquity UPLC®) equipped with a BEH C18 column (2.1 x 150 mm, 1.7 µM, Waters) and coupled to a time-of-flight mass spectrometer (TOF MS, Synapt Q-Tof (Waters Corporation, Milford, Massachusetts, USA)). A gradient of two buffers was used: buffer A (99/1/0.1 H2O/ACN/formic acid pH3), buffer B (99/1/0.1 ACN/H2O/formic acid pH3); 95% A for 0.1 min decreased to 50% A in 30 min (350 µL/min, column temperature 40 °C). The flow was diverted to the mass spectrometer equipped with an electrospray ionisation source and lockspray interface for accurate mass measurements. The MS source parameters were capillary voltage, 1.5 kV; sampling cone, 40 V; extraction cone, 4 V; source temperature, 120 °C; desolvation temperature, 350 °C; cone gas flow, 50 L/h; and desolvation gas flow, 550 L/h. The collision energy for the trap and transfer cells was 6 V and 4 V, respectively. For data acquisition, the dynamic range enhancement mode was activated. Full-scan data were recorded in negative centroid V-mode; the mass range between m/z 100 and 1,000, with a scan speed of 0.2 s/scan, with Masslynx software (Waters). Leucin-enkephalin �����SJ�ȝ/�VROXELOLVHG�LQ�ZDWHU�$&1������YRO�YRO���ZLWK������formic acid) was used for the lock mass calibration, with scanning every 10 s with a scan time of 0.5 s. From the resulting chromatograms, 967 deisotoped peaks were integrated and aligned via TransOmicsTM (Waters Corporation, Milford, Massachusetts, USA), each peak having an m/z and a retention time. Statistics (ANOVA with post-hoc t-test) and PCA (Pareto scaling, all 967 peaks included) were performed in TransOmicsTM extension EZinfo. The following stringent filters were used (resulting in the 29 compounds listed in Fig. S7): abundance > 1000, p-value ANOVA < 0.005, and differential p < 0.005 in both cse-1 and cse-2, fold change > 5 or < 0.2. Statistical analyses were performed on arcsinh transformed ion intensities. For structural elucidation, MS/MS was used. For MS/MS, all settings were the same as in full MS, except the collision energy was ramped from 15 to 35 eV in the trap, and the scan time was set at 0.5 s. CSE and HCT Protein Expression and Purification

Recombinant CSE and HCT were expressed in the Escherichia coli strain BL21codon + pICA2 after transformation with pLH36CSE or pLH36HCT, respectively, in which H[SUHVVLRQ�LV�LQGXFHG�E\�LVRSURS\O�ȕ-D-1-thiogalactopyranoside (IPTG) under control of a pL-promoter developed by the Protein Service Facility of VIB (WO 98/48025, WO 04/074488). The pLH36 plasmid is provided with a His6-tag followed by a murine caspase-3 site. The murine caspase-3 site can be used for the removal of the His6-tag attached at the N-terminus of the protein of interest during purification. Both proteins were produced and purified separately. The transformed bacteria were grown in Luria Bertani medium supplemented with ampicillin (100 µg/mL) and kanamycin (50 µg/mL) overnight at 28 °C before 1/100 inoculation in a 20-L fermenter provided with Luria Bertani medium supplemented with ampicillin (100 µg/mL) and 1% glycerol. The initial stirring and airflow was 200 rpm and 1.5 L/min., respectively. This was further automatically adapted to keep the pO2 at 30%. The temperature was kept at 28 °C. The cells were grown to an optical density of 1.0 as measured by the absorbance at 600 nm, transferred at 20 °C, and expression was induced by addition of 1 mM IPTG overnight. Cells were then harvested and frozen at -20 °C. After thawing, the cells were resuspended at 3 mL/g in 20 mM

7

NaH2PO4 pH 7.4, 500 mM NaCl, 20 mM imidazole, 1 mM PMSF and 0.1% CHAPS. The

cytoplasmic fraction was prepared by sonication of the cells and was isolated by

centrifugation at 18,000 x g for 30 min. All steps were conducted at 4 °C. The clear

supernatant was applied to a 60-mL Ni-Sepharose 6 FF column (GE Healthcare),

equilibrated with 20 mM NaH2PO4 pH 7.4, 500 mM NaCl, 20 mM imidazole and 0.1%

CHAPS. The column was eluted with 20 mM NaH2PO4 pH 7.4, 20 mM NaCl, 400 mM

imidazole, 0.1% CHAPS after an extra wash step with 50 mM of imidazole in the same

buffer. The elution fraction was diluted 1/20 with 20 mM Tris pH 8.5, 0.1% CHAPS and

loaded on a 20-mL Source 15Q column (GE Healthcare) to remove contaminants. After

equilibration, the protein of interest was eluted by a linear gradient over 20 column

volumes of NaCl from 0 to 1 M in 20 mM Tris pH 8.5, 0.1% CHAPS. After this ion-

exchange chromatography step, HCT was injected on Superdex 75 HR10/30 to PBS, and

activated murine caspase-3 (1/100 m/m murine caspase-3/CSE) with 10 mM DTT was

added to the CSE containing fractions to remove the His6-tag. After 1 hour incubation at 37

°C, the reaction solution was reloaded on a Ni-Sepharose 6 FF column to capture the His6-

tag that was attached to CSE and the murine caspase-3 that was also provided with a His6-

tag. CSE without fusion tag stayed in the flow-through of the column. Finally, the

recombinant CSE was injected on a HiLoad 26/60 Superdex 75 prep grade with PBS as

running solution for formulation and to remove minor contaminants. The obtained purified

fractions of both proteins were analysed by SDS-PAGE and the concentration was

determined using the Micro-BCA assay (Pierce).

Metabolic Extract Feeding Assay

Stem material of the cse-1 mutant was obtained as for phenolic profiling. Aliquots of

100 mg were extracted with 500 µL methanol and further processed as for phenolic

profiling. Fifteen µL of aqueous phase was fractionated on a Waters Acquity UPLC®

system with the same settings as for phenolic profiling coupled to a fraction collector

(BioLogic BioFrac Fraction Collector, Bio-Rad, Hercules, California, USA). A total of 96

fractions was collected in a 96-well plate, in time intervals of 0.3 min. The fractionation

was done twice, resulting in two plates with extract fractions. The fractions were

subsequently lyophilised after which 80 µL of reaction mixture (20 mM Tris-HCl, pH 7.0

and 1.82 µg CSE) was added to all fractions of the first plate and the same reaction mixture

but with boiled CSE was added to all fractions of the second plate. The second plate served

as the control. Plates were incubated at 25 ºC for 1 h. For each fraction, 15 µL was injected

and analysed as described for phenolic profiling, but on a Waters BEH C18 column (2.1 x

50 mm, 1.7 µM), and a shortened gradient was applied: 95% A for 0.1 min decreased to

50% A in 8 min (600 µl/min, column temperature 40 °C); buffers were the same as

described earlier. Subsequently, the compounds accumulating in cse mutants (Fig. S7) were

screened as possible substrates for CSE.

CSE activity assays in lignifying tissues of Arabidopsis stems

The basal 1 to 6 cm of individual inflorescences of cse-2 mutants and wild-type plants

were scraped with a scalpel to remove the outer tissues (containing cortex, epidermis and

part of the phloem) to enrich for lignifying tissues. The remaining inner stem material

(enriched for xylem, interfasicular fibres but also pith cells) was frozen in liquid nitrogen

8

and ground in 2-mL Eppendorf tubes using a Retsch mill (20 Hz, 4-mm bead). The ground plant material was further extracted on ice for 1 h with 100 mM Tris-HCl buffer, pH 7.0, containing 10 mM DTT, 1% (w/v) polyvinylpolypyrrolidone (PVPP), 15% glycerol and 1x cOmplete Mini Protease Inhibitor Cocktail (Roche). After centrifugation at 4 °C for 5 min at 20 000 x g, the supernatant was washed with the extraction buffer using Amicon Ultra-0.5, Ultracel-10 Membrane, 10 kDa (Millipore) to eliminate endogenous soluble phenolic compounds. Total soluble protein concentration was measured using the Qubit® 2.0 Fluorometer system (Invitrogen) according to the manufacturer’s instructions. The reaction mixture contained 50 mM Tris-HCl buffer, pH 7.0, 1 mM DTT, 100 µM caffeoyl-shikimate and 5 µg soluble proteins in a total volume of 40 µL. Samples were incubated at 25 °C for 1 h and terminated by boiling for 5 min. Boiled protein extracts of wild type plants were used as negative control. Caffeic acid was directly analyzed by UPLC-MS as described in the section Metabolic Extract Feeding Assay. Enzymatic Kinetics

CSE substrate tests were performed in a volume of 40 µL with 100 mM Tris-HCl pH 7, 1mM DTT and 1 µg purified CSE protein. The samples were held at ���Û&�IRU����PLQ and analysed via UHPLC-MS. CSE enzymatic kinetics for caffeoyl shikimate and p-coumaroyl shikimate were performed in a final volume of 40 µL with 100 mM Tris HCl pH 7, 1 mM DTT, 1 µg purified CSE protein and five different concentrations of substrate (from 10 µM to 1000 µM). Each reaction was performed three times. The samples were KHOG�DW����Û&�IRU���PLQ�DQG���K��IRU�FDIIHR\O�VKLNLPDWH�DQG�p-coumaroyl shikimate, respectively. For HCT substrate tests, reactions were performed in 40 µL with 100 mM Tris HCl pH 7, 1 mM DTT, 71 ng purified HCT protein. Samples were held at 20 °C for 30 min. For both HCT and CSE, the reactions were terminated by heating the samples for 10 min at 99 °C. All the samples were analysed via UHPLC-photodiode array detector (PDA)-MS. For detection of caffeoyl-CoA, the samples were diluted with water to a final volume of 200 µL to lower the ion suppression of Tris. The UHPLC and MS analytical settings were the same as described in ‘Metabolic Extract Feeding Assay’. Caffeoyl-CoA was integrated as the double negative charged ion with m/z 463.566. The PDA wavelength range was set at 200 to 500 nm with a resolution of 1.2 nm and a sample rate of 20 points per second. For CSE reactions, UV-Vis traces were used for calculating the peak areas for every compound (at their maximum absorbance wave length: caffeate at 323 nm and p-coumarate at 312 nm). Automatic peak integration was performed with Targetlynx (Waters Corporation, Milford, Massachusetts, USA). Kinetic parameters were obtained via Graphpad Prism 6 (San Diego, California, USA) with nonlinear Michaelis-Menten curve-fitting. Chemical Standards

p-Coumaroyl shikimate and caffeoyl shikimate were synthesised authentic compounds. Shikimic acid, caffeic acid, p-coumaric acid, ATP and CoA were purchased from Sigma-Aldrich (Steinheim, Germany) and caffeoyl-CoA and p-coumaroyl-CoA from TransMIT – Plant Metabolites and Chemicals (Gießen, Germany).

Protein Modeling

9

Three dimensional protein structure of CSE was modeled using Phyre2 (35). The fold library contained 47615 entries and the Intensive Modeling Mode option was selected. The QMEAN server was used for quality estimation of the obtained structure (36), returning a Q-mean score of 0.45. A structural similarity search was done by PDBeFold (40) and revealed an epoxy hydrolase of human as closest structural homologue (3ANS; Q-score 0.57, RMSD 1.01) in the PDB database. The structure of the ligand (5-O-caffeoyl shikimate) was downloaded from PubChem (CID 5281762) as sdf-file and converted to pdb format using iCon of ChemStudio. This structure was further optimized for subsequent docking by adding missing polar hydrogens and optimizing the number of torsions. Molecular docking was performed using Autodock Vina version 1.1.2 with the optimized structure of caffeoyl shikimate acid as ligand. The center of the search space for docking ZDV�VHW�DW�[� �-����\� ����DQG�]� ����DQG�LWV�GLPHQVLRQ�ZDV�VHW�DW�[� �����\� ����DQG�]� �����The complexes were further analyzed in PMV 1.5.6 and SPDBV_4.01 and the figure was optimized in POV-Ray and Adobe photoshop CS2.

10

Fig. S1.

Co-expression analysis of known lignin biosynthetic genes. Three different co-expression analyses were performed using ACT (the Arabidopsis Co-expression Tool) (28), CressExpress (29) and a high-resolution root spatio-temporal (HRRS) expression dataset (30). In each of these PAL1 (At2g37040), C4H (AT2g30490), 4CL1 (At1g51680), HCT (At5g48930), C3H1 (At2g40890), CCoAOMT1 (At4g34050), CCR1 (At1g15950), F5H1 (At4g36220), COMT (At5g54160) and CAD6/CAD-D, (At4g34230) were used as bait. Candidate genes had to be co-expressed with two or more lignin biosynthesis genes in order to be retained on the co-expressed gene list. The number between brackets depicts the total number of co-expressed genes that were retrieved by the analysis. Thirteen genes were found to be co-expressed in all three analyses. Gene identities are given in Table S1.

11

Fig. S2.

Expression pattern of CSE-reporter fusion proteins expressed from the CSE promoter. (A) Root of 1 week-old proCSE::CSE-GFP transformed Arabidopsis seedling showing GFP signal (green) in the vasculature. Cell walls were stained with propidium iodide (red). (B) Leaf of 12-day-old proCSE::CSE-GUS transformed Arabidopsis seedling showing GUS activity (blue) in the vasculature.

12

Fig. S3.

Position of the T-DNA insertions and CSE expression in cse-1 and cse-2. (A) Schematic representation of the CSE gene with indication of the T-DNA insertion sites and location of the primers used for qRT-PCR. (B) The qRT-PCR showed 6.3% residual CSE transcript levels in the inflorescence stem of cse-1 as compared to wild type. Because the CSE open reading frame is intact in the cse-1 allele, the transcribed mRNA is expected to code for a functional protein, and therefore some CSE activity might be retained in the cse-1 mutant. CSE transcript levels were below the detection limit in the cse-2 mutant and because the T-DNA insertion disrupts the middle of the coding sequence indicates that cse-2 can be considered a null-mutant. Wt: wild type; Error bars indicate ± SEM; **0.01>P>0.001; ANOVA followed by LSD post-hoc tests comparing CSE expression in cse-1 and cse-2 with CSE expression in WT.

13

Fig. S4.

Height and weight of cse mutants and two independent CSE complemented lines, cse-2 CSE 1 and cse-2 CSE 2. (A) Growth curves. Height was monitored every two days. Growth rate between day 7 and 13 (gray area) was significantly slower for cse-2 as compared to wild type and significantly faster for cse-2 CSE 2 as compared to cse-2. At final height, cse-2 was significantly smaller than wild type, and both cse-2 CSE 1 and cse-2 CSE 2 were significantly taller than cse-2. (B) Dry weight of the main infloresence stem. For statistical tests of the dry weight, all mutants and complemented lines were compared to wild type. **0.01>P>0.001; *** 0.001>P; ANOVA followed by LSD post-hoc test.

14

Fig. S5.

Microscopy of stem sections. The cell walls of fibres and vessel cells of cse-2 mutants have reduced autofluorescence and reduced intensity upon staining with Mäule and Wiesner reagents. Although reduced as compared to wild type, autofluorescence and Wiesner staining were still apparent in the cse-2 mutant. As the colour intensity in cse-2 CSE 1 and cse-2 CSE 2 was not obviously different from that of wild type, the CSE overexpression construct succesfully complemented the cse-2 mutation. The scale bar represents 0.1 mm. No obvious differences in colour intensity or cell wall morphology were observed between wild type and the weak cse-1 mutant.

15

Fig. S6.

Analysis of the lignin side-chain region of the HSQC spectra in the cse-2 mutant and wild type��,QWHJUDWHG�YDOXHV�IRU�WKH�Į-C/H correlation peaks from the major lignin interunit structures A-C are provided on the Figures.

16

Fig. S7.

17

UHPLC-MS based phenolic profiling of cse-1, cse-2 and wild-type inflorescence stems �Q ������DQG����UHVSHFWLYHO\����A) Principal component analysis (PCA) on the 967 integrated and aligned peaks revealed that the first principal component (PC1, 32% of variation) explained mainly the difference between genotypes; wild type had negative PC1 values and cse-2 had positive PC1 values. The weaker cse-1 allele fell between wild type and cse-2. (B) To pinpoint the metabolites for which the abundance was most affected due to mutation of CSE, stringent filters were used, resulting in 29 compounds (27 up and 2 down) of which 23 could be identified (Fig. S8). Mean: average peak intensity. Compounds are ordered according their metabolic classes and peak intensities, starting with the highest accumulating compound, caffeoyl shikimate. For data underlying (A) and (B) see Table S3. Individual metabolites can be identified in Table S3 by their m/z and retention time (RT).

18

Fig. S8.

19

MS-based elucidation of the structurally characterised compounds from phenolic profiling. MS/MS spectra and the reasoning for structural elucidation are given for the metabolites for

20

which the MS/MS spectra have not been published before. MS/MS spectra were obtained on a Synapt HDMS Q-Tof (Waters Corporation, Manchester, UK) in the negative mode. Compounds 1 and 2: Caffeoyl shikimate The compounds 1 and 2 eluting at 6.01 min and 7.46 min, had a calculated chemical formula of C16H15O8

- based on a theoretical m/z value (m/zth) of 335.077 that best approximated the experimental m/z value (m/zexp) of 335.075. The MS/MS spectra of both compounds are nearly identical. In each case there is a neutral loss of 156.042 Da, leading to a product ion at m/z 179.033. The calculated neutral loss represents a molecule with C7H8O4 as chemical formula (with a theoretical mass of 156.042 Da), corresponding to dehydrated shikimic acid. The smaller product ion at m/z 135.043 can be explained as the loss of CO2 from the m/z 179.033 ion (41). Therefore, the peak at m/z 179.033 can be assigned to caffeate (m/zth 179.035, see spectrum of caffeate 13). Assuming that compounds 1 and 2 are formed via esterification of shikimic acid to caffeic acid, the phenolate product ion of the caffeic acid moiety – due to shikimic acid loss – should also be observed in their MS/MS spectrum addition to the carboxylate product ion with m/z 179.033 (42-44). Indeed, a caffeic acid-derived phenolate product ion was observed at m/z 161.023 (m/z 161.021 for compound 2). Taken together, the accurate mass and MS/MS spectra hints at caffeoyl shikimate as the identity of the ions 1 and 2. More specifically, by comparing the mass, retention time and the MS/MS spectrum of ion 1 with those of an authentic standard compound, ion 1 was identified as trans-5-O-caffeoyl shikimate. This leaves the possibility that ion 2 is either the cis isomer of 5-O-caffeoyl shikimate, or the 4-O-caffeoyl shikimate or 3-O-caffeoyl shikimate isomer. Compounds 3 and 4: Caffeoyl shikimate 3/4-O-hexoside The two compounds eluting at 5.00 min and 6.96 min have a calculated chemical formula of C22H25O13

- (m/zth 497.130 and m/zexp 497.129). MS/MS fragmentation led to a typical dehydrated hexose loss (theoretical neutral loss of 162.053 Da) or dehydrated shikimic acid loss (theoretical neutral loss of 156.042 Da, see compounds 1 and 2, caffeoyl shikimate) that resulted in the product ions with m/z 335.076 and m/z 341.085 (m/z 341.086 for compound 4). The combined loss of dehydrated hexose and dehydrated shikimic acid resulted in the ion at m/z 179.034 (m/z 179.032 for compound 4). Similarly to the spectrum of caffeoyl shikimate (1 and 2), the peaks at m/z 161.022 (phenolate ion) and m/z 135.043 (arising from decarboxylation of the carboxylate ion at m/z 179.034) indicate a caffeoyl moiety in the structure of compounds 3 and 4. The fact that the hexose and shikimate moiety can leave independently from each other as neutral losses, suggests that both are linked directly to the caffeic acid moiety and not to each other. Because the hexose moiety dissociates from the precursor ion solely in its dehydrated form (162.053 Da), the hexose is linked to the 4-O or 3-O position of the caffeate moiety rather than being esterified. A detailed description of hexose linkages is given in the section for compound 8, i.e. caffeoyl hexose. Given the similarities with the spectra of 1 and 2, the shikimic acid moiety is most OLNHO\�OLQNHG�DV�DQ�HVWHU�WR�WKH�Ȗ-position (i.e., 9-position) of caffeic acid. This is further supported by the minor peak at m/z 323.072 representing the phenolate ion following shikimic acid loss (42-44). Therefore, compounds 3 and 4 are tentatively assigned as caffeoyl shikimate 3/4-O-hexoside.

21

Compound 5: Caffeoyl shikimate + 204 Da A compound with m/z 539.139 (C24H27O14

-, m/zth 539.140) elutes at 6.56 min. In the MS/MS spectrum of this compound, a neutral loss of 156.039 Da leads to the product ion with m/z 383.099. This neutral loss is most likely a dehydrated shikimic acid (C7H8O4, theoretical mass of 156.042 Da). In addition, a second neutral loss of 204.064 Da leads to the peak at m/z 335.075. This neutral loss predicts a structure with a chemical formula of C8H12O6 (theoretical mass of 204.063 Da). Taking the chemical coupling into account, this moiety might originate from a molecule with a chemical formula of C8H12O6 + H22� �C8H14O7 (theoretical mass of 222.074 Da). In theory, this molecule could be acetyl hexose, but this identity is speculative. The product ions with m/z 179.033, m/z 161.018 and m/z 135.045 arise from dissociation of the product ion with m/z 335.075 and indicate that the latter product ion represents caffeoyl shikimate (see 1-4). Because the structure of the neutral loss of 204.064 Da remained unresolved, this compound was named ‘caffeoyl shikimate + 204 Da’. Compounds 6 and 7: Caffeoyl shikimate rhamnose Compounds 6 and 7 elute at 5.77 min and 6.58 min, respectively, and have an m/z of 481.134, which corresponds to a chemical formula of C22H25O12

- (m/zth 481.135). The MS/MS spectra show the neutral loss of 146.059 Da, leading to the ion with m/z 335.075. This neutral loss typically corresponds to the loss of a dehydrated rhamnose moiety (theoretical mass of 146.058 Da). Because of the occurrence of ions with m/z 179.032 (m/z 179.035 for compound 7), m/z 161.021 (m/z 161.022 for compound 7) and m/z 135.042, the product ion with m/z 335.075 can be assigned as caffeoyl shikimate (see compounds 1-4). Because the shikimate moiety is only expelled as a combined dehydrated shikimate / dehydrated rhamnose loss upon collision-induced dissociation, it can be speculated that rhamnose is attached to the shikimate moiety, which would contrast with the position of the sugar in the caffeoyl shikimate 3/4-O-hexoside 3 and 4. Taken together, these compounds are tentatively assigned as caffeoyl shikimate rhamnose. Compound 8: Caffeoyl hexose Compound 8 elutes at 3.33 min and yielded an ion with m/z 341.087 corresponding to a chemical formula of C15H17O9

- (m/zth 341.088). The MS/MS spectrum shows a neutral loss of dehydrated hexose (C6H10O5, theoretical neutral loss of 162.053 Da) that leads to the product ion with m/z 179.035 (C9H7O4

-, m/zth 179.035). Other fragment ions with m/z 161.022 (C9H5O3

-, i.e. C9H7O4- minus H2O, m/zth 161.024) and m/z 135.042 (C8H7O2

-, i.e. C9H7O4

- minus CO2, m/zth 135.045) hints this compound to be caffeate linked to a hexose moiety (see compound 1). The position of the hexose in conjugates of p-hydroxyphenylpropanoic acid with hexose can be deduced from the MS/MS fragmentation. ,Q�FDVHV�ZKHUH�KH[RVH�LV�HVWHULILHG�WR�WKH�Ȗ-position of p-hydroxyphenylpropanoic acids (as in p-coumaroyl hexose, feruloyl hexose, 5-hydroxyferuloyl hexose, sinapoyl glucose), the spectrum is dominated by neutral losses of C6H10O5 (162.053 Da) and C6H12O6 (180.063 Da) leading to the carboxylate and phenolate product ions of the aglycone (7, 45, 46). In contrast, in cases where the hexose is linked to the phenolic function of p-hydroxyphenylpropanoic acids (as in ferulic acid 4-O-hexoside, 5-hydroxyferulic acid 4/5-

22

O-hexoside and sinapic acid 4-O-hexoside), the spectrum is dominated by a neutral loss of C6H10O5 only (45-47). In the MS/MS of this peak, eluting at 3.33 min, the neutral loss of C6H12O6 is prominent (i.e., leading to the product ion with m/z 161.022), therefore this compound is tentatively assigned as caffeoyl hexose. Compounds 9, 10 and 11: Caffeate 3/4-O-hexoside Compounds 9, 10 and 11 elute at 2.89, 3.99 min and 5.41 min, respectively, and have an m/z of about 341.086, corresponding to a chemical formula of C15H17O9

- (m/zth 341.088). The MS/MS spectra of the compounds are similar to that of caffeate 3-O-hexoside (47). The isomers can be explained by the occurrence of cis and trans stereoisomers, different hexose moieties and differences in the hexose linkage (i.e., between the 3-O- or 4-O-phenolic functions of the caffeate moiety and any of the hydroxyl functions of the hexose moiety). Therefore, these compounds were assigned as caffeate 3/4-O-hexoside. Compound 12: Caffeoyl hexose 3/4-O-hexoside Compound 12 elutes at 3.00 min with m/z 503.139. The accurate mass hints at the chemical formula being C21H27O14

- (theoretical m/zth 503.141). The MS/MS spectrum shows neutral losses of 162.052 Da (dehydrated hexose, C6H10O5, theoretical mass of 162.053 Da) and 180.063 Da (hexose, C6H12O6, theoretical 180.063 Da), resulting in the carboxylate and phenolate product ions with m/z 341.086 and m/z 323.089, respectively. In addition, losses of twice 162.052 Da (i.e., 324.105 Da) and the combined loss of 162.052 Da and 180.063 Da (i.e., 342.115 Da) are observed, leading to the product ions with m/z 179.041 and m/z 161.026, respectively. Because of the occurrence of the ion with m/z 135.049, the product ion with m/z 179.041 (C9H7O4

-, theoretical m/z 179.035) can be assigned as a caffeate moiety (see 13). As explained in the section for caffeoyl hexose (8) and caffeate 3/4-O-hexoside (9-11���WKH�QHXWUDO�ORVV�RI���������'D�LV�OLNHO\�WKH�ORVV�RI�KH[RVH�OLQNHG�WR�WKH�Ȗ-position, whereas the neutral loss of 162.052 Da is explained by the loss of a 3-O- or 4-O-linked hexose moiety. Taken together, this compound is tentatively assigned as caffeoyl hexose 3/4-O-hexoside. Compound 13: Caffeate Compound 13 elutes at 4.84 min with m/z 179.033. The calculated chemical formula is C9H7O4

- (m/zth 179.035). This ion was identified as caffeate based on its identical m/z, retention time and MS/MS spectrum to that of the standard. Compound 14: Feruloyl shikimate Compound 14 has a retention time of 9.95 min, and an m/z 349.091. The calculated chemical formula for the ion is C17H17O8

- (m/zth 349.093) and its MS/MS spectrum shows a neutral loss of 156.043 Da, which can be attributed to the loss of dehydrated shikimate (C7H8O4, theoretical mass of 156.042 Da, see caffeoyl shikimate 1 and 2) rendering the carboxylate product ion with m/z 193.048 (C10H9O4

-, m/zth 193.051). The phenolate product ion at m/z 175.039 indicated that the shikimic acid moiety was esterified (C10H9O4

- minus H2O, m/zth 175.040). Further fragmentation of the carboxylate product ion with m/z 193.048 yielded product ions with m/z 178.025 (C10H9O4

- PLQXV��&+3 radical, m/zth 178.027), m/z 160.016 (C10H9O4

- PLQXV��&+3 radical and H2O, m/zth 160.017), m/z 149.057 (C10H9O4-

23

minus CO2, m/zth 149.061) and m/z 134.036 (C10H9O4- PLQXV��&+3 radical and CO2, m/zth

134.037). This indicates that the m/z 193.048 product ion is ferulate. Therefore compound 14 is tentatively assigned as feruloyl shikimate. Compounds 15 and 16: Feruloyl hexose Compounds 15 and 16 elute at 5.28 min and 5.76 min, respectively, and yielded ions with m/z 355.101 of which the calculated chemical formula is C16H19O9

- (m/zth 355.104). Their MS/MS spectra match with that of a previously reported feruloyl hexose (45). Compounds 17 and 18: Feruloyl malate Compounds 17 and 18 elute at 8.68 and 9.03 min, respectively, and yielded ions with m/z 309.059 of which the calculated chemical formula is C14H13O8

- (m/zth 309.062). Their MS/MS spectra match with that of a previously reported feruloyl malate (7, 48). Compound 19: H(8-8)H dihexoside Compound 19 elutes at 5.89 min with m/z 621.218, corresponding to a chemical formula of C30H37O14

- (m/zth 621.219). The MS/MS spectrum shows two successive neutral losses of 162.054 Da each, leading to the peaks at m/z 459.165 and m/z 297.111. These neutral losses typically correspond with the subsequent losses of two dehydrated hexose structures (C6H10O5, theoretical mass of 162.053 Da). The aglycone with m/z 297.111 corresponds to an ion with chemical formula C18H17O4

- (m/zth 297.113). Smaller fragments can be logically explained by further fragmentation of the aglycone product ion: ions with m/z 267.090 (neutral loss of formaldehyde, 30.011 Da) and m/z 251.107 (neutral loss of formic acid, 46.005 Da) are typical for phenylpropanoid dimers with a resinol (i.e. 8–8) structure. Furthermore, the ion with m/z 121.027 is indicative for the 2,5X- fragment ion resulting from cleavage of the resinol structure in H(8–8)H (C7H5O2

-, m/zth 121.030) (49, 50). In conclusion, the compound was tentatively assigned as H(8–8)H dihexoside. Compound 20: H(8-O-4)ferulate Compound 20 elutes at 9.82 min with m/z 359.111, corresponding to a calculated chemical formula of C19H19O7

- (m/zth 359.114). Its MS/MS spectrum shows a neutral loss of 48.023 Da (leading to a product ion with m/z 311.088), which is typical for a combined water and formaldehyde loss in 8–O-4 linked oligolignols and similar structures (theoretical mass of 48.041 Da). The product ion with m/z 267.086 can be assigned to an additional CO2 loss (theoretical mass of 43.990 Da) from the m/z 311.088 ion (experimental mass difference is 44.002 Da). A CO2 loss is often observed for precursor ions with a carboxylic acid function as long as the resulting product ion is stabilized by the presence of adjacent double bonds and/or benzene rings (Bandu et al., 2004). The ion with m/z ��������UHVXOWV�IURP�D��&+3 radical loss (theoretical mass of 15.024 Da) upon homolytic cleavage of a methyl allyl ether or methyl aryl ether function present in the structure of the m/z 267.086 product ion (experimental mass difference is 15.011 Da) (51). This indicates that the compound has (at least) one methoxy function. Further fragments can be explained by the cleavage of the (8–O-4) structure. The product ion with m/z 193.047 can be assigned to the B unit (49, 50, 52). Given further fragments of m/z 178.032, m/z 149.060 and m/z 134.033, the ion with m/z

24

193.048 can be structurally characterized as ferulDWH��DV�WKH\�FRUUHVSRQG�WR�D�ORVV�RI�D��&+3 radical, CO2 and the combination of both, respectively). The ions with m/z 165.033 and m/z 135.042 can be assigned to the A unit and the A unit after formaldehyde loss, respectively (49, 50, 52). Given the calculated chemical formula of the A unit, which is C9H9O3

- (m/zth 165.056), this unit must be derived from p-coumaryl alcohol. Taken together, the compound is annotated as H(8–O-4)ferulate. As extra evidence for this putative identity, the spectrum is similar to that of G(8–O-4)ferulate, a known Arabidopsis metabolite (7, 19). Compound 21: H(8-O-4)S(8-5)G Compound 21 elutes at 14.25 min with m/z 553.207, corresponding to a chemical formula of C30H33O10

- (m/zth 553.208). The molecular structure can be elucidated via MS-based lignin sequencing as described in (49, 50). The MS/MS spectrum shows a neutral loss of 48.021 Da, which is typical for a combined water and formaldehyde loss in 8-O-4 linked oligolignols (theoretical mass of 48.041 Da). The product ion with m/z 135.044 suggests that a H unit is the phenolic 8-O-4 end-unit (135.044 corresponds to C9H9O3

- minus CO, m/zth 135.045). Further fragments with m/z 369.133, m/z 357.141, m/z 177.053 and m/z 162.032 can be assigned to an S(8-5)G moiety (49, 50). Therefore, this compound is tentatively assigned as H(t8-O-4)S(8-5)G. The MS/MS has been published before in (53). Compounds 28 and 29: G(8-O-4)S(8-5)G The compounds eluting at 14.54 and 15.32 min with m/z 583.217 could be assigned to the trilignols G(t8-O-4)S(8-5)G and G(e8-O-4)S(8-5)G based on the MS/MS data as described in detail in (49, 50). These trilignols are known metabolites of Arabidopsis (7).

25

Fig. S9.

Feeding CSE with the cse methanol-soluble phenolics. Compounds that accumulated in cse mutants (Fig. S7) were compared between the CSE-treated plate and the control plate (with boiled CSE). For three compounds (caffeoyl shikimate, caffeoyl hexose and feruloyl hexose) a reduction in abundance was accompanied by the appearance of a compound that was below the detection limit in the control plate (caffeate in the cases of caffeoyl shikimate and caffeoyl hexose, and ferulate in the case of feruloyl hexose). Caffeoyl shikimate was integrated as m/z 335.075 eluting at 1.50 min, caffeate as m/z 179.033 eluting at 1.17 min, caffeoyl hexose as m/z 341.087 eluting at 0.81 min, feruloyl hexose as m/z 355.101 at 1.3 min and ferulate as m/z 193.051 eluting at 1.94 min. All peaks were integrated by use of the MassLynx software. The percentage given in each graph is the residual peak area of the substrate, relative to that in the control. For data underlying this analysis see Table S4 and Table S5.

26

Fig. S10.

Three dimensional view of CSE (Tyr31 to Asp206) with caffeoyl shikimate in its binding

pocket. (A) Surface view of CSE showing the binding site cavity with caffeoyl shikimate

represented in sticks. (B) Surface view of CSE with caffeoyl shikimate in the binding site

as in panel A. Part of the loop that covers the binding pocket (Asn207 to Val210) is shown

as a tube representing the backbone of the amino acids, allowing a better view of the ligand

docked within the binding pocket. (C) Ribbon structure representation of CSE docked with

caffeoyl shikimate in the binding pocket. The amino acids of the conserved catalytic triad

(Asp76, Ser147 and His298 (37)) within the active site are shown in ball-and-stick

representation and are in close proximity with the ligand.

27

Fig. S11.

CSE activity in crude enzyme extracts from lignifying tissues of wild-type and cse-2 inflorescence stems. Samples were enriched in lignifying tissues by removing the outer tissues, mainly cortex and epidermis. Caffeoyl shikimate was fed to the crude extracts and caffeate was measured after 1 hour. The peak area was normalized to protein amount and then to the average of wild-type extracts. No caffeate was detected in the negative control VDPSOHV�ZLWK�ERLOHG�FUXGH�:7�HQ]\PH�H[WUDFWV��Q ��� ����!P>0.01; unpaired two-sided t test.

28

Fig. S12.

Catalytic activity of purified recombinant CSE towards caffeoyl shikimate and CoA as substrates. (A) Caffeoyl-CoA is not detected as a product of CSE when caffeoyl shikimate and CoA are given as substrates (1.82 µg CSE, 30 min, 20 °C, 100 mM Tris, pH 7, volume: 40 µL). To demonstrate that caffeoyl-CoA could have been detected had it been a product of this reaction, 10 µM of caffeoyl-CoA added to the same reaction mixture before the reaction was easily detected. 10 µM caffeoyl-CoA would correspond to a 10% conversion of the 100 µM substrates. (B) Caffeate is detected as a product of CSE when caffeoyl shikimate and CoA are given as substrates (1.82 µg CSE, 30 min, 20 °C, 100 mM Tris, pH 7, volume: 40 µL), but is not detected in the boiled CSE control. Fragments of two caffeoyl shikimate isomers are detected in the chromatograms because 5-O-caffeoyl shikimate undergoes non-enzymatic isomerisation in water. For (A) and (B), representative chromatograms of three repeats of each condition are shown.

29

Fig. S13.

Catalytic activity of HCT. The HCT-catalysed reaction of caffeoyl shikimate and CoA to caffeoyl-CoA and shikimate was originally suggested based on a small peak of caffeoyl-CoA that appeared when caffeoyl quinate was incubated with recombinant tobacco HCT and CoA, but no testing of caffeoyl shikimate as a substrate has been reported (16). We therefore re-evaluated the HCT activity using purified recombinant Arabidopsis enzyme and showed that HCT indeed catalysed the reaction of caffeoyl shikimate to caffeoyl-CoA. (A) Incubation of 50 µM p-coumaroyl-CoA and 50 µM shikimate with 71 ng HCT in 40 µl (100 mM Tris, pH 7) for 30 min resulted in 9.7 µM (standard error: 0.4 µM) p-coumaroyl shikimate, which is about a 19% conversion. Because 5-O-p-coumaroyl shikimate (isomer 2) undergoes non-enzymatic isomerisation in water, two additional p-coumaroyl shikimate isomers are detected. The sum of the three isomers was used to calculate the conversion. (B) Incubation of 100 µM caffeoyl shikimate and 100 µM CoA with 71 ng HCT in 40 µl (100 mM Tris, pH 7) for 30 min resulted in 10.5 µM (standard error: 0.8 µM) caffeoyl-CoA which is about a 10% conversion. Representative chromatograms of three repeats of each condition are shown for (A) and (B).

30

Fig. S14.

Phylogenetic tree. Protein sequences were selected based on a standard protein BLAST (BLASTP) against the non-redundant protein sequences database of NCBI using CSE (At1g52760) as query. Retained sequences of the BLAST output were renamed for practical reasons. The new code starts with two letters referring to the species name, followed by the accession number of the protein sequence. In some cases three letters were essential to distinguish species (e.g., Ppe and Ppa). For Arabidopsis thaliana sequences, the AGI codes of their corresponding genes were used. Sequences from Malus domestica, Eucalyptus grandis and Panicum virgatum were obtained via Phytozome v9.1. E. grandis and P. virgatum sequence data were produced by the US Department of Energy Joint Genome Institute. Following species were used: Ah, Arachis hypogaea / Al, Arabidopsis

31

lyrata / At, Arabidopsis thaliana / Cs, Cucumis sativus / Eg, Eucalyptus grandis / Fv, Fragaria vesca / Gm, Glycine max / Hv, Hordeum vulgare /Ho, Hyacinthus orientalis / Md, Malus domestica / Mt, Medicago truncatula / Os, Oryza sativa / Pv, Panicum virgatum / Ppa, Physcomitrella patens / Ppe, Prunus persica / Ps, Picea sitchensis / Pt, Populus trichocarpa / Sb, Sorghum bicolor / Sl, Solanum lycopersicum / Sm, Selaginella moellendorffii / Vv, Vitis vinifera / Zm, Zea mays. All sequences were aligned in BioEDIT (54) using CLUSTALL-W, and the manually curated alignment was used to generate a neighbour joining phylogenetic tree with TREECON (55). Bootstrap values were calculated for 100 pseudosamples and values above 50 are shown as percentages at nodes in the consensus tree. Following sequences were selected as outgroup: At2g47630, At3g62860, Sl_XP_004236766, Sl_XP_004247416, Pt_XP_002302714, Eg_G02767.1, Eg_H01110. Protein sequences of Arabidopsis thaliana are depicted in green. The red branch identifies the CSE-specific clade, and the grey box depicts protein sequences of eudicotyledons in the CSE-specific clade. Scale bar represents 0.1 expected amino acid residue substitutions per site.

32

Table S2. Cell wall and lignin amount and composition. Lignin composition was determined by thioacidolysis. Numbers between brackets are the SEM. *0.05>P>0.01; **0.01>P>0.001; ***0.001>P; unpaired two-sided t test.

33

Table S1. Genes identified in the co-expression analysis of known lignin biosynthetic genes. The AGI codes are listed for genes identified as being co-expressed with two or more lignin biosynthesis genes by ACT (Arabidopsis Co-expression Tool) (28), CressExpress (29) or in the high-resolution root spatio-temporal (HRRS) expression dataset (30). A number 1 means that the gene was identified in the respective co-expression analysis. See legend of Fig. S1.

Table S3. 967 deisotoped peaks detected in wild type (WT), cse-1 and cse-2 mutants. Peaks are integrated and aligned via TransOmicsTM. Column "compound" is unique identifier of the peak, resulting from a merge between retention time and m/z value. Column "Anova (p)" is the outcome of statistical analyses as performed in TransOmicsTM on arcsinh transformed peak areas. In "Isotope Distribution". Peak intensities are given for samples WT_1 to WT_9, cse-1_1 to cse-1_8 and cse-2_1 to cse-2_10.

Table S4. Targeted search for compounds that accumulate in cse mutants as potential substrates for CSE. Data from the screening experiment performed with fractions derived from cse-1 mutant extracts are shown. Peaks are integrated with MarkerLynx. Values are peak areas. Peak areas < 1 are not given for clarity. Compounds that are reduced by more than half in the CSE-treated fractions as compared to the control fractions are considered as hits, and are framed with a box. Hits were manually re-integrated from the same chromatograms (see Table S5). Not all cse-accumulating peaks are detected in the fractions. The fractions are given as columns, named as fr1 to fr96. The searched compounds are in rows, for the plate incubated with boiled (inactivated) CSE enzyme (BE) and for the fractions incubated with active CSE enzyme (E).

Table S5: Manual integration in MassLynx of the hits found by automatic integration Markerlynx (see Table S4). Values are peak areas.

References 1. J. K. Weng, C. Chapple, The origin and evolution of lignin biosynthesis. New Phytol.

187, 273–285 (2010). doi:10.1111/j.1469-8137.2010.03327.x Medline

2. F. Yang, P. Mitra, L. Zhang, L. Prak, Y. Verhertbruggen, J. S. Kim, L. Sun, K. Zheng, K. Tang, M. Auer, H. V. Scheller, D. Loqué, Engineering secondary cell wall deposition in plants. Plant Biotechnol. J. 11, 325–335 (2013). doi:10.1111/pbi.12016 Medline

3. N. Mitsuda, A. Iwase, H. Yamamoto, M. Yoshida, M. Seki, K. Shinozaki, M. Ohme-Takagi, NAC transcription factors, NST1 and NST3, are key regulators of the formation of secondary walls in woody tissues of Arabidopsis. Plant Cell 19, 270–280 (2007). doi:10.1105/tpc.106.047043 Medline

4. R. Zhong, E. A. Richardson, Z. H. Ye, Two NAC domain transcription factors, SND1 and NST1, function redundantly in regulation of secondary wall synthesis in fibers of Arabidopsis. Planta 225, 1603–1611 (2007). doi:10.1007/s00425-007-0498-y Medline

5. R. Vanholme, K. Morreel, C. Darrah, P. Oyarce, J. H. Grabber, J. Ralph, W. Boerjan, Metabolic engineering of novel lignin in biomass crops. New Phytol. 196, 978–1000 (2012). doi:10.1111/j.1469-8137.2012.04337.x Medline

6. W. Boerjan, J. Ralph, M. Baucher, Lignin biosynthesis. Annu. Rev. Plant Biol. 54, 519–546 (2003). doi:10.1146/annurev.arplant.54.031902.134938 Medline

7. R. Vanholme, V. Storme, B. Vanholme, L. Sundin, J. H. Christensen, G. Goeminne, C. Halpin, A. Rohde, K. Morreel, W. Boerjan, A systems biology view of responses to lignin biosynthesis perturbations in Arabidopsis. Plant Cell 24, 3506–3529 (2012). doi:10.1105/tpc.112.102574 Medline

8. F. Chen, M. S. Srinivasa Reddy, S. Temple, L. Jackson, G. Shadle, R. A. Dixon, Multi-site genetic modulation of monolignol biosynthesis suggests new routes for formation of syringyl lignin and wall-bound ferulic acid in alfalfa (Medicago sativa L.). Plant J. 48, 113–124 (2006). doi:10.1111/j.1365-313X.2006.02857.x Medline

9. J. M. Humphreys, C. Chapple, Rewriting the lignin roadmap. Curr. Opin. Plant Biol. 5, 224–229 (2002). doi:10.1016/S1369-5266(02)00257-1 Medline

10. N. D. Bonawitz, C. Chapple, The genetics of lignin biosynthesis: Connecting genotype to phenotype. Annu. Rev. Genet. 44, 337–363 (2010). doi:10.1146/annurev-genet-102209-163508 Medline

11. K. Freudenberg, A. C. Neish, Constitution and Biosynthesis of Lignin, A. Kleinzeller, G. F. Springer, W. H. G., Eds., Molecular Biology, Biochemistry and Biophysics (Springer-Verlag, New York, 1968), vol. 2, pp. 1–129.

12. J. Ralph, K. Lundquist, G. Brunow, F. Lu, H. Kim, P. F. Schatz, J. M. Marita, R. D. Hatfield, S. A. Ralph, J. H. Christensen, W. Boerjan, Lignins: Natural polymers

2

from oxidative coupling of 4-hydroxyphenylpropanoids. Phytochem. Rev. 3, 29–60 (2004). doi:10.1023/B:PHYT.0000047809.65444.a4

13. G. Schoch, S. Goepfert, M. Morant, A. Hehn, D. Meyer, P. Ullmann, D. Werck-Reichhart, CYP98A3 from Arabidopsis thaliana LV�D��ƍ-hydroxylase of phenolic esters, a missing link in the phenylpropanoid pathway. J. Biol. Chem. 276, 36566–36574 (2001). doi:10.1074/jbc.M104047200 Medline

14. R. Franke, M. R. Hemm, J. W. Denault, M. O. Ruegger, J. M. Humphreys, C. Chapple, Changes in secondary metabolism and deposition of an unusual lignin in the ref8 mutant of Arabidopsis. Plant J. 30, 47–59 (2002). doi:10.1046/j.1365-313X.2002.01267.x Medline

15. R. Franke, J. M. Humphreys, M. R. Hemm, J. W. Denault, M. O. Ruegger, J. C. Cusumano, C. Chapple, The Arabidopsis REF8 gene encodes the 3-hydroxylase of phenylpropanoid metabolism. Plant J. 30, 33–45 (2002). doi:10.1046/j.1365-313X.2002.01266.x Medline

16. L. Hoffmann, S. Maury, F. Martz, P. Geoffroy, M. Legrand, Purification, cloning, and properties of an acyltransferase controlling shikimate and quinate ester intermediates in phenylpropanoid metabolism. J. Biol. Chem. 278, 95–103 (2003). doi:10.1074/jbc.M209362200 Medline

17. W. Gao, H. Y. Li, S. Xiao, M. L. Chye, Acyl-CoA-binding protein 2 binds lysophospholipase 2 and lysoPC to promote tolerance to cadmium-induced oxidative stress in transgenic Arabidopsis. Plant J. 62, 989–1003 (2010). Medline

18. S. R. Turner, C. R. Somerville, Collapsed xylem phenotype of Arabidopsis identifies mutants deficient in cellulose deposition in the secondary cell wall. Plant Cell 9, 689–701 (1997). Medline

19. X. Li, N. D. Bonawitz, J.-K. Weng, C. Chapple, The growth reduction associated with repressed lignin biosynthesis in Arabidopsis thaliana is independent of flavonoids. Plant Cell 22, 1620–1632 (2010). doi:10.1105/tpc.110.074161 Medline

20. S. Besseau, L. Hoffmann, P. Geoffroy, C. Lapierre, B. Pollet, M. Legrand, Flavonoid accumulation in Arabidopsis repressed in lignin synthesis affects auxin transport and plant growth. Plant Cell 19, 148–162 (2007). doi:10.1105/tpc.106.044495 Medline

21. H. D. Coleman, J.-Y. Park, R. Nair, C. Chapple, S. D. Mansfield, RNAi-mediated suppression of p-coumaroyl-&R$��ƍ-hydroxylase in hybrid poplar impacts lignin deposition and soluble secondary metabolism. Proc. Natl. Acad. Sci. U.S.A. 105, 4501–4506 (2008). doi:10.1073/pnas.0706537105 Medline

22. J. Ehlting, D. Büttner, Q. Wang, C. J. Douglas, I. E. Somssich, E. Kombrink, Three 4-coumarate:coenzyme A ligases in Arabidopsis thaliana represent two evolutionarily divergent classes in angiosperms. Plant J. 19, 9–20 (1999). doi:10.1046/j.1365-313X.1999.00491.x Medline

3

23. J. Raes, A. Rohde, J. H. Christensen, Y. Van de Peer, W. Boerjan, Genome-wide characterization of the lignification toolbox in Arabidopsis. Plant Physiol. 133, 1051–1071 (2003). doi:10.1104/pp.103.026484 Medline

24. H.-C. Chen, Q. Li, C. M. Shuford, J. Liu, D. C. Muddiman, R. R. Sederoff, V. L. Chiang, Membrane protein complexes catalyze both 4- and 3-hydroxylation of cinnamic acid derivatives in monolignol biosynthesis. Proc. Natl. Acad. Sci. U.S.A. 108, 21253–21258 (2011). doi:10.1073/pnas.1116416109 Medline

25. F. Chen, R. A. Dixon, Lignin modification improves fermentable sugar yields for biofuel production. Nat. Biotechnol. 25, 759–761 (2007). doi:10.1038/nbt1316 Medline

26. R. Van Acker et al., Lignin biosynthesis perturbations affect secondary cell wall composition and saccharification yield in Arabidopsis thaliana. Biotechnology for Biofuels (2013).

27. R. Nilsson, K. Bernfur, N. Gustavsson, J. Bygdell, G. Wingsle, C. Larsson, Proteomics of plasma membranes from poplar trees reveals tissue distribution of transporters, receptors, and proteins in cell wall formation. Mol. Cell. Proteomics 9, 368–387 (2010). doi:10.1074/mcp.M900289-MCP200 Medline

28. I. W. Manfield, C. H. Jen, J. W. Pinney, I. Michalopoulos, J. R. Bradford, P. M. Gilmartin, D. R. Westhead, Arabidopsis Co-expression Tool (ACT): web server tools for microarray-based gene expression analysis. Nucleic Acids Res. 34, W504-W509 (2006). doi:10.1093/nar/gkl204 Medline

29. V. Srinivasasainagendra, G. P. Page, T. Mehta, I. Coulibaly, A. E. Loraine, CressExpress: A tool for large-scale mining of expression data from Arabidopsis. Plant Physiol. 147, 1004–1016 (2008). doi:10.1104/pp.107.115535 Medline

30. S. M. Brady, D. A. Orlando, J. Y. Lee, J. Y. Wang, J. Koch, J. R. Dinneny, D. Mace, U. Ohler, P. N. Benfey, A high-resolution root spatiotemporal map reveals dominant expression patterns. Science 318, 801–806 (2007). doi:10.1126/science.1146265 Medline

31. D. M. Brown, L. A. Zeef, J. Ellis, R. Goodacre, S. R. Turner, Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17, 2281–2295 (2005). doi:10.1105/tpc.105.031542 Medline

32. J. M. Alonso, A. N. Stepanova, T. J. Leisse, C. J. Kim, H. Chen, P. Shinn, D. K. Stevenson, J. Zimmerman, P. Barajas, R. Cheuk, C. Gadrinab, C. Heller, A. Jeske, E. Koesema, C. C. Meyers, H. Parker, L. Prednis, Y. Ansari, N. Choy, H. Deen, M. Geralt, N. Hazari, E. Hom, M. Karnes, C. Mulholland, R. Ndubaku, I. Schmidt, P. Guzman, L. Aguilar-Henonin, M. Schmid, D. Weigel, D. E. Carter, T. Marchand, E. Risseeuw, D. Brogden, A. Zeko, W. L. Crosby, C. C. Berry, J. R. Ecker, Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science 301, 653–657 (2003). doi:10.1126/science.1086391 Medline

4

33. T. L. Shimada, T. Shimada, I. Hara-Nishimura, A rapid and non-destructive screenable marker, FAST, for identifying transformed seeds of Arabidopsis thaliana. Plant J. 61, 519–528 (2010). doi:10.1111/j.1365-313X.2009.04060.x Medline

34. M. Karimi, A. Bleys, R. Vanderhaeghen, P. Hilson, Building blocks for plant gene assembly. Plant Physiol. 145, 1183–1191 (2007). doi:10.1104/pp.107.110411 Medline

35. L. A. Kelley, M. J. Sternberg, Protein structure prediction on the Web: A case study using the Phyre server. Nat. Protoc. 4, 363–371 (2009). doi:10.1038/nprot.2009.2 Medline

36. P. Benkert, M. Künzli, T. Schwede, QMEAN server for protein model quality estimation. Nucleic Acids Res. 37, W510-W514 (2009). doi:10.1093/nar/gkp322 Medline

37. L. C. Lee, Y. L. Lee, R. J. Leu, J. F. Shaw, Functional role of catalytic triad and oxyanion hole-forming residues on enzyme activity of Escherichia coli thioesterase I/protease I/phospholipase L1. Biochem. J. 397, 69–76 (2006). doi:10.1042/BJ20051645 Medline

38. A. Rohde, K. Morreel, J. Ralph, G. Goeminne, V. Hostyn, R. De Rycke, S. Kushnir, J. Van Doorsselaere, J. P. Joseleau, M. Vuylsteke, G. Van Driessche, J. Van Beeumen, E. Messens, W. Boerjan, Molecular phenotyping of the pal1 and pal2 mutants of Arabidopsis thaliana reveals far-reaching consequences on phenylpropanoid, amino acid, and carbohydrate metabolism. Plant Cell 16, 2749–2771 (2004). doi:10.1105/tpc.104.023705 Medline

39. C.-H. Jen, I. W. Manfield, I. Michalopoulos, J. W. Pinney, W. G. Willats, P. M. Gilmartin, D. R. Westhead, The Arabidopsis co-expression tool (ACT): A WWW-based tool and database for microarray-based gene expression analysis. Plant J. 46, 336–348 (2006). doi:10.1111/j.1365-313X.2006.02681.x Medline

40. E. Krissinel, K. Henrick, Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr. D Biol. Crystallogr. 60, 2256–2268 (2004). doi:10.1107/S0907444904026460 Medline

41. M. L. Bandu, K. R. Watkins, M. L. Bretthauer, C. A. Moore, H. Desaire, Prediction of MS/MS data. 1. A focus on pharmaceuticals containing carboxylic acids. Anal. Chem. 76, 1746–1753 (2004). doi:10.1021/ac0353785 Medline

42. L. Debrauwer, A. Paris, D. Rao, F. Fournier, J.-C. Tabet, Mass spectrometric studies on ��ȕ-estradiol-17-fatty acid esters: Evidence for the formation of anion-dipole intermediates. Org. Mass Spectrom. 27, 709–719 (1992). doi:10.1002/oms.1210270612

43. F. Fournier, M.-C. Perlat, J.-C. Tabet, Control of internal proton transfers on ion-dipole complexes from [M-H]í ions of diphenol esters. Rapid Commun. Mass Spectrom. 9, 13–17 (1995). doi:10.1002/rcm.1290090105

5

44. F. Fournier, B. Remaud, T. Blasco, J. Tabet, Ion-dipole complex formation from deprotonated phenol fatty acid esters evidenced by using gas-phase labeling combined with tandem mass spectrometry. J. Am. Soc. Mass Spectrom. 4, 343–351 (1993). doi:10.1016/1044-0305(93)85057-5

45. R. Dauwe, K. Morreel, G. Goeminne, B. Gielen, A. Rohde, J. Van Beeumen, J. Ralph, A. M. Boudet, J. Kopka, S. F. Rochange, C. Halpin, E. Messens, W. Boerjan, Molecular phenotyping of lignin-modified tobacco reveals associated changes in cell-wall metabolism, primary metabolism, stress metabolism and photorespiration. Plant J. 52, 263–285 (2007). doi:10.1111/j.1365-313X.2007.03233.x Medline

46. R. Vanholme, J. Ralph, T. Akiyama, F. Lu, J. R. Pazo, H. Kim, J. H. Christensen, B. Van Reusel, V. Storme, R. De Rycke, A. Rohde, K. Morreel, W. Boerjan, Engineering traditional monolignols out of lignin by concomitant up-regulation of F5H1 and down-regulation of COMT in Arabidopsis. Plant J. 64, 885–897 (2010). doi:10.1111/j.1365-313X.2010.04353.x Medline

47. H. Meyermans, K. Morreel, C. Lapierre, B. Pollet, A. De Bruyn, R. Busson, P. Herdewijn, B. Devreese, J. Van Beeumen, J. M. Marita, J. Ralph, C. Chen, B. Burggraeve, M. Van Montagu, E. Messens, W. Boerjan, Modifications in lignin and accumulation of phenolic glucosides in poplar xylem upon down-regulation of caffeoyl-coenzyme A O-methyltransferase, an enzyme involved in lignin biosynthesis. J. Biol. Chem. 275, 36899–36909 (2000). doi:10.1074/jbc.M006915200 Medline

48. M. Mir Derikvand, J. B. Sierra, K. Ruel, B. Pollet, C. T. Do, J. Thévenin, D. Buffard, L. Jouanin, C. Lapierre, Redirection of the phenylpropanoid pathway to feruloyl malate in Arabidopsis mutants deficient for cinnamoyl-CoA reductase 1. Planta 227, 943–956 (2008). doi:10.1007/s00425-007-0669-x Medline

49. K. Morreel, O. Dima, H. Kim, F. Lu, C. Niculaes, R. Vanholme, R. Dauwe, G. Goeminne, D. Inzé, E. Messens, J. Ralph, W. Boerjan, Mass spectrometry-based sequencing of lignin oligomers. Plant Physiol. 153, 1464–1478 (2010). doi:10.1104/pp.110.156489 Medline

50. K. Morreel, H. Kim, F. Lu, O. Dima, T. Akiyama, R. Vanholme, C. Niculaes, G. Goeminne, D. Inzé, E. Messens, J. Ralph, W. Boerjan, Mass spectrometry-based fragmentation as an identification tool in lignomics. Anal. Chem. 82, 8095–8105 (2010). doi:10.1021/ac100968g Medline

51. J. H. Bowie, The fragmentations of even-electron organic negative ions. Mass Spectrom. Rev. 9, 349–379 (1990). doi:10.1002/mas.1280090305

52. K. Morreel, J. Ralph, H. Kim, F. Lu, G. Goeminne, S. Ralph, E. Messens, W. Boerjan, Profiling of oligolignols reveals monolignol coupling conditions in lignifying poplar xylem. Plant Physiol. 136, 3537–3549 (2004). doi:10.1104/pp.104.049304 Medline

53. B. Vanholme, I. Cesarino, G. Goeminne, H. Kim, F. Marroni, R. Van Acker, R. Vanholme, K. Morreel, B. Ivens, S. Pinosio, M. Morgante, J. Ralph, C. Bastien, W. Boerjan, Breeding with rare defective alleles (BRDA): a natural Populus nigra HCT

6

mutant with modified lignin as a case study. New Phytol. 198, 765–776 (2013). doi:10.1111/nph.12179 Medline

54. T. A. Hall, BioEdit: A user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98 (1999).

55. Y. Van de Peer, R. De Wachter, TREECON for Windows: A software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. CABIOS 10, 569–570 (1994). Medline