supporting information - pnas · 2/19/2013  · cv. white crystal, a glutinous maize cultivar, were...

11
Supporting Information Liu et al. 10.1073/pnas.1301009110 SI Text Plant Growth Conditions and Sample Collection. Seeds of Zea mays cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger- mination, seeds were imbibed in distilled water at 6:00 PM by rst shaking for 10 min at 200 rpm and then germinating on wet lter paper on Petri dishes in the greenhouse under natural sunlight conditions (approximately 6:00 AM to approximately 7:00 PM) during the summer. The rst set of seeds was imbibed on 6/2/2011 and leaf tissue for RNA preparation was taken every 6 h for 2 d [up to the 48th hour (T48)]. A second set of seeds was imbibed at 6:00 PM on 7/24/2011 and leaf tissue for RNA prepa- ration was taken at T54, T60, T66, and T72. The epicotyls were collected, and then coleoptiles were removed before the embryonic leaves were isolated. These isolated leaves were then stored in liquid nitrogen. The whole procedure was done in an hour. The maximum photosynthetic photon ux density around noon was 1,600 μmol· m 2 · s 1 , and the day/night temperatures were 3035 °C/ 2528 °C, with a relative humidity of 6070%. Samples of the embryonic leaves and the shoot apical meristem (SAM) were collected from dry seeds (T00) and germinating seeds every 6 h up to 72 h for leaf anatomic examination and for RNA isolation and transcriptomic analysis. For RNA isolation, 200 plumules were dissected from the seeds and after removing the sheath tissue, the entire embryonic leaves and the SAM were stored in liquid N 2 until RNA extraction. Our tissue samples for tran- scriptomic analysis contained all of the four to ve embryonic leaves in the maize seeds. As the rst leaf is the largest, it should have the largest contribution to the transcriptomes. However, as new kranz structures (KSs) develop in these leaves, the tran- scriptomes are useful for hypothesizing the regulatory genes in- volved in KS development. The samples also contained SAM, but because SAM is small, its contribution to the transcriptomes may not be signicant. Anatomical Studies. For anatomical examination, samples of the embryonic leaves and SAM were xed in 2.5% (wt/vol) glutar- aldehyde and postxed in 1% (wt/vol) OsO4, both in 0.1 M so- dium phosphate buffer (1). After dehydration through an ethanol series, the samples were inltrated and embedded in Spurrs resin. Semithin cross-sections (0.9 μm) were cut from the base of embryonic leaves with an Ultracut E Microtome (Reichert-Jung) and stained with 0.1% (wt/vol) toluidine blue and 0.1% (wt/vol) borax for observation with a light microscope (Olympus). RNA Extraction and Sequencing. Total RNA was extracted using TRIzol reagent (Invitrogen), and 1 μL TURBO DNase (Ambion) per 10 μg RNA was added and incubated for 30 min at 37 °C to remove traces of contaminating DNA; and RNA was sub- sequently puried by the phenol:chloroform procedure. The RNAs were quantied using Qubit RNA Pico reagent (In- vitrogen) and assessed for purity and quality using NanoDrop (Thermo Scientic) and BioAnalyzer (Agilent). RNA-seq li- braries were prepared using the Illumina Standard mRNA-seq library preparation kit (Illumina) based on the manufactures protocol with the following modications. Briey, 15 μg of total RNA was subjected to two rounds of oligo-dT bead purication, fragmented for 2.75 min, and primed for rst strand cDNA using 0.8× amount of random primer. After double strand cDNA synthesis and adaptor ligation, each sample was fractionated on 2% (wt/vol) Low Range Ultra Agarose (Bio-Rad) gels, from which the libraries of two different size ranges (300 bp and 400 bp) were excised. The puried libraries were then in- dependently amplied by 15 cycles of PCR and cleaned up using AMPure XP beads (Beckman Agencourt). RNA-seq libraries were subjected to Illumina sequencing by paired-end 2*101 nt (both ends with 101 nucleotides long) on a HiSeq2000 at the NGS Core Facility of Biodiversity Research Center in Academia Sinica, Taiwan. A total of 195292 million pairs of raw reads was obtained from the two libraries (two lanes) for each time point sample; each pair of reads was counted as two reads. For each time point, 390584 million raw reads and 240351 million mappable reads were obtained. Data Processing and Analysis. Low-quality bases and reads were removed by three criteria: (i ) the consecutive bases from the end of a read with a default low-quality score of 2 [phred score of 2 or Q2 (2)], (ii ) the bases from the beginning of a read until all of the scores of the rst 20 remaining bases were at least Q20 (the base call error rate of 1%), and (iii ) the trimmed reads with less than 60 remaining bases. (Phred score is a general metric for the accuracy of a sequencing platform (2). The Q2 indicator does not give a specic error rate, but rather indicates a specic portion of the read that should not be used in further analyses.) Each pair of reads was treated as two single reads. Then, all alignment and quantication processes followed the alternative protocol Bin ref. 3). The processed reads were mapped to the maize genome (ZmB73_RefGen_v2) and the working gene set (ZmB73_5a_WGS, http://ftp.maizesequence.org/), using Tophat version 1.3.3 (http://tophat.cbcb.umd.edu/) and its embedded aligner Bowtie version 0.12.7 (http://bowtie-bio.sourceforge.net). Each read was aligned by the -npolicy, and at most 10 hits were allowed. The expression level of each working gene set (WGS) gene was estimated in the reads per kilobase per million (RPKM) (4), using Cufinks version 2.0.2 (http://cufinks.cbcb.umd.edu/). We selected only genes with RPKM 1 in two or more time points for further analysis. To compare the expression levels of the selected genes across all time points, we applied the upper quartile normalization procedure (5), which reduces the bias in RPKM values due to highly expressed genes. We used the tran- scriptome at T42 as the reference because its RPKMs were most evenly distributed among the 13 transcriptomes. The RPKM values at a time point were scaled by a constant factor to make the 75th percentile of the RPKM values for all expressed genes at that time point equal to the one at T42. Expression Prole Correlation and Clustering. The clustering tool gCLUTO (Graphical Clustering Toolkit) was downloaded from http://glaros.dtc.umn.edu/gkhome/cluto/gcluto/overview. It in- cludes several hierarchical clustering methods and we selected the repeated bisection method, which is a top-down hierarchical clustering algorithm that separates samples with the largest dis- tance. We calculated the Pearson correlation coefcient (PCC) between each pair of expression proles of the genes under study and used the PCC values to cluster the expression proles. A scalar quantity, called the gure of merit (FOM) (6), was used to assess the quality of the clustering algorithm and to determine the optimal number of clusters. To determine the functional enrichment for each cluster, we used the MapMan functional annotation (7). The annotated en- tries with the functional category name of not assigned.un- knownwere excluded in this study. We then used Fishers exact test to examine whether a function was signicantly over- or un- Liu et al. www.pnas.org/cgi/content/short/1301009110 1 of 11

Upload: others

Post on 06-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

Supporting InformationLiu et al. 10.1073/pnas.1301009110SI TextPlant Growth Conditions and Sample Collection. Seeds of Zea mayscv. White Crystal, a glutinous maize cultivar, were purchasedfrom a local supplier and stored at room temperature. For ger-mination, seeds were imbibed in distilled water at 6:00 PM byfirst shaking for 10 min at 200 rpm and then germinating on wetfilter paper on Petri dishes in the greenhouse under naturalsunlight conditions (approximately 6:00 AM to approximately7:00 PM) during the summer. The first set of seeds was imbibedon 6/2/2011 and leaf tissue for RNA preparation was taken every6 h for 2 d [up to the 48th hour (T48)]. A second set of seeds wasimbibed at 6:00 PM on 7/24/2011 and leaf tissue for RNA prepa-ration was taken at T54, T60, T66, and T72. The epicotyls werecollected, and then coleoptiles were removed before the embryonicleaves were isolated. These isolated leaves were then stored inliquid nitrogen. The whole procedure was done in an hour. Themaximum photosynthetic photon flux density around noon was∼1,600 μmol·m−2·s−1, and the day/night temperatures were 30–35 °C/25–28 °C, with a relative humidity of 60–70%. Samples of theembryonic leaves and the shoot apical meristem (SAM) werecollected from dry seeds (T00) and germinating seeds every 6 hup to 72 h for leaf anatomic examination and for RNA isolationand transcriptomic analysis. For RNA isolation, ∼200 plumuleswere dissected from the seeds and after removing the sheathtissue, the entire embryonic leaves and the SAM were stored inliquid N2 until RNA extraction. Our tissue samples for tran-scriptomic analysis contained all of the four to five embryonicleaves in the maize seeds. As the first leaf is the largest, it shouldhave the largest contribution to the transcriptomes. However, asnew kranz structures (KSs) develop in these leaves, the tran-scriptomes are useful for hypothesizing the regulatory genes in-volved in KS development. The samples also contained SAM,but because SAM is small, its contribution to the transcriptomesmay not be significant.

Anatomical Studies. For anatomical examination, samples of theembryonic leaves and SAM were fixed in 2.5% (wt/vol) glutar-aldehyde and postfixed in 1% (wt/vol) OsO4, both in 0.1 M so-dium phosphate buffer (1). After dehydration through an ethanolseries, the samples were infiltrated and embedded in Spurr’sresin. Semithin cross-sections (0.9 μm) were cut from the base ofembryonic leaves with an Ultracut E Microtome (Reichert-Jung)and stained with 0.1% (wt/vol) toluidine blue and 0.1% (wt/vol)borax for observation with a light microscope (Olympus).

RNA Extraction and Sequencing. Total RNA was extracted usingTRIzol reagent (Invitrogen), and 1 μL TURBODNase (Ambion)per 10 μg RNA was added and incubated for 30 min at 37 °Cto remove traces of contaminating DNA; and RNA was sub-sequently purified by the phenol:chloroform procedure. TheRNAs were quantified using Qubit RNA Pico reagent (In-vitrogen) and assessed for purity and quality using NanoDrop(Thermo Scientific) and BioAnalyzer (Agilent). RNA-seq li-braries were prepared using the Illumina Standard mRNA-seqlibrary preparation kit (Illumina) based on the manufacture’sprotocol with the following modifications. Briefly, 15 μg of totalRNA was subjected to two rounds of oligo-dT bead purification,fragmented for 2.75 min, and primed for first strand cDNA using0.8× amount of random primer. After double strand cDNAsynthesis and adaptor ligation, each sample was fractionated on2% (wt/vol) Low Range Ultra Agarose (Bio-Rad) gels, fromwhich the libraries of two different size ranges (∼300 bp and

∼400 bp) were excised. The purified libraries were then in-dependently amplified by 15 cycles of PCR and cleaned up usingAMPure XP beads (Beckman Agencourt). RNA-seq librarieswere subjected to Illumina sequencing by paired-end 2*101 nt(both ends with 101 nucleotides long) on a HiSeq2000 at theNGS Core Facility of Biodiversity Research Center in AcademiaSinica, Taiwan. A total of 195–292 million pairs of raw reads wasobtained from the two libraries (two lanes) for each time pointsample; each pair of reads was counted as two reads. For eachtime point, ∼390–584 million raw reads and 240–351 millionmappable reads were obtained.

Data Processing and Analysis. Low-quality bases and reads wereremoved by three criteria: (i) the consecutive bases from the endof a read with a default low-quality score of 2 [phred score of2 or Q2 (2)], (ii) the bases from the beginning of a read until all ofthe scores of the first 20 remaining bases were at least Q20 (thebase call error rate of ∼1%), and (iii) the trimmed reads with lessthan 60 remaining bases. (Phred score is a general metric for theaccuracy of a sequencing platform (2). The Q2 indicator doesnot give a specific error rate, but rather indicates a specificportion of the read that should not be used in further analyses.)Each pair of reads was treated as two single reads. Then, allalignment and quantification processes followed the “alternativeprotocol B” in ref. 3). The processed reads were mapped to themaize genome (ZmB73_RefGen_v2) and the working gene set(ZmB73_5a_WGS, http://ftp.maizesequence.org/), using Tophatversion 1.3.3 (http://tophat.cbcb.umd.edu/) and its embeddedaligner Bowtie version 0.12.7 (http://bowtie-bio.sourceforge.net).Each read was aligned by the “-n” policy, and at most 10 hits wereallowed. The expression level of each working gene set (WGS)gene was estimated in the reads per kilobase per million (RPKM)(4), using Cufflinks version 2.0.2 (http://cufflinks.cbcb.umd.edu/).We selected only genes with RPKM ≥1 in two or more time

points for further analysis. To compare the expression levels ofthe selected genes across all time points, we applied the upperquartile normalization procedure (5), which reduces the bias inRPKM values due to highly expressed genes. We used the tran-scriptome at T42 as the reference because its RPKMs were mostevenly distributed among the 13 transcriptomes. The RPKMvalues at a time point were scaled by a constant factor to make the75th percentile of the RPKM values for all expressed genes at thattime point equal to the one at T42.

Expression Profile Correlation and Clustering. The clustering toolgCLUTO (Graphical Clustering Toolkit) was downloaded fromhttp://glaros.dtc.umn.edu/gkhome/cluto/gcluto/overview. It in-cludes several hierarchical clustering methods and we selectedthe repeated bisection method, which is a top-down hierarchicalclustering algorithm that separates samples with the largest dis-tance. We calculated the Pearson correlation coefficient (PCC)between each pair of expression profiles of the genes under studyand used the PCC values to cluster the expression profiles. Ascalar quantity, called the figure of merit (FOM) (6), was used toassess the quality of the clustering algorithm and to determinethe optimal number of clusters.To determine the functional enrichment for each cluster, we

used the MapMan functional annotation (7). The annotated en-tries with the functional category name of “not assigned.un-known” were excluded in this study. We then used Fisher’s exacttest to examine whether a function was significantly over- or un-

Liu et al. www.pnas.org/cgi/content/short/1301009110 1 of 11

Page 2: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

derrepresented in a selected cluster of genes against the set of allexpressed genes with MapMan annotations.

Identification of Differentially Expressed Genes. We used the non-parametric method of Tarazona et al. (8) to identify differentiallyexpressed genes (DEGs) between two samples. Here, we set the qvalue (differentially expression probability) in themethod to be 0.7,because in our dataset genes with q > 0.7 have at least a 2.3-foldchange in RPKM between the two samples. The test is stringent.For example, the number of DEGs between T06 and T12 was only39, and because 30,255 genes (tests) were used, the probability ofpassing a test was only ∼39/30,000 = 0.0013, which is much smallerthan 0.01 (the 1% significance level). Thus, the test is conservativeand can be used to identify DEGs between two time points.

The above analysis was between two time points. To find allgenes differentially expressed during leaf development, we havedeveloped the followingmethod tomake all pairwise comparisonsbetween time points, which takes care of the problem of multipletesting. Suppose the probability of “success” in a single test is pand a total of N tests are made. We use the binomial expansionto calculate the probability of at least X successes in N tests. Inthe above, we have seen that p is only ∼0.0013. To be conser-vative, let us take p = 0.005. Among the 13 time points there are78 possible pairwise comparisons, that is n = 78. Then theprobability to have at least X = 4 “successes” is only ∼0.0001.Under this criterion, we find that 13,907 (46%) of the 30,255expressed genes and 590 (47%) of the 1,238 expressed TF genesare differentially expressed during the 72-h time course.

1. Sheue CR, et al. (2007) Bizonoplast, a unique chloroplast in the epidermal cells ofmicrophylls in the shade plant Selaginella erythropus (Selaginellaceae). Am J Bot94(12):1922–1929.

2. Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencertraces using phred. I. Accuracy assessment. Genome Res 8(3):175–185.

3. Trapnell C, et al. (2012) Differential gene and transcript expression analysis of RNA-seqexperiments with TopHat and Cufflinks. Nat Protoc 7(3):562–578.

4. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping andquantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628.

5. Bullard JH, Purdom E, Hansen KD, Dudoit S (2010) Evaluation of statistical methods fornormalization and differential expression in mRNA-Seq experiments. BMCBioinformatics 11:94.

6. Yeung KY, Haynor DR, RuzzoWL (2001) Validating clustering for gene expression data.Bioinformatics 17(4):309–318.

7. Thimm O, et al. (2004) MAPMAN: A user-driven tool to display genomics data sets ontodiagrams of metabolic pathways and other biological processes. Plant J 37(6):914–939.

8. Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A (2011) Differentialexpression in RNA-seq: A matter of depth. Genome Res 21(12):2213–2223.

Fig. S1. Comparison between mature leaf and embryonic leaf time course transcriptomes. Pearson’s correlation coefficients (PCCs) between the four matureleaf transcriptomes of Li et al. (1), as indicated in the first column, and our time course transcriptomes, as indicated in the first row.

1. Li P, et al. (2010) The developmental dynamics of the maize leaf transcriptome. Nat Genet 42(12):1060–1067.

Liu et al. www.pnas.org/cgi/content/short/1301009110 2 of 11

Page 3: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

TCA / org. transforma�on.TCAmitochondrial electron transport / ATP synthesis.cytochrome c reductaseamino acid metabolismamino acid metabolism.synthesisCo-factor and vitamine metabolismtetrapyrrole synthesisstress.abio�c.heatredoxredox.thioredoxinnucleo�de metabolismnucleo�de metabolism.synthesisnucleo�de metabolism.synthesis.pyrimidinenucleo�de metabolism.synthesis.purineRNA.processingRNA.processing.RNA helicaseRNA.regula�on of transcrip�on.Trihelix, Triple-Helix transcrip�on factor familyRNA.regula�on of transcrip�on.Chroma�n Remodeling FactorsRNA.regula�on of transcrip�on.General Transcrip�onRNA.regula�on of transcrip�on.puta�ve transcrip�on regulatorRNA.regula�on of transcrip�on.unclassifiedRNA.RNA bindingDNA.repairproteinprotein.aa ac�va�onprotein.synthesisprotein.synthesis.ribosomal proteinprotein.synthesis.ribosomal protein.prokaryo�c.unknown organellarprotein.synthesis.ribosomal protein.eukaryo�cprotein.synthesis.ribosomal protein.eukaryo�c.60S subunitprotein.synthesis.ini�a�onprotein.synthesis.elonga�onprotein.targe�ngprotein.targe�ng.nucleusprotein.targe�ng.mitochondriaprotein.targe�ng.chloroplastprotein.targe�ng.secretory pathwayprotein.postransla�onal modifica�onprotein.degrada�on.ubiqui�nprotein.degrada�on.ubiqui�n.E2protein.degrada�on.ubiqui�n.proteasomprotein.foldingsignalling.G-proteinssignalling.14-3-3 proteinscell.vesicle transportTCA / org. transforma�on.TCA.pyruvate DHlipid metabolism.FA synthesis and FA elonga�onlipid metabolism.FA synthesis and FA elonga�on.long chain fa�y acid CoA ligaseRNA.regula�on of transcrip�on.GeBP likestress.abio�cprotein.degrada�on.ubiqui�n.ubiqui�n proteasecellcell.divisionPS.photorespira�ontransport.p- and v-ATPases.H+-transpor�ng two-sector ATPaseTCA / org. transforma�onC1-metabolismsecondary metabolism.isoprenoids.non-mevalonate pathwayprotein.synthesis.ribosomal protein.prokaryo�c.unknown organellar.30S subunitprotein.degrada�onmitochondrial electron transport / ATP synthesismitochondrial electron transport / ATP synthesis.NADH-DHmitochondrial electron transport / ATP synthesis.NADH-DH.localisa�on not clearmitochondrial electron transport / ATP synthesis.cytochrome c oxidasetransport.metabolite transporters at the mitochondrial membranetransport.calciumamino acid metabolism.synthesis.aspartate familyRNA.regula�on of transcrip�on.Global transcrip�on factor groupOPPprotein.degrada�on.ubiqui�n.E3protein.degrada�on.ubiqui�n.E3.RINGRNA.regula�on of transcrip�on.JUMONJI familyDNA.synthesis/chroma�n structureDNA.synthesis/chroma�n structure.histoneamino acid metabolism.synthesis.serine-glycine-cysteine grouptransport.misccell.organisa�oncell.cycleprotein.targe�ng.secretory pathway.ERPS.lightreac�on.other electron carrier (ox/red)PS.lightreac�on.other electron carrier (ox/red).ferredoxindevelopment.late embryogenesis abundantsecondary metabolism.flavonoids.chalconeshormone metabolism.ethylene.synthesis-degrada�oncell wall.hemicellulose synthesiscell wall.cell wall proteins.proline rich proteinsPS.calvin cyclemisc.beta 1,3 glucan hydrolasestransport.Major Intrinsic Proteins.TIPprotein.degrada�on.ubiqui�n.E3.BTB/POZ Cullin3protein.degrada�on.ubiqui�n.E3.BTB/POZ Cullin3.BTB/POZtransportPS.lightreac�on.photosystem IPS.lightreac�on.photosystem I.PSI polypep�de subunitsRNA.regula�on of transcrip�on.CCAAT box binding factor family, HAP3protein.synthesis.ribosomal protein.prokaryo�ctransport.sugarsDNA.synthesis/chroma�n structure.retrotransposon/transposaseDNA.synthesis/chroma�n structure.retrotransposon/transposase.hat-like transposaselipid metabolism.lipid degrada�on.lipases.triacylglycerol lipasePS.lightreac�on.cytochrome b6/fTCA / org. transforma�on.carbonic anhydraseshormone metabolism.abscisic acid.synthesis-degrada�on.synthesis.9-cis-epoxycarotenoid dioxygenasehormone metabolism.abscisic acid.synthesis-degrada�on.synthesishormone metabolism.auxin.induced-regulated-responsive-ac�vatedhormone metabolism.ethylene.signal transduc�ontransport.ammoniumcell wall.cell wall proteins.AGPscell wall.cell wall proteins.AGPs.AGPRNA.regula�on of transcrip�on.ABI3/VP1-related B3-domain-containing transcrip�on factor familysecondary metabolism.phenylpropanoids.lignin biosynthesis.HCThormone metabolism.auxinPS.lightreac�on.photosystem II.LHC-IIsecondary metabolism.phenylpropanoids.lignin biosynthesisstresscell wall.cell wall proteinscell wall.cell wall proteins.LRRcell wall.degrada�on.cellulases and beta -1,4-glucanasesmisc.oxidases - copper, flavone etc.hormone metabolism.cytokininhormone metabolism.salicylic acidhormone metabolism.salicylic acid.synthesis-degrada�onmisc.protease inhibitor/seed storage/lipid transfer protein (LTP) family proteinRNA.regula�on of transcrip�onRNA.regula�on of transcrip�on.NAC domain transcrip�on factor familyhormone metabolismsignalling.receptor kinases.leucine rich repeat XIPSPS.lightreac�on.photosystem IIPS.lightreac�onPS.lightreac�on.photosystem II.PSII polypep�de subunitsPS.lightreac�on.NADH DHPS.lightreac�on.cyclic electron flow-chlororespira�onPS.calvin cycle.rubisco large subunitglycolysis.unclear/dually targetedglycolysis.unclear/dually targeted.fructose-2,6-bisphosphatase (Fru2,6BisPase)cell wallcell wall.degrada�oncell wall.degrada�on.pectate lyases and polygalacturonasescell wall.modifica�oncell wall.pec�n*esterasescell wall.pec�n*esterases.PMEsecondary metabolismsecondary metabolism.isoprenoids.terpenoidssecondary metabolism.phenylpropanoidssecondary metabolism.N miscsecondary metabolism.N misc.alkaloid-likesecondary metabolism.flavonoidssecondary metabolism.flavonoids.dihydroflavonolssecondary metabolism.simple phenolshormone metabolism.cytokinin.synthesis-degrada�onhormone metabolism.ethylenestress.bio�cstress.abio�c.unspecifiedmiscmisc.UDP glucosyl and glucoronyl transferasesmisc.cytochrome P450misc.peroxidasesmisc.myrosinases-lec�n-jacalinmisc.invertase/pec�n methylesterase inhibitor family proteinmisc.plastocyanin-likemisc.GDSL-mo�f lipaseRNA.regula�on of transcrip�on.AP2/EREBP, APETALA2/Ethylene-responsive element binding protein familyRNA.regula�on of transcrip�on.bHLH,Basic Helix-Loop-Helix familyRNA.regula�on of transcrip�on.C2H2 zinc finger familyRNA.regula�on of transcrip�on.GRAS transcrip�on factor familyRNA.regula�on of transcrip�on.MADS box transcrip�on factor familyRNA.regula�on of transcrip�on.MYB domain transcrip�on factor familyRNA.regula�on of transcrip�on.WRKY domain transcrip�on factor familyRNA.regula�on of transcrip�on.AS2,Lateral Organ Boundaries Gene FamilyDNA.unspecifiedprotein.synthesis.ribosomal protein.prokaryo�c.chloroplastprotein.synthesis.ribosomal protein.prokaryo�c.chloroplast.30S subunitprotein.synthesis.ribosomal protein.prokaryo�c.chloroplast.30S subunit.S15protein.synthesis.ribosomal protein.prokaryo�c.chloroplast.30S subunit.S19protein.synthesis.ribosomal protein.prokaryo�c.chloroplast.50S subunit.L2protein.degrada�on.sub�lasesprotein.degrada�on.cysteine proteasesignalling.receptor kinasessignalling.receptor kinases.DUF 26signalling.receptor kinases.S-locus glycoprotein likesignalling.receptor kinases.wall associated kinasetransport.amino acidstransport.nitratetransport.phosphatetransport.pep�des and oligopep�des

Z-transformation(p)

Over-representation

Under-representation

A B

C

Fig. S2. MapMan functional categories enriched in expressed genes at each time point. Enrichment tests conducted using (A) all expressed genes and (B)genes with top 10% or (C) top 1% RPKM values in each time point. The heatmap is color-coded based on the Z transformation of adjusted P values (1). Blue andred represent under- and overrepresented categories, respectively.

1. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc, B 57(1):289–300.

Liu et al. www.pnas.org/cgi/content/short/1301009110 3 of 11

Page 4: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

A

Z-transformation (p)

Over-representation

Under-representation

Up DownT06 T12 T18 T24 T30 T36 T42 T48 T54 T60 T66 T72

Fig. S3. (Continued)

Liu et al. www.pnas.org/cgi/content/short/1301009110 4 of 11

Page 5: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

T06 T12 T18 T24 T30 T36 T42 T48 T54 T60 T66 T72B

Fig. S3. (Continued)

Liu et al. www.pnas.org/cgi/content/short/1301009110 5 of 11

Page 6: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

CT06 T12 T18 T24 T30 T36 T42 T48 T54 T60 T66 T72

Fig. S3. (Continued)

Liu et al. www.pnas.org/cgi/content/short/1301009110 6 of 11

Page 7: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

E

C1C2

C4

C3

C5

C6

C12

C7

C15

C11

C8

C9

C10

C13

C14

DT00 T72T12 T36 T48T24 T54

0 0.5 0.9

Fig. S3. MapMan functional categories enriched in differentially expressed genes (DEGs). Enrichment tests with (A) type 1 DEGs, (B) type 2 DEGs, and (C) type3 DEGs. (D) Clusters of all three types of DEGs. (E) Enrichment tests on each DEG cluster shown in D.

Liu et al. www.pnas.org/cgi/content/short/1301009110 7 of 11

Page 8: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

Fig. S4. (Continued)

Liu et al. www.pnas.org/cgi/content/short/1301009110 8 of 11

Page 9: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

Fig. S4. MapMan functional categories enriched in genes of various coexpression gene clusters. (A) Expression profiles of genes sorted according to clusterorders. (B) Enrichment tests of detailed overrepresented functional categories on each cluster as shown in A. (C) Expression profiles of genes in selectedclusters. (D) Enrichment tests of both over- and underrepresented high-level categories on each cluster as shown in C. (E) Enrichment tests of both over- andunderrepresented hormone-related genes on each cluster as shown in C.

Liu et al. www.pnas.org/cgi/content/short/1301009110 9 of 11

Page 10: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

Establishment

Specification

Maintenance

Provascular stem cell (PSC)Vascular stem cell (VSC)

AtPIN1

AtMP

AtHB8

AtPXY

PSC

AtCLE44

Auxin

AtBDL

AtWOX4

VSC

Differentiation

Fiber (F)Metaxylem vessel (MV)Protoxylem vessel (PV)

Phloem precursor (PP)Xylem precursor (XP)

Sieve element (SE)Companion cell (CC)

0

2

4

6

8

0 6 12 18 24 30 36 42 48 54 60 66 72

AC187157.4_FG005

0

10

20

30

40

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G142768

0

1

2

3

4

5

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G354253

0

5

10

15

20

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G039934

0

20

40

60

80

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G074267GRMZM2G098643GRMZM2G149184

0

5

10

15

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G086949GRMZM2G034840

0

1

2

3

4

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G081671GRMZM2G052544

0

50

100

150

200

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G023291GRMZM2G423337GRMZM2G178102GRMZM2G003509AC190829.3_FG002

0

5

10

15

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G038198GRMZM2G006062

0

10

20

30

40

50

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G109987GRMZM2G042250GRMZM2G469551GRMZM2G158742GRMZM2G048297GRMZM2G098704

0

0.1

0.2

0.3

0.4

0.5

0.6

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G091490GRMZM2G171395

0

1

2

3

4

5

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G440219GRMZM2G178998GRMZM2G354151GRMZM2G041668

0

0.5

1

1.5

2

2.5

3

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G048826AC212859.3_FG008

0

50

100

150

200

0 6 12 18 24 30 36 42 48 54 60 66 72

GRMZM2G083347AC203535.4_FG002GRMZM2G134073GRMZM2G054252GRMZM2G179049GRMZM2G054277

AtPHBAtPHVAtREV

XP

SE CC

AtCAN

Differentiation

F MV PV

AtVND6AtVND7

AtVND7

AtSND1AtNST1

AtVNI2

AtAPL

PP

AtREV

AtVND6

AtVNI2

AtMP

AtHB8

AtPXY

AtPHBAtPHV

Time (hr)

RPKM

AtBDL

AtPIN1

AtCLE44

AtAPL

AtCAN

AtSND1AtNST1

AtVND7

A B

Fig. S5. Maize transcription factors likely regulating vascular development. (A) The current known components and relationships of vascular developmentgenes in Arabidopsis thaliana. (B) Expression patterns of putative maize orthologs of A. thaliana vascular development genes.

Liu et al. www.pnas.org/cgi/content/short/1301009110 10 of 11

Page 11: Supporting Information - PNAS · 2/19/2013  · cv. White Crystal, a glutinous maize cultivar, were purchased from a local supplier and stored at room temperature. For ger-mination,

Table S1. Read count statistics of the deep sequencing data for the 13 time points studied

Sample Library length, bp

Quality filtering Mappable read

Total reads Rate, % Total Rate, %

T00_ST-DA01 300 218,477,277 75.9 277,488,186 76.6400 143,975,388 86.8

T06_ST-DA02 300 213,618,797 74.8 255,413,163 71.7400 142,589,424 85.7

T12_ST-DA03 300 206,947,451 74.6 275,874,436 75.9400 156,344,245 82.1

T18_ST-DA04 300 210,270,688 76.5 279,889,765 76.4400 155,975,570 84.0

T24_ST-DA05 300 190,425,691 78.2 294,733,426 78.8400 183,789,389 79.9

T30_ST-DA06 300 195,697,940 80.2 271,280,993 77.9400 152,421,099 85.6

T36_ST-DA07 300 215,565,717 79.9 308,395,082 80.0400 169,797,300 84.6

T42_ST-DA08 300 153,005,724 85.2 242,223,816 72.2400 182,552,562 81.7

T48_ST-DA09 300 156,382,336 83.7 240,264,229 73.1400 172,108,685 84.8

T54_ST-DB05 300 222,228,743 76.8 340,039,005 81.2400 196,448,375 82.1

T60_ST-DB06 300 236,437,704 76.3 351,089,370 80.6400 199,181,671 82.3

T66_ST-DB07 300 237,623,498 74.6 339,580,019 75.6400 211,538,295 79.9

T72_ST-DB08 300 208,071,129 80.7 320,303,598 75.3400 217,502,039 76.5

Table S2. Numbers of annotated genes and expressed features

Gene set A/E

No. of expressed features

CDS TF Pseudo TE

WGS A 63,363 2,070 17,344 29,024E 30,255 1,238 3,040 1,986

FGS A 39,441 2068 95 53E 23,900 1,237 95 53

A, all; CDS, protein coding gene; E, expressed (RPKM ≥1 in more than onetime point); FGS, filtered gene set; pseudo, pseudogene; TE, transposableelement; TF, transcription factor; WGS, working gene set.

Other Supporting Information Files

Dataset S1 (XLSX)Dataset S2 (XLSX)

Liu et al. www.pnas.org/cgi/content/short/1301009110 11 of 11