the published online july 16, 2007 ge ome -...

17
S-96 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2 Analysis of Gene Expression during Brassica Seed Germination Using a Cross- Species Microarray Platform Matthew E. Hudson,* Tonko Bruggink, Sherman H. Chang, Wenjin Yu, Bin Han, Xun Wang, Peter van der Toorn, and Tong Zhu Abstract We have developed an approach to facilitate the use of model organism microarrays in related, nonmodel organisms. We demonstrate the method by using Arabidopsis oligonucleotide microarrays to analyze gene expression in Brassica. Probes with low hybridization signals to Brassica genomic DNA were excluded from the transcriptional analysis of an Arabidopsis microarray at the software level, forming a virtual Brassica microarray. We then performed an experiment on transcriptional responses during seed germination in Brassica using 17 886 homologous probesets of the virtual array where Brassica mRNA hybridization was detected. We subjected seed to hydration or priming (a step to improve germination vigor) and subsequent heating and drying treatments (steps to prolong shelf life of dried seed). Exploration of the microarray results indicated two likely expression patterns shared by many genes. One class of transcripts was strongly, globally, and irreversibly downregulated by priming, while other transcripts were induced, but often reversibly. Legacy seed-storage protein messages were in the first class, and many protein synthesis components and some resource mobilization enzymes were in the second. We were able to validate our results by confirming transcriptional responses using reverse transcription polymerase chain reaction (RT–PCR). Brassica is a model system for the molecular processes during hydration of seeds. It has well-established, commercially useful seed treatment procedures, and Brassica species have significant economic value as crops. Its close phylogenetic relationship to Arabi- dopsis thaliana makes it a clear potential beneficiary of Arabidopsis functional genomics (Paterson et al., 2001). Cauliflower (Brassica oleracea L. convar botrytis L. Alef. var. botrytis ) is a vegetable crop for which the seed hydration treatment known as priming is well established and applied on a commercial scale. Prim- ing of seeds is used to increase germination efficiency. Under this treatment, seeds are partially hydrated and begin the processes of germination, but emergence does not occur, and the seeds are subsequently dried. Priming has several effects on the physiology of seeds, including developmental advancement, decreased membrane permeability, dormancy breaking, and the induction of DNA and protein synthesis (Welbaum et al., 1998). It is considered that seed germination vigor and longevity are correlated to molecular mobility, M.E. Hudson, S.H. Chang, B. Han, X. Wang, and T. Zhu, Torrey Mesa Research Institute, Syngenta Research and Technology, 3115 Merryfield Row, San Diego, CA 92121. M.E. Hudson, Dep. of Crop Sciences, Univ. of Illinois, 334 NSRC, 1101 W. Peabody Blvd., Urbana, IL 61801. W. Yu and T. Zhu, Syngenta Biotechnology, 3054 Cornwallis Rd., Research Triangle Park, NC 27709. T. Bruggink and P. van der Toorn, Syngenta Seeds B.V., Westeinde 62, Enkhuizen, the Netherlands. Received 1 Dec. 2006. *Corresponding author ([email protected]). Published in Crop Sci. 47(S2) S96–S112. Published 14 July 2007. doi:10.2135/cropsci2006.12.0758tpg © Crop Science Society of America 677 S. Segoe Rd., Madison, WI 53711 USA Abbreviations: EDTA, ethylenediaminetetraacetic acid; FDR, false discovery rate; MES, 2-N-morpholino-ethane-sulfonic acid; MOID, match-only integral distribution; RH, relative humidity; PCR, polymerase chain reaction; RT–PCR, reverse transcription polymerase chain reaction; SAM, Statistical Analysis of Microarrays; SAPE, Streptavidin Phycoerythrin. Ge ome The Ge ome The Original Research Published online July 16, 2007

Upload: others

Post on 27-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-96 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

Analysis of Gene Expression during Brassica Seed Germination Using a Cross-Species Microarray PlatformMatthew E. Hudson,* Tonko Bruggink, Sherman H. Chang, Wenjin Yu, Bin Han, Xun Wang, Peter van der Toorn, and Tong Zhu

AbstractWe have developed an approach to facilitate the use of model organism microarrays in related, nonmodel organisms. We demonstrate the method by using Arabidopsis oligonucleotide microarrays to analyze gene expression in Brassica. Probes with low hybridization signals to Brassica genomic DNA were excluded from the transcriptional analysis of an Arabidopsis microarray at the software level, forming a virtual Brassica microarray. We then performed an experiment on transcriptional responses during seed germination in Brassica using 17 886 homologous probesets of the virtual array where Brassica mRNA hybridization was detected. We subjected seed to hydration or priming (a step to improve germination vigor) and subsequent heating and drying treatments (steps to prolong shelf life of dried seed). Exploration of the microarray results indicated two likely expression patterns shared by many genes. One class of transcripts was strongly, globally, and irreversibly downregulated by priming, while other transcripts were induced, but often reversibly. Legacy seed-storage protein messages were in the fi rst class, and many protein synthesis components and some resource mobilization enzymes were in the second. We were able to validate our results by confi rming transcriptional responses using reverse transcription polymerase chain reaction (RT–PCR).

Brassica is a model system for the molecular processes during hydration of seeds. It has well-established, commercially useful seed treatment procedures, and Brassica species have signifi cant economic value as crops. Its close phylogenetic relationship to Arabi-dopsis thaliana makes it a clear potential benefi ciary of Arabidopsis functional genomics (Paterson et al., 2001). Caulifl ower (Brassica oleracea L. convar botrytis L. Alef. var. botrytis) is a vegetable crop for which the seed hydration treatment known as priming is well established and applied on a commercial scale. Prim-ing of seeds is used to increase germination effi ciency. Under this treatment, seeds are partially hydrated and begin the processes of germination, but emergence does not occur, and the seeds are subsequently dried. Priming has several eff ects on the physiology of seeds, including developmental advancement, decreased membrane permeability, dormancy breaking, and the induction of DNA and protein synthesis (Welbaum et al., 1998). It is considered that seed germination vigor and longevity are correlated to molecular mobility,

M.E. Hudson, S.H. Chang, B. Han, X. Wang, and T. Zhu, Torrey Mesa Research Institute, Syngenta Research and Technology, 3115 Merryfi eld Row, San Diego, CA 92121. M.E. Hudson, Dep. of Crop Sciences, Univ. of Illinois, 334 NSRC, 1101 W. Peabody Blvd., Urbana, IL 61801. W. Yu and T. Zhu, Syngenta Biotechnology, 3054 Cornwallis Rd., Research Triangle Park, NC 27709. T. Bruggink and P. van der Toorn, Syngenta Seeds B.V., Westeinde 62, Enkhuizen, the Netherlands. Received 1 Dec. 2006. *Corresponding author ([email protected]).

Published in Crop Sci. 47(S2) S96–S112. Published 14 July 2007.doi:10.2135/cropsci2006.12.0758tpg© Crop Science Society of America677 S. Segoe Rd., Madison, WI 53711 USA

Abbreviations: EDTA, ethylenediaminetetraacetic acid; FDR, false discovery rate; MES, 2-N-morpholino-ethane-sulfonic acid; MOID, match-only integral distribution; RH, relative humidity; PCR, polymerase chain reaction; RT–PCR, reverse transcription polymerase chain reaction; SAM, Statistical Analysis of Microarrays; SAPE, Streptavidin Phycoerythrin.

Ge ome

The

Ge ome

The

Original Research

Published online July 16, 2007

Page 2: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-97

which in turn is related to hydration level and mem-brane permeability (Buitink et al., 2000). Carbohy-drate structure and sometimes embryo morphology are also thought to be infl uenced by hydration treat-ment (Welbaum et al., 1998). Th e chief drawback of seed priming is the concomitant reduction in seed longevity. Th is reduction in shelf life occurs in most primed seed and is thought to be caused by a loss of desiccation tolerance and/or oligosaccharide content (Bruggink and van der Toorn, 1995; Welbaum et al., 1998). Th is eff ect has been quantitatively studied in Brassica oleracea (Powell et al., 2000). Th e decreased viability of primed seed in Brassica is suffi ciently severe to restrict the use of osmotic priming in com-mercial varieties. However, the problem can be at least partly overcome by treating seeds with a mild water and temperature stress for a period of several hours or days, similar to treatments that can induce desiccation tolerance in germinated seeds (Bruggink et al., 1999). Th is process of gradual dehydration with heating is referred to as shelf-life induction.

Phylogenetic analyses based on DNA sequences have indicated the close relationship between the genera Brassica and Arabidopsis (Yang et al., 1999, Koch et al., 2001). Recent comparative genomic analyses have clearly demonstrated both synteny and microsynteny between Arabidopsis and Brassica, even though Brassica oleracea may lack homologs of some Arabidopsis genes (Hall et al., 2002; O’Neill and Ban-croft , 2000). It is clear that gene content, gene order, and homology at both the nucleic acid and amino acid sequence level are closely related between Brassica oleracea and Arabidopsis (Quiros et al., 2001).

Several discoveries have been made in Arabi-dopsis and elsewhere using Aff ymetrix (Aff ymetrix, Santa Clara, CA) microarrays as an exploratory tool (Tepperman et al., 2001; Salter et al., 2003; Loguinov et al., 2004; Monte et al., 2004). Used in this context, the microarray guides the design of further, lower-throughput experiments to fi nd genes that rapidly or strongly change in expression (e.g., gel blots or quantitative reverse transcription polymerase chain reaction [RT–PCR]). Th ese genes provide biomark-ers for the response at the cellular level and can provide insight into signal transduction and devel-opmental mechanisms. Cross-species exploratory microarray hybridization is a potentially useful technique for applying model organism genomics in related species for which few functional genomics resources are available (Zhu et al., 2001b; Chismar et al., 2002; Lee et al., 2004). Using such resources, it is not possible to monitor the entire transcriptome simultaneously, and it is necessary to confi rm any result using a second method of transcript analysis, since the underlying genome is unknown and hence

the actual transcript in the target species must be sequenced and verifi ed. For this reason, cross-species hybridizations are necessarily exploratory experi-ments, and researchers confi rm potential biomarkers of the response under investigation using a second technique. For example, Zhu et al. (2001b) used an oligonucleotide microarray designed using the rice (Oryza L. ssp. japonica) genome to assay the barley (Hordeum vulgare L.) transcriptome and confi rmed the results by RNA gel blot analysis.

Several issues remain to be addressed regarding the successful application of this technique in Bras-sica using arrays designed using Arabidopsis genome data, since these two members of the Brassicaceae family are signifi cantly divergent. Sequence distance values of up to 22.9% have been reported for nuclear coding DNA sequence within the Brassicaceae fam-ily (Koch et al., 2001). Divergence between the tribe Brassicae and Arabidopsis has been estimated to have occurred 14.5 to 20.4 million years ago, based on a mitochondrial sequence divergence of 0.71% (Yang et al., 1999); this divergence would be expected to be at least 12-fold higher between nuclear DNA sequences (Wolfe et al., 1987). Th e sequence diff er-ences between the two species may present a barrier in applying Arabidopsis microarrays to Brassica. In a previous study, Girke et al. (2000) tested the feasibil-ity of directly applying Arabidopsis cDNA micro-arrays to profi le developing Brassica seeds. While the correlation coeffi cients were high between the hybridization signals in the two species, the hybrid-ization signal intensity was approximately 50% lower for Brassica compared to Arabidopsis targets. Th e reported transcription pattern of selected genes with lower transcript abundance could not be validated on a genomic scale by RNA blot analysis or RT–PCR using Brassica-specifi c probes (O’Hara et al., 2002). Th us, it is necessary to address these issues before applying Arabidopsis GeneChip microarrays to accu-rately profi le gene expression in Brassica.

We describe a system for using microarrays across species boundaries, using genomic DNA hybridization to oligonucleotide arrays and pur-pose-written analysis soft ware to control for spe-cies-specifi c diff erences in hybridization. We then perform an exploratory experiment to investigate the reprogramming of the transcriptome during germi-nation of Brassica seeds. Seed germination has been the subject of intense biochemical, molecular, and physiological research (Jacobsen and Beach, 1985; Bewley and Marcus, 1990; Bewley 1997; Bewley et al., 2000). Functional genomic approaches have great promise to reveal more mechanistic details of this process (Ogawa et al., 2003; van der Geest, 2002; Bové et al., 2001) and have already proven their potential

Page 3: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-98 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

to generate results on an unprecedented scale. For example, microarray analyses have been used to iden-tify seed-specifi c transcripts and promoters (Girke et al., 2000; Zhu et al., 2001a). It was reported that 25% of 2600 genes investigated showed twofold or more increased abundance in seeds than leaves or roots (Girke et al., 2000). Th e control of transcrip-tion during development of embryos and fi lling of seeds has also been investigated using microarray techniques (Ruuska et al., 2002; Lee et al., 2002; Zhu et al., 2003), as has the response of seed to gib-berellin treatment during germination (Ogawa et al., 2003). More recently, the work of Soeda et al. (2005) used mechanically spotted microarrays with probes representing 1437 Brassica cDNAs to investigate the eff ect of priming, drying, and shelf-life induction treatments on transcript abundance. Here, we assay genome-wide patterns of gene expression using 17 886 probesets homologous to Arabidopsis genes, with the aim of both determining the mechanism of action of pregermination hydration treatments on germination effi ciency and validating our genomic DNA controlled approach to cross-species microarray analysis.

MATERIALS AND METHODSSeed Germination Treatments

Seeds of caulifl ower cultivars Lintop and Maverick were primed for 4 d followed by either shelf-life induction or by direct drying according to Bruggink et al. (1999). Priming consisted of elevat-ing the seed moisture content to 54% (dry weight basis) and transferring seeds to a rotating drum (10 rpm), which was kept in the dark at 20°C. At the end of this period, seeds were directly dried by exposing them to moving air at 20°C and 40% rela-tive humidity (RH) for 24 h, aft er which they were stored at 20°C and 40% RH. Alternatively, they were transferred to shelf-life induction conditions, as pre-viously described (Bruggink et al., 1999). Th ese con-ditions consisted of a relative humidity of 100% and a temperature of 32°C for 48 h, followed by drying. Drying in this case consisted of linearly reducing temperature and relative humidity to 20°C and 28%, respectively. At the end of this period, seeds were placed at 40% RH and 20°C for storage.

To test potential longevity, dried seeds were equilibrated at 75% RH and 20°C for 3 d and then divided over aluminum bags, which were subse-quently sealed and placed in a water bath at 46°C. Aft er periods of 0, 2, 3, 4, 5, 6, and 7 d, bags were removed from the water and opened, and the seeds were put to germinate on moistened fi lter paper. Th e percentage of seeds that germinated and developed into a normal seedling was determined aft er 10 d

of subsequent incubation at 20°C. Seeds for RNA extraction were processed in parallel with the ger-mination experiment. Statistical signifi cance of the germination responses was evaluated using the chi-square test.

DNA and RNA ExtractionSeeds treated in parallel with the treatments for

the seed germination experiments were fi xed and stored by transfer to RNAlater buff er (Ambion, Aus-tin, TX). DNA was extracted using CsCl

2 gradient

centrifugation (Sambrook et al., 1989). For each sam-ple, total RNA was extracted from approximately 0.75 mL of seeds with phenol:chloroform:isoamyl alcohol (25:24:1), precipitated with ethylene glycol monobu-tyl ether, and then resuspended in water and further purifi ed through LiCl precipitation (Sambrook et al., 1989). Th e RNA was further examined by gel electro-phoresis for integrity and by spectrometry for purity. To ensure data quality, only samples with A260/A280 ratios of 1.9 to 2.1 were included in the study.

Microarray Labeling and HybridizationFor genomic DNA labeling, 3 μg of genomic DNA

directly from the CsCl preparation was mixed with 20 μL of 2.5X random hexamers and heated at 100°C for 5 min. Th e mixture was cooled on ice immediately and labeled with biotin-dNTPs at 37°C for 2 h in the presence of Klenow DNA fragment using the BioPrime DNA labeling system (Invitrogen, Carlsbad, CA).

Th e RNA labeling and hybridization were per-formed as previously described (Zhu et al., 2001a). Briefl y, total RNA (5 μg) from each sample was reverse transcribed at 42°C for 1 h using 100 pmol of the oligo dT(24) primer containing a 5’ T7 RNA polymerase promoter sequence [5’-GGCCAGTGAATTGTAATAC-GACTCACTATAGGGAGGCGG-(dT)24–3’], 50 mM Tris-HCl (pH 8.3), 75 mM KCl, 3mM MgCl

2, 10 mM

dithiotreitol (DTT), 0.5 mM dNTPs, and 200 units of SuperScript II reverse transcriptase (Invitrogen, Carls-bad, CA). Th e second strand of cDNA was synthesized using 40 units of Escherichia coli DNA polymerase I, 10 units of E. coli DNA ligase, and 2 units of RNase H in a reaction containing 25 mM Tris-HCl (pH 7.5), 100 mM KCl, 5 mM MgCl

2, 10 mM (NH

4)SO

4, 0.15 mM b-

NAD+, 1 mM dNTPs, and 1.2 mM DTT. Th e reaction proceeded at 16°C for 2 h and was terminated using ethylenediaminetetraacetic acid (EDTA). Double-stranded cDNA products were purifi ed by phenol/chloroform extraction and ethanol precipitation.

Biotinylated complementary RNAs (cRNAs) were transcribed in vitro from synthesized cDNA by T7 RNA Polymerase (Enzo BioArray High Yield RNA Transcript Labeling Kit, Enzo, NY). cRNAs were purifi ed using affi nity resin (QIAGEN RNeasy

Page 4: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-99

Spin Columns; Valencia, CA) and randomly frag-mented by incubating at 94°C for 35 min in a buff er containing 40 mM Tris-acetate (pH 8.1), 100 mM potassium acetate, and 30 mM magnesium acetate to produce molecules of approximately 35 to 200 bases. Th e labeled DNA and RNA samples were mixed with 0.1 mg ml−1 sonicated herring sperm DNA in a hybridization buff er containing 100 mM 2-N-mor-pholino-ethane-sulfonic acid (MES), 1 M NaCl, 20 mM EDTA, 0.01% Tween 20, denatured at 99°C for 5 min, and equilibrated at 45°C for 5 min before hybridization. Th e hybridization mix was then trans-ferred to the GeneChip microarray cartridge and hybridized at 45°C for 16 h on a rotisserie at 60 rpm.

Th e hybridized arrays were then rinsed and stained in a fl uidics station (Aff ymetrix, Santa Clara, CA). Th ey were fi rst rinsed with wash buff er A (6X SSPE [0.9M NaCl, 0.06 M NaH

2PO

4, 0.006 M

EDTA], 0.01% Tween 20, 0.005% Antifoam) at 25°C for 10 min and incubated with wash buff er B (100 mM MES, 0.1 M NaCl, 0.01% Tween 20) at 50°C for 20 min, then stained with Streptavidin Phycoery-thrin (SAPE) (100 mM MES, 1M NaCl, 0.05% Tween 20, 10 mg ml−1 SAPE, 2 mg ml−1 BSA) at 25°C for 10 min, washed with wash buff er A at 25°C for 20 min and stained with biotinylated anti-streptavidin anti-body at 25°C for 10 min. Aft er staining, arrays were stained with SAPE at 25°C for 10 min and washed with wash buff er A at 30°C for 30 min. Arrays were scanned according to manufacturer’s instructions, using an Agilent GeneArray Scanner and GeneChip Suite 4.0 scanning soft ware (Aff ymetrix, Santa Clara, CA). Th e 4.0 soft ware was used only to gen-erate the DAT and CEL fi les and not to determine expression metrics, which was done using custom Perl soft ware described below.

Microarray Data Processing and Generation of Expression Metrics

Th e DAT fi les of raw images generated by the scan of the microarray were used to generate CEL fi les with Aff ymetrix MAS 4.0. Th e CEL fi les gener-ated in this manner, containing the signals of each oligonucleotide probe feature, were then analyzed using custom scripts written in Perl 5.6. Th e CEL fi les generated by genomic DNA hybridization were compared at the individual probe level to gener-ate a database of probes hybridizing to the Brassica genome with 75% or less of the signal generated by hybridization to the Arabidopsis genome. Th e perfect match probe values from the microarrays hybrid-ized to Brassica RNA were then compared to the database; probes hybridizing to the Brassica genome with 75% or less of the signal from the Arabidopsis genome were removed from the analysis. Th e 72nd

percentile value of the remaining probes was then calculated, according to the match-only integral distribution (MOID) algorithm (Zhou and Abagyan, 2002). Th ese raw signal values were then corrected by background correction (subtracting a background estimate obtained from the fi ft h percentile of the sig-nal values, and scaling any resultant negative values to zero), and subsequently, a per-array normalization was used that scaled all expression metrics on each array to a target mean of 100. For cluster analysis, the values were normalized by median centering to a target of 1 on a per-gene basis. For Statistical Analysis of Microarrays (SAM) analysis, data were log

(2) transformed. All the above operations were

performed using scripts written in Perl 5.6. Self-organizing maps were generated using GeneSpring 4.2 (Silicon Genetics, Palo Alto, CA), based on the results of hierarchical, correlation-based clustering also performed with GeneSpring.

ReplicationSince our major concern was the reproducibility

of our probe-level analysis, we performed two tech-nical replicate microarrays for a single RNA sample. We replicated the single RNA sample for each of the fi ve points from the cultivar Lintop with a second RNA sample, also at each of the fi ve points and with two replicate microarrays, for the cultivar Maverick. Hence, our overall experiment contains fi ve treat-ment points, each with two diff erent genotypes to control for biological and allelic variation between Brassica varieties, and each genotype with two tech-nical replicates to control for technical variation in our procedure, giving 20 microarrays in total. We also performed additional replication using the pre-vious generation, Novartis-designed “8k” Gene Chip (Zhu and Wang, 2000), which gave broadly similar results (data fi les available from the authors).

Statistical AnalysisSince this was an exploratory experiment, the

mean of the two technical replicate hybridizations for each genotype, for every transcript, was used for initial analysis and clustering, without fi ltering genes for statistical signifi cance. However, at this stage genes were fi ltered based on the statistical likelihood that true expression was detected, hence excluding the noisy probes with low signal that can confound clustering studies. Th e coeffi cient of variation of all the negative control probes was used to calculate the 95% confi dence interval for undetected transcripts. Th is 95% confi dence threshold, 25, gave the thresh-old of detectability (the arrays are each normalized to a target mean of 100). Any gene where no two rep-licate data points of the 10 microarrays used reached

Page 5: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-100 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

a signal of 25 was excluded from the analysis (all genes where all the probes were eliminated were also removed from the analysis at this step). In the case of this analysis, genes with one or more probes remain-ing and called “good” were allowed to proceed past the fi lter; however, this can readily be changed for other experimental designs using our scripts. Among 26 367 mRNAs with Arabidopsis probesets, 17 886 putative orthologs were considered “present” in Brassica using these cutoff s.

To determine which individual genes were statis-tically signifi cant, the d statistic test from Statistical Analysis of Microarrays (SAM; Tusher et al., 2001) was used, as implemented in the Bioconductor pack-age for the R programming language. Eight hundred thirty-three genes were classed as signifi cant at a 5% false discovery rate (FDR) (delta = 2.3, Q = 0.05).

Real-Time Quantitative RT–PCRTh e total RNA was treated with DNase I from

DNA-free Kit (Ambion, Austin, TX) before RT–PCR to ensure no gDNA contamination. Th e treated RNA (100 ng) was reverse transcribed at 37°C for 1 h using M-MLV reverse transcriptase kit (Sigma Aldrich, St. Louis, MO) with antisense primers (0.5mM each) for the gene of interest and actin 1. Th e latter was used as an internal control to eliminate any diff erences in reverse transcription effi ciency or RNA concentration.

Brassica-specifi c PCR primers and probes were designed from Brassica gene sequences homologous to Arabidopsis probeset sequences using Primer Express soft ware (Applied Biosystems, Foster City, CA). Probes were labeled at their 5’ end with a reporter fl uorophore (tetrachloro-6-carboxyfl uores-cein [TET] for the internal control [actin 1] and fl uo-rescein [FAM] for the gene of interest) and at the 3’ end with the quencher fl uorophore tetramethylrho-damine (TAMRA). Sequences of primers and probes used in this work are shown in Table 1.

Real-time quantitative PCR (TaqMan) was sub-sequently run in an ABI Prism 7900HT sequence detection system (Applied Biosystems, Foster City, CA). Reactions were multiplexed to simultaneously measure the expression of both the gene of interest and the internal control gene actin 1 from Brassica oleracea. Each 10 μL reaction contained 2 μL cDNA obtained from the RT reaction and 8 μL PCR Mas-ter Mix, which contained: 1x fi nal concentration of 2x PCR Mastermix for Quantitative PCR (Sigma Aldrich, St. Louis, MO), 7.5 mM MgCl

2, 150 nM

Sulforhodamine 101, primers to a fi nal concentra-tion of 900 nM each, probes to a fi nal concentration of 100 nM each. Th e reaction parameters were 5 min at 95°C followed by 40 cycles of 15 sec at 95°C and 1 min at 60°C. Five replicates were run for all reactions of each sample to eliminate the assay variation.

Table 1. Primers and probes used in real-time quantitative reverse transcription polymerase chain reaction.

Gene name GenBank ID Oligo name Sequence 5’- 3’

Brassica oleracea actin 1 AF044573 BoActin1-167F† CTGTGACAATGGTACCGGAATG

BoActin1-229R‡ ACAGCCCTGGGAGCATCA

BoActin1-190TET§ TCAAGGCTGGTTTCGCCGGTG

B. oleracea desaturase AF056571 o3desatur540F GATCGTGTTGGCCACTCTTGT

o3desatur614R GGAACACCATAGACTTTTAGAACTGTGA

o3desatur562Fam¶ TATCTATCATTCCTCGTTGGTCCA

B. napus 18.5 kD oleosin X61937 Oleosin18K596F CCGTGGCAATGCTCATCA

Oleosin18K663R GAAGACGGTTATAGCTGCAATGC

Oleosin18K615Fam CGGCTTTCTCTCCTCCGGTG

B. oleracea 40S ribosomal protein AF144752 40S rib386F TGCGTCCCCCTAAGAAAGG

40S rib458R ATGGCTTCATGGACGGAAGT

40S rib406Fam TCTGCTGTTCAGAGACCACGCAAC

B. napus 12S cruciferin AF319771 12S stor4888F ACATGTTGAGCGGGTTCGA

12S stor4988R GTTTCCTCTGCTGTCTTGTTGGT

12S stor4924Fam AGGCATTGAAAATCGACGTTAGGTTGGCT

B. napus beta-amylase AF319168 bamylase245F AAACGTATCTCCGCCGATGA

bamylase310R TTACACGCCACGGACAGATC

bamylase266Fam TCCGGTCTTGGGATCGAGGCG†Forward primer.‡Reverse primer.§Probe labeled with tetrachloro-6-carboxyfl uorescein.¶Probe labeled with fl uorescein.

Page 6: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-101

Raw data of the assay results were obtained by using the soft ware SDS2.0 (Applied Biosystems, Foster City, CA), according to manufacturer’s instructions). Relative expression levels to actin were calculated using the cycle threshold (C

t) values and the formula

2ΔΔCt as described by the manufacturer (Applied Bio-systems, 1997). To compare the expression pattern of measured genes between each sample and reduce any eff ect of polymorphism between the genotypes, the expression level of the fi ve replicates of each sample was normalized to the mean expression value.

Results and DiscussionStrategy for Reassembling a Virtual Oligonucleotide Microarray from the Arabidopsis GeneChip Microarray for Brassica Transcript Detection

In this study, we have used oligonucleotide microarrays because they allow precise control over the sequence, GC content, probe length, and other characteristics of the probe DNA, which can other-wise be signifi cant spoiling factors in cross-species experiments using “spotted” cDNA arrays. In addi-tion, the 25-bp sequence length gives optimal dis-crimination between divergent sequences (Lipshultz et al., 1999), thus minimizing the problems expected from spurious cross-hybridization.

We expect, based on the data of Koch et al. (2001), that the sequence divergence between Brassica RNA samples and the Arabidopsis sequences on the chip will be suffi ciently high to cause mismatches between most of the oligonucleotide probes and the target RNA. Th is problem invalidates the assumption, inherent in most Aff ymetrix-type oligonucleotide microarray experi-ments, that a “mismatch” probe hybridization value may be safely subtracted from a “perfect match” probe hybridization value. Using standard analysis tools, this will cause the transcripts to be called “absent” (as described by Chismar et al., 2002) since the mismatch hybridization levels will be comparable to those of the “match.” However, substantial information about tran-script levels will remain in the perfect match signal, as demonstrated by Zhu et al. (2001a), who used a custom chip design without a mismatch probe. Th ese authors validated cross-species microarray analysis without mismatch probes and the technique of genomic DNA hybridization, using the relatively large genomes of rice and barley. In the work described here, a similar cus-tom array design is used, although the techniques are also valid for use in a conventional match-mismatch design if the mismatch data is discarded and the match probe signal is considered alone.

An additional concern is that some Arabidopsis probes will not hybridize measurably to Brassica

DNA, or will give much lower signal intensity. Th e generally lower signal intensity is dealt with here at the data processing level by normalizing all the data from each microarray (Brassica or Arabidopsis) to a target mean of 100. Th e potential for nonhybridizing probes to cause skewed or noisy data is addressed here by detecting such probes by hybridization of genomic DNA to the microarray, and excluding these probes from the data analysis step.

Finally, a distinct problem exists in diff erentiat-ing between closely related members of the same gene family in a cross-species hybridization. Th is is a problem that we are unable to address without the sequence of all the Brassica orthologs of the Arabi-dopsis genes in every paralogous gene family. Th is is an equivalently large problem for the use of a Brassica cDNA spotted microarray (Soeda et al., 2005). For this reason we present our results as valid only at the gene-family level, unless confi rmed by analysis of a Brassica ortholog. We validate the expression patterns of examples from the gene-family members of selected transcripts with strongly diff erential expression, using real-time PCR with specifi c probes designed to cloned Brassica orthologs of Arabidopsis genes.

Removal of Hybridization Data for Nonhybridizing Oligonucleotide Probes Increases Data Quality and Simplifi es Interpretation

Th e main experiments described in this paper were conducted using a custom-designed oligo-nucleotide microarray (Zhu, 2003). Th is microarray contains no mismatch probes and 382 166 perfect-match oligonucleotide probes of 25 base pairs each, designed to the TIGR2 version of the Arabidopsis genome annotation (January, 2002). Th ese sequences were designed to match 25 996 unique Arabidopsis genes (representing over 99% of the predicted exons in the Arabidopsis genome at the time of design). Consequently, each gene is represented by a probe-set of, on average, 15 oligonucleotide probes. Th e probeset hybridization data, of 15 intensity values on average for each gene ID, is used to calculate the expression value for that gene, which is taken as the 72nd percentile of the normalized probe cell inten-sity values for each probeset (the MOID algorithm; Zhou and Abagyan, 2002), a comparable metric to the standard Aff ymetrix expression intensity value, but one that does not require subtraction of a mismatch. Th e same expression metric was used to calculate expression values from the match probes of the standard match–mismatch Arabidopsis genome array and compared with the results generated by Aff ymetrix MAS 4.0 soft ware. Th e correlation

Page 7: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-102 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

coeffi cients between the Aff ymetrix match–mis-match algorithm and the match probe only (72nd percentile) algorithm used here were between 0.92 and 0.96 for all the arrays used in this experiment.

To assess the utility of the Arabidopsis array for Brassica gene expression analysis, we estimated the rate of nucleotide substitution between Brassica oleracea and Arabidopsis thaliana. Th ree full-length Brassica coding sequences belonging to diff erent functional groups and with clear Arabidopsis ortho-logs were compared using pair-wise Martinez/Needleman-Wunsch alignment (Chlorphyllase I, Arabidopsis = At1 g19670, Brassica = GB: AF337544; Ethylene receptor ETR2, A. = At3 g23150, B. = EMBL: AB078598; 40s ribosomal protein, A. = At3 g02560, B. = EMBL: AF144752). Th e nucleotide substitution rates between these sequences were 20.5% 19.7 and 20%, respectively, for the coding regions to which the probe oligonucleotides were designed.

Assuming the median of these measurements is close to that for all genes (20% substitution rate or 80% sequence identity), the probability of any 25-bp probe designed to match an Arabidopsis coding region being 100% identical to Brassica can be esti-mated at 0.825, so 99.62% of probes will be expected to contain a mismatch. Despite this, 75% of the probes on an array hybridized to Brassica DNA were found to be within 50% of the signal intensity of the identical probes hybridized to Arabidopsis DNA. We found that probes with extensive mismatches, but 12 bp or more of contiguous, identical sequence, were oft en capable of producing hybridization signals from Brassica genomic DNA that were within 75% of the Arabidopsis values aft er hybridization (data not shown). Since the 72nd percentile is used to esti-mate the expression value, it is likely that the expres-sion patterns of closely homologous genes are being monitored in most, if not all, cases.

Labeling effi ciency diff erences due to the genome size were normalized by subtracting a background measurement (the fi ft h percentile of the cell values) and then scaling the corrected cell values for each microar-ray to a target mean of 100. Th e cell intensity values for each oligonucleotide probe in two replicated arrays hybridized to Arabidopsis ecotype Columbia genomic DNA were then compared. In the fi rst replicate Ara-bidopsis array, 91.5% of probes had at least 50% of the signal intensity of the probes on the second array. Th e comparison of Brassica and Arabidopsis genomic DNA hybridized arrays was then used to defi ne a list of useful probes. Th e intention here is to remove probes from the analysis for which the Brassica target gene is not clearly orthologous to a gene from Arabidopsis, while retaining probes for genes that may have two or more Brassica orthologs for the Arabidopsis gene.

Since the arrays contain several redundant probes for each gene, it is possible to remove a substantial number of probes from the analysis and still obtain a robust result. Consequently, all the probes where Brassica genomic hybridization was less than 75% of the signal from Arabidopsis (143 361 probes, or 37.5%) were des-ignated ‘unusable’; the remainder were processed using a purpose-written expression analysis algorithm (see “Materials and Methods” section). Th e fi gure of 62.5% of probes that hybridized to Brassica genomic DNA with 75% or more of the Arabidopsis signal is close to the 68.7% of probes we estimate to have 12 bp or more of identical contiguous sequence. It is likely therefore that these probes are recognizing related sequences in Brassica with substantial homology, but signifi cant mismatches, to the Arabidopsis gene to which the probes were designed. For this reason we cannot attri-bute the behavior of individual genes in Brassica to the data from probes designed to their direct Arabidopsis orthologs—in most cases the sequence of the Brassica genes themselves is not known. Since the nucleotide level similarity between members of a gene family in Arabidopsis may be greater than the similarity between the Arabidopsis and Brassica orthologs, we can only describe the Brassica gene expression data in terms of the behavior of gene families. However, the ability to do this on a whole-genome scale can still provide insight into the physiology of transcriptional regulation, in the case of gene families that are coregulated.

Figure 1 shows the eff ect of using only the probes designated “usable” on the correlation of expression values from control genomic DNA hybridization, using a fourth microarray also hybridized to Brassica genomic DNA. Although the average probe number per probeset was reduced from 15 to 9 aft er removal of the unusable probes, the correlation coeffi cient between Brassica and Arabidopsis genomic hybridization improved from 0.69 from the unmodifi ed analysis to 0.75 by the use of only the “homologous” probes. Th ese data support the assumption that the expression values we infer from our analysis of microarrays hybridized to Brassica mRNA are likely to represent measurements of the expression level of genes with high nucleotide-level identity to those in Arabidopsis. We compared the reproducibility of RNA detection by comparing the same Brassica RNA sample hybridized to two micro-arrays, analyzing both arrays with both methods and comparing the correlation between replicates for each method. Th e correlation in both cases was very high (0.99) with no loss of reproducibility by exclusion of probes. Consequently, we do not expect (and have not observed) the use of a restricted probeset to strongly aff ect the relative expression values between Brassica RNA samples.

Page 8: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-103

Pregermination Treatments Affect Seed Germination Rate and Longevity

To ensure that our seed treatments produced eff ects consistent with those described previously (Bruggink et al., 1999), we then investigated the germi-nation physiology of Brassica seed in parallel with RNA extraction for microarray profi ling (Fig. 2 and Table 2).

Th e seed treatments delivered the expected eff ects on germination effi ciency and longevity in both cul-tivars (Table 2, Fig. 2). Table 2 shows the eff ects of the partial hydration, or priming treatment, and the priming treatment followed by a partial dehydration (shelf-life induction) treatment with heating on time taken to germination. In all cases treated seed germi-nated faster than untreated. Th e induction treatment aft er priming led to a slightly larger increase in rapid-ity than priming alone (Table 2).

Figure 2 shows that viability of untreated seeds is reduced to half its original value aft er 4 to 6 d of incu-bation at 46°C in both cultivars. For primed seeds this

reduction is reached aft er approximately 2 d. Seeds partially dried with heating (shelf-life induction) aft er priming showed similar longevity to untreated seeds. In summary, primed seed germinated more rapidly but deteriorated more quickly, and this was reversed by the shelf-life treatment of gradual dehydration with heating. Th ese data are fully consistent with previous studies on seed germination responses and indicate that our treatments are equivalent to those in com-parable studies ( Bruggink et al., 1999; Powell et al.,

Table 2. Effect of priming and priming + induction treatments on speed of germination of seeds of two Brassica cultivars. T50 is the average time to germination of seeds.

TreatmentT50 (days)

Lintop Maverick

Untreated 3.15 2.55

Primed 1.50 2.03

Primed + shelf-life induction 1.36 1.85

Figure 1. Correlation of Brassica genomic and mRNA microarray hybridizations with Arabidopsis genomic DNA and with Brassica replicates. Scatterplots of expression signal values derived from Arabidopsis and Brassica genomic DNA hybridization to the Arabidopsis microarray are shown. Expression metrics for probesets were calculated from the 72nd percentile value of all perfect match probe tiles for each transcript. This was done either including all probes in each probeset (top left) or after excluding “bad” probes that show less than 75% of the hybridization signal to Brassica genomic DNA compared to Arabidopsis genomic DNA (top right). The bottom scatterplots show the reproducibility of replicate hybridizations to Brassica mRNA samples using the standard analysis (bottom left) and excluding “bad” probes (bottom right).

Page 9: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-104 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

2000). Th e responses of germination viability to prim-ing and to shelf-life treatment are both signifi cant at the P < 0.001 level.

Exploratory Analysis: Correlation of Transcriptional Profi le with Functional Category

An exploratory analysis was then performed on the results of the expression microarrays from

the seed germination experiments. Th e RNA used for microarray analysis was extracted from Bras-sica seeds of two genotypes (Lintop and Maverick), which were either untreated or subjected to four successive treatments corresponding to established procedures designed to optimize germination rates. In the data shown in Fig. 3, 4, and 5, point 1 repre-sents untreated seed. Point 2 represents seed treated by partial hydration to 54% moisture or “priming.” Aft er this treatment, seeds for point 3 were dried to 6% moisture (priming followed by drying). Seeds for point 4 were treated as for point 2, but instead of drying, they were exposed to the partial dehy-dration with heating referred to as shelf-life induc-tion (Bruggink et al., 1999). Subsequently, RNA for point 5 (seeds partially hydrated, partially dried with heating [shelf-life induced] and then dried fully) was harvested. Th e continuous lines con-necting the points in Fig. 2 are used only to dem-onstrate the pattern used to map the gene clusters. Th erefore, points 1, 2, and 3 are continuous, as are the points 1, 2, 4, and 5, but there is no continuity between points 3 and 4.

Expression values were computed using the algo-rithm described earlier, and data were then fi ltered to remove low-level random values by exclusion of transcripts below the detection threshold (described in “Materials and Methods”). Two replicate arrays were performed on each treatment point for both genotypes. Th e Pearson correlation coeffi cients between replicate arrays from the same treatment and the same genotype were high (between 0.936 and 0.997, mean 0.981). Th e correlations between arrays from the same treatment across the two diff erent genotypes were lower but still high enough to indicate strong correlation (between 0.858 and 0.940, mean 0.902).

In total, 17 886 probesets met our preliminary fi lter criteria; we concluded that the remaining genes are likely not expressed in Brassica seed at high enough levels to detect with confi dence, or are substantially polymorphic between Arabidopsis and Brassica. Th is cutoff eliminated most of the probe-sets with unacceptable noise levels. Th e expression profi les of all 17 886 remaining transcripts, with values normalized to the median on a per-gene basis, are shown in Fig. 3A for the fi ve treatment points described, to demonstrate the global patterns within the total data set. Values represent the mean of the signal values from two independent replicate micro-array experiments. It is apparent from Fig. 3A that there are three clearly visible generic patterns within these lines, an observation that was confi rmed using hierarchical clustering analysis, which was consis-tent with three global clusters of expression patterns (analysis not shown).

Figure 2. Seed germination after accelerated aging. The germination of Brassica oleracea seed from cultivars Lintop and Maverick was assayed after seeds were stored for 2 to 7 d, under conditions designed to accel-erate deterioration in viability (46°C and 75% relative humidity). Unbroken line: untreated seeds; dotted line: partially hydrated (primed) seeds; broken line: primed seeds in which shelf life was induced using a slow drying and heating treatment.

Page 10: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-105

Figure 3. Exploratory analysis of array data from pre-germination treatment. (A) Superimposed expression pattern of all transcripts detected using one or more probes with hybridization to Brassica genomic DNA. (B) Functional classifi -cation of the detected genes, using Arabidopsis annotation. The pie chart shows the breakdown by functional category (the highest-level MIPS category) of all the detected transcripts. The percentages in parentheses indicate the percent-age of the categories in the whole genome annotation. (C) Clustering of normalized expression data. The graph at top left shows the median-centered expression profi les of all the detected transcripts at the fi ve experimental points: (1) untreated seed, (2) primed seed, (3) primed and dried seed, (4) primed seed after shelf-life induction, and (5) primed seed after shelf-life induction followed by drying. The lower graphs show how these profi les are separated into three distinct patterns, which can be resolved using self-organizing map clustering. The charts to the right of each cluster graph show the relative percentage enrichment or depletion of genes in the functional categories shown in B, in each of the three clusters.

Page 11: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-106 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

Figure 4. Expression patterns of probesets from key functional categories showing statistically signifi cant changes. The expression signals for particular probesets corresponding to known Arabidopsis genes are plotted against the treatment of the seed before RNA extraction. The fi ve points on each curve correspond to (1) untreated seed, (2) primed seed, (3) primed and dried seed, (4) primed seed after shelf-life induction, and (5) primed seed after shelf-life induction followed by drying. All probesets show statistically signifi cant differences in expression. Further details and gene lists are available in the supplementary material. Error bars represent the deviation from the mean of two replicate hybridizations.

Page 12: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-107

Having found three distinct expression patterns that were clearly visible among these genes with detectable mRNA levels in seed, we used clustering of the median-centered data using self-organizing maps to separate the three clusters, which are shown in Fig. 3C. Th e fi rst, and largest, group of 7731 transcripts (cluster 1) is induced by the hydration step, relatively unchanged by the drying step, repressed by the grad-ual dehydration (shelf-life induction) step and again relatively unchanged by drying. Th is behavior is con-sistent with genes that are involved in the mediation of seed responses to hydration, and whose response to priming is reversed by the gradual dehydra-tion– induction treatment. Since this group represents 43% of all the genes detected in seed, the response to seed hydration probably occurs on a genomic scale. Th e second cluster shows no dramatic alteration in gene expression and serves to provide an indication

that the large changes seen in the other two clusters are unlikely to be due to systematic error. Th e third cluster has almost the inverse response pattern of the fi rst, in that most of the genes are strongly repressed by the hydration treatment. However, any of the genes repressed by hydration in this group are not reversed by a subsequent induction in response to gradual dehydration. Many genes in this cluster change only slightly, but several transcripts are strongly hydra-tion-repressed, by more than an order of magnitude in some cases. Compared with other studies that have, for example, shown that 10% of investigated transcripts respond strongly to light in etiolated seed-lings (Tepperman et al., 2001), this result may indicate global changes in the seed or seedling genome during early developmental transitions such as germination and de-etiolation. However, without thorough statisti-cal analysis of a Brassica-specifi c Aff ymetrix array,

Figure 5. Relative expression levels of four validated Brassica orthologs of Arabidopsis genes in functional categories found to be strongly responsive to priming treatment, assayed by real-time quantitative reverse transcription polymerase chain reaction (RT–PCR). The data represent the mean of fi ve replicates, with error bars representing the standard error of the mean. The fi ve points on each curve correspond to (1) untreated seed, (2) primed seed, (3) primed and dried seed, (4) primed seed after shelf-life induction, and (5) primed seed after shelf-life induction followed by drying. The expression levels were normalized fi rst to actin measured from the same RNA sample and then to the mean of all points for each genotype.

Page 13: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-108 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

it is not possible to prove this conclusion on a whole-genome scale.

We used the MIPS classifi cation system (http://mips.gsf.de/proj/thal/; verifi ed 11 June 2007) to deter-mine which functional classes of proteins are encoded by the transcripts in each of the three expression pat-tern clusters. Th is system has the advantage of provid-ing a putative functional category for most genes in the Arabidopsis genome. Figure 3B shows the breakdown by functional category of all of the detected transcripts. Th e detected transcripts closely mirror the overall pro-portion of the genome, with the exception of a reduc-tion in the percentage of unknowns (probably many of these genes are misannotated or expressed at low levels) and an increase in the proportion of metabolic genes (a well-characterized and highly expressed group). Th e relative abundance of each functional category in each of the coregulated groups is shown as a bar graph (Fig. 3C, right-hand panel). Th e clusters of genes that are upregulated by the hydration treatment (cluster 1) are enriched in proteins involved in translation and protein synthesis (361 of the detected transcripts fell into this category, 253 of them in cluster 1; signifi -cant by chi-squared at P = 2.5 × 10−25). Th e result that genes related to protein synthesis are transcription-ally activated during hydration was anticipated, since respiration and amino acid metabolism are processes activated very early in seed germination (Bové et al., 2001). Th ey include ribosomal proteins and elongation factors, genes involved in ribosomal RNA biosynthesis, amino acid biosynthesis, protein complex formation and tRNA biosynthesis and assembly. A similar conclu-sion was reached recently by Soeda et al. (2005) using a Brassica cDNA microarray.

Proteins involved in energy production and protein degradation and folding were also overrep-resented in this cluster (372 of 752 degradation and folding, P = 5.4 × 10−4; 118 of 246 energy, P < 0.05). Th ese observations are consistent with previous results at the single-protein or activity level that pro-tease activities (Muntz, 1996) and protein synthesis (Bray, 1995; Bewley, 1997) are activated when seeds are hydrated, but demonstrate that a global eff ect is seen on genes in these classes.

Th e second and third clusters show no strong enrichment in any of the general MIPS functional cat-egories, although the general MIPS categories do not separate storage proteins (discussed below). Th e third cluster contains fewer protein synthesis components and more plant-specifi c proteins (a group that includes storage proteins) than the previous two.

Soeda et al. (2005) divide their transcripts into three groups, which correspond to transcripts repressed during priming–germination, transcripts induced during priming, and transcripts induced

during the later stages of germination (not inves-tigated here). We also observed transcripts whose levels appear to be induced (cluster 1) and repressed (cluster 3) during the process of seed hydration. Th e classes of genes in these groups match well those described by Soeda et al. (2005), including global changes in expression of ribosomal proteins, elongation factors, histone genes and a number of transcription factors and metabolic enzymes (see Supplemental Table 1 for details).

Gene by Gene Statistical Analysis Provides Support for Changes in Gene Regulation

Th e SAM package (Tusher et al., 2001) was used to analyze the signifi cance of the expression values for each gene on a gene-by-gene basis. Using an FDR of 5%, 833 genes were classed as signifi cant. Th ese statistically signifi cant probesets were used for further analysis. In contrast, 16 941 genes were called signifi -cant at a FDR of 50%—while the genes at this high FDR were not further used, this gives evidence that a high number of genes (>8000) change in expression, indicating likely changes in a very large number of transcripts during the process of seed hydration.

Th e 833 statistically signifi cant Arabidopsis homolog genes included large numbers in the same functional categories as those correlated with par-ticular gene clusters shown in Fig. 3. Fift y-nine of the 833 genes are involved in protein synthesis, folding, modifi cation, destination, translation, or ribosome biogenesis, and strikingly, fi ft y-two are ribosomal proteins. Six are involved in energy generation, 4 in amino acid biosynthesis, and 50 in general metabolism. Th ese genes and their expression pat-terns are summarized in Supplemental Table 1.

Storage Protein mRNA Levels Decline in an Irreversible Manner after Hydration Treatment

We broke down the data shown in Fig. 3 into the genes contained within the set of 833 statistically signif-icant genes representing gene families found to corre-late with particular clusters (Fig. 4). Th e mRNAs of the super-abundant seed storage proteins, induced strongly during seed development, are known to persist in dry seeds, forming 50 to 60% of the total mRNA by mass in desiccated soybeans (Goldberg et al., 1981). Th ese mRNA levels are known to decline rapidly during germination; this is thought to be necessary to curtail the synthesis of storage proteins, which are synthesized from these messages in the immediate period aft er seed imbibition (Kermode and Bewley, 1985; Kermode et al., 1985; Bewley et al., 1989; Misra and Bewley, 1985). Th e levels of the messages for oleosins, cruciferins, and

Page 14: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-109

legumins, and their homologs and isoforms, are dis-played in Fig. 4 and Supplemental Table 1. Two oleosin probesets (At4 g25140 and At5 g51210), one legumin (At2 g28680), and two cruciferins (At1 g03880 and At4 g28520) were found to be statistically signifi cant at the FDR 0.05 level. Th ese messages all decline sharply fol-lowing hydration treatment, with a more than fi vefold reduction in the case of At4 g25140, and a more than 10-fold reduction in the case of At4 g28520. Th e sharp decline in oleosin message levels mirrors the 10- to 15-fold induction of these messages observed in Ara-bidopsis seeds 8 to 10 d aft er fertilization (Ruuska et al., 2002). Interestingly, the message level of the storage proteins and oleosins do not return to the value in the dried seed aft er the gradual dehydration but remain low. Th e removal of “legacy” transcripts of embryo-genesis and seed fi lling is therefore not reversible by gradual dehydration. Th ese results correlate well with those of Soeda et al. (2005), which also demonstrated that storage proteins such as cruciferin, oleosin, napin and the late embryogenesis abundant (LEA) transcripts are repressed by seed priming.

Levels of an Amylase Transcript Are Reduced in Response to Hydration

Th e mRNAs for amylases are strongly induced in aleurone tissue during cereal seed germination, dem-onstrating that at least part of the induction of degrada-tive enzymes is controlled at the level of transcription (Rogers, 1985; Livesley and Bray, 1991). Contrary to the behavior of amylase genes in cereal aleurone cells, we did not observe a strong increase in the expression of any messages for alpha- or beta-amylase during seed hydration (Fig. 4 and 5). In fact, we observed a substan-tial reduction in the level of the mRNA for one highly statistically signifi cant beta-amlyase probeset (Arabi-dopsis gene ID At3 g23920) in response to the hydra-tion–priming treatment. Th is unexpected result was not detected by the limited microarray used by Soeda et al. (2005), and since our microarray experiment used a cross-species hybridization approach, there remained a possibility of error. We therefore validated this observa-tion using RT–PCR on the orthologous Brassica tran-script (Fig. 5).

Levels of Transcripts for Proteolytic and Lipolytic Enzymes Are Differentially Regulated during Priming

We found that one protease gene (At5 g67360, a cucumisin-like serine protease) was signifi cantly induced during the initial hydration treatment, and that two other protease messages (At5 g58870, an FtsH protease, and At4 g39910, a ubiquitin-specifi c protease) were signifi cantly repressed. It is expected

that not all proteases will be induced, as many pro-teases have cellular roles other than the mobiliza-tion of seed reserves. However, the lack of a globally strong induction of genes in this category could indi-cate that the general increases in proteolytic activ-ity observed during germination (Muntz, 1996) are mediated partly at the post-transcriptional level.

We observed a strong and statistically signifi -cant increase in expression of messages for omega-3 and omega-6 fatty acid desaturases in the partially hydrated seed (At2 g29980 and At3 g12120). Inter-estingly, a delta-9 desaturase, At3 g15850, was signifi cantly repressed in seeds subject to the same treatments. Th e stronger eff ect on lipid metabolism than starch metabolism may refl ect the balance of lipid to polysaccharide reserves in cruciferous seed, or a tendency to utilize lipid resources preferentially in this phase of germination (lipid fi lling gene acti-vation occurs in a diff erent phase of seed develop-ment than the induction of starch fi lling transcripts [Ruuska et al., 2002]). While the changes in message levels for the storage proteins themselves (note par-ticularly the behavior of oleosin messages) are not reversed at point 4 (shelf life induction treatment), the changes in the omega-3 fatty acid desaturase, the beta amylase, and the serine protease in response to the initial priming hydration treatment are reversed by the shelf-life treatment. Th erefore, the removal of “legacy” storage transcripts is not tied to the mobili-zation of the reserves stored in the products of these genes, and this mobilization may be partly reversed by the shelf-life treatment.

Analysis of Transcriptional Responses of Putative Biomarkers from the Two Cultivars by RT–PCR

We selected fi ve Arabidopsis probesets from the microarray experiment as potential biomarkers for the quality of primed and shelf-life treated seed: At4 g28520 (Cruciferin), At3 g02560 (40s ribosomal protein), At4 g25140 (Oleosin), At2 g29980 (Omega-3 desaturase), and At3 g23920 (Beta amylase). All showed statistically signifi cant changes in expres-sion except for the 40s ribosomal protein, which was chosen for its close sequence similarity to the only available Brassica ribosomal gene, since many other ribosomal proteins were statistically signifi cantly altered in expression. We assayed the expression of the fi ve homologous Brassica genes by quantitative RT–PCR. All fi ve genes were present in GenBank and are clearly orthologous to the Arabidopsis probes that gave a strong signal response to seed prim-ing treatments. To address the biological variations between the Brassica cultivars, we again replicated

Page 15: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-110 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

these experiments in another cultivar, B. oleracea L. convar botrytis L. Alef. var. botrytis cv. Lintop, as used for the priming experiment shown in Fig. 1 and for the microarray experiments. We were able to clearly validate the expression patterns of the repre-sentative storage protein messages we examined, an oleosin (accession X61937) and a cruciferin (accession AAK07609). Both were substantially reduced by the priming treatment in both cultivars. We were also able to reproduce the unexpected behavior of the beta-amylase gene, using the B. napus ortholog AF319168 of the Arabidopsis gene ID At3 g23120 to design the primers. We also investigated the behavior of a fatty acid desaturase, AF056571. We replicated the behav-ior of this gene in the cultivar Maverick; however, the response pattern in cultivar Lintop was diff erent and, similar to the microarray result (Fig. 4), shows noisy data, possibly a result of allelic variation between the cultivars. We were unable to reproduce the general-ized behavior of the components of the translation machinery using primers designed to a 40s ribosomal

protein from B. oleracea (Acc. AF337544) (data not shown). However, the microarray data indicated that ribosomal proteins are induced on a large scale aft er the priming treatment, and we believe this result is likely to be attributable to aberrant behavior of the particular Brassica 40s gene selected (the Arabidopsis homolog of which was not one of the 52 that were statistically signifi cant).

ConclusionsTh e expression of all Brassica genes showing a

detectable hybridization signal to an Arabidopsis array was characterized. Our results correlate well with those obtained using Brassica cDNA arrays (Soeda et al., 2005). We believe, therefore, that the expres-sion values for Arabidopsis gene families as annotated on the microarray correspond closely to the mRNA expression levels of orthologous gene families in Brassica in our experiment. Although our study uses inferred homology to an Arabidopsis array, rather than an array of Brassica genes, it has the advantage

that the entire Arabidopsis gene set is available, rather than the relatively small subset of Brassica genes rep-resented by the available cDNAs and ESTs. Of 25 996 Arabidopsis genes on our microarray, we were able to detect homology by Brassica genomic DNA hybridiza-tion and detectable Brassica mRNA hybridization to probesets representing 17 886 genes. Our study there-fore has more than 10-fold greater depth than that of Soeda et al. (2005), who used a spotted array of 1437 Brassica cDNAs (only 979 of which were sequenced; the number of unigenes is not stated), 33 Arabidopsis probes, and 12 from tomato. In our study, 833 probe-sets showed statistically signifi cant changes in expres-sion. Our results show very close agreement between the over-represented functional classes of genes in particular expression groups detected by correlation analysis, the individual genes called signifi cant in per-gene statistical analysis, and RT–PCR validation (the 40s ribosomal probeset was the only one selected for RT–PCR validation that was not called signifi cant by SAM analysis, and also the only one that was not vali-dated). As with any microarray, validation using a sec-ond technique adds strength to the conclusions; with a cross-species hybridization approach, the nature of the orthologous Brassica genes hybridizing to the Ara-bidopsis array remains unknown, and thus a second technique such as RT–PCR is required to draw fi rm conclusions about particular genes. However, because our microarray results, our RT–PCR results, and the data from the limited Brassica genome microarray agree so closely, we are inclined to speculate on the global patterns of gene expression during seed germi-nation, on the basis of our exploratory dataset.

Th e extent of alteration in mRNA (and presum-ably, protein) synthesis activated by seed priming, while the process of germination can still be reversed by drying, indicates that priming treatment alone is suffi cient to cause genomescale transcriptional remodeling. We suggest that the eff ect of the prim-ing–heating–gradual dehydration process on the induction of protein synthesis could be a contribut-ing factor to seed longevity. One eff ect of hydration treatment that is not reversed by gradual dehydra-tion is the reduction in the abundance of the mes-sages of seed storage proteins, and certain other seed-specifi c messages. It is well known that the mRNAs for storage proteins persist in large quanti-ties in the viable dried seed and that these messages must be removed before the synthesis of the proteins required for germination can proceed. Since the removal of seed storage protein messages seems to be the strongest eff ect of partial hydration, and is not fully reversed in either cultivar by slow drying (Fig. 4), the marked diff erence between the transcript profi le of treated and untreated seeds may be the key

We believe that the expression

values for Arabidopsis gene families

as annotated on the microarray

correspond closely to the mRNA

expression levels of orthologous

gene families in Brassica.

Page 16: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

Hudson et al.: Cross-species microarrays for Brassica germination analysis S-111

to the eff ectiveness of the pregermination treatments described here. Primed seed, which lack seed-spe-cifi c transcripts such as storage proteins, may, on germination, immediately activate a system of pro-tein synthesis unencumbered by the legacy mRNAs generated during seed development.

Th e data set we generated using cross-species hybridization controlled by genomic DNA hybridiza-tion gave us useful biological insight into the processes potentially controlling the seed hydration viability phenomenon. We were able to show that four of fi ve orthologous Brassica transcripts were in broad agree-ment with the results of our exploratory microarray, as were the results of small-scale hybridizations with Bras-sica-specifi c arrays. We believe therefore that we have described a useful and robust method for exploratory, cross-species use of model-organism microarrays.

AcknowledgmentsWe thank Melvin Wei and Tyson Meyer for helping with sample

preparation and performing GeneChip microarray experiments,

Cor Coenen for performing the seed treatments, and Wenqiong

Chen for helpful discussions. Funding for this study was provided

by Syngenta Research and Technology, and by a startup grant to

MEH from the University of Illinois.

ReferencesApplied Biosystems. 1997. ABI PRISM 7700 sequence detection

System: Relative quantitation of gene expression. User bulletin

no. 2. Applied Biosystems, Foster City, CA.

Bewley, J.D. 1997. Seed germination and dormancy. Plant Cell

9:1055–1066.

Bewley, J.D., A.R. Kermode, and S. Misra. 1989. Dessication and

minimal drying treatments of seeds of castor bean and Phaseo-

lus vulgaris which terminate development and promote germi-

nation cause changes in protein and messenger RNA synthesis.

Ann. Bot. (Lond.) 63:3–17.

Bewley, J.D., F.D. Hempel, S. McCormick, and P. Zambryski. 2000.

Reproductive development. p. 988–1043. In B.B. Buchanan, W.

Gruissem, and R.L. Jones (ed.) Biochemistry and molecular

biology of plants. ASPP, Rockville, MD.

Bewley, J.D., and A. Marcus. 1990. Gene expression in seed devel-

opment and germination. Prog. Nucleic Acid Res. Mol. Biol.

38:165–193.

Bové, J., M. Jullien, and P. Grappin. 2001. Functional genomics in

the study of seed germination. Genome Biol. 3:1002.1–1002.5.

Bray, C.M. 1995. Biochemical processes during the osmopriming of

seeds. p. 767–789. In J. Kiegel and G. Galili (ed.) Seed develop-

ment and germination. Marcel Dekker, New York.

Bruggink, G.T., J.J.J. Ooms, and P. van der Toorn. 1999. Induction of

longevity in primed seeds. Seed Sci. Res. 9:49–53.

Bruggink GT, and P. Van der Toorn. 1995. Induction of desiccation

tolerance in germinated seeds. Seed Sci. Resear. 5:1–4.

Buitink, J., O. Leprince, M.A. Hemminga, and F.A. Hoekstra. 2000.

Molecular mobility in the cytoplasm: An approach to describe

and predict lifespan of dry germplasm. Proc. Natl. Acad. Sci.

USA 97:2385–2390.

Chismar, J.D., T. Mondala, H.S. Fox, E. Roberts, D. Langford, E.

Masliah, D.R. Salomon, and S.R. Head. 2002. Analysis of result

variability from high-density oligonucleotide arrays compar-

ing same-species and cross-species hybridizations. Biotech-

niques 33:516–524.

Girke, T., J. Todd, S. Ruuska, J. White, C. Benning, and J. Ohlrogge.

2000. Microarray analysis of developing Arabidopsis seeds.

Plant Physiol. 124:1570–1581.

Goldberg, R.B., G. Hoschek, G.S. Ditta, and R.W. Breidenbach. 1981.

Developmental regulation of cloned superabundant embryo

mRNAs in soybean. Dev. Biol. 83:218–231.

Hall, A.E., A. Fiebig, and D. Preuss. 2002. Beyond the Arabidop-

sis genome: Opportunities for comparative genomics. Plant

Physiol. 129:1439–1447.

Jacobsen, J.V., and L.R. Beach. 1985. Control of transcription of

α-amylase and rRNA genes in barley aleurone protoplasts by

gibberellin and abscisic acid. Nature 316:275–277.

Kermode, A.R., and J.D. Bewley. 1985. Th e role of maturation dry-

ing in the transition from seed development to germination:

II. Post-germinative enzyme production and soluble protein

synthetic pattern changes within the endosperm of Ricinus

communis L. seeds. J. Exp. Bot. 36:1916–1927.

Kermode, A.R., D.J. Giff ord, and J.D. Bewley. 1985. Th e role of

maturation drying in the transition from seed development to

germination: III. Insoluble protein synthetic pattern changes

within the endosperm of Ricinus communis L. seeds. J. Exp.

Bot. 36:1928–1936.

Koch, M., B. Haubold, and T. Mitchell-Olds. 2001. Molecular sys-

tematics of the Brassicaceae: Evidence from coding plastidic

matK and nuclear Chs sequences. Am. J. Bot. 88:534–544.

Lee, J.-L., M.E. Williams, S.V. Tingey, and J.A. Rafalski. 2002. DNA

array profi ling of gene expression changes during maize

embryo development. Funct. Integr. Genomics 2:13–27.

Lee, H.S., J. Wang, L. Tian, H. Jiang, I.J. Black, A. Madlung, B. Watson,

L. Lukens, J.C. Pires, J.J. Wang, L. Comai, T.C. Osborn, R.W.

Doerge, and Z.J. Chen. 2004. Sensitivity of 70-mer oligonucle-

otides and cDNAs for microarray analysis of gene expression in

Arabidopsis and its related species. Plant Biotechnol. J. 2:45–57.

Lipshultz, R.J., S.P.A. Fodor, T.R. Gingeras, and D.J. Lockhart.

1999. High density synthetic oligonucleotide arrays. Nat.

Genet. 21:20–24.

Livesley, M.A., and C.M. Bray. 1991. Th e eff ects of ageing upon α-

amylase production and protein synthesis by wheat aleurone

layers. Ann. Bot. (Lond.) 68:69–73.

Loguinov, A.V., I.S. Mian, and C.D. Vulpe. 2004. Exploratory dif-

ferential gene expression analysis in microarray experiments

with no or limited replication. Genome Biol. 5:R18.

Misra, S., and J.D. Bewley. 1985. Reprogramming of protein synthesis

from a developmental to a germinative mode induced by dessica-

tion of the axes of Phaseolus vulgaris. Plant Physiol. 78:876–882.

Monte, E., J.M. Tepperman, B. Al-Sady, K. Kaczorowski, J.M.

Alonso, J.R. Ecker, X. Li, Y. Zhang, and P.H. Quail. 2004. Th e

phytochrome-interacting transcription factor, PIF3, acts early,

selectively and positively in light-induced chloroplast develop-

ment. Proc. Natl. Acad. Sci. USA 101:16091–16098.

Muntz, K. 1996. Proteases and proteolytic cleavage of storage pro-

teins in developing and germinating dicotyledonous seeds. J.

Exp. Bot. 47:605–622.

Ogawa, M., A. Hanada, Y. Yamauchi, A. Kuwahara, Y. Kamiya, and

S. Yamaguchi. 2003. Gibberellin biosynthesis and response

during Arabidopsis seed germination. Plant Cell 15:1591–1604.

O’Hara, P., A.R. Slabas, and T. Fawcett. 2002. Fatty acid and lipid

biosynthetic genes are expressed at constant molar ratios but

diff erent absolute levels during embryogenesis. Plant Physiol.

129:310–320.

O’Neill, C.M., and I. Bancroft. 2000. Comparative physical

mapping of segments of the genome of Brassica oleracea

var. alboglabra that are homeologous to sequenced regions

of chromosomes 4 and 5 of Arabidopsis thaliana. Plant J.

23:233–243.

Page 17: The Published online July 16, 2007 Ge ome - Illinoisstan.cropsci.uiuc.edu/publications/Hudson_brassica.pdf · lower transcript abundance could not be validated on a genomic scale

S-112 The Plant Genome [A Supplement to Crop Science] July 2007 No. 2

Paterson, A.H., T.-H. Lan, R. Amasino, T.C. Osborn, and C. Quiros.

2001. Brassica genomics, a complement to, and early benefi ciary

of, the Arabidopsis sequence. Genome Biol. 3:1011.1–1011.4.

Powell, A.A., L.J. Yule, H.C. Jing, S.P.C. Groot, R.J. Bino, and H.W.

Pritchard. 2000. Th e infl uence of aerated hydration seed

treatment as assessed by the viability equations. J. Exp. Bot.

51:2031–2043.

Quiros, C.F., F. Grellet, J. Sadowski, T. Suzuki, G. Li, and T.B. Wro-

blewski. 2001. Arabidopsis and Brassica comparative genomics:

Sequence, structure, and gene content in the ABI1-Rps2-Ck1 chro-

mosomal segment and related regions. Genetics 157:1321–1330.

Rogers, J.C. 1985. Two barley alpha-amylase gene families are regu-

lated diff erently in aleurone cells. J. Biol. Chem. 260:3731–3738.

Ruuska, S.A., T. Girke, C. Benning, and J.B. Ohlrogge. 2002. Con-

trapuntal networks of gene expression during Arabidopsis seed

fi lling. Plant Cell 14:1191–1206.

Salter, M.G., K.A. Franklin, and G.C. Whitelam. 2003. Gating of

the rapid shade-avoidance response by the circadian clock in

plants. Nature 426:680–683.

Sambrook, J., E.F. Fritsch, and T. Maniatis. 1989. Molecular clon-

ing: A laboratory manual. 2nd ed. Cold Spring Harbor Press,

Plainview, NY.

Soeda, Y., M.C. Konings, O. Vorst, A.M. van Houwelingen, G.M.

Stoopen, C.A. Maliepaard, J. Kodde, R.J. Bino, S.P. Groot, and

A.H. van der Geest. 2005. Gene expression programs during

Brassica oleracea seed maturation, osmopriming, and germi-

nation are indicators of progression of the germination process

and the stress tolerance level. Plant Physiol. 137:354–368.

Tepperman, J.M., T. Zhu, H.S. Chang, X. Wang, and P.H. Quail. 2001.

Multiple transcription-factor genes are early targets of phyto-

chrome A signaling. Proc. Natl. Acad. Sci. USA 98:9437–9442.

Tusher, V.G., R. Tibshirani, and G. Chu. 2001. Signifi cance analysis

of microarrays applied to the ionizing radiation response.

Proc. Natl. Acad. Sci. USA 98:5116–5121.

van der Geest, A.H.M. 2002. Seed genomics: Germinating opportu-

nities. Seed Sci. Res. 12:145–153.

Welbaum, G.E., Z. Shen, M.O. Oluoch, and L.W. Jett. 1998. Th e

evolution and eff ects of priming vegetable seeds. Seed Technol.

20:209–235.

Wolfe, K.H., W.-H. Li, and P.M. Sharp. 1987. Rates of nucleo-

tide substitution vary greatly among plant mitochondrial,

chloroplast and nuclear DNAs. Proc. Natl. Acad. Sci. USA

86:6201–6205.

Yang, Y.-W., K.N. Lai, P.-Y. Tai, and W.-H. Li. 1999. Rates of nucleo-

tide substitution in angiosperm mitochondrial DNA sequences

and dates of divergence between Brassica and other angio-

sperm lineages. J. Mol. Evol. 48:597–604.

Zhou, Y., and R. Abagyan. 2002. Match-only integral distribution

(MOID) algorithm for high-density oligonucleotide array

analysis. BMC Bioinformatics 3:3.

Zhu, T. 2003. Global analysis of gene expression using GeneChip

microarrays. Curr. Opin. Plant Biol. 6:418–425.

Zhu, T., P. Budworth, W. Chen, N. Provart, H.-S. Chang, S. Guimil,

B. Estes, G. Zou, and X. Wang. 2003. Transcriptional control of

nutrient partitioning during rice grain fi lling. Plant Biotech-

nol. J. 1:59–70.

Zhu, T., P. Budworth, B. Han, D. Brown, H.-S. Chang, G. Zou, and

X. Wang. 2001a. Toward elucidating the global gene expression

patterns of developing Arabidopsis: Parallel analysis of 8300

genes by a high-density oligonucleotide probe array. Plant

Physiol. Biochem. 39:221–242.

Zhu, T., H.S. Chang, J. Schmeits, P. Gil, L. Shi, P. Budworth, G. Zou,

X. Chen, and X. Wang. 2001b. Gene expression microarrays,

improvements, and applications towards agricultural gene

discovery. J. Assoc. Lab. Automation 6:95–98.

Zhu, T., and X. Wang. 2000. Large-scale profi ling of the Arabidopsis

transcriptome. Plant Physiol. 124:1472–1476.