profiling the developing jatropha curcas l. seed transcriptome by pyrosequencing

Profiling the Developing Jatropha curcas L. SeedTranscriptome by Pyrosequencing

Andrew J King & Yi Li & Ian A Graham

Published online: 28 January 2011# Springer Science+Business Media, LLC. 2011

Abstract Jatropha curcas L. has received much attentionrecently as a potential oilseed crop for the production ofrenewable oil. Despite the interest in this crop, relativelylittle is known on the molecular biology of this speciescompared with more established oilseed crops. To gain amore detailed understanding of the processes involved indeposition of oil and protein within Jatropha seeds, weconducted high-throughput sequencing analysis of thetranscriptome of developing J. curcas seeds using 454sequencing. A single sequencing run yielded 195,692sequences (46 Mbp) of raw sequence data. Assembly ofthis sequence data produced 12,419 contigs and 17,333singletons. BLASTX searches of the contigs revealed thatstorage proteins were the most abundant transcripts.Oleosins, ribosomal proteins, metallothioneins and lateembryogenesis abundant proteins were also highly repre-sented. Curcin, a type-I ribosome-inactivating protein,accounted for 0.7% of the transcriptome. No transcriptsfor type-II ribosome-inactivating proteins were found,suggesting that these are not present in the seeds of J.curcas. To test the power of 454 sequencing compared toconventional gene sequencing as a tool for gene discovery,a search of the homologues for genes involved in the

conversion of sucrose to triacylglycerol was conducted.Hits for all the known genes in this process were obtained.Pyrosequencing of the J. curcas developing seed tran-scriptome has provided a valuable increase in the amount ofsequence data currently available for this species. Thesequence data will be of great use to those engaged in J.curcas research and crop improvement.

Keywords Biodiesel . Expressed sequence tags . Jatrophacurcas . Developing seeds

Introduction

In recent years, much interest has been generated in thepotential of Jatropha curcas L. as a perennial oilseed crop forcultivation in tropical and sub-tropical regions. Many planta-tions are now being established in Asia, Africa and LatinAmerica for the production of biodiesel, and J. curcas istherefore likely to become an increasingly important oilseedcrop over the next decade [1, 2]. J. curcas also has a historyof cultivation under a number of less intensive agriculturalsystems. It can be used as a fencing crop, as a shade crop forplants such as vanilla or for controlling soil erosion [3].

Despite the interest in this species, relatively little ispresently known about the molecular biology of the plantcompared with other crop species [2]. As use of J. curcas asan industrial scale crop is only a recent development, thereis considerable scope for the agronomic improvement of thespecies through plant breeding and biotechnology.

Expressed sequence tag (EST) databases are a valuable toolfor sampling the transcriptome of a particular organism ortissue and providing insight into the biological processes. ThecDNA data generated by such studies can be used in furthergene expression studies such as microarrays or qPCR.

Electronic supplementary material The online version of this article(doi:10.1007/s12155-011-9114-x) contains supplementary material,which is available to authorized users.

A. J. King :Y. Li : I. A. Graham (*)Centre for Novel Agricultural Products, Department of Biology,University of York,Heslington York YO10 5DD, UKe-mail: [email protected]

A. J. Kinge-mail: [email protected]

Y. Lie-mail: [email protected]

Bioenerg. Res. (2011) 4:211–221DOI 10.1007/s12155-011-9114-x

http://dx.doi.org/10.1007/s12155-011-9114-x

Additionally, cDNA sequences are useful resources for theidentification of coding regions in genomic DNA. A numberof EST databases have been established for developing seeds,including the model species Arabidopsis thaliana [4], andcrops such as castor [5, 6] and sesame [7], and speciesproducing unusual fatty acids such as Momordica chariantaand Impatiens balsamina [8]. These EST databases havebeen obtained using conventional dye-terminator sequencingand contain between 743 [6] and 10,522 [4] single passreads. Dye-terminator sequencing has also recently beenused to produce 7,320 and 5,929 ESTs from developing andgerminating seeds of J. curcas, respectively [9]. However,obtaining further depth of sequence data using dye-terminator sequencing is prohibitively expensive.

In recent years, both throughput and speed of tran-scriptome sequencing projects has improved vastly throughthe use of new sequencing technologies such as 454pyrosequencing [10–12]. To gain further insight into theseed biology of J. curcas, we constructed an EST databaseusing 454 sequencing. We compare the effectiveness of thetwo sequencing approaches for identifying genes involvedin key metabolic processes such as lipid biosynthesis.

Methods

Lipid Analysis

Seeds were collected from manually pollinated J. curcasplants. The mass of the seeds were recorded, and the seedswere then lyophilized. The lyophilized material was groundto a fine powder and FAMES analysis was performed asdescribed previously [13].

RNA Extraction, cDNA Synthesis and 454 Sequencing

RNA was extracted from developing seeds of J. curcasusing a CTAB/lithium acetate procedure [14]. cDNA wasthen synthesised using the Dualsystems Biotech EasyClonecDNA library construction kit (Schlieren, Switzerland).cDNA amplification was performed using 16 cycles oflong-distance PCR. Primers were removed using anInvitrogen cDNA size fractionation columns (Invitrogen,Carlsbad, CA, USA). Five micrograms of cDNA from threedifferent developmental stages were then pooled and sent toCogenics (Meylan Cedex, France) for 454 sequencingusing the GLS-FLX platform.

Sequence Assembly, Analysis and BLAST Searching

The raw sequence reads were stripped off the primersequences with custom Perl scripts. The high qualitysequences which were longer than 40 nucleotides and

contained less than 3% unknown (N) residues were furtherselected and subsequently assembled into contiguoussequences using CAP3 DNA sequence assembly programwith default parameters [15]. The assembled contigs wereannotated locally using a BLAST 2.0 search [16] of theNCBI non-redundant peptide database with the BLASTXalgorithm. To identify specific sequences relating to genesinvolved in lipid biosynthesis, we also conductedTBLASTN searches of the J. curcas transcriptome datasetusing peptide sequences corresponding to genes listed onthe Arabidopsis Lipid Gene Database [17]. To estimate thenumber of sequences containing full-length clones, startcodons, and stops codons, 100 sequences were selected atrandom and the sequence alignments obtained usingBLASTX were compared. This analysis was performed intriplicate, and values are reported as the mean±SEM.

Results and Discussion

454 Sequencing of Developing J. curcas Seeds

The seeds of J. curcas are endospermic—i.e. the bulk of thestorage reserves are deposited within the endosperm ratherthan the embryo. In castor, another endospermic seed of theEuphorbioaceae, endosperm development has a develop-ment pattern where there is an initial free-nuclear stagewhich progresses to cellularization and maturation. After aninitial phase in which seeds grow rapidly to full-size, lipiddeposition begins during the cellularization stage where theendosperm becomes distinct [18]. In order to selectdeveloping stages in which oil deposition is occurring, weanalysed the lipid content of J. curcas seeds at variousintervals after pollination. Total lipid content was deter-mined and compared to that of mature seeds (SupplementaryFigure 1). The analysis indicated that seed development in J.curcas follows a similar pattern to seed development inRicinus communis, with oil deposition occurring after theseeds become full-sized. We selected stages at 58, 63 and70 days after pollination for the 454 sequencing library.cDNAwas prepared from mRNA extracted from developingseeds of these three developmental stages and pooled inequal amounts. A single half-run on the GLS-FLX sequenceryielded 195,692 sequences with an average length of 234 bp(46 Mbp). After trimming and removal of low quality reads,187,314 sequences with an average length of 220 bp(41 Mbp) were assembled using CAP3 program to produce12,419 contigs and 17,333 singletons (29,752 uniquesequences). Most of the contigs were composed of relativelyfew sequences (median=3, mean=13.7). The mean andmedian contig lengths (excluding singletons) were found tobe 401 and 322 bp, respectively (Fig. 1a and b). The shortcontig length, and relatively small number of EST per

212 Bioenerg. Res. (2011) 4:211–221

contigs, and the large number of singleton sequencesindicates that the transcriptome sampling had not reachedsaturation. Therefore, many genes are likely to be repre-sented by more than one contig or singleton. A BLASTXsearch was performed on the consensus sequence of theassembled contigs against the NCBI non-redundant peptidedatabase. A BLASTX hit with an E value less than 10−10

was obtained for 6,942 (56%) of the 12,419 contigs(Supplementary Table 1). A diagrammatic representation ofthe most abundantly represented classes is shown in Fig. 2,with a summary of the 50 most abundantly represented genesshown in Table 1. As anticipated from other developing seedEST databases [4–7], storage proteins are the most abun-dantly expressed genes in the developing seeds of J. curcas,accounting for 24% of the transcriptome, and the six mostabundantly represented sequences were all storage proteins.Ribosomal proteins were the next abundantly expressedtranscripts, accounting for 4.3% of the transcriptome.Oleosins account for 2.8% of the transcriptome. Transcriptswere detected for five oleosin genes, three of which havepreviously been deposited within GenBank (SupplementaryTable 2). Other abundantly represented sequences includemetallothioneins (1.7%) and late embryogenesis/seed matu-ration related proteins (0.7 %). Analysis of a random subsetof the contigs revealed that 26.0±1.0% contained a startcodon and 34.7±1.9% contained a stop codon. Full-length coding regions were detected in 15.0±2.3% of thecontigs.

Ribosome-Inactivating Proteins of J. curcas

A single isoform of curcin [19], a type-I ribosome-inactivating protein, was present in the EST database asthe tenth most abundantly represented sequence (GenBankTSA accession EZ417711), accounting for 0.7% of the totaltranscriptome. Transcripts for curcin 2 [20] were not

present. Unlike ricin from castor, the type-I RIPs of J.curcas lack a lectin cell binding domain and are thusthought to be only mildly cytotoxic upon ingestion [21, 22].Although curcins have historically been compared to ricin,it is not known whether the seeds of J. curcas do in factcontain type-II RIPs [2]. As there are currently plans tocultivate millions of hectares of land with J. curcas,presence of a type-II RIP within the seeds could haveserious consequences in instances of accidental ingestion ordeliberate poisoning [23]. As few as two castor beans cancause death in humans if ingested orally, and around 8%clinically reported cases of accidental castor bean ingestioncases are fatal [23, 24], especially where access to medicalhelp is limited. To determine whether the seeds of J. curcasmay contain a type-II RIP, we performed a TBLASTNsearch using the ricin precursor from R. communis. Notype-II RIPs were found. Although this does not confirmthe absence of type-II RIPs in J. curcas, the lack of anydetectable transcripts in the seed EST database stronglysuggests that seeds do not contain such proteins. Theabsence of type-II RIPs is therefore a positive factor for thedevelopment of this species as a crop.

Storage Lipid Biosynthesis

In oilseeds, sucrose is converted into triacylglycerol via aseries of compartmentalised reactions. Sucrose is firstconverted into pyruvate through glycolysis. This occurs inboth the cytosol and plastids, as both glucose-6-phosphateand phosphoenol pyruvate can be imported into the plastid.Starch granules in the plastid may also be converted intopyruvate via the plastidial glycolytic pathway. Pyruvate isthen converted into acetyl-CoA. De novo fatty acidbiosynthesis then occurs in the plastid via the elongationsystem which involves three different ketoacyl synthases.Saturated and monounsaturated fatty acids are then

103 37

150

8227

9515

5873

352

037

724

918

713

210

148 41 42 26 22 13 19

0

1000

2000

3000

4000

5000

6000

<101

201-

300

401-

500

601-

700

801-

900

1001

-110

0

1201

-130

0

1401

-150

0

1601

-170

0

>180

0

Contig length (bp)

No

. of

con

tig

s

4736

2258

1290

826

554

388

273

223

158

782

524

225

123

48 110

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

2 3 4 5 6 7 8 9 1011

-20

21-5

051

-100

101-

200

201-

1000

>100

0

Sequences per contig

No

. of

con

tig

s

a bFig. 1 Distribution of contiglengths (bp) and sequences percontig. a Size ranges of the12,419 contigs in bp. b Thenumber of individual sequencesreads per contig

Bioenerg. Res. (2011) 4:211–221 213

exported from the plastid (i.e. palmitate (16:0), palmitoleate(16:1), stearate (18:0) and oleate (18:1)) after being hydro-lysed by acyl–acyl carrier protein (ACP) thioesterases [25]and subsequently converted to coenzyme-A (CoA) estersby plastidial long-chain CoA synthetases [26] The acyl-CoAs then serve as donors for glycerolipid biosynthesis viaa series of reactions in the endoplasmic reticulum involvingthe Kennedy pathway [27]. Although the processes involv-ing the conversion of sucrose to triacylglycerol are not fullyunderstood, gene sequences for many of the enzymesrequired have been identified [17]. We searched the J.curcas EST database for all the nuclear encoded genesinvolved in the conversion of sucrose to TAG (Table 2 andFig. 3). Homologues corresponding to all A. thaliana genesknown to be involved in these processes were present.Transcripts corresponding to the cytosolic glycolysispathway were particularly abundant, with glyceraldehyde-3-phosphate dehydrogenase, aldolase and phosphoglyceratekinase all accounting for ≥0.1% of the transcriptome.Transcripts for the plastidial glycolytic pathway were lessabundant, and representatives for all enzymes in the plastidialoxidative pentose phosphate pathway were also presentsuggesting that both are operational in developing J. curcasseeds. ESTs corresponding to all steps involved in de novofatty acid biosynthesis and export from the plastid werepresent. Transcripts for enzymes of the Kennedy pathwaywere much less abundant than those of the glycolysis and denovo fatty acid biosynthesis pathways. Only three ESTs werepresent for the previously cloned diacylglycerol acyltransfer-ase I (GenBank accession DQ278448). One EST was alsoidentified for a putative type-2 diacylglycerol acyltransferase[28]. The relatively low level of ESTs for the final steps oftriglyceride biosynthesis has previously been observed in

other seed EST databases. Only three transcripts for DGATswere found in the developing seed database of A. thaliana[4] whilst no DGAT1 or DGAT2 transcripts have beenreported in the EST databases of R. communis [5, 6].Although the relative lack of DGAT transcripts in developingseeds of J. curcas is quite surprising, it should be noted thatDGAT1 proteins levels are known to be post-transcriptionally regulated in castor [29]. Triacylglycerolcan also be formed by the transacylation of diacylglycerolby phospholipids [30]. Six ESTs were detected for a putativePDAT.

In addition to the previously identified oleate desaturaseof J. curcas (GenBank accession ABA41034/EZ409947),we identified transcripts for a second oleate desaturase(EZ414061). In total, 161 ESTs were detected for oleatedesaturase, but only seven corresponded to linoleatedesaturase. This is consistent with the low concentrationsof linolenic acid (typically <0.5 %) that has been reportedin J. curcas oil [2].

Phorbol Ester Biosynthesis

Although there is considerable interest in the use of J.curcas seeds as a source of renewable oil, the seed mealfrom this species contain phorbol-esters which limits its useas an animal feed [2]. Phorbol-esters are tigliane diterpe-noids, and the first committed step in the biosynthesis ofthe tigliane is likely to be the conversion of a 20-carbonisoprenoid diphosphate, geranylgeranyl diphosphate(GGPP), into a macrocyclic diterpenoid by a terpenesynthase. The isoprenoid precursors may be provided byeither the cytosolic mevalonate pathway or the plastidialmethylerythritol phosphate (MEP) pathway [31]. Analysisof the J. curcas EST database revealed that transcripts werepresent for all steps of both these pathways (SupplementaryTable 3). However, relatively few transcripts were detected,with between 1 and 10 ESTs for each step in the MEPpathway. Although it is not clear whether phorbol-esters areproducts of the mevalonate or MEP pathway, it is likely thatthe MEP pathway is involved. So far, all known plantditerpene synthases involved in the biosynthesis of second-ary metabolites are located in the plastid [32–34]. Inaddition to being a precursor for a diverse range ofsecondary metabolites, GGPP is a precursor for a range ofother compounds in plants, including chlorophylls, preny-lated proteins and gibberellins. The GGPP precursors usedfor these compounds are primarily derived from the MEPpathway [35, 36]. It should be noted however that A.thaliana contains GGPP synthases located in variouscompartments in the cell, including the plastid, mitochondriaand ER [37]. The J. curcas seed EST database containedonly one EST corresponding to geranylgeranyl pyrophos-phate synthase (Supplementary Table 3), which appeared to

Fig. 2 Diagrammatic representation of the most abundantly expressedgene sequences in the J. curcas EST database. Functional classificationof EST sequences the J. curcas seed transcriptome based on BLASTXannotations

214 Bioenerg. Res. (2011) 4:211–221

Table 1 Fifty most abundantly represented transcripts in developing seeds of J. curcas

Annotation Organism E value GenBank ESTs %

1 Legumin-like protein Ricinus communis 1e-179 AAF73007 9,341 5.0

2 11S globulin seed storage protein 2 precursor Sesamum indicum 1e-112 Q9XHP0 8,542 4.6

3 11S globulin precursor isoform 2B Ficus pumila 1e-135 ABK80753 8,269 4.4

4 2s albumin precursor Ricinus communis 2e-25 P01089 7,756 4.1



7 Oleosin Corylus avellana 2e-41 AAO65960 3,247 1.7

8 Metallothionein-like protein Gossypium hirsutum 2e-18 AAV74186 2,668 1.4

9 BURP Medicago truncatula 9e-64 ABE82234 2,032 1.1

10 Curcin precursor Jatropha curcas 1e-162 AAL58089 1,336 0.7

11 Protease inhibitor/seed storage/lipid transfer protein Arabidopsis thaliana 1e-12 NP_194817 1,250 0.7

12 Oleosin 1 Camellia oleifera 4e-43 ABF57559 988 0.5

13 Late embryogenesis abundant protein D-7 Gossypium hirsutum 1e-8 P13939 982 0.5

14 Oleosin Ricinus communis 4e-38 AAR15171 904 0.5

15 Protease inhibitor/seed storage/lipid transfer protein Arabidopsis thaliana 3e-8 NP_194817 827 0.4

16 Vicilin-like protein Anacardium occidentale 1e-138 AAM73730 639 0.3

17 2S albumin precursor Ricinus communis 2e-15 P01089 575 0.3

Litchi chinensis 0.0 ABF00115 571 0.3

19 2S albumin precursor Ricinus communis 2e-15 P01089 532 0.3

20 Protease inhibitor/seed storage/lipid transfer protein Arabidopsis thaliana 3e-22 NP_188456 514 0.3

21 Glutaredoxin Vernicia fordii 4e-40 O81187 489 0.3

22 Hypothetical protein Homo sapiens 1e-5 XP_001714526 479 0.3

23 Translationally controlled tumour protein homolog Hevea brasiliensis 6e-73 Q9ZSW9 461 0.2

24 Hypothetical protein Medicago truncatula 2e-60 ABE77875 441 0.2

25 No significant homology N/A N/A N/A 439 0.2

26 Metallothionein-like protein type 3 Carica papaya 2e-26 Q96386 420 0.2

27 Thiazole biosynthetic enzyme Citrus sinensis 1e-146 O23787 397 0.2

28 Fructose-bisphosphate aldolase-like protein Solanum tuberosum 1e-179 ABC01905 393 0.2

29 Cationic peroxidase Nelumbo nucifera 1e-162 ABN46984 374 0.2

30 Metallothionein-1 like protein Oenanthe javanica 6e-6 AAB70560 371 0.2

31 No significant homology N/A N/A N/A 357 0.2

32 Alpha tubulin 1 Pseudotsuga menziesii 0.0 AAV92352 352 0.2

33= Cystatin-like protein Arabidopsis thaliana 2e-27 AAM64661 348 0.2

33= LEA protein in group 3 Arabidopsis thaliana 1e-30 BAA11017 348 0.2

35 Foot protein 1 variant 3 Perna canaliculus 2e-11 AAY29135 326 0.2

36 β-tonoplast intrinsic protein Arabidopsis thaliana 2e-99 NP_173223 314 0.2

37 Aquaporin Ricinus communis 1e-125 CAE53881 307 0.2

38 Calmodulin 4 Daucus carota 3e-79 AAQ63461 305 0.2

39 Gamma-thionin Phaseolus vulgaris 5e-21 CAL68581 279 0.1

40 Annexin-like protein RJ4 Fragaria x ananassa 1e-118 P51074 274 0.1

41 Glyceraldehyde 3-phosphate dehydrogenase Daucus carota 1e-165 AAR84410 267 0.1

42 S-adenosylmethionine synthetase 1 Catharanthus roseus 0.0 Q96551 261 0.1

43 Thioredoxin H-type (TRX-H) Ricinus communis 1e-47 Q43636 257 0.1

44 Foot protein-4 variant-1 Mytilus californianus 4e-19 ABC84184 245 0.1

45 BURP domain-containing protein Bruguiera gymnorrhiza 8e-61 BAB60849 239 0.1

46 Protein disulfide-isomerase precursor Ricinus communis 0.0 Q43116 235 0.1

47 Unknown protein Arabidopsis thaliana 2e-75 NP_199381 234 0.1

48= Unknown protein Arabidopsis thaliana 3e-75 NP_199381 231 0.1

48= NAD-dependent malate dehydrogenase Prunus persica 1e-176 AAL11502 231 0.1

48= Cold acclimation-induced protein Morus mongolica 3e-7 AAZ82815 231 0.1

GenBank annotation obtained from a BLASTX search of the non-redundant GenBank database. In most instances, the GenBank entry with thelowest E value is shown. In instances where GenBank entries lack an informative annotation, a more informative entry with a similar E value isshown

Bioenerg. Res. (2011) 4:211–221 215

18 Elongation factor 1α

Tab

le2

ESTscorrespo

ndingof

enzymes

invo

lved

intheconv

ersion

ofsucroseinto

triacylglycerolin

developing

seedsof

J.curcas

Conversionof

sugars

into

triacylglycerol

Step

Protein

Arabido

psisthaliana

Jatrop

hacurcas

contigs

Hits

%

Plasm

amem

brane—

sucrosetransport

1Sucrose

transporter,plasmamem

brane

At1g22710,At1g09960

EZ408931

(3),1singleton

40.002

Cytosol—

glycolysis

2aSucrose

synthase

At4g0

2280,At3g4

3190

,At5g2

0830

EZ4127

30(132

)EZ409118

(79),EZ4184

93(14),

EZ40

8538

(7),EZ4125

60(3),EZ41

7361

(3),

EZ4115

75(3),3sing

letons

244

0.130

2bNeutral

invertase

At4g34860,At1g56560,At4g09510,

At5g2

2510

EZ4115

76(22),EZ41

4627

(3),EZ40

9268

(3),

EZ4113

25(2),6sing

letons

360.019

3Fructokinase

At3g5

9480,At5g5

1830

EZ4094

42(45),EZ41

9234

(4),EZ4183

98(4),

EZ40

8118

(3),EZ41

7644

(2)

580.031

4UDP-glucose

pyrophosphatase

At5g17310

EZ409644

(46),EZ416162

(9),2singletons

570.030

5Pho

sphoglucom

utase

At1g7

0730

EZ4150

46(59),1sing

leton

600.032

6Phosphoglucoseisom

erase

At5g42740

EZ417899

(12),EZ417840

(3),2singletons

170.009

7aPho

sphofructokinase,PPidependent,α-sub

unit

At1g2

0950

EZ4148

61(20),EZ41

6216

(12),1sing

leton

330.018

7bPho

sphofructokinase,PPidependent,β-sub

unit

At1g1

2000

EZ4174

66(22),EZ40

8944

(9),1sing

leton

320.017

8Fructose-1,6-bispho

sphatase

At1g4

3670

EZ4098

29(7),EZ40

9223

(3),1sing

leton

110.006

9Pho

sphofructokinase,ATPdepend

ent

At4g2

6270,At5g4

7810

,At4g2

9220

EZ4132

82(10),EZ41

9322

(2),EZ4138

09(2),

2sing

letons

160.008

10Ado

lase

At2g3

6460

EZ4170

77(393

),3sing

letons

396

0.211

11Triosephosphate

isom

erase

At3g55440

EZ411669

(84),EZ419272

(8)

920.049

12Glyceraldehyd

e-3-ph

osph

atedehy

drog

enase

At3g0

4120,At1g1

3440

EZ4188

03(267

),EZ4141

74(166),EZ40

8884

(5),

EZ415041

(4),6singletons

448

0.260

13Pho

sphoglyceratekinase

At1g7

9550

EZ4150

03(200

)20

00.107

14Pho

sphoglycerom

utase

At1g0

9780,At3g0

8590

,EZ4156

34(77),EZ41

7060

(7)

840.045

15Eno

lase

At2g3

6530,At2g2

9560

EZ4088

79(154

),EZ4146

91(16),EZ4137

51(6),

EZ407590

(5),3singletons

184

0.098

16Pyruv

atekinase

At3g5

2990,At5g6

3680

,At5g5

6350

EZ4168

04(69),EZ41

5965

(66),EZ4175

71(25),

EZ41

4131

(10),EZ4083

15(4),EZ41

2706

(3),

EZ41

7199

(2),EZ4101

36(2),1sing

leton

182

0.097

Plastidialmem

brane—

hexose

andtriose

translocation

17Glucose-6-phosphate

translocator

At5g54800,At1g61800

EZ414407

(20),EZ407316

(8),EZ412437

(5),

2sing

letons

350.018

18Triosephosphate

translocator

At5g46110

EZ409314

(4),EZ413060

(2),2singletons

80.004

19Pho

sphoenolpy

ruvate

translocator

At5g3

3320,At3g0

1550

EZ4187

77(16),EZ41

6154

(4)

200.010

20Pyruv

atetranslocator

?N/A

?

Plastid–starch

metabolism

21Pho

sphoglucom

utase

At5g5

1820

EZ4114

16(11),1sing

leton

120.006

22a

ADP-glucose

pyrophosphorylase,

largesubunit

At4g39210,At1g27680

2singletons

20.001

216 Bioenerg. Res. (2011) 4:211–221

Tab

le2

(con

tinued)

Conversionof

sugars

into

triacylglycerol

Step

Protein

Arabido

psisthaliana

Jatrop

hacurcas

contigs

Hits

%

22b

ADP-glucose

pyroph

osph

orylase,

smallsubunit

At5g4

8300

EZ4097

31(4)

40.00

2

23a

Starchsynthase,soluble

At4g18240,At5g24300

EZ418005

(8),EZ417219

(3),2singletons

130.007

23b

Starchsynthase,granulebound

At1g32900

1singleton

1<0.001

24Starchbranchingenzyme

At2g36390,At5g03650

EZ412460

(11),EZ409986

(5),EZ408038

(3),

EZ417876

(2),3singletons

240.01

2

25Alpha-glucanphosphorylase

At3g46970,At3g29320

EZ417617

(4),EZ408368

(3),EZ413639

(2),

7sing

letons

160.00

9

26Isoamylase-type

debranchingenzyme

At2g39930,At4g09020,At1g03310

EZ414064

(4),EZ410452

(2),EZ415438

(2),

EZ418838

(2),EZ414406

(2),2singletons

140.00

7

27Hexokinase

At4g2

9130,At1g4

7840

EZ4098

40(8),EZ4115

68(3),EZ41

7173

(2),

2sing

letons

150.00

8

Plastid—

glycolysis

28Pho

sphoglucoseisom

erase

At4g2

4620

EZ4184

19(14),EZ4119

67(5),EZ41

6342

(4),

1sing

leton

240.01

3

29Phosphofructokinase,ATPdependent

At5g61580

EZ409668

(3)

30.002

30Fructose-1,6-bispho

sphatase

At3g5

4050

EZ4073

09(3)

30.00

2

31Aldolase

At2g0

1140

,At4g3

8970

EZ4127

57(41),EZ41

0272

(17),EZ4089

25(4),

1sing

leton

630.03

4

32Triosephosphate

isom

erase

At2g21170

EZ414807

(9),EZ414069

(2),1singleton

120.006

33Glyceraldehyde-3-phosphatedehydrogenase

At1g16300

EZ411341

(88)

880.047

34Pho

sphoglyceratekinase

At3g1

2780

EZ4120

71(4),EZ41

2862

(2)

60.00

3

35Pho

sphoglycerom

utase

?N/A

?

36Eno

lase

At1g7

4030

EZ4102

77(5),EZ40

8005

(2),EZ4177

05(2)

90.00

5

37Pyruv

atekinase

At3g2

2960,At5g5

2920

,At1g3

2440

EZ4165

70(42),EZ4119

75(10),EZ41

2722

(3),

EZ41

7791

(2),EZ4146

23(2),2sing

letons

610.03

3

Plastid

–ox

idativepentoseph

osph

atepathway

38Glucose-6-pho

sphate

1-dehy

drog

enase

At5g4

0760,At1g0

9420

,At3g2

7300

EZ4136

26(3),EZ41

5974

(3),EZ4093

37(2),

EZ41

5084

(2),EZ4186

36(2),3sing

letons

150.00

8

396-ph

osphog

lucolactonase

At5g2

4400

EZ4147

57(49)

490.02

6

40Gluconate-6-phosphate

dehydrogenase

At1g64190,At3g02360

EZ418999

(48),EZ418049

(19),EZ418763

(17),

EZ41

7122

(4),EZ4097

09(3),EZ40

7558

(2),

2sing

letons

950.05

1

38Glucose-6-pho

sphate

1-dehy

drog

enase

At5g4

0760,At1g0

9420

,At3g2

7300

EZ4136

26(3),EZ41

5974

(3),EZ4093

37(2),

EZ41

5084

(2),EZ4186

36(2),3sing

letons

150.00

8

41Ribose-5-ph

osph

ateisom

erase

At3g0

4790

EZ4082

97(8),EZ40

9515

(7),EZ4090

93(2)

170.00

9

42Ribulose-5-phosphate-3-epim

erase

At5g61410

EZ417906

(5),EZ409987

(2)

70.004

43Transketolase

At2g4

5290,At3g6

0750

EZ4146

96(7),EZ41

9056

(2),EZ4112

57(2),

EZ41

7305

(2),1sing

leton

140.00

7

44Transaldolase

At1g1

2230

EZ4170

76(25),EZ40

7732

(7)

320.01

7

Bioenerg. Res. (2011) 4:211–221 217

Tab

le2

(con

tinued)

Conversionof

sugars

into

triacylglycerol

Step

Protein

Arabido

psisthaliana

Jatrop

hacurcas

contigs

Hits

%

Plastid

–fatty

acid

biosynthesis

45a

Pyruv

atedehy

drog

enaseE1α

At1g0

1090

EZ4161

72(22),EZ41

8184

(14)

360.019

45b

Pyruv

atedehy

drog

enaseE1β

At2g3

4590

EZ4075

13(46)

460.025

45c

Pyruv

atedehy

drog

enaseE2

At3g2

5860,At1g3

4430

EZ4116

66(9),EZ4193

64(8),EZ40

9742

(6),

EZ40

9692

(6),EZ4119

24(5),EZ4110

67(3),

3sing

letons

400.021

46a

Acetyl-CoA

carboxylase,

biotin

carboxyl

carrierproteinsubunit(CAC1a)

At5g1

6390

EZ4074

02(42),EZ40

7836

(8),1sing

leton

510.027

46b

Acetyl-CoA

carboxylase,

biotin

carboxylasesubunit

(CAC2)

At5g3

5360

EZ4097

08(9),EZ41

4205

(3),1sing

leton

130.007

46c

Acetyl-CoA

carboxylase,

caroboxyltransferasesubunit

(CAC3)

At2g3

8040

EZ4159

13(6),EZ4115

29(4),EZ41

8040

(2),1

sing

leton

130.007

47Ketoacyl-ACPsynthase

At1g6

2640

EZ4085

51(5),EZ41

4997

(2),3sing

letons

100.005

48Ketoacyl-ACPreductase

At1g24360

EZ414660

(34)

340.018

493-Hydroxyacyl-A

CPdehydratase

At5g10160,At2g22230

EZ415979

(31)

310.017

50Enoyl-A

CPreductase

At2g05990

EZ416240

(30),EZ416721

(3)

330.018


IAt5g4

6290

EZ4140

40(30),EZ40

9421

(9),EZ4108

62(7),

EZ41

8649

(4),EZ4108

15(4)

540.029


IIAt1g7

4960

EZ4194

30(15),EZ41

2129

(7),EZ4184

91(5),

1sing

leton

280.015

53Steroyl-A

CPdesaturase

At3g0

2630,At2g4

3710

EZ4182

73(56),EZ41

2266

(44),EZ4155

69(22),

EZ41

6762

(21),EZ4189

00(6),3sing

letons

152

0.081

54Acyl-ACPthioesterase

At4g1

3050,At1g0

8510

EZ4194

62(73),EZ41

8948

(4),EZ4143

85(2),

1sing

leton

800.043

55Acyl-CoA

synthetase

(plastid

outerenvelope)

At1g7

7590,At2g0

4350

EZ4175

98(4),EZ40

8632

(3),3sing

letons

100.005

Endoplasm

icreticulum

—Kennedy

pathway

56Glycerol-3-phosphateacyltransferase

At3g11430,At4g00400,At5g06090

EZ417873

(2),2singletons

40.002

57Lysophosphatid

icacid

acyltranserferase

At3g57650,At1g75020,At3g18850

EZ417701

(3),EZ408857

(2),EZ413603

(2),1

sing

leton

80.004

58Pho

sphotid

icacid

phosph

atase

At1g1

5080

1sing

leton

10.001

59Diacylglycerolcholinephosphotransferase

At3g25585

EZ415027

(4)

40.002

60Oleatedesaturease

At3g12120

EZ409947

(88),EZ414061

(77),2singletons

167

0.089

61Linoleate

desaturase

At2g29980

EZ407668

(2),EZ419293

(2),3singletons

70.004

62Phospholip

iddiacylglycerol

acyltransferase

At5g13640,At3g44830

EZ407935

(2),3singletons

50.003

63DiacylglycerolacyltransferaseI

At2g19450

EZ412185

(3)

30.002

63DiacylglycerolacyltransferaseII

At3g51520

1singleton

1<0.001

218 Bioenerg. Res. (2011) 4:211–221

be similar to a plastidial isoform (At4g36810, [37]). Tran-scripts were also detected for the small subunit of geranyl(geranyl)diphosphate synthase. These proteins are catalyti-cally inactive themselves, but moderate the activity of GGPPsynthases to confer geranyl diphosphate (GPP) synthaseactivity [38].

The conversion of GGPP into the tigliane diterpenerequires a diterpene synthase. In angiosperms, other than thediterpene synthases involved in gibberellin (and otherphytohormone biosynthesis), only a few diterpene synthaseshave been characterised to date, which include casbene andneocembrene synthases from R. communis and otherEuphorbiaceae [32, 39]. Phylogenetic analysis of plantterpene cyclases reveals that they can be divided into sixsubfamilies based on the chain length of the substrates used,involvement in primary or secondary metabolism andtaxonomy [40]. Angiosperm (Magnoliophyta) sesquiterpenecyclases (using FPP as substrate) and casbene synthase allbelong to family A (TpsA). The casbene synthase sequencescontain a putative N-terminal plastid transit peptide which isabsent from the sesquiterpene cyclases. Analysis of the J.curcas EST database revealed a number of ESTs from theTpsA gene family (Supplementary Table 3). However, theyappear to be most similar to sesquiterpene cyclases ratherthan plastidial diterpene synthases.

Comparison in Efficiency of Gene DiscoveryUsing Dye-Terminator and Pyrosequencing

Direct comparisons of the efficiency of gene discoverybetween our dataset and that provided by Costa et al. [9] areproblematic, as different developmental stages were select-ed. At present, there are no standard descriptors of seedstages of J. curcas, but based on the relative abundance ofsequences for storage proteins, oleosins and “late embryo-genesis” related sequences, the data set provided by Costaet al. [9] is likely to include earlier developmental stagesthan studied in this report. The 41 Mbp of 454 sequencedata produced in this study provided 12,419 contigs and17,333 (29,752 unique sequences). The assembledsequence data obtained from both developing and germi-nating seed libraries produced by Costa et al. contained atotal of 7 Mbp of data and yielded 1,606 contigs and 5,677singletons (7,283 singletons) [9]. The increased depth ofcoverage obtained from 454 sequencing permitted thediscovery of a larger number of genes involved in keybiological processes such as lipid biosynthesis. Forexample, we were able to obtain sequences correspondingto all stages of plastidial fatty acid biosynthesis. Sequencesfor ketoacyl-ACP synthase III, acyl-ACP thioesterase and3-hydroxyacyl-ACP dehydratase were not detected in the

SUCROSE

UDP-Glc

1

2

Glc-1-P

Glc-6-P

Fruc-6-P

Fruc-1,6-P

fructose

3

4

5

6

DHAP GA3P

1,3-BPG

3-PG

2-PG

PEP

Pyruvate

STARCH

Glc-6-P

DHAP GA3P

Fruc-6-P

Fruc-1,6-P

PEP

Pyruvate

Glc

7 98

1011

12

13

14

15

16

17

18

19

20

6-PGL

6-PG

Ru-5-P

PLASTID

R-5-P Xu-5-P

S-7-P GA3P

E-4-P

CYTOSOL

Glc-1-P

ADP-Glc

AcetylCoA

MalonylCoA

3-KetoacylACP

3-hydroxyacylACP

E-2-enoylACP

Acyl-ACP

16:0-ACP18:0-ACP

16:1-ACP18:1-ACP

1,3-BPG

3-PG

2-PG

ER

Acyl-CoAPool

LPA

46

25

37

48

PA

DAG

TAG

PC

2623

2227

21

28

29 30

31

3333

Desaturation18:1 PC 18:2 PC18:2 PC 18:3 PC

α-1,4-Glc24

19

34

35

36

38

39

40

41 42

43

44

43

45

47

53

54, 55

49

50

51,52

54, 5556

57

5859 60

61

6263

G3P

Nascent oil body

Fig. 3 Schematic representation of the steps involved in the conversionof sucrose into triacylglycerol in J. curcas seeds. The enzymes involvedin the various steps are represented by numbers and are detailed inTable 2. Genes corresponding to the enzymes involved in stepsindicated with dashed arrows (8 and 20) are presently unknown.UDP-Glc uracil-diphosphate-glucose, Glc-1-P glucose-1-phosphate,Glc-6-P glucose-6-phosphate, Fruc-6-P fructose-6-phosphate, Fruc-1,6-P fructose-1,6-bisphosphate, DHAP dihydroxyacetone phosphate,GA3P glyceraldehyde-3-phosphate, 1,3-BPG 1,3-bisphosphoglycerate,

3-PG 3-phosphoglycerate, 2-PG 2-phosphoglyercate, PEP phospho-enolpyruvate, α-1,4-Glc α-1,4-glucan, Glc glucose, ADP-Glc adenosinediphosphate-glucose, 6-PGL 6-phospho-gluconolactone, 6-PG 6-phosphogluconate, Ru-5-P ribulose-5-phosphate, R-5-P ribose-5-phosphate, Xu-5-P xlyulose-5-phosphate, E-4-P erythrose-4-phosphate,S-7-P sedulose 7-phosphate, CoA coenzyme-A, ACP acyl carrierprotein, G3P glycerol-3-phosphate, LPA lysophosphatidic acid, PAphosphatidic acid, DAG diacylglycerol, PC phosphotidylcholine, TAGtriacylglycerol

Bioenerg. Res. (2011) 4:211–221 219

developing seed EST library obtained from 7,320 dye-terminator sequencing reads. For steroyl-ACP desaturase,we obtained sequence corresponding to the full codingregion for orthologues of two Arabidopsis genes(At2g43710 and At3g02630). No orthologues to plastidialsteroyl-ACP desaturases were found in the study of Costaet al. [9]. Similarly, we were able to obtain sequence datafor the full coding region of two oleate desaturase genes inour J. curcas library, whereas the Costa et al. study containsonly partial sequence data for one of these genes [9]. Insummary, the pyrosequencing data presented in this studyhas resulted in a greater depth of sequence coverage thanobtained from previous studies and provides sequence datafor a larger number of genes present in the developing seedtranscriptome.

Interestingly, Costa et al. report an exceptionally highlevel of transposable elements (TE) in their developing seedtranscriptome, with around 11% of the ESTs showingsignificant homology to TEs. Further analysis of the ESTdataset deposited by Costa et al. revealed that these wereTy3-gypsy type LTR retrotransposons. Analysis of ourdataset revealed only 31 ESTs with significant homologyto LTR retrotransposons. These differences in the level ofTE elements present in the different transcriptome datasetscould be due to the selection of material at differentdevelopmental stages in the two studies or possible stressinduction caused by the removal of the testa in the study ofCosta et al.

Conclusions

Transcriptome analysis of the developing J. curcas seedsusing a single run of the GLS-FLX produced 41 Mbp ofsequence data after removal of low quality sequences andprimers. Assembly of these sequences resulted in 12,419contigs. Despite the greater depth of sequence coverageachieved with pyrosequencing, most contigs were relativelyshort and contained few sequences, indicating the samplingof the developing seed transcriptome had not reachedsaturation. Nonetheless, a search of the ESTs produced inthat database revealed homologues of all known sequencesinvolved in pathways for the conversion of sucrose intostorage lipid (TAG). The sequence data therefore provides auseful resource for further transcriptome studies (qPCR,etc.) or as a dataset for sequence analysis in proteomicstudies. To achieve more saturated coverage of thedeveloping seed transcriptome, further cDNA sequencingreactions could be performed after normalisation of thecDNA population. Alternatively, the relatively small size ofthe J. curcas genome [41], coupled with recent increases inthe throughput of 454 sequencing makes genome sequenc-ing of J. curcas a viable proposition. The lack of type-II

RIPs within the sequence database suggests that J. curcasseeds do not contain RIPs with a lectin domain. Theavailability of the sequence data presented in this manu-script will provide a useful resource for those engaged in J.curcas research.

Acknowledgements This work was supported by funding from theGarfield Weston Foundation. The individual sequence reads have beendeposited in the GenBank Short Read Archive (SRA) as accessionSRR027577. The 12,419 contigs have been deposited in the GenBankTranscriptome Shotgun Assembly (TSA) archive as accessionsEZ407282-EZ419700. The annotations obtained from a BLASTXsearch with these contigs is presented in Supplementary Table 1.

References

1. Fairless D (2007) Biofuel: the little shrub that could—maybe.Nature 499:652–655

2. King AJ et al (2009) Potential of Jatropha curcas as a source ofrenewable oil and animal feed. J Exp Bot 60(10):2897–2905

3. Heller J (1996) Physic nut. Jatropha curcas L. Promoting theconservation and use of underutilized and neglected crops.Institute of Plant Genetics and Crop Research, Gatersleben,Germany and International Plant Genetic Resource Institute,Rome, p 66

4. White JA et al (2000) A new set of Arabidopsis expressedsequence tags from developing seeds. The metabolic pathwayfrom carbohydrates to seed oil. Plant Physiol 124(4):1582–1594

5. Lu C, Wallis J, Browse J (2007) An analysis of expressedsequence tags of developing castor endosperm using a full-lengthcDNA library. BMC Plant Biol 7(1):42

6. van de Loo FJ, Turner S, Somerville C (1995) Expressedsequence tags from developing castor seeds. Plant Physiol 108(3):1141–1150

7. Chung Suh M et al (2003) Comparative analysis of expressedsequence tags from Sesamum indicum and Arabidopsis thalianadeveloping seeds. Plant Mol Biol 52(6):1107–1123

8. Cahoon EB et al (1999) Biosynthetic origin of conjugated doublebonds: production of fatty acid components of high-value dryingoils in transgenic soybean embryos. Proc Nat Acad Sci USA 96(22):12935–12940

9. Costa G et al (2010) Transcriptome analysis of the oil-rich seed ofthe bioenergy crop Jatropha curcas L. BMC Genomics 11(1):462

10. Emrich SJ et al (2007) Gene discovery and annotation usingLCM-454 transcriptome sequencing. Genome Res 17(1):69–73

11. Cheung F et al (2006) Sequencing Medicago truncatula expressedsequenced tags using 454 Life Sciences technology. BMCGenomics 7(1):272

12. Weber APM et al (2007) Sampling the Arabidopsis transcriptomewith massively parallel pyrosequencing. Plant Physiol 144(1):32–42

13. Larson TR, Graham IA (2001) A novel technique for the sensitivequantification of acyl CoA esters from plant tissues. Plant J 25(1):115–125

14. Gasic K, Hernandez A, Korban SS (2004) RNA extraction fromdifferent apple tissues rich in polyphenols and polysaccharides forcDNA library construction. Plant Mol Biol Rep 22(4):437a–437g

15. Huang X, Madan A (1999) CAP3: a DNA sequence assemblyprogram. Genome Res 9(9):868–877

16. Altschul SF et al (1997) Gapped BLAST and PSI-BLAST: a newgeneration of protein database search programs. Nucl Acids Res25(17):3389–3402

220 Bioenerg. Res. (2011) 4:211–221

17. Beisson F et al (2003) Arabidopsis genes involved in acyl lipidmetabolism. A 2003 census of the candidates, a study of thedistribution of expressed sequence tags in organs, and a web-based database. Plant Physiol 132:681–697

18. Chen GQ et al (2007) Expression profiles of genes involved infatty acid and triacylglycerol synthesis in castor bean (Ricinuscommunis L.). Lipids 42:263–274

19. Juan L et al (2003) Cloning and expression of curcin, a ribosome-inactivating protein from the seeds of Jatropha curcas. Acta BotSin 45(7):858–863

20. Qin W et al (2005) Expression of a ribosome inactivating protein(curcin 2) in Jatropha curcas is induced by stress. J Biosci 30(3):351–357

21. Barbieri L, Battellia MG, Stirpe F (1993) Ribosome-inactivatingproteins from plants. Biochim Biophys Acta 1154:237–282

22. Hartley MR, Lord JM (2004) Genetics of ribosome inactivatingproteins. Mini-Rev Med Chem 4:487–492

23. Audi J et al (2005) Ricin poisoning: a comprehensive review. JAm Med Assoc 294(18):2342–2351

24. Challoner KR, McCarron MM (1990) Castor bean intoxication.Ann Emerg Med 19(10):1177–1183

25. Salas JJ, Ohlrogge JB (2002) Characterization of substratespecificity of plant FatA and FatB acyl-ACP thioesterases. ArchBiochem Biophys 403(1):25–34

26. Schnurr JA et al (2002) Fatty acid export from the chloroplast.Molecular characterization of a major plastidial acyl-coenzyme Asynthetase from Arabidopsis. Plant Physiol 129(4):1700–1709

27. Baud S, Dubreucq B, Miquel M, Rochat C, Lepiniec L (2008)Storage reserve accumulation in Arabidopsis: metabolic anddevelopmental control of seed filling. In: The Arabidopsis Book.Rockville MD: American Society of Plant Biologists

28. Lardizabal KD et al (2001) DGAT2 is a new diacylglycerolacyltransferase gene family: purification, cloning, and expressionin insect cells of two polypeptides from Mortierella ramannianawith diacylglycerol acyltransferase activity. J Biol Chem 276(42):38862–388629

29. He X et al (2004) Regulation of diacylglycerol acyltransferase indeveloping seeds of castor. Lipids 39(9):865–871

30. Ståhl U et al (2004) Cloning and functional characterization of aphospholipid:diacylglycerol acyltransferase from Arabidopsis.Plant Physiol 135(3):1324–1335

31. Lichtenthaler HK (1999) The 1-deoxy-D-xylulose-5-phosphatepathway of isoprenoid biosynthesis in plants. Annu Rev PlantPhysiol Plant Mol Biol 50(1):47–65

32. Mau CJ, West CA (1994) Cloning of casbene synthase cDNA:evidence for conserved structural features among terpenoidcyclases in plants. Proc Nat Acad Sci USA 91(18):8497–8501

33. Wildung MR, Croteau R (1996) A cDNA clone for taxadienesynthase, the diterpene cyclase that catalyzes the committed stepof taxol biosynthesis. J Biol Chem 271(16):9201–9204

34. Martin DM, Faldt J, Bohlmann J (2004) Functional characteriza-tion of nine Norway spruce TPS genes and evolution ofgymnosperm terpene synthases of the TPS-d subfamily. PlantPhysiol 135(4):1908–1927

35. Gerber E et al (2009) The plastidial 2-C-methyl-D-erythritol 4-phosphate pathway provides the isoprenyl moiety for proteingeranylgeranylation in tobacco BY-2 cells. Plant Cell 21(1):285–300

36. Kasahara H et al (2002) Contribution of the mevalonate andmethylerythritol phosphate pathways to the biosynthesis ofgibberellins in Arabidopsis. J Biol Chem 277(47):45188–45194

37. Okada K et al (2000) Five geranylgeranyl diphosphate synthasesexpressed in different organs Are localized into three subcellularcompartments in Arabidopsis. Plant Physiol 122(4):1045–1056

38. Wang G, Dixon RA (2009) Heterodimeric geranyl(geranyl)diphosphate synthase from hop (Humulus lupulus) and theevolution of monoterpene biosynthesis. Proc Nat Acad Sci USA106(24):9914–9919

39. Kirby J et al (2010) Cloning of casbene and neocembrene synthasesfrom Euphorbiaceae plants and expression in Saccharomycescerevisiae. Phytochemistry 71(13):1466–1473

40. Bohlmann J, Meyer-Gauen G, Croteau R (1998) Plant terpenoidsynthases: molecular biology and phylogenetic analysis. Proc NatAcad Sci USA 95:4126–4133

41. Carvalho CR et al (2008) Genome size, base composition andkaryotype of Jatropha curcas L., an important biofuel plant. PlantSci 174(6):613–617

Bioenerg. Res. (2011) 4:211–221 221

profiling the developing jatropha curcas l. seed transcriptome by pyrosequencing

Documents