a metagenetic approach for revealing community structure of marine planktonic copepods j. hirai,* m....

Post on 28-Dec-2015

249 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A metagenetic approach for revealing community structure of marine planktonic copepods

J. HIRAI,* M. KURIYAMA,* T. ICHIKAWA,* K. HIDAKA* and A. TSUDA†*National Research Institute of Fisheries Science, Fisheries Research Agency, 2-12-4 Fukuura, Kanazawa, Yokohama, Kanagawa236-8648, Japan, †Atmosphere and Ocean Research Institution, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba277-8564, Japan

Molecular Ecology Resources (2015) 15, 68–80

Keywords: 28S rDNA: D2, 454 genome sequencer, biodiversity, Copepoda, metagenetics

Introduction

Marine ecosystem Planktonic copepods: Most abundant, wide distributionArthropoda 388-522 Mya, 11500 morphological species (new, cryptic species)

Important: Marine food webs and biogeochemical cycles Indicator: sensitive to environmental changes (natural, anthropogenic stressors)

Þ Study their community structureÞ Understanding and monitoring changes in marine ecosystems

Horizontal distribution: locally studied, not global scalesDue to time-consuming morphological classification, requiring expertsCryptic species, difficulty to identify the immature stages etc.

Solution: DNA-based genetic analysis1) Sanger sequencing-> time-consuming & cost intensive for individual sorting and analysis 2) Metagenetic method-> useful for surveying species richness in metazoans-> MOUTs: molecular operational taxonomic units based on sequence similarity-> independent of morphological classification-> effective tool for rapidly and comprehensively revealing the community structure

Introduction

DNA barcoding

To facilitate species identification based on similarity to known sequences in DBCommon molecular marker: COI, highly variable 5’ end of COI gene

Copepods: only a few studies used COI as a molecular markerCOI shows high evolutionary rate, difficulty in designing primer in a broad group

rRNA : common genetic marker for metagenetic analysis,having both variable and conserved region, Large subunit rDNA (LSU): more variable, used for species identificationD2 region of LSU (about 350bp): hypervariable region btw conserved regions-> suitable for designing universal primer, LSU region: available sequences in DB: Larger than COIUsed for metagenetic analysis of Haptophyta

In this study,Develop a metagenetic method for revealing community structure of copepods using 454 pyrosequencing Þ Rapid & comprehensive analysis of copepod community structure will be possiblePropose: new, efficient technique for assessing copepod community structure(diversity and distribution of planktonic copepods)

Materials and methods4 plankton community samples(1 artificial (Art.), 3 field-collected (FC))1. Similarity threshold for MOTU clustering in Art.2. Apply new method to 3 FC samples(showing high species diversity, significant hydrographic variation observed)

* Compare: metagenetic vs. morphological analysis* Evaluation the accuracy of the new method

Artificial community samples33 species(subtropical regions off): Table1.3 orders, 17 families, 27 genera-> sanger/pyrosequencing

Sanger sequencing33 species: first antennaPrimer: LSUCop-D1F (50-GCGGAGGAAAAGAAAACAAC-30)Cop-D3R (50-CGATTAGTCTTTCGCCCCT-30), 1000bp

Field-collected samples

O-line transect along 138°Subtropical: (31°00.00N, 137°59.90E Kuroshio: (32°54.70N, 138°00.60E)Slope: (33°52.00N, 137°44.00E).

* Vertical sampling- Depth: 0 to 200m / daytime- VMPS- Vertical multiple plankton sampler0.25m2 mouth-opening area100um mesh

* Sample preparation- 99% ethanol, 4°c maintain- After 24hours, replace to new EtOH- aliquots: morphological classifi-cation and metagenetic analysis

* Temperature-Salinity- CTD profiler

Fig. S1. Vertical profiles of water temperature and salinity at 138°E (0–250 m depth).

Highest Temperature, Salinity

Pyrosequencing

4 ethanol-preserved samples (one: artificial, ¼ aliquot field collected samples)* Remove all large noncopepod samples using 2mm mesh* Filtered onto 100um mesh

Primer: 400bp w D2 region, highly conserved region; successful amplification of LSU-D2 region in > 100 copepod species

LSU Cop-D2F (50-AGACCGATAGCAAACAAGTAC-30)LSU Cop-D2R (50-GTCCGTGTTTCAAGACGG-30)

Chimera: major cause of overestimation of diversitySolution:- Low number of cycles, - Long extension time, - Low concentrations of template DNA

Quality filtering

To minimize overestimation of MOTUs

Criteria1. Contained no ambiguous Ns2. Comprised 300-420bp w/o primer sites3. Contained ≤ 5 homopolymers4. No MID adaptor mismatch5. No more than three mismatches per primer6. Average quality score > 277. contained primer sitesbased on LSU sequences of copepods in GenBank

Merge: Forward & Reverse sequences in mothurClassify: based on the Reference data set using the naïve Bayesian classifier > 70%Align: sequences classified as subclass Copepod, MaffFiltering: using single-linkage preclusteringChimeras: remove them using UCHIMEReference Data: copepod sequences w LSU-D2 region in GenBank + 100 LSU-D2 seqs.artificial & field-collected samplesTaxonomic information: Boxshall & Halsey(2004), SILVA

MOTU analysis of the artificial community sample

Clustered at 95-99% similarity thresholdIn-del region: removed to minimize overestimation of MOTUs in the distant cal. Due to Homopolymer: most common error of 454 pyrosequencingOnly MOTUs w ≥3 Sequence readsMOTUs numbers: calculated for each similarity threshold 95~99%33 reference sequence vs. MOTUs : NJ tree build using MEGA5

No. of sequence reads for MOTU/ DW of the identified reference species in Art.logDW =2.891(Log PL)-7.467

Pearson’s product-moment correlation coefficients(r) 1: positive 0: no correlation -1: negativeBtw proportion of DW and sequence reads using SPSS

Fig. S3. Relationship between per-centage of dry weight and sequence reads at 97% similarity in the artificial community analysis (r = 0.638, p < 0.01).

MOTU analysis of field-collected samples

Quality-filtered sequence:Slope: 10611, Kuroshio: 6221, Subtropical: 12500 reads: low, many short fragments6221 reads was used for MOTU clustering for comparison of No. of 3 sites MOTUs ?

Clustered into MOTUs (97 similarity threshold)Classified into taxonomic order using a naïve Bayesian classifierCalanoid: classified at family level< 70% threshold : unclassified MOTU

Taxonomic composition of MOTUs(1)A nonbiomass-based approach: use only No. of the MOTUs(2)Biomass-based approach: MOTUs including the No. of sequence reads

Detection of biomass-dominant species: representative sequence of the top 6 MOTUs(those with the highest numbers of sequence reads) and blasted

Morphological analysis of field-collected samples

Morphological classification: Quantitative sample aliquots for calanoid copepods: depends on the size

Biomass(DW) estimation: logDW =2.891(Log PL)-7.467

Total number of species, biomass of each speciesNumbers of species and total biomass per familyBiomass-dominant species=> Compared with the value obtained from metagenetic analysis

MOTUs from metagenetic analysis vs. LSU-D2 sequences of morphologically identified species

Pearson’s product-moment correlation coefficients(r) Btw sequence reads of MOTUs and biomass

Results

Artificial community sample analysis33 reference sequences: D2 region, 405-408

Asymptote: suggesting sufficient sampling coverage

99%, 98%: overestimation 96%, 95%: underestimation97%: the closest match to the true MOTUs numbers

Nonselected MOTUs: probably present as gut contents of predatory copepods

MOTU no.

Reads

Best hit Iden-tity

Acces-sion no.

MOTU 1 1415 Pareucalanus attenuatus 99%AB796416

MOTU 2 1028 Subeucalanus subtenuis 100%AB796417

MOTU 3 811 Calanus sinicus 100%AB796406

MOTU 4 591 Euchirella messinensis 99%AB796401

MOTU 5 523 Eucalanus californicus 100%AB796414

MOTU 6 262 Undeuchaeta major 100%AB796403

MOTU 7 187 Centropages sp. 99%AB796413

MOTU 8 93 Paraeuchaeta media 99%AB796418

MOTU 9 88 Euchirella curticauda 99%AB796400

MOTU 10

87 Neocalanus gracilis 99%AB796410

MOTU 11

61 Temora discaudata 99%AB796428

MOTU 13

53 Mesocalanus tenuicornis 99%AB796407

MOTU 14

52Pleuromamma abdomi-nalis

99%AB796423

MOTU 15

39 Pontellina plumata 99%AB796426

MOTU 16

31 Aetideus acutus 99%AB796399

MOTU 17

24 Gaetanus minor 99%AB796402

MOTU 18

18 Candacia curta 100%AB796412

MOTU 19

17 Lucicutia flavicornis 100%AB796419

MOTU 20

11 Corycaeus sp. 99%AB796430

MOTU 21

10 Oithona sp. 100%AB796429

MOTU 23

10 Mecynocera clausi 100%AB796420

MOTU 26

7 Haloptilus sp. 99%AB796405

MOTU 27

7 Paracalanus sp. 99%AB796425

MOTU 30

4 Scolecithrix danae 99%AB796427

MOTU 31

4 Cosmocalanus darwinii 99%AB796408

MOTU 33

3 Calocalanus sp. 99%AB796411

MOTU 34

3 Pareucalanus sp. 99%AB796415

MOTU 35

3 Metridia brevicauda 100%AB796421

Table S1. BLAST results of selected MO-TUs in the artificial community analysis.

Fig. S2. Unrooted NJ tree of the artificial community for comparison between reference sequences of 33 species and MOTUs at the 97% similarity threshold. Scale bar indicates p-distance. Reference sequences obtained by Sanger sequencing are indicated by red circles and rep-resentative sequences of MOTUs are indicated by blue squares. Values in parentheses represent numbers of sequence reads in each MOTU.

Fig. S3. Relationship between percentage of dry weight and sequence reads at 97% similarity in the artificial community anal-ysis (r = 0.638, p < 0.01).

The number of sequence reads & DW of each speciesÞ Correlation (not strong)

High biomass(DW) tends to contain large num-bers of sequence reads

Field-collected sample analysis110 copepod MOTUs, 97% similarity70 of these were classified into calanoid copepods73 calanoid copepod species: morphologically identified

MOTU number≒ Species richnessNo. of calanoid MOTUs > Obs. Morphological species

59.4, 65.6, 63.3% calanoida

11.5-15.6%

3.1-3.3%

16.7-20.3%

59.4-65.6% dominant 3sites

49.9, 85.1, 62.9% calanoida49.3, 11.0, 35.2% cy-clopoida

Fig. S4. Unrooted neighbor-joining tree of the field-collected samples for comparison between morphological species and MOTUs at 97% similarity. Scale bar indicates p-distance. Sequences of morphologically identified species are indicated by red circles and representative sequences of MOTUs are indicated by blue squares.

64 morphological species-> 47 MOTUs, 97% similarity9 morphological species-> not detected in MOTUs23 MOTUs-> not correspond to Morphological species

Proportion of sequence reads-> proportion of DWpositive correlation

Family level species richness Significant correlated with morphological analyses of calanoid copepods in all 3 sites(slope r: 0.691, Kuroshio r=0.878, subtropical r= 0.808)

Fig. 6 Comparison btw metagenetic & morphological analaysis of calanoid copepods

Family level % of sequence reads & DWSignificant correlation with FC (slope r: 0.843, Kuroshio r=0.802, subtropical r= 0.921)Large proportion – High biomass

underestimation

In Kuroshio currentHigh species richness

The correlation btw No. of sequence reads & DWUsefulness of the number of reads as a proxy for biomass

99% similarity threshold: high species-level resolution for detection of dominant species

Discussion

MOTUs (LSU-D2)in metagenetic analysis: Reflect Species compositionProxy for species richness97% similarity: suitable for surveying the community structure of pelagic copepodsusing LSU regions

LSU: simple to design primer,Slow evolution rate: underestimation with insufficient taxonomic resolution97%: high species resolution97% similarity MOTUs clustering: avoid Artificial inflation of diversity; Haptophyta

99%: ideal for species identification, not proper for evaluating species richnessInflation in the No. of MOTUs => small numbers of sequence reads: =>no significant effect on dominant MOTUs

Art.: (Metridia venusta & Oncaea sp.): not detected in the MOTUs, NematodaWhy: Insufficient quantity of template DNA, PCR bias

Gut contents of carnivorous copepods

Discussion

Discrepancy btw Biomass & No. of sequence reads: primer mismatches, length of amplicons & copy numbers of rRNA

Art. s: - primer mismatch-> PCR efficiency; to minimize mismatch 3’ region of each primer is important- sequence length (OK)- Sequence reads: suggested to be a proxy for biomass, not strong (bias):Correlation btw Biomass & No. of sequence reads in NGS: SSU region study

FC.s:High species richness in the warm, western-boundary Kuroshio Current Copepods diversity: strongly correlated with temperatureHigher diversity: warm oligotrophic oceansHighest species richness: affected by the Kuroshio Current (HT and S)Kuroshio Current: - transport plankton from lower latitudes- Increase species diversity in the western North Pacific

Discussion

No. of MOTUs > morphological speciesMorphological identified by only adult copepods

Metagenetic analysis: immature stages, possible cryptic species, cut contents=> Higher estimates of species richnessMorphological species with large biomass: successful detection with MOTUs in FC.s.

MOTUs O, Morphological species (X): small sequence reads: rare speciesOther possibility: rare MOTUs = artefacts, pseudogenes, remnants of extracellular DNA in the water

Rarefaction curve: no fully stabilizeNo. of sequence reads ∝ larger numbers of MOUTs

MMGH sample

Discussion

Proportion of sequence reads Vs. DW: correlated

Discrepancies btw sequence reads & DW: - methodological biases of metagenetic & morphological analysis

MOTU classification at family level (Fig.6)Species richness of taxa: difficult to identify morphologically(ex: Paracalanidae, & Scolecitrichidae): small size, subtle morphological dif-ferences

Hydrographic area: Kuroshio & Subtropical station: Paracalanidae (genus Calocalanus)

Underestimation: Acartiidae (primer mismatches, short sequence lengths, phylogeny)Clausocalanidae (small genetic distance Clausocalanidae & Calanidae)Þ 97% similarity threshold: not good!

This metagenetic anlaysis: optimized in wide range of copepod taxaSolutionDifferent methods for data analysis and different molecular marker should be selected (Acartiidae, Calanidae, Clausocalanidae)

Discussion

Sequence reads ∝ biomass composition at the family level: (Fig.6,7)

Morphological analysis: time-consuming sorting, dissection

* Metagenetic analysis: all individuals, immature stages, And rapid detection of biomass-dominant taxa, Dominant taxa: valuable insight into the composition of the copepod com-munity: important to understanding copepod community structure and envi-ronmental conditions

C. sinicus, P. parvus : dominant at the Slope stationKnown to Key species and important prey for planktivorous fish in this region

Detect species richness and biomass of small copepods (Oithona, Para-calanus, Clausocalanus – underestimated)99% similarity: proper to detect dominant species

Rapid means of obtaining valuable information on copepod community structureMust be improved, LSU Reference DB accumulationCalanoid copepods (specific ecological characters): classification to the genus level-> easily adapted to field-collected samples on the global scale.

top related