snps map of fungi - fasta · snps map of fungi • the aim was to look for snps that could be ......
TRANSCRIPT
11/4/15
1
MappingSNPsinFungi
OmonIsi
SNPsMAPofFungi
• TheaimwastolookforSNPsthatcouldbeusedauniqueIDforfungistrains
• GenomesequenceforPleurotusostreatuswasdownloaded
• FASTQfilesweredownloadedforTremellafusiformis
• FASTQsequencesweremappedagainstthereferencegenome.Why?
11/4/15
2
Results
1.Therewereonly1017SNPsfound
2.Thisishighlyunlikely,sowhy?
3.TheReferencegenomewasnotadequate,itisadifferentspecies
11/4/15
3
Conclusions/Lessons
IhavelearnedhowtodoSNPscallusingGalaxy
Thefactthatonlyrelatedorganismsshouldbeusedasreferencegenome
isobvious.
11/4/15
4
IdenWficaWonofsomaWcmutaWonsandcopynumbervariaWonsinCLL
JianYanThomasDeRaedtOmarAhmad
- NormalandLeukemiawholegenomesequencing- addstats- analyzedwithGATK- idenWfiedanumberofdrivermutaWons
- NormalandLeukemiawholegenomesequencing- IlluminaHiSeq2000- PairedEndReads(100bp)
11/4/15
5
FastQC
BamNormal
BamTumor
Pairedendsequencing
fastq
SAMTOOLSMpileup
versionhg19Varscan
CopyNumberVariaWon=>genelist
SNP-indelValidaWon
GATKRe-align
RealignedBAM
BWAversushg19
FastQGroomer
Pipeline
FastQCPairedendsequencing
fastq
Galaxy
11/4/15
6
BamNormal
BamTumor
Pairedendsequencing
fastq
BWAversushg19
FastQGroomer
Galaxy
FastQGroomer
BWA
BamNormal
BamTumor
VarscanSAMTOOLSMpileup
versionhg19
SAMTOOLS:Mpileup- SelectrightoutputfileIssueswithrunningVarscanforTumorsonGalaxy- NoopWontosubtractnormal- NoopWontosetp-valueforNormal-Tumorcomparison
11/4/15
7
#----------------------------script------------------------------------------#------------1---------------#----------------callingSNPsandLOH-------------#bashscripttocallsomaWcmutaWonsfromtumorandnormalpair#author:JianYan,UCSD#!/usr/binbam=/mnt/silencer2/home/j4yan/CSHL/bam#directorylocaWonforbamfilescript=/mnt/silencer2/home/j4yan/CSHL/script#directorylocaWonforVarScanref=/mnt/silencer2/home/j4yan/bowWe_index/hg19/hg19.fa#directoryforreferencegenomeoutput=/mnt/silencer2/home/j4yan/CSHL/output/SNP#locaWonforoutputforiin15doecho"$istarts"samtoolsmpileup-B-q1-f$ref$bam/CLL00${i}_normal.bam>$output/CLL00${i}.nor.mpileupsamtoolsmpileup-B-q1-f$ref$bam/CLL00${i}_tumor.bam>$output/CLL00${i}.tum.mpileupjava-jar$script/VarScan.v2.3.9.jarsomaWc$output/CLL00${i}.nor.mpileup$output/CLL00${i}.tum.mpileup$output/out.CLL00${i}.basename-min-coverage10-min-var-freq0.08-somaWc-p-value0.05java-jar$script/VarScan.v2.3.9.jarprocessSomaWc$output/out.CLL00${i}.basename.snpjava-jar$script/VarScan.v2.3.9.jarprocessSomaWc$output/out.CLL00${i}.basename.indeljava-jar$script/VarScan.v2.3.9.jarsomaWcFilter$output/out.CLL00${i}.basename.snp.SomaWc.hc-indel-file$output/out.CLL00${i}.basename.indel-output-file$output/out.CLL00${i}.basename.snp.SomaWc.hc.filterecho"$ifinished"done
Varscan
SNP-indelValidaWon
chrom posiWon ref var nr1 nr2 n_freqgt tr1 tr2 t_freq t_gt Status VarP SomaWcP chr2 81426719 C T 104 2 1.89% C 60 52 46.43%Y SomaWc 1.0 2.677960782616631E-16 34 26 28 24 57 47 1 1chr2 92323206 A C 28 0 0% A 15 4 21.05%M SomaWc 1.0 0.021730720713143611 4 2 2 14 14 0 0chr2 92323213 G T 29 0 0% G 15 4 21.05%K SomaWc 1.0 0.019919827320381504 10 5 2 2 14 15 0 0chr2 92325170 T C 24 1 4% T 18 6 25% Y SomaWc 1.0 0.04320114983152931 10 8 3 3 16 8 1 0chr2 97366152 C A 122 3 2.4% C 77 33 30% M SomaWc 1.0 1.1652835525027084E-9 23 54 6 27 33 89 1 2chr2 133020331 A T 16 0 0% A 11 4 26.67%W SomaWc 1.0 0.04338153503893158 2 9 2 2 8 8 0 0chr2 155555169 A C 18 0 0% A 15 6 28.57%M SomaWc 1.0 0.016632016632015884 11 4 6 0 13 5 0 0chr2 190352891 G T 11 0 0% G 5 7 58.33%K SomaWc 1.0 0.0032305828509893624 3 2 6 1 10 1 0 0chr2 198266834 T C 308 10 3.14% T 168 153 47.66%Y SomaWc 1.0 2.506589526816947E-43 106 62 93 60 192 116 4 6chr2 216234847 T G 21 0 0% T 27 7 20.59%K SomaWc 1.0 0.026510009906234835 27 0 7 0 20 1 0 0
VarscanSNPoutput
Varscan
SNP-indelValidaWon
Normal
Tumor
Muta=oninCodingSequenceSF3B1
BAMinIGV(Integra=veGenomeViewer)
11/4/15
8
Varscan
SNP-indelValidaWon
Muta=oninCodingSequenceSF3B1
Varscan
SNP-indelValidaWon
Variantvalida=on
11/4/15
9
VarScantocallCNV#Part1:callCNVs
#!/usr/bin/bashbam=/mnt/silencer2/home/j4yan/CSHL/bamscript=/mnt/silencer2/home/j4yan/CSHL/scriptref=/mnt/silencer2/home/j4yan/bowWe_index/hg19/hg19.faoutput=/mnt/silencer2/home/j4yan/CSHL/output/CNVpileup=/mnt/silencer2/home/j4yan/CSHL/outputforiin15doecho"$istarts”
samtoolsmpileup-B-q1-f$ref$bam/CLL00${i}_normal.bam>$pileup/CLL00${i}.nor.mpileupsamtoolsmpileup-B-q1-f$ref$bam/CLL00${i}_tumor.bam>$pileup/CLL00${i}.tum.mpileup
java-jar$script/VarScan.v2.3.9.jarcopynumber$pileup/CLL00${i}.nor.mpileup$pileup/CLL00${i}.tum.mpileup$output/
out.CLL00${i}.basename#calculatethecopynumbercoverageoftumorandnormalcells
java-jar$script/VarScan.v2.3.9.jarcopyCaller$output/out.CLL00${i}.basename.copynumber-output-file$output/out.CLL00${i}.basename.copynumber.called--homdel-file$output/out.CLL00${i}.basename.copynumber.hmodel#callcopynumbervariantsecho"$ifinished"done
Varscan
CopyNumberVariaWon=>genelist
output[j4yan@silencerCNV]$headout.CLL005.basename.copynumberchrom chr_start chr_stop num nd td log2 gcchr1 10028 10112 85 11.5 16.7 0.541 51.8chr1 131349 131448 100 25.1 24.8 -0.015 64.0chr1 131449 131548 100 23.1 20.7 -0.155 58.0chr1 131549 131617 69 14.8 15.7 0.093 59.4chr1 133364 133463 100 17.5 16.1 -0.121 66.0chr1 133464 133491 28 12.5 11.0 -0.189 71.4chr1 133544 133643 100 14.5 6.3 -1.208 62.0chr1 567545 567607 63 11.3 8.0 -0.495 46.0chr1 657741 657756 16 10.3 23.6 1.192 62.5[j4yan@silencerCNV]$headout.CLL005.basename.copynumber.calledchrom chr_start chr_stop num nd td adj.loggc region raw_raWochr1 131349 131448 100 25.1 24.8 0.019 64.0 neutral -0.015chr1 131449 131548 100 23.1 20.7 -0.136 58.0 neutral -0.155chr1 657862 657961 100 56.6 53.7 -0.076 52.0 neutral -0.075chr1 657962 658061 100 51.4 44.3 -0.194 57.0 neutral -0.213chr1 658378 658477 100 21.7 17.3 -0.303 60.0 del -0.327chr1 761960 762059 100 163.8 136.4 -0.284 47.0 del -0.264chr1 762060 762159 100 756.5 764.6 -0.011 46.0 neutral 0.015chr1 762160 762259 100 861.3 847.2 -0.005 57.0 neutral -0.024chr1 762260 762359 100 343.9 336.3 -0.003 62.0 neutral -0.032
Varscan
CopyNumberVariaWon
R
11/4/15
10
#Part2,usingDNAcopytoperformsta=s=cs
#Rscriptsource("hwp://bioconductor.org/biocLite.R")biocLite("DNAcopy")library(DNAcopy)cn<-read.table("Desktop/CSHL/out.CLL005.basename.copynumber.called",header=T)#readtableCNA.object<-CNA(genomdat=cn$adjusted_log_raWo,chrom=cn$chrom,maploc=cn$chr_start,data.type='lograWo')#Createsa‘copynumberarray’dataobjectusedforDNAcopynumberanalysesbyprogramssuch#ascircularbinarysegmentaWon(CBS).CNA.smoothed<-smooth.CNA(CNA.object)#DetectoutliersandsmooththedatapriortoanalysisbyprogramssuchascircularbinarysegmentaWon(CBS).segment<-segment(CNA.smoothed,verbose=0,min.width=2,undo.SD=3)#ThisfuncWonimplementsthecicularbinarysegmentaWon(CBS)algorithmofOlshenandVenka-traman(2004).p.segment<-segments.p(segment)#Thisprogramcomputespseudop-valuesandconfidenceintervalsforthechange-pointsfoundbythecircularbinary#segmentaWon(CBS)algorithm.pdf("Desktop/CSHL/CLL005.CNV.pdf")plot(segment,type="w")dev.off()write.table(p.segment,file="Desktop/CSHL/CLL005.copynumber.called.segments.p_value",sep="\t”)
CopyNumberVariaWon
R
imageviewoutput CopyNumberVariaWon
R
11/4/15
11
Copynumbergain#outputofthetablechr_stopnum_posiWons normal_depth tumor_depth adjusted_log_raWo gc_content region_call raw_raWo"99" "Sample.1" "chr14" 22670491 22749509 21 2.0536 7.80717368159807 1.43726069686465e-13 22749309 22749509"100" "Sample.1" "chr14" 22749609 22788857 9 2.6009 6.28011088585941 4.68660642046018e-09 22788857 22891686
"101" "Sample.1" "chr14" 22891586 22934605 9 3.1491 10.8815481513653 2.80606427200614e-26 22934605 22934605"235" "Sample.1" "chr7" 38313022 38339628 7 2.5386 8.23033979314098 2.08277896231573e-15 38331348 38339628
Genesinvolved:TCRA:Tcellreceptoralpha(chr14)TRGC2:Tcellreceptorgamma2,chainCregion(chr7)
CopyNumberVariaWon
R
Copynumberloss
chr_stopnum_posiWons normal_depth tumor_depth adjusted_log_raWo gc_content region_call raw_raWo
"28" "Sample.1" "chr10" 37823494 37823658 3 -1.3527 8.97662044342639 1.0579677959203e-16 37823658 37823658"52" "Sample.1" "chr12" 278350427836042 -1.3865 7.55817926989938 2.74030322197209e-11 27836042783604"79" "Sample.1" "chr13" 50035024 50038158 6 -1.3845 4.82793481236089 6.40097713084299e-05 50035424 50042087
"151" "Sample.1" "chr19" 45322781 45322881 2 -1.306 7.1938280311902 2.12377619682129e-10 45322881 45324000"180" "Sample.1" "chr20" 55108449 55108535 5 -2.1962 18.400935829044 6.80717991820467e-73 55108535 55108535
Genesinvolved:chr10:genedeserts:MTRNR2L7:MT-RNR2-Like7,PlaysaroleasaneuroprotecWveandanWapoptoWcfactor.
LINC00993:longintergenicnon-proteincodingRNA993
CACNA1C:CalciumChannel,Voltage-Dependent,LType,Alpha1CSubunit(chr12)
SETDB2:Histone-LysineN-Methyltransferase(chr13)BCAM:BasalCellAdhesionMolecule(LutheranBloodGroup)(chr19)
FAM209B:FamilyWithSequenceSimilarity209,MemberB(chr20)
CopyNumberVariaWon
R
11/4/15
12
ThankYouDanny!!
AndallotherTeachersStaffandClassmates