![Page 1: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/1.jpg)
Tools and Algorithms in Bioinformatics
CLC Genomics Workbench
September 22, 2017
Dr. Matthew Cserhati
GCBA Guda lab
![Page 2: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/2.jpg)
Logging on
•Open up a Remote Desktop, pairs
• IP, password, id will be handed out in class
•Right-click CLC GWB and run as Administrator• Let Dr. Guda and me help each group with password
![Page 3: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/3.jpg)
Outline
• Introduction to CLC Genomics Workbench
• Guided genome assembly
• Workflows
• Plugins• IPA analysis
![Page 4: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/4.jpg)
CLC Genomics Workbench - Introduction
• Manual: http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/User_Manual.pdf
• Software downloadable at (if your lab is interested): https://www.qiagenbioinformatics.com/products/clc-genomics-workbench/• Must pay for license
Just For Your Information:
![Page 5: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/5.jpg)
CLC Genomics Workbench - Introduction
• Widely used, cutting edge multifunctional Windows-based NGS analysis and visualization platform
• Allows you to do Guided genome assembly, RNA-seq analysis, Epigenomic analysis, De Novo Sequencing (see Genome Assembly class, week 15, Dec. 1), Microarray analysis
• Allows you to import your own NGS data or download from the Internet
• Workflow configuration (task automatization)
• Multiple plugins which allow extra functionality• E.g. IPA, Chip-seq, MetaGeneMark
![Page 6: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/6.jpg)
Guided genome assembly
• As opposed to de novo assembly a genome from a related species will be used to guide assembly
• Task: assemble the genome of an unknown NucleoCytoplasmic Large DNA Virus (NCLDV), with id: “GD12”
• Guide genome: Paramecium Bursaria Chlorella Virus-1 (PBCV-1) genome, NCBI ID: JF411744.1
• Find the paired end reads in folder “guided_assembly”
PBCV-1
![Page 7: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/7.jpg)
Guided genome assembly• NGS Core Tools
• Trim reads (5 minutes)• NGS Core Tools, Trim Sequences
• Map reads to reference (10 minutes)• Creates summary
• Mapped reads
• Unmapped reads
• plus log
• Extract Consensus Sequence from mapped reads (5 minutes)• This way we get the assembled genome sequence
![Page 8: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/8.jpg)
Guided genome assembly – ORF detection
• Classical Sequence Analysis• Nucleotide Analysis
• Find Open Reading Frames to predict genes in the newly assembled genome
• Choose Genetic code #1
• Export results to .txt, .xls
• (Post-processing: sequence extraction, blast against known proteins, annotation)
![Page 9: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/9.jpg)
CLC GWB workflows• Workflows in CLC GWB are designed analysis pipelines used to
automate data input and output creation using NGS data
• Based in the GUI environment of CLCGWB the user can• Drag and drop elements
• Inputs
• Tasks
• Connect them together into a pipeline
• Workflows can be made available for other users/researchers for common tasks• E.g. RNA-seq analysis
![Page 10: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/10.jpg)
Resources
• Workflows: http://resources.qiagenbioinformatics.com/tutorials/Workflow-intro.pdf
• Data files: http://resources.qiagenbioinformatics.com/testdata/chrM-tutorial-data.zip [Download this!]• Two sets of reads
• Normal tissue
• Cancer tissue
• Human mitochondrial genome and annotation• Mit. genome sequence
• (37 mitochondrial) Genes and CDs
• SNV table
![Page 11: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/11.jpg)
What will the workflow do?
• Alignment of reads to the reference genome
• Re-alignment of reads for better quality
• Detect variants and filter them against known variants
![Page 12: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/12.jpg)
Selection of workflow elements and finished workflow
Click on ‘Add Element’ button at bottom of work panel to get list
![Page 13: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/13.jpg)
Elements of the workflow• Local Realignment Tool: When reads are mapped to a genome, they
sometimes misalign• Indels
• Uses information from other reads near indel to realign reads across indel
• Variant Detection:• Basic Variant Detection: runs quickly, no error-model estimation
• Low Frequency Variant Detection: calls subset of basic variant detection; slowest, uses error-model estimation
• Fixed Ploidy Variant Detection: calls subset of variants from Low Freq Var Detection; difference is likely due to mapping or sequencing errors
![Page 14: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/14.jpg)
Types of tracks (see page 536 of CLC GWB manual)
• Sequence track: Displays the chromosomes of the reference genome.
• Reads track: Displays how all of the (mapped) reads map to the reference genome. Zoomable.
• Variant track: Displays information on allele variants at the base pair level. Variants can be SNV, MNV, replacement, insertion, or deletion. [double-click to visualize in table format]
• Annotation track: displays location and length of different genetic elements. [double-click to visualize in table format]• Gene, CDS, peaks (ChIP-Seq)
• These first four are present in the example
• Other types of tracks: coverage graph, expression tracks
![Page 15: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/15.jpg)
Tracks practical examples• Add all outputs from workflow to track list (plus add Genome)
• ‘Track tools’, ‘Create track list’
• Examine the created tracks
• Double-click on the annotation tracks (genomeTracks, Gene)• Find the ATP genes in the mitochondrial genome
• Search for Name contains ATP
• Filter for variants where the coverage >= 100 (normalData)
• Create a GC content graph for the NC_001807 genome• Track tools, Graphs, Create GC content graph
• This graph displays the GC% all along the genome sequence
![Page 16: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/16.jpg)
IPA (Ingenuity Pathway Analysis) Plugin• Ingenuity Pathway Analysis:
https://www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis/• A GUI tool for analyzing complex ‘omics’ data
• Gene network analysis and visualization• Upstream regulators• Disease analysis• Pathway analysis
• The results of RNA-seq analysis (week 8) can be uploaded to the IPA server (week 11)
• For this we use an integrated IPA-plugin
• Manual: http://resources.qiagenbioinformatics.com/manuals/ingenuitypathwayintegration/current/Ingenuity_Pathway_Analysis.pdf
![Page 17: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/17.jpg)
IPA
![Page 18: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/18.jpg)
Working with plugins• Manage plugins
• Manage existing plugins
• Update existing plugins
• Download plugins• Download and Install
• Install from file• .cpa file
• Search for Ingenuity Pathway Analysis plugin
![Page 19: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/19.jpg)
IPA Plug-in
• Used mainly for statistical comparison data generated using the RNA-seq tools• Differential Expression for RNA-Seq tool
• For this you need a user ID and password in IPA
• You can either• Upload data only
• Upload and analyze
• Bonferroni
• Select log2 fold change
Data sets uploaded in IPA
Data sets in CLC GWB
![Page 20: Tools and Algorithms in Bioinformatics CLC Genomics …...Tools and Algorithms in Bioinformatics CLC Genomics Workbench September 22, 2017 Dr. Matthew Cserhati GCBA Guda lab. Logging](https://reader034.vdocuments.mx/reader034/viewer/2022042712/5f9abe9c3e25b46d9e0a595c/html5/thumbnails/20.jpg)
Thanks for your attention!