Download - Co-OP Presentation
OVERVIEW
• Pipelines• Projects• Validation(s)• ChimeraScan• Trinity• Manta• Development• Additional Work• What I learned• What I can improve• Moving forward• Acknowledgments
Pipelines• ABySS: Assemble short reads by a de novo, parallel, paired-end sequence assembler• Trans-ABySS: Analyze assemblies for structural variants and splice variants
using a reference genome and annotations.• Genome-Validator: Validate fusion and indel events from Trans-ABySS
against given BAM files and attempt to assigning ‘tumourigenicity’ as ‘somatic’ or ‘germline’ to events when both a normal tumour genome are given.
• Delly: Discover split-read and paired-end structural variants and genotyping from parallel sequencing data.
• Microbial Detection Pipeline: Detect bacterial and/or viral sequences to determine potential contamination or integration into the genome.
• Integration Site Pipeline: Detect putative integrative sites of viral sequences into human sequences.
• Probing Pipeline: Detect fusion and SNP mutations in genome and transcriptome libraries.
• Compression and Transfer: Compress and transfer files off of scratch space for archiving and reducing total space usage on scratch space.
Projects
• TCGA LIHC• TCGA MESO• NCI HER2 BRCA• GPH Lymphoma• TCGA BLCA• TCGA SARC• WES CHOL• TCGA UVM• COLO-829
• Kaplan• HCI HIV Cervical• MCF7• TCGA THYM
ChimeraScan-0.4.5A software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets. differs from other fusion finders(deFUSE) in that it adds a fragmentation step along with the whole paired-end approach which is also used by deFUSE.
Script(s):• setup:
– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_setup_final.sh
• checker:– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_checker.sh
• cleaner:– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_cleaner.sh
• binner:– /projects/trans_scratch/software/chimerascan/scripts/binning_beta.py
• summarizer:– /projects/trans_scratch/software/chimerascan/scripts/chimerascan.sum.sh
• report generator:– /projects/trans_scratch/software/chimerascan/scripts/ChimeraScan.report.sh
MantaRapid detection of structural variants and indels for clinical sequencing applications
Script(s):• manta_sum.sh:
– /home/ewillie/tools/scripts/manta_sum.sh
• manta_delly_overlay.py:– /home/ewillie/tools/scripts/manta_delly_overlay.py
• Manta_gv2_overlay.py:– /home/ewillie/tools/scripts/manta_gv2_overlay.py
• vcfToBedpe:– /projects/trans_scratch/software/svtools-Manta2Bedpe/vcfToBedpe
DevelopmentOverlay/Setup Script(s):
• manta_delly_overlay.py:– /home/ewillie/tools/scripts/manta_delly_overlay.py
• Manta_gv2_overlay.py:– /home/ewillie/tools/scripts/manta_gv2_overlay.py
• transabyss_defuse_overlay.py:– /home/ewillie/tools/scripts/transabyss_defuse_overlay.py
• trinity_setup.sh:– /projects/trans_scratch/trinityrnaseq-2.1.1/trinity_setup.sh
Additional Work• Assemblies: Run ABySS to assemble sample(s) for further downstream analyzing.• Analyses: Run various analysis tools on data and comparing their result by means of
overlays and/or visualization.• Overlays: Compare results between different tools or different settings to find
similarities and differences. The overlays are done using appropriate scripts, and venn diagrams are generated to help illustrate similarities and/or differences.
• Testing Scripts: new scirpts such as integration_pipeline.sh were tested for potential bugs and ease of use. Testing was done iteratively, with each iteration providing more confidence.
• ChimeraScan Wiki: Create a comprehensive wiki with information regarding validation, and a detailed procedure for running the tool. Additional information such as installation procedure, resource requirements, and interpreting the outputs. The wiki also contains debugging information.
What I Learned
• Real world applications of bioinformatics.• Problem solving including troubleshooting, debugging and querying the
literature.• Bash scripting language including a significant knowledge of terminal
commands.• Writing scripts to improve time and efficiency of jobs.(Do a job manually
for > 2hrs or write a script to do it in a fraction of that time.)• A greater attention to detail to help reduce rate of errors.• Time management, task prioritization and meeting deadlines.• Visualize and analyzing structural variants using IGV.
What I could work onProblem solving and troubleshooting skills.Deeper understanding of the SVIA pipeline tools.Clear and concise presentation of my results.Minimizing my rate of error when performing tasks.Verbal presentation skills.Create an appetite for personal projects.
ANY SUGGESTIONS????????
Moving ForwardMy interest in the algorithmic aspect of genomics has grown tremendously,
enticing me to take more applied algorithm courses. Obtaining a genomics certificate as part of my degree to further develop my
interest in genomic sciences. Since i am now aware of the qualities and skills that are needed to be successful
in this rapidly changing industry, I will be dedicating time to further develop these qualities and sharpen these skills.
Improving my scripting abilities both in python and bash to build on the experience I have already gained here during the last eight months.
Applying the knowledge and skills i have acquired here in order to be successful in a different work environment.