bioruby -- bioinformatics library

18
BioRuby BioRuby Bioinformatics Library 生物情報科学用ライブラリ Naohisa Goto / 後藤直久 Genome Information Research Center, Research Institute for Microbial Diseases, Osaka Univ. 大阪大学微生物病研究所附属遺伝情報実験センター Email: [email protected] twitter: @ngotogenome

Upload: ngotogenome

Post on 06-May-2015

1.792 views

Category:

Technology


2 download

DESCRIPTION

Short presentation about BioRuby open-source bioinformatics library for the Ruby programming language. RubyKagi2011 Lightning Talk in Jul/18/2011.

TRANSCRIPT

Page 1: BioRuby -- Bioinformatics Library

BioRuby

BioRuby― Bioinformatics Library

―生物情報科学用ライブラリ

Naohisa Goto / 後藤直久Genome Information Research Center, Research

Institute for Microbial Diseases, Osaka Univ.

大阪大学微生物病研究所附属遺伝情報実験センター

Email: [email protected]

twitter: @ngotogenome

Page 2: BioRuby -- Bioinformatics Library

BioRuby

Who am I? / 自己紹介

Name: Naohisa Goto

名前: 後藤 直久

Affiliation: Genome Information Research Center,

Research Institute for Microbial Diseases, Osaka

University

所属: 大阪大学微生物病研究所附属遺伝情報実験センター

Twitter: @ngotogenome

Email: [email protected]

First Ruby experience: 1.2.6 (compiled in 22/Jun/1999)

Page 3: BioRuby -- Bioinformatics Library

BioRuby

BioRuby

Bioinformatics software library and tools written

in the Ruby Language

Rubyで書かれた生物情報科学(バイオインフォマティクス)用ライブラリとツール集

Free software (Ruby License)

http://bioruby.org/

https://github.com/bioruby/bioruby

% gem install bio

Page 4: BioRuby -- Bioinformatics Library

BioRuby

DNA

DNA is a chain made of the

four molecules.

DNAは4種類の分子からなる鎖

A (Adenine)

C (Cytosine)

G (Guanine)

T (Thymine)

DNA can be treated as String.

DNAは文字列として扱える。

図: WikiPedia: 染色体

Human: Total 3GB (49-247MB/chromosome)

ヒト: 合計3GB (染色体1本あたり49~247MB)

(ところで、Encoding どうしよう…)

Page 5: BioRuby -- Bioinformatics Library

BioRuby

An example DNA data (with metadata / 付加情報含む)

>gi|60459557|gb|AY948115.1| Homo sapiens alcohol dehydrogenase 1A (class

I), alpha polypeptide (ADH1A) gene, complete cds

GAGGGCGACAAAAGGGAACAGACCCAAAACCACAGGAGAGATGCTAGCATGACAGGGATGCAGAGACATA

AAGCACAACAGTGAGATGGAGTTAATATACCTCCACGAGGGTGACCTTGTCCTGCATCTCAAATTTTGGG

TAGGATTTGAATGGGCCAGAGGGACAGAAAAGAAGAGAAAGAGCATGATGAGCAAGGGCTTGAATGTTAA

ATAGATTCCTCTTTGGGGGACCAGGGAGATACAAGCTTCTAAAGCACATACGCCCTGTATTGGAGAATGG

GGAGGAGTAGATAGATGAGAAGGTTGAAGCCATATTACGAAGCCTTGAATGCTGAACATCAGATCTGGGG

CTATATTCTTACCTTGATACATTTCAGAAGCAACTGAAATCGTAGGACCTTCCTTGCTTCTCTATTGGGT

GAATGTTTCTCAGTCTTGGTGTGAGTCTCAGTGCCTACGTAGTTAAAGCTTACTGAAATGTTCCCTTTAC

AATTCTAGAGAGATATGTCCTTTATGTTGACATGTTCATGCTGACAGACTGCATCTGATTAAACAGCTGC

CTGTGCAATGCCTCCAAGTGTGGATAAAAGAAAAATTAAACTCATAATCTTGGACAGCCATGTGTAGACT

AGTTACATTGATCAAAGGGCAATAGAAATGATCCAGTGAGGATTTGTCTGAATTTCCCACAATTATTTAA

AATCTACCTCAAATACCTGTTCATCTATAATGCCTCCCCTGAGGCCTTCATTCTGAATAGTACCTCTGTC

TCTGTCCCCAAAGCACTAACTGATCCCTGTGATAGCGCACTTCCCAGCCAGGCTGATATGTAGACTTGGC

TGCCTGTGTATCTTTTCCCCATAGACTGTGAGCTTCCTTTTATGAATAATAATTGTAGCTAGCATTTAGT

AGGGTGCTCCTACCTGTTAAACTCTATGATGAGTGCTTTACATAGATTATATCATTTATTCACTAAACAG

TCCTTTAAAATGGTGCTATATTCACTAAACAGTCCTTTAAAATGGTGCTATATTCACTAAACAGTCATTT

AAAATGGTATTATTCTTCTTCATCTTACAGGTAAACAAACTAAGGCAAAAAAAAAAGTGAAATAATAAGT

GCCAGTACACAGAGCTAGTAAGGAATAGGGTCTGCCAGGTCCCAAAAAGCATGCCATCACCTTTGCCCCA

TACTGCCTCTGGTACAGATAGAGGTAATGTCTTATTTATCACTGCCATCCACTGGACCCAGCTTAGTGCC

TGACACACAGAGGGGCTCAGTCAATGCTGATTGGTTTGAGGTGGAGCAAAAATGCTTAGCAGGGTGAGCA

CCTTTGCTGTGATTGAGTATCTGATTCTCTATGAAGAGAAGGGGAGTCCTGAGCCAAACACATTCCTCTG

GCTCCTGGCTGTCATCTTTATTTGCCCGGCTTCTTTGCTCTTCCTCCTTCCTAACTGCACCGTTTGGATT

(snip / 以下略)

http://togows.dbcls.jp/entry/genbank/AY948115.1.fasta

Page 6: BioRuby -- Bioinformatics Library

BioRuby

What BioRuby can do? / できること

Biological data analysis / 生物データの解析

DNA, RNA

Protein / タンパク質

Relation of genes / 遺伝子間の関係性

Phylogenetic tree / 系統樹

Bibliography / 文献情報

I/O with other software / 他のソフトの入出力

Utilize web services / ウェブサービス利用

Page 7: BioRuby -- Bioinformatics Library

BioRuby

Code example

require "rubygems"

require "bio"

f = Bio::FlatFile.open(ARGF)

f.each do |entry|

dna = entry.naseq

aa = dna.translate

seq = Bio::Sequence.new(seq)

print seq.output_fasta(e.definition)

end

DNA → Protein translation / DNA→タンパク質の翻訳

Page 8: BioRuby -- Bioinformatics Library

BioRuby

Status

Latest version: BioRuby 1.4.1 (22/Oct/2010)

Supported Ruby version: 1.8.x

Will soon be migrated to1.9 / 速やかに1.9に移行予定

Files

Library: 230 files / 35,000 lines (w/o comments, void lines)

Tests: 120 files / 22,000 lines

Sample codes: 70 files

Functionality

580 classes/modules

2,800 methods

Plugin system introduced (using gem)

Page 9: BioRuby -- Bioinformatics Library

BioRuby

BioRuby developer’s community

Core developers (6 persons) Toshiaki Katayama (leader) (Univ. of Tokyo, Japan)

Naohisa Goto (release manager) (Osaka Univ., Japan)

Mitsuteru Nakao (Japan)

Pjotr Prins (Wageningen University, Netherlands)

Raoul Bonnal (INGM, Italy)

Jan Aerts (Belgium)

Total >30 contributors in 10 years

10年間で延べ30人以上の貢献者

Active developers / users in the world

世界中にアクティブな開発者/利用者

Page 10: BioRuby -- Bioinformatics Library

BioRuby

Page 11: BioRuby -- Bioinformatics Library

BioRuby

Brief history

11/2000 BioRuby project started

06/2001 The first version (BioRuby 0.1)

2005-2006 IPA Exploratory (未踏) Software Project

02/2006 BioRuby 1.0.0 released

09/2008 moved from CVS to Git

08/2010 Published BioRuby research paper / 学術論文

10/2010 BioRuby 1.4.1 released

Page 12: BioRuby -- Bioinformatics Library

BioRuby

Preceding projects / 先行プロジェクト

Demanded by genome projects in late 90’s

1990年代後半のゲノムプロジェクトに伴う

BioPerl – since 1996 (Perl 1987)

Biopython – since 1999 (Python 1991)

BioJava – since 1999 (Java 1995)

BioRuby – since 2000 (Ruby 1995)

Together with Open Bioinformatics Foundation

http://open-bio.org/

Google Summer of Code 2009, 2010, 2011

Page 13: BioRuby -- Bioinformatics Library

BioRuby

BioHackathon

Open Bio* Hackathon (2002, 2003)

Phyloinformatics Hackathon (2006)

DBCLS BioHackathon (2008-2010)

Page 14: BioRuby -- Bioinformatics Library

BioRuby

Academic Community / 学会

• Bioinformatics Open Source Conference

• GIW / JSBi (日本バイオインフォマティクス学会)

• MBSJ (日本分子生物学会)

• Open Bio Japan

(オープンバイオ研究会)

Page 15: BioRuby -- Bioinformatics Library

BioRuby

Open Source Community

• Ruby Kansai (関西Ruby勉強会) (2005-)

• IPA Exploratory (未踏) Software Project (2005-2006)

• RubyKaigi (2006-)

• Google Summer of Code (2009-2011)

(Open Bioinformatics Foundation)

Page 16: BioRuby -- Bioinformatics Library

BioRuby

Recent topics

Release of new version (BioRuby 1.4.2)

Ruby 1.9.3 migration

Revolution of DNA sequencing technique

DNA塩基配列決定技術の飛躍的向上

Page 17: BioRuby -- Bioinformatics Library

BioRuby

Next-Generation Sequencer (NGS)

Example: Illumina HiSeq 2000

>600GB DNA sequences in 10-days

10日間で600GB超のDNA塩基配列を決定する装置

Lack of Resources / 足りないもの

• HDD

• CPU

• Memory

• Software

• Human

• Money

• ...

Page 18: BioRuby -- Bioinformatics Library

BioRuby

Join us

BioRuby

Web http://bioruby.org

ML [email protected]

GitHub https://github.org/bioruby/bioruby

BioRubyユーザーが書いた本

多田雅人著「Rubyではじめる

バイオインフォマティクス」

発売中!!