2014 10-15-nextbug edinburgh

@yannick__ http://yannick.poulet.org

Social insect evolution: genomics opportunities

& approaches

2014-10-15-NextBUG

mailto:[email protected]


© National Geographic

Atta leaf-cutter ants

Oecophylla Weaver ants

© ameisenforum.de

© ameisenforum.de

Fourmis tisserandes

© ameisenforum.de

Oecophylla Weaver ants

Tofilski et al 2008

Forelius pusillus

Tofilski et al 2008

Forelius pusillus hides the nest entrance at night

Avant

Workers staying outside die« preventive self-sacrifice »

Tofilski et al 2008

Forelius pusillus hides the nest entrance at night

Dorylus driver ants: ants with no home

© BBC

© Dirk Mezger

Ritualized fighting

© Carsten BrühlCamponotus gigas Pfeiffer & Linsenmair 2001

Army ant milling - “spiral of death”

Animal biomass (Brazilian rainforest)

from Fittkau & Klinge 1973

Other insects 49.6

Amphibians 2.8

Reptiles 3.7

Birds 5.3

Mammals 14.5

!Earthworms

17.3

!!

Spiders 4.7

Soil fauna excluding earthworms,

ants & termites 148

Ants & termites 114

Well-studied:

• behavior

• morphology

• evolutionary context

• ecology

This changes everything.454

Illumina Solid...

Any lab can sequence anything!

Major research areasGenes/mechanisms for evolution of

social behavior?

www.sciencemag.org SCIENCE VOL 331 25 FEBRUARY 2011 1067

REPORTS

on

Mar

ch 1

2, 2

013

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fro

m

Solenopsis invicta fire ants are a big problem!very well studied!

Ascunce et al 2011

Solenopsis invicta fire ant: two social forms

!

•1 large queen •Independent founding •Highly territorial •Many sizes of workers

!

•2-100 smaller queens •Dependent founding •No inter-colony aggression •All workers similar size

Single-queen form: Multiple-queen form:

Fire ants+

Population genetics: Allozyme screen

Ken Ross L. Keller

“starch gel”+

1 2 3=> “Gp-9” locus associated to social form

Single queen form Multiple queen form

Ken Ross and colleagues Laurent Keller and colleagues

Social form completely associated to Gp-9 locus

bbbbBB BB Bb bb




(>15% ) (< 5% )

bbBB BB Bb

x

Gp-9 bb females rareKen Ross and colleagues

Laurent Keller and colleagues



(>15% ) (< 5% )

BB BB Bb




(>15% ) (< 5% )

BB BB Bb

xKen Ross and colleagues




(>15% ) (< 5% )

BB BB Bb

x xKen Ross and colleagues



Single queen form Multiple queen form(>15% ) (< 5% )

BB BB Bb

x x xKen Ross and colleagues


Single queen form Multiple queen form(>15% ) (< 5% )


Sex chromosomes

X Y

Gp-9 B

Gp-9 b

SB Sb

“Social chromosomes”

?

Wang et al Nature 2013

Major research areas

Genes/mechanisms for differences (e.g., lifespan?)?

Genes/mechanisms for evolution of social behavior?

genome evolution social evolution

This changes everything.454

Illumina Solid...

Any lab can sequence anything!

Genomics is hard.

• Biology/life is complex • Field is young. • Biologists lack computational training. • Generally, analysis tools suck.

• badly written • badly tested • hard to install • output quality… often questionable.

• Understanding/visualizing/massaging data is hard. • Datasets continue to grow!

Genomics is hard.

Inspiration?

arX

iv:1

210.

0530

v3 [

cs.M

S] 2

9 N

ov 2

012

Best Practices for Scientific ComputingGreg Wilson ∗, D.A. Aruliah †, C. Titus Brown ‡, Neil P. Chue Hong §, Matt Davis ¶, Richard T. Guy ∥,Steven H.D. Haddock ∗∗, Katy Huff ††, Ian M. Mitchell ‡‡, Mark D. Plumbley §§, Ben Waugh ¶¶,Ethan P. White ∗∗∗, Paul Wilson †††

∗Software Carpentry ([email protected]),†University of Ontario Institute of Technology ([email protected]),‡MichiganState University ([email protected]),§Software Sustainability Institute ([email protected]),¶Space Telescope Science Institute([email protected]),∥University of Toronto ([email protected]),∗∗Monterey Bay Aquarium Research Institute([email protected]),††University of Wisconsin ([email protected]),‡‡University of British Columbia ([email protected]),§§QueenMary University of London ([email protected]),¶¶University College London ([email protected]),∗∗∗Utah StateUniversity ([email protected]), and †††University of Wisconsin ([email protected])

Scientists spend an increasing amount of time building and usingsoftware. However, most scientists are never taught how to do thisefficiently. As a result, many are unaware of tools and practices thatwould allow them to write more reliable and maintainable code withless effort. We describe a set of best practices for scientific softwaredevelopment that have solid foundations in research and experience,and that improve scientists’ productivity and the reliability of theirsoftware.

Software is as important to modern scientific research astelescopes and test tubes. From groups that work exclusivelyon computational problems, to traditional laboratory and fieldscientists, more and more of the daily operation of science re-volves around computers. This includes the development ofnew algorithms, managing and analyzing the large amountsof data that are generated in single research projects, andcombining disparate datasets to assess synthetic problems.

Scientists typically develop their own software for thesepurposes because doing so requires substantial domain-specificknowledge. As a result, recent studies have found that scien-tists typically spend 30% or more of their time developingsoftware [19, 52]. However, 90% or more of them are primar-ily self-taught [19, 52], and therefore lack exposure to basicsoftware development practices such as writing maintainablecode, using version control and issue trackers, code reviews,unit testing, and task automation.

We believe that software is just another kind of experi-mental apparatus [63] and should be built, checked, and usedas carefully as any physical apparatus. However, while mostscientists are careful to validate their laboratory and fieldequipment, most do not know how reliable their software is[21, 20]. This can lead to serious errors impacting the cen-tral conclusions of published research [43]: recent high-profileretractions, technical comments, and corrections because oferrors in computational methods include papers in Science[6], PNAS [39], the Journal of Molecular Biology [5], EcologyLetters [37, 8], the Journal of Mammalogy [33], and Hyper-tension [26].

In addition, because software is often used for more than asingle project, and is often reused by other scientists, comput-ing errors can have disproportional impacts on the scientificprocess. This type of cascading impact caused several promi-nent retractions when an error from another group’s code wasnot discovered until after publication [43]. As with bench ex-periments, not everything must be done to the most exactingstandards; however, scientists need to be aware of best prac-tices both to improve their own approaches and for reviewingcomputational work by others.

This paper describes a set of practices that are easy toadopt and have proven effective in many research settings.Our recommendations are based on several decades of collec-tive experience both building scientific software and teach-ing computing to scientists [1, 65], reports from many othergroups [22, 29, 30, 35, 41, 50, 51], guidelines for commercial

and open source software development [61, 14], and on empir-ical studies of scientific computing [4, 31, 59, 57] and softwaredevelopment in general (summarized in [48]). None of thesepractices will guarantee efficient, error-free software develop-ment, but used in concert they will reduce the number oferrors in scientific software, make it easier to reuse, and savethe authors of the software time and effort that can used forfocusing on the underlying scientific questions.

1. Write programs for people, not computers.Scientists writing software need to write code that both exe-cutes correctly and can be easily read and understood by otherprogrammers (especially the author’s future self). If softwarecannot be easily read and understood it is much more difficultto know that it is actually doing what it is intended to do. Tobe productive, software developers must therefore take severalaspects of human cognition into account: in particular, thathuman working memory is limited, human pattern matchingabilities are finely tuned, and human attention span is short[2, 23, 38, 3, 55].

First, a program should not require its readers to hold morethan a handful of facts in memory at once (1.1). Human work-ing memory can hold only a handful of items at a time, whereeach item is either a single fact or a “chunk” aggregating sev-eral facts [2, 23], so programs should limit the total number ofitems to be remembered to accomplish a task. The primaryway to accomplish this is to break programs up into easilyunderstood functions, each of which conducts a single, easilyunderstood, task. This serves to make each piece of the pro-gram easier to understand in the same way that breaking up ascientific paper using sections and paragraphs makes it easierto read. For example, a function to calculate the area of arectangle can be written to take four separate coordinates:def rect_area(x1, y1, x2, y2):

...calculation...

or to take two points:def rect_area(point1, point2):

...calculation...

The latter function is significantly easier for people to readand remember, while the former is likely to lead to errors, not

Reserved for Publication Footnotes

1–7

arX

iv:1

210.

0530

v3 [

cs.M

S] 2

9 N

ov 2

012












...calculation...


...calculation...



1–7

arX

iv:1

210.

0530

v3 [

cs.M

S] 2

9 N

ov 2

012












...calculation...


...calculation...



1–7

1. Write programs for people, not computers. 2. Automate repetitive tasks. 3. Use the computer to record history. 4. Make incremental changes. 5. Use version control. 6. Don’t repeat yourself (or others). 7. Plan for mistakes. 8. Optimize software only after it works correctly. 9. Document the design and purpose of code rather than its mechanics.!10. Conduct code reviews.

Inspiration?

• Technologies

• Planning for mistakes

• Automated testing

• Continuous

• Writing for people: use style guide

Code for people: Use a style guide• For R: http://r-pkgs.had.co.nz/style.html

http://r-pkgs.had.co.nz/style.html

R style guide extract

Coding for people: Indent your code!

Programming better

• variable naming

• coding width: 100 characters

• indenting

• Follow conventions -eg “Google R Style”

• Versioning: DropBox & http://github.com/

• Automated testing

• “being able to use understand and improve your code in 6 months & in 60 years” - approximate Damian Conway

preprocess_snps <- function(snp_table, testing=FALSE) { if (testing) { # run a bunch of tests of extreme situations. # quit if a test gives a weird result. } # real part of function. }

Friday, 22 June 12

Line length Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function.

R style guide extract

!ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header=TRUE, sep='\t', col.names = c('colony', 'individual', 'headwidth', ‘mass'))

!ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header = TRUE, sep = '\t', col.names = c('colony', 'individual', 'headwidth', 'mass') )

!ant_measurements <- read.table(file = '~/Downloads/Web/ant_measurements.txt', header=TRUE, sep='\t', col.names = c('colony', 'individual', 'headwidth', ‘mass'))

Code for people: Use a style guide• For R: http://r-pkgs.had.co.nz/style.html • For Ruby: https://github.com/bbatsov/ruby-style-guide

Automatically check your code:install.packages(“lint”) # once

library(lint) # everytime lint(“file_to_check.R”)

http://r-pkgs.had.co.nz/style.html

https://github.com/bbatsov/ruby-style-guide

Four tools

suck less. Four tools that

Four tools

suck less. (hopefully)

Four tools that

1. SequenceServer

“Can you BLAST this for me?”

• Once I wanted to set up a BLAST server.

Anurag Priyam, Mechanical engineering student, Kharagpur

Aim: An open source idiot-proof web-interface

for custom BLASTFriday, 22 June 12

Anurag Priyam, Mechanical engineering student, IIT Kharagpur

Sure, I can help you…


Antgenomes.org SequenceServer BLAST made easy

(well, we’re trying...)

http://antgenomes.org/blast

http://www.sequenceserver.com/

(requires a BLAST+ install)

Do you have BLAST-formatted databases? If not: sequenceserver format-databases /path/to/fastas

1. Installinggem install sequenceserver

# ~/.sequenceserver.conf bin: ~/ncbi-blast-2.2.25+/bin/ database: /Users/me/blast_databases/

2. Configure.

sequenceserver ### Launched SequenceServer at: http://0.0.0.0:4567

3. Launch.

http://www.sequenceserver.com/

New release(soon)

Demo

http://antgenomes.org/sequenceserver

http://localhost:4567


Antgenomes.org SequenceServer BLAST made easy

(well, we’re trying...)

Web server :Anurag Priyam & Git community - http://sequenceserver.com

blast on 48-core 512gig fat machine

via ssh

http://antgenomes.org/blast

http://sequenceserver.com

2. Bionode

Module countsNode = “NPM”

Reusable, small and testedmodules

ExamplesBASH

JavaScript

bionode.io (online shell)

bionode-ncbi urls assembly Solenopsis invicta | grep genomic.fna

http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG/ GCA_000188075.1_Si_gnG_genomic.fna.gz

bionode-ncbi download sra arthropoda | bionode-sra

bionode-ncbi download gff bacteria

var ncbi = require('bionode-ncbi') ncbi.urls('assembly', 'Solenopsis invicta'), gotData) function gotData(urls) { var genome = urls[0].genomic.fna download(genome) })

# Get descriptions for papers related to SRA search !bionode ncbi search sra Solenopsis invicta | tool-‐stream extractProperty uid | bionode ncbi link sra pubmed | tool-‐stream extractProperty destUID | bionode ncbi search pubmed !

Difficulty writing scalable, reproducible andcomplex bioinformatic pipelines.Solution: Node.js everywhereStreams var ncbi = require('bionode-ncbi') var tool = require('tool-stream') var through = require('through2') var fork1 = through.obj() var fork2 = through.obj()

ncbi .search('sra', 'Solenopsis invicta') .pipe(fork1) .pipe(dat.reads)

fork1 .pipe(tool.extractProperty('expxml.Biosample.id')) .pipe(ncbi.search('biosample')) .pipe(dat.samples)

fork1 .pipe(tool.extractProperty('uid')) .pipe(ncbi.link('sra', 'pubmed'))

Working with Gene predictions

Gene predictionDozens of software algorithms: dozens of predictions

20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting

Visual inspection... and manual fixing required.

1 gene = 5 minutes to 3 days

Yand

ell &

Enc

e 20

13 N

RG

GTCTACAATGCGATTGTAAAATAGCACGAgAGGTGCATATGATGAACGACTATGTTCCACAACCACAGCTCATATATAACATGATTTtGTTTGCCGAATTCATACACGCATTACAACACACATTGAATTCAATAATAATATCAAATTCACATTCAAAGCTTTCAAGTTAGACAAAAGTTTTAATGCCGTTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATTATGTTGAATaTTAGGGTTTTTATAAAGAATGTGTATATTGUTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTTATTTATTGTTCATTGTTTGTTCTTTATTTTGTTATTTGTAAATAATGAAA

Evidence

Evidence

Consensus:

3. GeneValidator

Monica Dragan

Ismail Moghul

https://github.com/monicadragan/GeneValidatorhttps://github.com/IsmailM/GeneValidatorApp

https://github.com/monicadragan/GeneValidator

https://github.com/IsmailM/GeneValidatorApp

Monica Draganhttps://github.com/monicadragan/GeneValidatorhttps://github.com/IsmailM/GeneValidatorApp

Ismail Moghul

https://github.com/monicadragan/GeneValidator

https://github.com/IsmailM/GeneValidatorApp

GeneValidator

Run on:

★whole geneset: identify most problematic predictions

★alternative models for a gene (choose best)

★individual genes (while manually curating)

Warning: Work in Progress

gem install GeneValidator gem install GeneValidatorApp

http://afra.sbcs.qmul.ac.uk/genevalidator

http://afra.sbcs.qmul.ac.uk/genevalidator

3. Afra: Crowdsourcing gene model curation

Gene predictionDozens of software algorithms: dozens of predictions

20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting

Visual inspection... and manual fixing required. 1 gene = 20 minutes to 3 days 15,000 genes * 20 species = impossible. Ya

ndell

& E

nce

2013

NRG

GTCTACAATGCGATTGTAAAATAGCACGAgAGGTGCATATGATGAACGACTATGTTCCACAACCACAGCTCATATATAACATGATTTtGTTTGCCGAATTCATACACGCATTACAACACACATTGAATTCAATAATAATATCAAATTCACATTCAAAGCTTTCAAGTTAGACAAAAGTTTTAATGCCGTTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTTATTTATTGTTCATTGTTTGTTCTTTATTTTGTTATTTGTAAATAATGAAA

Evidence

Evidence

Consensus:

Algorithm discovery by protein folding game playersFiras Khatiba, Seth Cooperb, Michael D. Tykaa, Kefan Xub, Ilya Makedonb, Zoran Popovićb,David Bakera,c,1, and Foldit PlayersaDepartment of Biochemistry; bDepartment of Computer Science and Engineering; and cHoward Hughes Medical Institute, University of Washington,Box 357370, Seattle, WA 98195

Contributed by David Baker, October 5, 2011 (sent for review June 29, 2011)

Foldit is a multiplayer online game in which players collaborateand compete to create accurate protein structure models. For spe-cific hard problems, Foldit player solutions can in some cases out-perform state-of-the-art computational methods. However, verylittle is known about how collaborative gameplay produces theseresults and whether Foldit player strategies can be formalized andstructured so that they can be used by computers. To determinewhether high performing player strategies could be collectivelycodified, we augmented the Foldit gameplay mechanics with toolsfor players to encode their folding strategies as “recipes” and toshare their recipes with other players, who are able to further mod-ify and redistribute them. Here we describe the rapid social evolu-tion of player-developed folding algorithms that took place in theyear following the introduction of these tools. Players developedover 5,400 different recipes, both by creating new algorithms andby modifying and recombining successful recipes developed byother players. The most successful recipes rapidly spread throughthe Foldit player population, and two of the recipes became parti-cularly dominant. Examination of the algorithms encoded in thesetwo recipes revealed a striking similarity to an unpublished algo-rithm developed by scientists over the same period. Benchmarkcalculations show that the new algorithm independently discov-ered by scientists and by Foldit players outperforms previouslypublished methods. Thus, online scientific game frameworks havethe potential not only to solve hard scientific problems, but also todiscover and formalize effective new strategies and algorithms.

citizen science ∣ crowd-sourcing ∣ optimization ∣ structure prediction ∣strategy

Citizen science is an approach to leveraging natural humanabilities for scientific purposes. Most such efforts involve

visual tasks such as tagging images or locating image features(1–3). In contrast, Foldit is a multiplayer online scientific discoverygame, in which players become highly skilled at creating accurateprotein structure models through extended game play (4, 5). Folditrecruits online gamers to optimize the computed Rosetta energyusing human spatial problem-solving skills. Players manipulateprotein structures with a palette of interactive tools and manipula-tions. Through their interactive exploration Foldit players also uti-lize user-friendly versions of algorithms from the Rosetta structureprediction methodology (6) such as wiggle (gradient-based energyminimization) and shake (combinatorial side chain rotamer pack-ing). The potential of gamers to solve more complex scientific pro-blems was recently highlighted by the solution of a long-standingprotein structure determination problem by Foldit players (7).

One of the key strengths of game-based human problem ex-ploration is the human ability to search over the space of possiblestrategies and adapt those strategies to the type of problem andstage of problem solving (5). The variability of tactics andstrategies stems from the individuality of each player as well asmultiple methods of sharing and evolution within the game(group play, game chat), and outside of the game [wiki pages (8)].One way to arrive at algorithmic methods underlying successfulhuman Foldit play would be to apply machine learning techniquesto the detailed logs of expert Foldit players (9). We chose insteadto rely on a superior learning machine: Foldit players themselves.

As the players themselves understand their strategies better thananyone, we decided to allow them to codify their algorithmsdirectly, rather than attempting to automatically learn approxi-mations. We augmented standard Foldit play with the ability tocreate, edit, share, and rate gameplay macros, referred to as“recipes” within the Foldit game (10). In the game each playerhas their own “cookbook” of such recipes, from which they caninvoke a variety of interactive automated strategies. Players canshare recipes they write with the rest of the Foldit community orthey can choose to keep their creations to themselves.

In this paper we describe the quite unexpected evolution ofrecipes in the year after they were released, and the striking con-vergence of this very short evolution on an algorithm very similarto an unpublished algorithm recently developed independentlyby scientific experts that improves over previous methods.

ResultsIn the social development environment provided by Foldit,players evolved a wide variety of recipes to codify their diversestrategies to problem solving. During the three and a half monthstudy period (see Materials and Methods), 721 Foldit players ran5,488 unique recipes 158,682 times and 568 players wrote 5,202recipes. We studied these algorithms and found that they fellinto four main categories: (i) perturb and minimize, (ii) aggressiverebuilding, (iii) local optimize, and (iv) set constraints. The firstcategory goes beyond the deterministic minimize functionprovided to Foldit players, which has the disadvantage of readilybeing trapped in local minima, by adding in perturbations to leadthe minimizer in different directions (11). The second categoryuses the rebuild tool, which performs fragment insertion withloop closure, to search different areas of conformation space;these recipes are often run for long periods of time as they aredesigned to rebuild entire regions of a protein rather than justrefining them (Fig. S1). The third category of recipes performslocal minimizations along the protein backbone in order to im-prove the Rosetta energy for every segment of a protein. The finalcategory of recipes assigns constraints between beta strands orpairs of residues (rubber bands), or changes the secondary struc-ture assignment to guide subsequent optimization.

Different algorithms were used with very different frequenciesduring the experiment. Some are designated by the authors aspublic and are available for use by all Foldit players, whereasothers are private and available only to their creator or theirFoldit team. The distribution of recipe usage among differentplayers is shown in Fig. 1 for the 26 recipes that were run over1,000 times. Some recipes, such as the one represented by theleftmost bar, were used many times by many different players,while others, such as the one represented by the pink bar in the

Author contributions: F.K., S.C., Z.P., and D.B. designed research; F.K., S.C., M.D.T., andF.P. performed research; F.K., S.C., M.D.T., K.X., and I.M. analyzed data; and F.K., S.C., Z.P.,and D.B. wrote the paper.

The authors declare no conflict of interest.

Freely available online through the PNAS open access option.1To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1115898108/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1115898108 PNAS ∣ November 22, 2011 ∣ vol. 108 ∣ no. 47 ∣ 18949–18953

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

PSYC

HOLO

GICALAND

COGNITIVESC

IENCE

S

http://Fold.it

http://Fold.it

• Recruiting & retaining contributors

Crowd-sourcing the visual inspection + correction of gene models.

Challenges

Recruiting & retaining contributorsPlan A: get students. • Increase accessibility:

• Make tasks small & simple • Need excellent tutorials & training • Need an intelligent “mothering” user interface.

• Provide rewards: • Better grades • Learning experience • Good karma (helping science) • Prestige & pride (on facebook; points & badges “leaderboard”, with

certificates, in publications) • Opportunities to develop expertise & responsibilities

Crowd-sourcing the visual inspection + correction of gene models.

Challenges


• Ensuring quality

Ensuring quality

• Excellent tutorials/training

• Make tasks small & simple

• Redundancy

• Review of conflicts by senior users.

Begin

EĞĞĚƐ�ĐƵƌĂƟŽŶ

�ƌĞĂƚĞ�ŝŶŝƟĂů�ƚĂƐŬƐ

Being curated

Curate

Being curated

Curate

Being curated

Curate

Submit Submit Submit

�ƵƚŽͲĐŚĞĐŬ

�ŽŶĞ

/ŶĐŽŶƐŝƐƚ

ĞŶƚ͗�ĐƌĞĂ

ƚĞ�

“ƌĞǀŝĞǁ͟

�ƚĂƐŬ�

�ŽŶƐŝƐƚĞŶƚ͗�create nexƚ�ƌĞƋƵŝƌĞĚ�ƚĂƐŬ

Crowd-sourcing the visual inspection + correction.

Challenges

http://afra.sbcs.qmul.ac.ukAnurag Priyam http://github.com/yeban/afra


• Ensuring quality

Warning: Work in Progress

Timelines• Rolled out to:

• 8 MSc students

• 20 3rd year students

• Need to improve tutorials/guidance/documentation

• Roll out to 200 first years (few months)

• Expand

Summary• Ants are cool

• Exciting times & big challenges

• Inspiration from people working with computers more/longer

• SequenceServer - set up custom BLAST servers

• Bionode -modular streams for bioinformatics

• GeneValidator - identifying problems with gene predictions

• Afra - infrastructure to crowdsource gene curation to the masses

Recruiting Genomehacker/Bioinformatics support

GitHub

Thanks!

[email protected]@yannick__

http://yannick.poulet.org

Colleagues & Collaborators @ QMUL & UNIL Anurag Priyam @yeban Monica Dragan Ismail Moghul Vivek Rai Bruno Vieira @bmpvieira



genome evolution social evolutionGenerally

Single- vs. Multiple queennessin fire antsin similar independent species

•one or many loci? •one or many genes? •convergence?

Social parasitism

Strengths of selection in social evolution

concepts & mechanisms

Medically relevant questionsCandidate gene studies

VitellogeninSex determination genes

functional testing....

Tools for genomics work on emerging model organisms

Molecular response to social upheaval

2014 10-15-nextbug edinburgh

Science

multiplequeen form

colleagues social form

bb bb bb x ken ross

bb bb bb x x ken ross

bbbbbb bb bb bb ken

bb bb bb x x x ken ross

bbbb bb bb x gp

colleagueslaurent keller