2013: Trends from the Trenches

Download 2013: Trends from the Trenches

Post on 25-May-2015




2 download

Embed Size (px)


Slides from the 2013 "trends talk" as delivered annually at Bio-IT World Boston.


<ul><li> 1. Trends from the trenches.2013 Bio IT World - Boston 1</li></ul><p> 2. Some less aspirational title slides ...2 3. Trends from the trenches.2013 Bio IT World Boston3 4. Trends from the trenches.2013 Bio IT World Boston4 5. Im Chris.Im an infrastructure geek.I work for the BioTeam. www.bioteam.net - Twitter: @chris_dag 5 6. BioTeamWho, What, Why ... Independent consulting shop Staffed by scientists forced tolearn IT, SW &amp; HPC to get ourown research done 10+ years bridging the gapbetween science, IT &amp; highperformance computing6 7. If you have not heard me speak ...Apologies in advance Infamous for speaking very fast and carrying a huge slide deck ~70 slides for 25 minutes about average for me Let me mention what happened after my Pharma HPC best practices talk yesterday ... By the time you see this slide Ill be on my ~4th espresso7 8. Why I do this talk every year ... Bioteam works for everyone Pharma, Biotech, EDU, Nonprot, .Gov, etc. We get to see how groups of smart people approach similar problems We can speak honestly &amp; objectively about what we see in the real world8 9. Standard Dag DisclaimerListen to me at your own risk Im not an expert, pundit, visionary or thought leader Any career success entirely due to shamelessly copying what actual smart people do Im biased, burnt-out &amp; cynical Filter my words accordingly 9 10. So why are you here?And before 9am! 10 11. Its a risky time to be doing Bio-IT 11 12. Big Picture / Meta Issue HUGE revolution in the rate at whichlab platforms are being redesigned,improved &amp; refreshed Example: CCD sensor upgrade on thatconfocal microscopy rig just doubledstorage requirements Example: The 2D ultrasound imager isnow a 3D imager Example: Illumina HiSeq upgrade justdoubled the rate at which you can acquiregenomes. Massive downstream increasein storage, compute &amp; data movementneeds For the above examples, do youthink IT was informed in advance?12 13. The Central Problem Is ...Science progressing way faster than IT can refresh/change Instrumentation &amp; protocols are changing FAR FASTER than we can refresh our Research-IT &amp; Scientic Computing infrastructure Bench science is changing month-to-month ... ... while our IT infrastructure only gets refreshed every 2-7 years We have to design systems TODAY that can support unknown research requirements &amp; workows over many years (gulp ...) 13 14. The Central Problem Is ... The easy period is over 5 years ago we could tossinexpensive storage andservers at the problem;even in a nearby closet orunder a lab bench ifnecessary That does not work anymore; real solutionsrequired 14 15. The new normal.15 16. And a related problem ... It has never been easier toacquire vast amounts of datacheaply and easily Growth rate of data creation/ingest exceeds rate at whichthe storage industry isimproving disk capacity Not just a storage lifecycleproblem. This data *moves*and often needs to be sharedamong multiple entities andproviders ... ideally without punching holes inyour rewall or consuming allavailable internet bandwidth16 17. If you get it wrong ... Lost opportunity Missing capability Frustrated &amp; very vocal scientic staff Problems in recruiting, retention,publication &amp; product development17 18. Enough groundwork. Lets Talk Trends* 18 19. Topic: DevOps &amp; Org Charts 19 20. The social contract betweenscientist and IT is changing forever 20 21. You can blame the cloud for this 21 22. DevOps &amp; Scriptable Everything On (real) clouds,EVERYTHING has anAPI If its got an API you canautomate andorchestrate it scriptable datacentersare now a very real thing 22 23. DevOps &amp; Scriptable Everything Incredible innovation inthe past few years Driven mainly bycompanies withmassive interneteets to manage ... but the benetstrickle down to us littlepeople 23 24. DevOps will conquer the enterprise Over the past few years cloud automation/ orchestration methods have been trickling down into our local infrastructures This will have signicant impact on careers, job descriptions and org charts 24 25. Scientist/SysAdmin/Programmer2013: Continue to blur the lines between all these roles Radical change in how IT www.opscode.com is provisioned, delivered, managed &amp; supported Technology Driver:Virtualization &amp; Cloud Ops Driver:Conguration Mgmt, SystemsOrchestration &amp; InfrastructureAutomation SysAdmins &amp; IT staff need to re-skill and retrain to stay relevant 25 26. Scientist/SysAdmin/Programmer2013: Continue to blur the lines between all these roles When everything has an API ... ... anything can be orchestrated or automated remotely And by the way ... The APIs (knobs &amp; buttons) are accessible to all, not just the bearded practitioners sitting in that room next to the datacenter 26 27. Scientist/SysAdmin/Programmer2013: Continue to blur the lines between all these roles IT jobs, roles and responsibilities are going to change signicantly SysAdmins must learn to program in order to harness automation tools Programmers &amp; Scientists can now self- provision and control sophisticated IT resources 27 28. Scientist/SysAdmin/Programmer2013: Continue to blur the lines between all these roles My take on the future ... SysAdmins (Windows &amp; Linux) who cant code will have career issues Far more control is going into the hands of the research end user IT support roles will radically change -- no longer owners or gatekeepers IT will own policies, procedures, reference patterns, identity mgmt, security &amp; best practices Research will control the what, when and how big 28 29. Topic: Facility Observations 29 30. Facility 1: Enterprise vs Shadow IT Marked difference in thetypes of facilities wevebeen working in Discovery Researchsystems are rmlyembedded in theenterprise datacenter ... moving away from wildwest unchaperonedlocations and mini-facilities30 31. Facility 2: Colo Suites for R&amp;D Marked increase in use of commercial colocation facilities for R&amp;D systems And theyve noticed! - Markly Group (One Summer) has a booth - Sabey is on this afternoons NYGenome panel Potential reasons: Expensive to build high-density hosting at small scale Easier metro networking to link remote users/sites Direct connect to cloud provider(s) High-speed research nets only a cross-connect away31 32. Facility 3: Some really old stuff ... Final facility observation Average age of infrastructure we work on seems to be increasing ... very few aggressive 2-year refresh cycles these days Potential reasons Recession &amp; consolidation still effecting or deferring major technology upgrades and changes Cloud: local upgrades deferred pending strategic cloud decisions Cloud: economic analysis showing stark truth that local setups need to be run efciently and at high utilization in order to justify existence 32 33. Facility 3: Virtualization Every HPC environment weve worked on since 2011has included (or plans to include) a local virtualization environment True for big systems: 2k cores / 2 petabyte disk True for small systems: 96 core CompChem cluster Unlikely to change; too many advantages33 34. Facility 3: Virtualization HPC + Virtualization solves a lot of problems Deals with valid biz/scientic need for researchers torun/own/manage their own servers near HPC stack Solves a ton of research IT support issues Or at least leaves us a clear boundary line Lets us obtain useful cloud features without choking on endless BS shoveled at us by private cloud vendors Example: Server Catalogs + Self-service Provisioning 34 35. Topic: Compute 35 36. Compute: Still feels like a solvedproblem in 2013 Compute power is acommodity Inexpensive relative to othercosts Far less vendor differentiationthan storage Easy to acquire; easy todeploy36 37. Compute: Fat NodesFat nodes are wiping out small and midsized clusters This box has 64 CPU Cores ... and up to 1TB of RAM Fantastic Genomics/ Chemistry system A 256GB RAM version only costs $13,000* BioIT Homework: Go visit the Sillicon Mechanics booth and nd out the current cost of a box with 1TB RAM 37 38. Possibly the most signicant 13 compute trend 38 39. Compute: Local Disk is BackDefensive hedge against Big Data / HDFS Weve started to see organizations move away from blade servers and 1U pizza box enclosures for HPC The new normal may be 4U enclosures with massive local disk spindles - not occupied, just available Why? Hadoop &amp; Big Data This is a defensive hedge against future HDFS or similar requirements Remember the meta problem - science is changing far faster than we can refresh IT. This is a defensive future-proong play. Hardcore Hadoop rigs sometimes operate at 1:1 ratio between core count and disk count39 40. Topic: Network 40 41. Network: 10 Gigabit Ethernet still thestandard ... although not as pervasive as Ipredicted in prior trend talks Non-Cisco options attractive BioIT homework: listen to the Aristatalks and visit their booth. SDN still more hype than realityin our market May not see it until next round oflarge private cloud rollouts or newfacility construction (if even) 41 42. Network: Inniband for message passingin decline Still see it for comp chem, modeling &amp;structure work; Started building sucha system last week Still see it for parallel and clusteredstorage Decline seems to match decreasingpopularity of MPI for latest generationof informatics and omics tools Hadoop / HDFS seems to favorthroughput and bandwidth overlatency42 43. Topic: Storage 43 44. Storage Still the biggest expense, biggest headache and scariestsystems to design in modern life science informaticsenvironments Most of my slides for last years trends talk focused onstorage &amp; data lifecycle issues Check http://slideshare.net/chrisdag/ if you want to see what Ive saidin the past Dag accuracy check: It was great yesterday to see DataDirect talkingabout the KVM hypervisor running on their storage shelves! Imconvinced more and more apps will run directly on storage in the future ... not doing that this year. The core problems and commonapproaches are largely unchanged and dont need to berestated44 45. Its 2013, we know what questions to ask of our storage45 46. NGS new data generation: 6-month windowData like this lets us make realistic capacity planning and purchase decisions 46 47. Storage: 2013 Advice: Stay on top of thecompute nodes withmany disks trends. HDFS if suddenly requiredby your scientists can bepainful to deploy in astandard scale-out NASenvironment 47 48. Storage: 2013 Object Storage isgetting interesting48 49. Storage: 2013Object Storage + Commodity Disk Pods Object storage is far more approachable ... used to see it in proprietary solutions for specic niche needs potentially on its way to the mainstream now Why? Benets are compelling across a wide variety of interesting use cases Amazon S3 showed what a globe-spanning general purpose object store could do; this is starting to convince developers &amp; ISVs to modify their software to support it www.swiftstack.com and others are making local object stores easy, inexpensive and approachable on commodity gear Most of your Tier1 storage and server vendors have a fully supported object store stack they can sell to you (or simply enable in a product you already have deployed in-house)49 50. Remember this disruptive technology example from last year?50 51. 100 Terabytes for $12,000(more info: http://biote.am/8p ) 51 52. Storage: 2013 There are MANY reasons why you shouldnot build that $12K backblaze pod ... done wrong you will potentially inconvenienceresearchers, lose critical scientic information and(probably) lose your job Inexpensive or open source object storagesoftware makes the ultra-cheap storagepod concept viable 52 53. Storage: 2013 A single unit like this is risky and should onlybe used for well known and scoped use cases.Risks generally outweigh the disruptive priceadvantage However ... What if you had 3+ of these units running anobject store stack with automatic triplelocation replication, recovery and self-healing? Then things get interesting This is one of the lab projects I hope to work on in 13 53 54. Storage: 2013 Caveat/Warning The 2013 editions of backblaze-like enclosures mitigatemany of the earlier availability, operational and reliabilityconcerns Still a aggressive play that carries risk in exchange for adisruptive price point There is a middle ground Lots of action in the ZFS space with safer &amp; more mainstreamenclosures BioIT Homework: Visit the Silicon Mechanics booth andcheck out what they are doing with Nexentas Open Storagestuff.54 55. Topic: Cloud 55 56. Can you do a Bio-IT talk without using the C word? 56 57. Cloud: 2013 Our core advice remains the same Whats changed 57 58. Cloud: 2013Core Advice Research Organizations need a cloud strategy today Those that dont will be bypassed by frustrated users IaaS cloud services are only a departmental credit card away ... and some senior scientists are too big to be red for violating IT policy 58 59. Cloud AdviceDesign Patterns You actually need three tested cloud design patterns: (1) To handle legacy scientic apps &amp; workows (2) The special stuff that is worth re-architecting (3) Hadoop &amp; big data analytics 59 60. Cloud AdviceLegacy HPC on the Cloud MIT StarCluster http://web.mit.edu/star/cluster/ This is your baseline Extend as needed60 61. Cloud AdviceCloudy HPC Some of our research workows are important enough to be rewritten for the cloud and the advantages that a truly elastic &amp; API-driven infrastructure can deliver This is where you have the most freedom Many published best practices you can borrow Warning: Cloud vendor lock-in potential is strongest here61 62. Hadoop &amp; Big DataWhat you need to know Hadoop and Big Data are now general terms You need to drill down to nd out what people actually mean We are still in the period where senior leadership may demand Hadoop or BigData capability without any actual business or scientic need 62 63. Hadoop &amp; Big DataWhat you need to know In broad terms you can break Big Data down into two very basic use cases: 1. Compute: Hadoop can be used as a very powerfulplatform for the analysis of very large data sets. Thegoogle search term here is map reduce 2. Data Stores: Hadoop is driving the development of verysophisticated no-SQL non-Relational databases anddata query engines. The google search terms includenosql, couchdb, hive, pig &amp; mongodb, etc. Your job is to gure out which type applies for the groups requesting Hadoop or BigData capability 63 64. Cloud: 2013What has changed .. Lets revisit some of my bile from prior years ... private clouds: still utter crap ... some AWS competitors are delusional pretenders ... AWS has a multi-year lead on the competition 64 65. Private Clouds in 2013: Im no longer dismissing them as utter crap Usable &amp; useful in certain situations BioTeam positi...</p>