disposable environments at scale
DESCRIPTION
Presentation from ZFS Day, 2 October 2012 (http://zfsday.com)TRANSCRIPT
/
Disposable Environments at Scale
or: How I Learned to Stop Worrying and Love ZFS
Eric Sproul
Build EngineerSysadmin
Consultant
Twitter:@eirescot
Recipe for Success
enablement (n): the act of providing (someone)with adequate power, means, opportunity or authority(to do something)
This is a story of how ZFS enabled business success
Background
Etsy is the world's handmade marketplace
Experiencing rapid growth
Every department needed to understandhow the business was evolving
Asked OmniTI for help
The Situation
No data warehouse
Analytical queries againstPostgreSQL OLTP
Large initial size (~250 GB)
Forecast to reach 1 TB+within a year
Problems
Long-running queries to OLTPdestroy web performance
Re-running reportsproduces different results
Inflexible reporting interface
Requirements
Relieve pressure fromOLTP database
Make production datacontinuously available
Enable correlationof other sources
Flexible reporting UI
Solution
Create separate BI analytics DB
Build it on ZFS
Initial Capabilities
ETL to collate table-level datafrom multiple databases
Run deep analytic querieswithout impacting website
New web UI enables ad-hoc reporting
Reaping the Benefits of ZFS
Snapshots
Faster backups
Simple replica creation
Reaping the Benefits of ZFS
CompressionExtend usable life of storage:
PgSQL logical: 1.3TOn-disk: 653G (2.0x)
Intelligent resilverShorter rebuilds ==
Reduced risk of data loss
We Want More!
Monthly reports now on BI system
Occasional problems requirere-running reports
Still get different results,same as before
We Want More!
Need to test report changes
Fine to dev with small mock-up
Staging requires somethingthat looks like production
The Next Level
Disposable environments
Run on slave replica
R/W copy of BI data
Discard when finished
Disposable Environment
Use a non-global zone
set zonepath=/zones/bistageset autoboot=trueset limitpriv=default,dtrace_proc,dtrace_userset ip-type=sharedadd netset address=10.1.2.3set physical=bnx0endadd datasetset name=bi01tank/stageend
Disposable Environment
Starting state: ZFS datasets
bi01tank/pgsql/databi01tank/pgsql/data/91bi01tank/pgsql/wal_archivebi01tank/pgsql/wal_archive/91
Disposable Environment
Take snapshots
bi01tank/pgsql/data@stagebi01tank/pgsql/data/91@stagebi01tank/pgsql/wal_archive@stagebi01tank/pgsql/wal_archive/91@stage
zfs snapshot -r bi01tank/pgsql/data@stage
Disposable Environment
Create clones
bi01tank/pgsql/data@stagebi01tank/pgsql/data/91@stagebi01tank/pgsql/wal_archive@stagebi01tank/pgsql/wal_archive/91@stage
bi01tank/stage/databi01tank/stage/data/91bi01tank/stage/wal_archivebi01tank/stage/wal_archive/91
zfs clone <src_dataset>@stage <dst_dataset>
Disposable Environment
Zone now sees a full, writable copy of data
Unchanged data is referenced to origin
Changes accumulate to clone
Disposable Environment
pgsql/data/91 pgsql/data/91@stage stage/data/91
Change accountedto clone
Unchanged datareferenced fromsnapshot
Live FS Snapshot Clone
Next-Level Results
Any report can be re-runon the same data
Massage existing data orbring more in for ad-hoc report
Test changes to reports and web UI
Next-Level Results
When finished with the environment:
Shut down zone
Delete clone & origin snap
Return on Investment
BI database runs on 2 machines
OLTP database lifetime extendedtwo years past expectation
Faster, more granular, and ad-hoc reportingenables better decisions by management
Bonus!
With the same technique,we can safely test:
PostgreSQL upgrades
Schema changes
Thank You
ZFS, Zones and much moreare available to the community via
illumos and its distributions
Go forth and enable your business!