vendor lock-in-free storage at msu
TRANSCRIPT
![Page 2: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/2.jpg)
V E N D O R L O C K - I N - F R E E S T O R A G E AT M I C H I G A N S TAT E U N I V E R S I T Y
G R E G M A S O N I N S T I T U T E F O R C Y B E R - E N A B L E D R E S E A R C H
2
![Page 3: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/3.jpg)
W H O A M I
• Sysadmin at MSU for over 6 years
• Couple of years in industry before that doing operations.
• Primary engineer for HPC storage
• On the internet: [email protected], @nodoubleg
3
![Page 4: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/4.jpg)
W H AT I S H P C ?
• High Performance Computing
• Built with fast CPUs, low-latency high-bandwidth networks, fast storage, and batch job schedulers
4
![Page 5: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/5.jpg)
M S U ’ S S C A L E
• ~7,600 cores
• ~50TB RAM
• ~2PB storage
• ~2,000 software titles installed
5
![Page 6: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/6.jpg)
M S U ’ S H P C W O R K L O A D
• We serve everybody, from Ag Econ to Zoology
• Tuning anything for a specific workload is futile
• Chemistry
• Bioinformatics
6
![Page 7: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/7.jpg)
M S U ’ S H P C S T O R A G E
• Persistent storage is all ZFS. Reasonably fast, reasonably available, cheap. Always safe*
• High-speed parallel storage is Lustre. Currently based on a modified ext4. Fast, only moderately reliable/safe.
• NetApp filer, to support VMware environment
7
![Page 8: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/8.jpg)
Z F S AT M S U
• Run in production since 2009. OpenSolaris then, OpenZFS now.
• Over 1.5PB in production at iCER
• Even using it in odd places
8
![Page 9: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/9.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
9
![Page 10: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/10.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
10
![Page 11: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/11.jpg)
( s o m e ) B E N E F I T S O F Z F S
• Checksum ALL THE THINGS!!1
• Integrated raid understands the objects it stores
• Copy-on-write transactions are atomic
• Snapshots
• Reduces hardware costs
• Simplified administration: zfs set refquota=3T tank/filesystem zfs snapshot tank/filesystem@beforeupgrade zpool status
11
![Page 12: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/12.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
12
![Page 13: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/13.jpg)
O V E R V I E W O F Z F S C O M P O N E N T S
• pool: A collection of devices that provides storage for data managed by ZFS
• vdev: A top-level device in a pool. Can be a plain disk, raid group (raidz), or mirror.
• dataset: A zvol or filesystem
• zvol: A block device presented to the OS
• filesystem: A plain ol’ POSIX filesystem
• snapshot: A copy-on-write reference to a dataset at a point in time. Not just a copy.
• zil/log/slog/logzilla: ZFS Intent Log. All writes not yet committed to disk are stored here. Only read from when recovering from an unclean shutdown. Not a buffer.
• ARC/primarycache: Adaptive Replacement Cache. Some of the smart’s behind the performance of ZFS. Not just a dumb page cache. Resides in RAM.
• l2arc/cache/secondarycache: A block-device version of the ARC, commonly an SSD. When objects are evicted from the ARC, they might end up on the l2arc.
• For more info: http://bit.ly/zfsdocs
13
![Page 14: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/14.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
14
![Page 15: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/15.jpg)
P L AT F O R M S W I T H Z F S
• OpenZFS
• Illumos
• FreeBSD
• Linux
• Mac OS X
• Oracle ZFS
• ZFS Storage Appliance
• Solaris 11
15
![Page 16: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/16.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
16
![Page 17: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/17.jpg)
B U I L D A Z F S - B A S E D S Y S T E M
• You want trustworthy HBAs, disks, and NICs.
• I use LSI HBAs with the IT firmware.
• my NICs are Mellanox and Intel.
• Hardware spec isn’t scary!
• Illumos HCL: http://illumos.org/hcl/
• FreeBSD & Linux: anything these run on. Tend to have better hardware vendor support
17
![Page 18: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/18.jpg)
R E C O M M E N D E D C O N F I G
• Quanta M4600H, Seagate 84-drive JBOD, Sanmina JBODS, or Supermicro SAS JBODs.
• Servers: any decent 2-socket Intel server with lights-out management. Lots of ECC RAM for the cache.
• Network: at least 10-gig. Investigate 40-gig Ethernet or Infiniband (IB).
• Be sure the number of disks meets the performance requirement
18
![Page 19: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/19.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
19
![Page 20: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/20.jpg)
P O T E N T I A L P I T FA L L S
• Using cheap SATA hard drives
• Using SAS expanders with SATA drives
• Improperly-sized raid stripes
• Picking the wrong SSDs for acceleration
• Using the wrong disk multipathing strategy/algorithm
20
![Page 21: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/21.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
21
![Page 22: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/22.jpg)
Z F S A LT E R N AT I V E S ( i f y o u m u s t )
• btrfs
• ReFS
• GPFS
• HAMMER
• Ceph
22
![Page 23: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/23.jpg)
B T R F S
• Default filesystem for some Linux distros. Only very recently considered stable.
• Features checksums, mirroring, integrated double-parity raid that is still maturing.
• Can shrink the “array” or pool of disks, thanks for reused code from Linux MD raid.
• “mostly works ok” “typically doesn’t corrupt itself” as of kernel 3.10
• As of kernel 4.0, things are looking better-ish
23
![Page 24: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/24.jpg)
R e F S
• Proprietary, successor to NTFS
• Works with Storage Spaces in Windows
• Supports most NTFS features
• 64-bit checksums are stored separately for metadata. Same for data, when enabled.
• Keeps running even after checksum failures, allowing for online recovery
• Performance is very low when data checksums are enabled
24
![Page 25: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/25.jpg)
G P F S
• Proprietary parallel filesystem from IBM
• Similar to Lustre on ZFS: parallel filesystem with integrated raid, checksumming, and compression.
• Better raid implementation (declustered raid)
• Excellent policy-driven data movement, and truly global namespaces
25
![Page 26: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/26.jpg)
H A M M E R
• Default filesystem in DragonflyBSD
• All data is CRC-checked. Smaller checksum than ZFS, designed for bit rot detection, not blind data verification.
• Raid is left to other software/devices. A bit flip on a raid array is not easily recoverable.
• single-file history accessible with undo command
• Smallest maximum filesystem size of the alternatives, at “only” 1 exabyte
26
![Page 27: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/27.jpg)
C E P H
• A data storage system, not filesystem
• Superb object store and block device provider (RADOS)
• Objects are the way of the future
27
![Page 28: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/28.jpg)
T O P I C S
• Benefits of ZFS
• Overview of ZFS
• Platforms with ZFS
• Build a ZFS-based system
• Potential pitfalls
• ZFS alternatives, if you must
• Storage of the future
28
![Page 29: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/29.jpg)
S T O R A G E O F T H E F U T U R E ?
• ZFS still plays an important role for persistent data storage, and robust POSIX filesystems.
• Vendors are publicly committing to Lustre on ZFS.
• Future is object stores. Ceph, Amazon S3, Microsoft Azure, even objects on Lustre.
• Networks will unify, bringing unified storage with them. Infiniband and Ethernet will converge.
29
![Page 30: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/30.jpg)
M O R E I N F O R M AT I O N
• OpenZFS: http://www.open-zfs.org
• me: [email protected], @nodoubleg
30
http://bit.ly/zfsdocs
![Page 31: Vendor lock-in-free storage at MSU](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55c73b0cbb61ebfa278b4648/html5/thumbnails/31.jpg)
Q U E S T I O N S ?
31