Challenges of Storage in an Elastic Infrastructure.
May 9, 2014
Farid Yavari, Storage Solutions Architect and Technologist
The eBay Inc. Portfolio
2
Platform
2014 University of Minnesota Storage Workshop
The eBay Inc. Portfolio
3
Platform
ebayMarket places
2014 University of Minnesota Storage Workshop
The eBay Inc. Portfolio
4
Platform
ebayMarket places
PayPal
2014 University of Minnesota Storage Workshop
The eBay Inc. Portfolio
5
Platform
ebayMarket places
PayPal
ebayEnterprise
s
2014 University of Minnesota Storage Workshop
The eBay Inc. Portfolio
2014 University of Minnesota Storage Workshop 6
Platform
ebayMarket places
PayPal ebayEnterprise
s
Size of the Managed Infrastructure
7
• SAN ~ FC OLTP environment 16PB• NAS ~ 6PB• Cloud Object Store ~1EB by 2015• Analytics environment ~270PB
• ~130 enterprise SAN/NAS/FLASH storage arrays in 3 DCs• ~1.4mil peak hour IOPS in SAN environment• Thousands of servers with external storage
2014 University of Minnesota Storage Workshop
Foundation of an Elastic Infrastructure
8
• Automated Control Plane• Resource Pool
– Compute– Memory– Storage– Low latency, High bandwidth interconnect
• Traffic Management– PCI Compliance– Security– QoS
Definition: An infrastructure that can spawn, destroy, grow, shrink and move processes dynamically and efficiently within and across data centers.
2014 University of Minnesota Storage Workshop
Key Technologies to Enable an Elastic Infrastructure
9
• Control Plane– Virtualization / Containers/ Hardware– Orchestration of infrastructure resources– Normalization of resources
• Resource Pool– High Speed Networking (10Gbe, 40Gbe, 100Gbe, beyond)
• RDMA enabled (routable layer3)• Lossless flow control
– Memory Class Storage– New Media beyond Virtical Nand
• Traffic– Virtual Lan– Access Control
Image Credit: Open Stack
2014 University of Minnesota Storage Workshop
Key Initiatives to Enable an Elastic Infrastructure
10
•Separation of Storage and Compute- Hadoop use case
•Software defined storage, software defined network•Cloud, SLA, OLA based services
–Standardization–Automation–Show/Chargeback–Self Service
2014 University of Minnesota Storage Workshop
More importantly
11
•Simplicity•Simplicity•Simplicity
•Simplicity Scales•Simplicity is the ultimate sophistication
2014 University of Minnesota Storage Workshop
Shifting Paradigm of Storage
12
Tech Bus BW Lat 1 2 3 4HDD (LFF) SAS/SATA 600MB/s
1.2GB/s3-12ms 5-6TB 7-8TB 10TB ?
SSD SAS/SATA 600MB/s1.2GB/s
0.3-08ms 4-8TB 8-16TB 24TB 36-48TB
Flash PCIE 2GB/s3GB/s
2μs - 150μs
1-16TB 16TB+
Beyond Nand
PCIE 2+GB/s 1μs -40μs100’s ns
8-24TB 24TB+
Sectors/Block/Cells -> unbound (bytes/ pages)Devices becoming consumables (cell failures vs platter failures, shrinking capacities on failure)Endurance is temporary (next 2-8 years).
Interim – controllers will shield users from endurance.long term – media becomes tolerant (no read/ program disturb)
2014 University of Minnesota Storage Workshop
What about Block, File, Object and Databases?
13
• Legacy support will keep them around for a long time (Is Cobol/RPG Dead yet?)• Block is still the basis of storage, until Software catches up to media (2-4 years delayed after new tech introduced)• Object Store is the new Block, File and Database
– No Attachments required (Simplicity)– Enforceable Access Control– Only difference between Block, File and Database in an object store is the richness of metadata
Maps to new technology better than Block, File and Databases today.
5-10 years: We’re back to the future.
Processes have compute and a work area (memory) which may be persistent.
Processed data stored in an object store and handed off to further processing or tiered to archival is based on policy and flow control.
Scale out across a data center
2014 University of Minnesota Storage Workshop
The Challenge
14
• Near Term (3-5 years)– Paradigm shift is starting.– Storage Arrays are understanding flash and flash are becoming flash not disk emulation (slow evolution)– Next steps is to model filesystems, kernels, application to understand flash
• Failure domains, Performance, Data placement/movement vs RAID
• Strategic 5+ years– Server model/ Resource model changing
• Disaggregation of resources brings new challenges• Persistent Memory replacing standard memory requires system bring up and fault remediation challenges• Object storage evolution and metadata handling – long ways to go
2014 University of Minnesota Storage Workshop
2016 Infrastructure Goals
15
•Flash Everywhere: Local, Shared, Cold, Hadoop, Media etc.•No FC or Infiniband. ISER/IP, 40Gbps+ethernet•Fully REST API driven storage management and automation•Hyperscale cloud across all classes of service (QA, Dev, Prod, ec.)•Extremely high densities, 140TB+ per flash device, 54PB+ racks•Two tiers of flash: High performance, and Archival flash
2014 University of Minnesota Storage Workshop