sn wf12 amd fabric server (satheesh nanniyur) oct 12
DESCRIPTION
Big Data has influenced the data center architecture in ways unimagined before. This presentation explores the Fabric Compute and Storage architectures to enable extreme scale-out, low power, high density Big Data deploymentsTRANSCRIPT
Fabric ArchitectureA Big Idea for the Big Data infrastructure
Satheesh NanniyurSenior Product Line ManagerAMD Data Center Server Solutions (formerly, SeaMicro)
Agenda
• Defining Big Data from an Infrastructure perspective• Fabric Architecture for Big Data• An overview of the Fabric Server and Fabric Storage• Illustrating Fabric Architecture Benefits for Hadoop• Conclusion
Have you come across Big Data?
Apple’s virtual smartphone assistant, Siri, uses complex machine learning techniques
Target’s “pregnancy prediction score”– NY Times: “How companies learn your
secrets” – Feb 2012
So, what really is Big Data?
Business• “Key basis of competition and growth…”
Observational• “Too big, moves too fast, or doesn’t fit the structures of your
database”
Mathematical• “Every day, we create 2.5 “million trillion” (quintillion) bytes
of data"
Systems• “Exceeds the processing capacity of conventional database”
The Infrastructural definition of Big Data
• Store “all” data not knowing its use in advance
Massive Storage
• Ask a query, and when you do, get the answer fast
Massive Compute
Big Data infrastructure is not business as usual
The IT architectural approach used in clustered environments such as a large Hadoop grid is radically different from the converged and virtualized IT environments
Massive Storage
• Petabyte scale high density storage• Flexible storage to compute ratio to
meet evolving business needs
Massive Compute
• High density scale-out compute• Power and space efficient
infrastructure
IDC White Paper, “Big Data: What It Is and Why You Should Care”
Fabric Architecture for Big DataThe holy grail of Big Data Infrastructure
Imagine a world where you could simply stack up servers, with each server:
Fraction of a rack unit
Share over 5 PB of storage
10GE network with no cabling
Flexible provisioning of
storage
A deeper look at the traditional rack-mount architecture
Nodes
ToR
Aggregation
• Compromise between Compute and Storage density
• Rigid compute to storage ratio• Oversubscribed network suitable for
north-south traffic, not heavy east-west required for Big Data
• Too many adapters (NIC, Storage Ctlr) and cabling that can fail
Cabling and Management
Fabric with 3-D Torus for Big Data Infrastructure
Big Data is a big shift from North-South traffic to East-West
Switchless Linear Scalability that avoids bottlenecks
Highly available network minimizing node loss and data reconstruction
High density scale-out architecture with low power and space
High Speed and Low Latency Interconnection
An overview of the Fabric Server
Z+
Z-X+
X-
Y+
Y-
PCIe
x86 Server
SeaMicro Fabric Node with IOVT
• 512 x86 cores with 4TB DRAM in 10RU
• Up to 5 petabytes of storage
• Flexible Storage to Compute ratio
• 10GE network per server 160GE of uplink bandwidth
Fabric Storage ... for Big Data?Isn’t Big Data always deployed with DAS?
Flexible Fabric Storage to Compute Ratio
Rigid Storage to Compute Ratio (Traditional Rackmount)
Storage
Com
pute
Underutilized Compute & Network
• Add storage capacity independent of compute to increase cluster efficiency
• Flexibly provision storage capacity to meet evolving customer needs
“.. the rate of change was killing us, where the data volumes were practically doubling every month. Trying to keep up with that growth was an extreme challenge to say the least.. “
Customer quote from IDC white paper - “Big Data – What It Is and Why You Should Care”
Massive capacity scale-out Fabric Storage
Traditional Rackmount
Freedom Fabric
Captive DAS with Rigid Storage to Compute Ratio
Flexible scale-out Fabric Storage up to 5PB
Intel /AMD x86 servers
• Massive scale-out capacity with commodity drives• Decoupled from Compute and Network to grow storage
independently
Hadoop and the SMAQ stack
Data Storage
Data Processing
Query
MapReduce Framework
HDFS
Pig, Hive
Built to scale linearly with massive scale-out storage (HDFS) and compute (MapReduce)
Hadoop data processing phasesFabric Architecture cost efficiently meets the Hadoop infrastructure needs
Map
Map
Map
Reduce
Reduce
Storage Intensive
Compute Intensive
Network Intensive
Compute Intensive
Storage Intensive
HDFS Input
Shuffle HDFS Output
Map and Intermediate
Data Write
Reduce
5 Petabytes of storage capacity with independent scale-out
512 x86 cores with 4TB DRAM per Fabric Server in 10RU
10 Gpbs Inter-Node Bandwidth per server
160 Gbps shared uplink for Inter-Rack traffic
Hadoop resource usage patternBased on Terasort run on SeaMicro SM15000
Map
Shuffle
Reduce
Compute
Map Shuffle
Reduce
Storage
Shuffle
Network
Deployment Challenges of Hadoop
• Plan for peak utilization
– Hadoop infrastructure utilization is bursty
• Compute, Storage, and Network mix dependent on
application workload
– Flexible ratios optimize deployment
• Power and Space Efficiency key to large scale
deployment
• Administrative cost can increase as rapidly as your data
– Simplified deployment and reduced hardware components
decrease TCO
Fabric Server for Hadoop DeploymentFabric Server offers 60% more compute and storage in the same power and space envelope
Traditional Rackmount
SeaMicro Fabric Server
Intel Xeon Cores 320 512
AMD Opteron Cores* 320 1024
Storage 720 TB 1136 TB
Storage Scalability None Up to 4PB
Network B/W per server Up to 2Gbps Up to 8Gbps
Network Downlinks 40 0
ToR Switch 2 0 (Built-in)
Aggregation (End of Row) switch/router 1 1
Based on SeaMicro SM15000 and HP DL380 Gen8 2U dual socket octal core servers in a 42U rack
Summary
Traditional architectures cannot scale to meet the needs of Big Data
Efficient Big Data deployments need flexible storage to compute ratio
Conventional wisdom of reduced hardware components still holds
Fabric Servers provide unprecedented density, bandwidth, and scalability for Big Data deployments
http://www.amd.com/seamicroFor more information, visit http://www.amd.com/seamicro or email [email protected]