dig data and ucs
TRANSCRIPT
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 1/50
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 2/50
Big Data and Cisco Unified ComputingSystem (UCS)BRKAPP-2033
Kapil Bakshi – Technical Solution Architect
Sean McKeown - Technical Solution Architect
Raghunath Nambiar – Distinguished Engineer
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 3/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Agenda
Big Data Concepts and Overview – Enterprise data management and big data
– Problems, Opportunities and Use case examples – Hadoop, NOSQL and MPP Architecture concepts
Cisco UCS for Big Data – Building a big data cluster with the UCS Common Platform Arc
– UCS Networking, Management, and Scaling for big data
Q&A
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 4/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Big Data Concepts and Overview
– Enterprise data management and big data – Problems, Opportunities and Use case examples
– Hadoop, NOSQL and MPP Architecture concepts
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 5/50© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
For our purposes, big data refers to distributed computingarchitectures specifically aimed at the “3 V’s” of data: VoVelocity, and Variety
What is Big Data?
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 6/50© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Operational(OLTP)
Traditional Enterprise Data Management
Operational(OLTP)
ETL EDW
Online
TransactionalProcessing
Extract,
Transform,and Load(batchprocessing)
EnterpriseDataWarehouse
Operational(OLTP)
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 7/50© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Traditional Business Intelligence Questions
Transactional Data (e.g. OLTP)
Real-time, but limited reporting/analytics
• What are the top 5most active stockstraded in the lasthour?
• How many new
purchase orders havewe received sincenoon?
Enterprise Data Wareh
High value, structured, indexe
• How many morehurricane windosold in Gulf-areastores duringhurricane seaso
the rest of the y• What were the t
most frequently ordered productthe past year?
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 8/50© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
So what has changed?The Explosion of Unstructured Data
2005 20152010
• More than 90% is unstructureddata
• Approx. 500 quadrillion files
• Quantity doubles every 2 years
• Most unstructured data is neither
stored nor analyzed!
1.8 trillion gigabytes of data
was created in 2011…
10,000
0
G B
o f D a t a
( I N
B I L L I O N S )
STRUCTURED DATA
UNSTRUCTURED DATA
Source: Cloudera
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 9/50© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Machine
Operational(OLTP)
Operational(OLTP)
ETLBI/
Operational(OLTP)
Enterprise Data Management with Big Data
Web
ETL
Da
In-memoryanalytics
Big Data
(Hadoop, etc.)
MPP EDW
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 10/50© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Traditional Business Intelligence QuestionsTransactional Data (e.g.
OLTP)
Fast data, real-time
• What are the top 5most active stockstraded in the last hour?
• How many newpurchase orders havewe received sincenoon?
Enterprise Data Warehouse
High value, structured,indexed, cleansed
• How many morehurricane windows aresold in Gulf-areastores during hurricaneseason vs. the rest ofthe year?
• What were the top 10most frequently back-ordered products overthe past year?
Big Dat
Lower value, semmulti-source, ra
• Which produccustomers clicmost and/or smost time browithout buying
• How do we opset pricing forproduct in eacfor individualcustomers ev
• Did the recenmarketing laugenerate the eonline buzz, athat translate
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 11/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Example: Web and Location Analytics
iPhone searches Amazon for Vizio TV’s
in Electronics
1336083635.130 10.8.8.158 TCP_MISS/200 8400 GEThttp://www.amazon.com/gp/aw/s/ref=is_box_?k=Visio+tv… "MozCPU iPhone OS 5_0_1 like Mac OS X) AppleWebKit/534.46 (KHTVersion/5.1 Mobile/9A405 Safari/7534.48.3"
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 12/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Big Data and Key Infrastructure Attributes
Usually not blade servers (not enough local storage)
Usually not virtualized (hypervisor only adds overhead)
Usually not highly oversubscribed (significant east-west traffic)
Usually not SAN/NAS
(What big data isn’t)
Move thecompute tothe storage
Low-cost, DAS-based, scale-out
clustered filesystem
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 13/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Cost, Performance, and Capacity
Enterprise
Database
Massive Scale-Out
Column Store
Hadoop
No SQL
Structured Data:RelationalDatabase
UnMaStreSatSenBlo
$20K/TB
$10K/TB
$300-$1K/TB
HW:SW $ split 70:30
HW:SW $ split 30:70
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 14/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Big Data Software Architectures
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 15/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Three basic categories of big data architectures
MPP RelaColumnar D
“Scale-out
Batch-orientedHadoop
Heavy lifting, processing
Real-time NoSQL
Fast store/retrieve
• Structured Data
• Optimized for DWOLTP (ACID-com
• Data stored via freaccess columns rrows for faster ret
• Rigid schema appon insert/update
• Read and write (inmany times
• Somewhat limited
• Queries often invosubset of data set
• TB to low PB size
• Unstructured Data – emails,syslogs, clickstream data
• Optimized for largestreaming reads of largeblocks (128-256 MB)comprising large files
• Dynamic schema effectivelyapplied on read
• Optimized to compute datalocally at cost ofnormalization
• Write once, read many
• Linear scaling to thousandsof nodes and tens of PB
• Entire data set at play for agiven query
• Unstructured Data – tweets,sensor data, clickstream
• Data typically stored andretrieved as key-value pairsin flexible column families
• High transaction rates, manyreads and writes, smallblock/chunk sizes (1K-1MB)
• Less well-suited for ad-hocanalysis than Hadoop
• TB to PB scale
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 16/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Three basic big data software architectu
•Greenplum DB(Pivotal DB)*
•ParAccel*
•Vertica
•Netezza•Teradata
MPP RelationDatabase
Scale-out BI/DW
•Cloudera*
•MapR*
•Intel Hadoop*
•Pivotal HD*
Batch-orientedHadoop
Heavy lifting, processing
Real-time NoSQL
Fast key-valuestore/retrieve
•HBase (part of ApacheHadoop)*
•DataStax
(Cassandra)*•Oracle NoSQL*
• Amazon Dynamo
*Cisco Partne
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 17/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Hadoop is a distributed, fault-tolerant framework for storing andanalyzing data.
Its two primary components are the Hadoop Filesystem (HDFS)and the MapReduce application engine.
What Is Hadoop?
H d C t d O ti
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 18/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Hadoop Components and OperationsHadoop Distributed File System
Block1
Block2
Block3
Block4
Block5
Bloc6
Scalable & Fault Tolerant
Filesystem is distributed, stored across all
data nodes in the cluster Files are divided into multiple large
blocks – 64MB default, typically 128MB – 512MB
Data is stored reliably. Each block isreplicated 3 times by default
Types of Node Functions
– Name Node - Manages HDFS
– Job Tracker – Manages MapReduce Jobs
– Data Node/Task Tracker – stores blocks/doeswork
ToRFEX/switch
Datanode 1
Datanode 2
Datanode 3
Datanode 4
Datanode 5
ToRFEX/switch
Datanode 6
Datanode 7
Datanode 8
Datanode 9
Datanode 10
TFEX/
Dno
Dno
Dno
NN
Tr
File
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 19/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
HDFS Architecture
ToR
FEX/switch
Datanode 1
Datanode 2
Datanode 3
Datanode 4
Datanode 5
ToR
FEX/switch
Datanode 6
Datanode 7
Datanode 8
Datanode 9
Datanode 10
ToR
FEX/switch
Datanode 11
Datanode 12
Datanode 13
Datanode 14
Datanode 15
1
Switch
Name Node
/usr/sean/foo.txt:blk_1,blk_/usr/jacob/bar.txt:blk_3,blk
Data node 1:blk_1Data node 2:blk_2, blk_3Data node 3:blk_3
1
1
2
2
2
3
3
3
4
4
44
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 20/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
MapReduce Example: Word Count
the
quickbrown
fox
the foxate the
mouse
how nowbrowncow
Map
Map
Map
Reduce
Reduce
bro
fohonoth
atcomo
qu
the, 1brown, 1
fox, 1quick, 1
quick, 1
the, 1fox, 1the, 1
how, 1now, 1
brown, 1
ate, 1mouse, 1
cow, 1
Input Map Shuffle & Sort Reduce
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 21/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
MapReduce Architecture
ToR
FEX/switch
TaskTracker 1
TaskTracker 2
TaskTracker 3
TaskTracker 4
TaskTracker 5
ToR
FEX/switch
TaskTracker 6
TaskTracker 7
TaskTracker 8
TaskTracker 9
TaskTracker 10
ToR
FEX/switch
TaskTracker 11
TaskTracker 12
TaskTracker 13
TaskTracker 14
TaskTracker 15
Switch
Job Tracker
Job1:TT1:Mapper1,MapperJob1:TT4:Mapper3,Reduce
Job2:TT6:Reducer2Job2:TT7:Mapper1,Mapper
M1
M2
R1
M3
M1
M3
R2
M2
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 22/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Design Considerations for MPP Database
MPP Hadoop NOSQL
Design Considerations
•Scale-out with Shared nothing
•Data Redundancy with Local RAID5
Configuration Considerations
(1) High Compute (Fastest CPU)
(2) High IO Bandwidth (15K RPMHDD)
Flash/SSD and In-memory(3) Moderate Capacity
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 23/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Design Considerations
•Scale-out with Shared-Nothing
•Data Redundancy Options
• Key-Value: JBOD + 3-WayReplication
• Document-Store: RAID or Replication
Configuration Considerations
(1) Moderate Compute
(2) Balanced IOPS (Performance vs. Cost)
10K RPM HDD, 15K RPM HDD
SSDs, PCI Flash(3) Moderate to High Capacity
Design Considerations for NoSQL Databas
MPP Hadoop NOSQL
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 24/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Cisco UCS and Big Data
Building a big data cluster with the UCS CommonPlatform Architecture (CPA)
CPA Networking
CPA Sizing and Scaling
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 25/50
‟Life is unfair, and the unfairness is distributed un
• -R
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 26/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
The evolution of big data deployments
Experimental use of Big Data
Deployed into IT Ops mandatedinfrastructures
“Skunk works”
Small to medium clusters
App team mandated
Purpose built for Big
Big Data has establisvalue
Performance matters
Large or small cluste
IT Infrastructure
Big Data
VMware
WEBSAP
Generic IT servers
General Purpose IT Data Center Dedicated “Pod” for B
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 27/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Hadoop Hardware Evolving in the Enterprise
Typical 2009Hadoop node
• 1RU server
• 4 x 1TB 3.5”spindles
• 2 x 4-core CPU
• 1 x GE
• 24 GB RAM• Single PSU
• Running Apache
• $
Economics favor“fat” nodes
• 6x-9x moredata/node
• 3x-6x moreIOPS/node
• Saturated gigabit,10GE on the rise
• Fewer total nodeslowerslicensing/supportcosts
• Increasedsignificance of nodeand switch failure
TypicHado
• 2RU serve
• 12 x 3TB 3x 1TB 2.5”
• 2 x 8-core C
• 1-2 x 10GE
• 128 GB RA• Dual PSU
• Runningcommerciadistribution
• $$$
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 28/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Cisco UCS Common Platform Architecture (CBuilding Blocks for Big Data
UCS 6200 Ser
Fabric Interco
Nexus 2232Fabric Extenders
UCS Manager
UCS 240 M3
Servers
LAN, SAN, Managem
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 29/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
CPA Network Design for Big Data
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 30/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Cisco UCS: Physical Architecture
6200
Fabric A
6200
Fabric B
B200CNA
FEXB
FEX
A
SAN A SAN ETH 1 ETH 2
MGM
T
Chassis 1
Fabric Switch
Fabric Extenders
Uplink Ports
Compute BladesHalf / Full width
OOB Mgmt
Server Ports
Virtualized Adapters
Cluster
Rack Mount C2
CNA
FEX A FEX B
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 31/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
CPA: TopologySingle wire for data and management
2 x 10GE linksper server for all traffic, data andmanagement
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 32/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
CPA Recommended FEX Connectivity2 FEX’s and 2 FI’s
• 2232 FEX has 4 buffer groups: ports 1-8, 9-16, 17-24, 25-3• Distribute servers across port groups to maximize buffer
performance and predictably distribute static pinning on upl
Cisco Virtual Interface Card (VIC)
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 33/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Cisco Virtual Interface Card (VIC)
PCIe
10GbE/
UserDefinablevNICs
Eth
0
FC
1 2
Converged Network Adapter
FCoE in hardware
Bare metal and VM deployments Virtualize in hardware
PCIe compliant
vNIC Fabric Failover
Up to 256 distinct PCIe devices
Ethernet vNIC and FC vHBA QoS
8 queues
vNIC bandwidth guarantees
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 34/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
UCS Fabric Failover
• Fabric provides NIC failovercapabilities chosen when
defining a service profile• Traditionally done using NIC
bonding driver in the OS
• Provides failover for bothunicast and multicast traffic
• Works for any OS onbare metal
• (Also works for anyhypervisor-based servers)
vNIC
1
10GE 10GE
OS / Hypervisor / VM
vEth
1
FFEX
Physical
AdapterV
A
6200-AL1L2
L1L2
Physical Cable
Virtual Cable
Cisco
VIC 1225
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 35/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Recommended UCS networking with Apache HadUse 2 VNICs with Fabric Failover on opposite fabrics for internal and ext
VNIC 1 on Fabric A with
FF to B (internal cluster)VNIC 2 on Fabric B with
FF to A (external data)
No OS bonding required
VNIC 0 (management)
wiring not shown for
clarity (primary on FabricB, FF to A)
N
floevfaapba(e
VNIC
1
L2/L3 Switching
Data Node 1 Data Node 2
6200 A
VNIC
2
6200 B
VNIC
1
EHM EHM
Data ingress/egress
VNIC
0
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 36/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Cisco Unified IO Grant Bandwidth
• Near Wire Speed without CPU load• Dynamic bandwidth management according to SL
3G/s LAN Traffic (HDFS Import)3G/s
ApplicationTraffic3G/s
3G/s
Cluster Traffic (Shuffle)4G/s
3G/s
t1 t2
IndividualEthernets
Prioritized QoS
HB H d M R d ith Q S
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 37/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
WP
0
5000
10000
15000
20000
25000
30000
35000
40000
L a t e n c y
( u s
)
Time
R E A D - A v e r ag e L a t e nc y ( u s ) Q o S - R E A D - A v e r ag e L a t e nc y ( u s )
1 7 0
1 3 9
2 0 8
2 7 7
3 4 6
4 1 5
4 8 4
5 5 3
6 2 2
6 9 1
7 6 0
8 2 9
8 9 8
9 6 7
1 0 3 6
1 1 0 5
1 1 7 4
1 2 4 3
1 3 1 2
1 3 8 1
1 4 5 0
1 5 1 9
1 5 8 8
1 6 5 7
1 7 2 6
1 7 9 5
1 8 6 4
1 9 3 3
2 0 0 2
2 0 7 1
2 1 4 0
2 2 0 9
2 2 7 8
2 3 4 7
2 4 1 6
2 4 8 5
2 5 5 4
2 6 2 3
2 6 9 2
2 7 6 1
2 8 3 0
2 8 9 9
2 9 6 8
3 0 3 7
3 1 0 6
3 1 7 5
3 2 4 4
3 3 1 3
3 3 8 2
3 4 5 1
3 5 2 0
3 5 8 9
3 6 5 8
3 7 2 7
3 7 9 6
3 8 6 5
3 9 3 4
4 0 0 3
4 0 7 2
4 1 4 1
4 2 1 0
4 2 7 9
4 3 4 8
4 4 1 7
4 4 8 6
4 5 5 5
4 6 2 4
4 6 9 3
4 7 6 2
4 8 3 1
4 9 0 0
4 9 6 9
5 0 3 8
5 1 0 7
5 1 7 6
5 2 4 5
5 3 1 4
5 3 8 3
5 4 5 2
5 5 2 1
5 5 9 0
5 6 5 9
5 7 2 8
5 7 9 7
5 8 6 6
5 9 3 5
B u f f e r U s e
d
Timeline
H adoop T er aS or t H base
R
CNo
~60% ReadImprovement
HBase + Hadoop Map Reduce with QoS
C H d ll h 10GE?
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 38/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Can Hadoop really push 10GE?
Analytic workload
lighter on the net Transform worklo
heavier on the ne
Hadoop has numparameters whic
Take advantage – mapred.reduce.slowsta
– dfs.balance.bandwidth
– mapred.reduce.paralle
– mapred.reduce.tasks
– mapred.tasktracker.red
– mapred.compress.map
It can, depending on workload, so tune for it!
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 39/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
CPA Sizing and Scaling for Big Data
UCS R f C fi ti f Bi D t
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 40/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
UCS Reference Configurations for Big Data
Half-Rack UCS Solutions
Bundle for MPP, NoSQL
Configuration
Full Rack UCS Solu
Bundle for Hado
Capacity
Full Rack UCS Solutions
Bundle for Hadoop,
NoSQL Performance
2 x UCS 6248
2 x Nexus 2232 PP
8 x C240 M3 (SFF)
2x E5-2690
256GB
24x 600GB 10K SAS
2 x UCS 6296
2 x Nexus 2232 P
16 x C240 M3 (LF
E5-2640 (12 core
128GB
12x 3TB 7.2K SAT
2 x UCS 6296
2 x Nexus 2232 PP
16 x C240 M3 (SFF)
2x E5-2665 (16 cores)
256GB
24 x 1TB 7.2K SAS
Si i
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 41/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Sizing
Start with current storage requirement
– Factor in replication (typically 3x) and compression (varies by data set) – Factor in 20-30% free space for temp (Hadoop) or up to 50% for some N
systems
– Factor in average daily/weekly data ingest rate
– Factor in expected growth rate (i.e. increase in ingest rate over time)
If I/O requirement known, use next table for guidance
Most big data architectures are very linear, so more nodes = more better performance
Strike a balance between price/performance of individual nodes vsnodes
Part science, part art
CPA sizing and application guidelines
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 42/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
CPA sizing and application guidelines
Server
CPU 2 x E5-2690 2 x E5-2665 2
Memory (GB) 256 256
Disk Drives 24 x 600GB 10K 24 x 1TB 7.2K 12
IO Bandwidth (GB/Sec) 2.6 2.0
Rack-Level
Cores 256 256
Memory (TB) 4 4
Capacity (TB) 225 384
IO Bandwidth (GB/Sec) 41.3 31.9
ApplicationsMPP DBNoSQL
HadoopNoSQL
Best Performance Best P
Scaling the CPA
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 43/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Scaling the CPA
Single Rack16 servers
Single DomainUp to 10 racks, 160 servers
Multiple Dom
L2/L3 Switchi
Scaling the Common Platform Architecture
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 44/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Consider intra- and inter-domain bandwidth:
Servers Per
Domain
(Pair of Fabric
Interconnects)
Available
North-Bound
10GE ports
(per fabric)
Southbound
oversubscription
(per fabric)
Northbound
oversubscription
(per fabric)
Intra-domain
server-to-server
bandwidth (per
fabric, Gbits/sec)
Inte
serve
band
fabric
160 16 2:1 5:1 5
144 24 2:1 3:1 5
128 32 2:1 2:1 5
Scaling the Common Platform ArchitectureMultiple domains based on 16 servers per rack and 2 x 2232 FEXs
Multi-Domain CPA Customer Example
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 45/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
• 10 Gits/sec Intra-DoServer to Server NW
Bandwidth• 5 Gbits/sec Inter-Do
Server to Server NWBandwidth
• Static pinning from FFI (no port-channel)
Rack Awareness
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 46/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Rack Awareness
Rack Awareness provides Hoptional ability to group nod
in logical “racks” Logical “racks” may or may
correspond to physical dataracks
Distributes blocks across di“racks” to avoid failure dom
single “rack” It can also lessen block mo
between “racks”
Can be useful to control bloplacement and movement iintegrated environments
“Rack” 1
Datanode 1
Datanode 2
Datanode 3
Datanode 4
Datanode 5
“Rack” 2
Datanode 6
Datanode 7
Datanode 8
Datanode 9
Datanode 10
“Rack” 3
Datanode 11
Datanode 12
Datanode 13
Datanode 14
Datanode 15
1
1
1
2
2
2
3
3
3
444
Recommendations: UCS Domains and Rack
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 47/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Recommendations: UCS Domains and Rack
Single Domain Recommendation
Turn off or enable at physical rack level
• For simplicity and ease ofuse, leave Rack Awarenessoff
• Consider turning it on to limit
physical rack level faultdomain (e.g. localizedfailures due to physical datacenter issues – water, power,cooling, etc.)
Multi Domain Recomme
Create one Hadoop rack per
• With multiple domaenable Rack Awarsuch that each UCis its own Hadoop
• Provides HDFS daprotection across d• Helps minimize cro
domain traffic
Further Reading
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 48/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Further Reading
BRKCOM-2011 - UCSM Integration Hadoop Networking Best Prac – Best practices for multi-domain CPA deployment
BRKAPP-2027 - Big Data Architecture and Deployment – Deep dive on network behavior of Hadoop
Complete Your Online Session Evaluation
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 49/50
© 2013 Cisco and/or its affiliates. All rights reserved.BRKAPP-2033 Cisco Public
Maximize your Cisco Live expfree Cisco Live 365 account. DPDFs, view sessions on-demalive activities throughout the yeCisco Live 365 button in your log in.
Complete Your Online Session Evaluation
Give us your feedback andyou could win fabulous prizes.
Winners announced daily. Receive 20 Cisco Daily Challenge
points for each session evaluationyou complete.
Complete your session evaluationonline now through either the mobile
app or internet kiosk stations.
7/22/2019 dig data and UCS
http://slidepdf.com/reader/full/dig-data-and-ucs 50/50