n series overview hardware and competitive advantage
Post on 22-Dec-2015
216 views
TRANSCRIPT
Innovating to Deliver Choices
19931993 NAS appliance and Snapshots
Near-line storage appliance20012001
Multiprotocol appliance19961996
20022002 Unified SAN/NAS appliance
20032003 iSCSI storage system
20042004 RAID-DP™ disk resiliency
20052005 Thin provisioning and virtual cloning
20062006 Scalable grid storage
Addressing Today’s Challenges
Explosive data growth Do more with less Scale the infrastructure 24x7 global access Data security & compliance
Consolidatestorage
Operateeverywhere
Protectyour business
Different Classes of Data
Availability Requirement
Data Criticality
Cost
Tier 4Compliance / Reference / Archive
Tier 3Departmental / Distributed
Tier 2Operational / Internal
Tier 1Business Critical
Different Tiers of Storage
• One architecture• One management interface• Total interoperability
Data ONTAP™ Operating System – SAN, NAS, iSCSI
Tier 4Compliance / Reference / Archive
Tier 3Departmental / Distributed
Tier 2Operational / Internal
Tier 1Business Critical
Data ONTAP® Operating System – SAN, NAS, iSCSI
One architectureOne application interface
One management interfaceTotal interoperability
Broadest Scalable Storage Architecture
N3700 N5000 N7000
N Series Family of Unified Enterprise Storage Systems
4 FC ports16TB max
20 FC ports168TB max
32 FC ports504TB max
A Fundamentally Simpler Approach
• Unrivaled synergy: everything works together
• Unique leverage: everything can do more
• Simpler administration: one process works everywhere
• Easier to deploy: less to learn means reduced training
Primary
Secondary
Backup
Compliance
Disaster Recovery
SAN
iSCSI
NAS
NFS
CIFS
Unified Unified Unified
High-end FC
Midrange FC
Midrange ATA
Low-end ATA
Virtualization
• Filer Head Unit• Disk Shelves• Optimised Micro Kernel• Connectivity
N Series Storage - What is it?
Dedicated Storage Appliances
Its all about Blocks!
• DataONTAP – Optimised Micro-kernel– Highly Optimised, Scalable and Flexible OS
• Write Anywhere File Layout – WAFL– Allows for flexible storage containers– Tightly integrated with NVRAM and RAID– RAID 4 for performance and flexibility– RAID DP for performance, flexibility and increased
reliability!
Traditional Implementation
• Classic RAID 4 – Dedicated Parity Drive
PD
• For each Data block written, parity is written
• The parity drive becomes the bottleneck!• 5 Data Blocks + 5 Parity = 10 Disk Writes!
D PDP
D PPD
Traditional Implementation
• RAID 5 – Distributed Parity
D PD P
DP D
P
D
P
• For each Data block written, parity is written
• 5 Data Blocks + 5 Parity = 10 Disk Writes!
Performance achieved!
• WAFL and RAID 4
P
• Stripe is calculated in NVRAM, parity describes the stripe.
• No bottleneck! All drives are written to evenly
• 5 Data Blocks + 1 Parity = 6 Disk Writes!
D D D DD
Performance achieved!
• Lets extend that further….– RAID 4 with 28 Disks = 28 data + 28 Parity
• Total 56 Disk Writes with Hot Parity
– RAID 5 with 28 Disks = 28 data + 28 Parity• Total 56 Disk Writes
– WAFL RAID 4 with 28 Disks = 28 data + 2 Parity• Total 30 Disk Writes
• The more disks the bigger the difference!
Protecting the Data
• Subsystem Resilience– RAID-DP
• Disk Resilience– Lost Write Protection– Momentary Disk Offline– Maintenance Center– Checksums– Background Media Scans– RAID Scrub
Storage Resiliency – RAID-DP
• RAID-DP is dual parity data protection• NetApp RAID-DP is an implementation of the
industry standard RAID 6 as defined by SNIA– SNIA definition recently updated to include NetApp RAID-
DP:“Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods, including dual check data computations (parity and Reed Solomon), orthogonal dual parity check data and diagonal parity have been used to implement RAID Level 6”
Storage Resiliency - RAID-DP
• Impact on usable capacity is zero– Default raid group sizes with RAID-DP are double
those of RAID 4– Result: Even though an extra parity disk is used,
net result is the same number of data disk drives– Example: RAID 4 (7D+1P)+(7D+1P)
RAID-DP (14D+2P)Both result in 14 data disks
• Comparable performance to RAID 4– Typically 1-3% impact on performance– Competitor RAID 6 typically see significant
degradation on writes when compared to their RAID 5
Storage Resiliency – RAID-DP• RAID-DP provides extra protection over single parity
solutions• Within the disk drive industry
– Disk drives are much larger– Disk drive error correction capability and reliability
have not improved at the same rate – More data is read while reconstructing a RAID
group– Increases likelihood of unrecoverable error when
RAID is not available to correct it• Leads to higher risk of data loss from single disk
failure followed by unrecoverable media error during reconstruct (MEDR), or a double disk failure
• This is a disk drive industry risk that RAID-DP protects against
RAID-DP / RAID 6 so what?
If a customer requires ATA storage then RAID-6 has to be mandatory, surely!
The Competitors solution is to sell MORE DISK!
Lost Write Protection - Anatomy Of A Lost Write
• How a “lost” write occurs:
1. Disk malfunction occurs during write2. Disk reports a successful write3. In reality, no write happened (silently
dropped) or data written in random location (lost)
4. Subsequent read* of blocks returns bad data
5. Result = data corruption or data loss
* Note: No checksum mismatch occurs (existing data still matches checksum since neither were updated) so no error is reported. Also, parity inconsistency is detected and fixed by RAID scrub, but without the detection or recovery of the lost write data.
Lost Writes Protection
Step 1: Write New Block• Write Data To Free Block• Block ID Stored In Checksum• WAFL Tracks Block ID
ID 1234
Step 2: Write Updated Data• Update Data On Block 1234• Write Block 1234 To Location• Previous free block had ID 0000• Changes to Block ID 1234
Step 3: Read Data From Disk
• Verify Data Block ID Against ID Tracked By WAFL• Incorrect ID Indicates Lost Write Has Occurred• Re-Create Data From Parity
31
2
1
11
3
1
22
1
3
31
22
95
8
7
7
1212
11
{D D D D P DP
31
2
1
11
3
1
22
1
3
31
22
95
8
7
7
1212
11
{D D D D P DP
New ID 1234
ID 0000
ID 1234
ID 1234
ID 1234
ID 1234
Legend
WAFL
Checksum
Benefit: High Data Integrity
Old ID 0000
Momentary Disk Offline
• Feature where RAID temporarily suspends I/O to a drive– Available with DOT 7.0 for non-disruptive disk
firmware upgrade– Disk offline for SATA drive spasm recovery
supported (7.0.1) – Aggregate requirements
• RAID-DP or mirrored aggregate (SyncMirror)• Is allowed only if RAID group is in normal/restricted state
and disk copy is not in progress in group
Momentary Disk Offline
Data1 Data2 Data3 Data4 Data5 Data6 Parity1 Parity2
Step 1: Trigger Detection• Firmware Upgrade• SATA Spasm Errors• FC Timeout Errors• Media/Head Errors Step 2: Drive Offline
• Reads From Parity• Writes Logged• Execute (FW Update, Error
Recovery, Power Cycle)
Step 3: Bring Drive Online• Recovery Test (Dummy I/O)• Re-Sync Logged Writes• Re-Activate I/O To Drive
Maintenance Center
Step 1: Predict Failure Step 2: Run Diagnostics Step 3: Fix Problems
• Provides additional storage resiliency• Predictive and preventative techniques to ensure system
health is at peak• Customers benefit
– Fewer storage related issues– Lower IT management costs
Checksums
512 bytes data per sector
8 bytes checksum per sector
• Fibre Channel Drives• BCS• 8 x 520-byte sectors
per 4kB block
HOW CHECKSUMS WORK
• Verify all 4kb block checksums on read
- Read data- Re-calculate
checksum- Compare To stored
checksum- If needed, re-create
from parity
512 bytes data per sector
64 bytes checksum
in 9th sector
• SATA Drives• BCS Emulation• 9 x 512-byte sectors
per 4kB block
Background Media Scans
Step 1: Scan For Media Error• Begin scans at disk block 0 • Uses SCSI Verify• 128 blocks (512K) verify
request size
Step 2: Detect & Fix Error• Looking for latent defects• Drive marks bad block• Reconstruct data from parity• Re-allocate to available block
Step 3: Complete Scan• Continue scanning all blocks• Background verify process
with no performance impact• Fixed scan rate (sectors/sec)
31
2
1
11
3
1
22
1
3
31
22
95
8
7
7
1212
11
{D D D D P DP
31
2
1
11
3
1
22
1
3
31
22
95
8
7
7
1212
11
{D D D D P DP
RAID Scrubs
Step 1: Scrub Disks • Issue reads to all disks in RG• Scan for media defects• Verify checksums• Compute parity
Step 2: Detect & Fix Errors• Fix checksum errors• Fix parity errors• Fix media errors
Step 3: Complete Scan• Default runs 6 hours/week• Can configure schedule• Can resume if interrupted
31
2
1
11
3
1
22
1
3
31
22
95
8
7
7
1212
11
{D D D D P DP
31
2
1
11
3
1
22
1
3
31
22
95
8
7
7
1212
11
{D D D D P DP
Infrastructure Utilization Challenges
• Overall storage utilization is low– Most enterprises are below 50% utilization
• Too many untapped resources– Static allocation– Suboptimal performance– No sharing of resources
Industry Trends
Disk capacity is growingMore disks being used to address performanceControl requirements drive volume granularityDiffering data types need different managementSize of data units growing unevenly
Increasing mismatch between tools and building blocks
RG1 RG2 RG3
Aggregate
AggregateAggregate
RG1 RG2 RG3
Aggregates and FlexVol™ Volumes:How They Work
Flexible Volume 1
Flexible Volume 2
Flexible Volume 3
Create and populate each flexible volume No preallocation of blocks
to a specific volume WAFL® allocates space
from aggregate as data is written
Create RAID groups
Create aggregatevol1vol1 vol2vol2
vol3vol3
14 x 72 gb disks = 1 tb capacity
Flexible Volumes Improve Utilization
Vol 0
Data Parity
Database
Data Data Data Data Data Data Data Parity Spare
Home Directories
Data Data Parity
Vol0 = 1gb Max
200gb Database created
3 disk vol for Home Directories / Shares
1 Hot spare
140 Gb 370 Gb40 Gb
550 Gb of wasted space
14 x 72 gb disks = 1 tb capacity
Flexible Volumes Improve UtilizationVol0 = 1gb Max
200gb Database created
3 disk vol for Home Directories / Shares
1 Hot spare
SpareData Data Data Data Data Data Data Data Data Data Data Parity Parity
Aggregate
Database Home DirsVol0
400 Gb used
600 Gb of Free Space!
LUNs
Application-levelsoft allocation
1 TB
400 GB
FlexVols™: Enabling Thin Provisioning
FlexVols: Container level:
flexible provisioning Better utilization
Physical Storage: 1 TB
FlexVols: 2TB
Container-levelsoft allocation
1 TB
300GB
200GB
200GB
50GB
150GB
100GB
Application-level: Higher granularity Application over-
allocation containment
Separates physical allocation from space visible to users
Increases control of space allocation
Causes of Unplanned Downtime
20%
40%
40%
TechnologyFailures
OperatorErrors
ApplicationErrors
Fewer Components Redundant Components Cluster Failover SnapMirror™ for DR
Appliance Simplicity Ease of Management Plug-n-play Low Product Complexity
Multiple point-in-time copy with low overhead
Fast Recovery of Entire Filesystem, Database
Source: GartnerGroup, 2005
• A Snapshot is a reference to a complete point-in-time image of the volume’s file system, “frozen” as read-only.
• Taken automatically on a schedule or manually
• Readily accessible via “special” subdirectories
• 255 snapshots concurrently for each file system, with no performance degradation.
• Snapshots replace a large portion of the “oops!” reasons that backups are normally relied upon for:
– Accidental data deletion– Accidental data corruption
• Snapshots use minimal disk space (~1% per Snap)
Snapshots Defined
FileA.dat
Snapshots Defined
FileA.dat
SnapShot!
File Read File Write
Only the changed block is written back to disk
Previous block is maintained for the SnapShot version
SnapShot only copies the pointers to the blocks
FileA.dat
How not to SnapShot - Copy on Write!
FileA.dat
SnapShot!
File Read File Write
Step 1 – Original block must be movedSnapShot only copies the pointers to the blocks
Step 2 – SnapShot Index updated
Step 3 – New Block is written to disk
SnapRestore Defined
• SnapRestore reverts an entire volume (filesystem) to any previous online Snapshot
– Makes the Snapshot the new active file system
• Instant recovery (no reboot)*
• Particularly compelling for database contents or software testing situations
* Except if restoring root volume
N Series SnapRestore
Step 1 – Volume index set as master, current volume pointers removed, redundant blocks flagged as available
VolxVolxVolx
Volume is restored in seconds! No performance impact!
Competitions Volume Restore
Volx
SnapShot!
Step 1 - Volume Index is Restored, Data is inconsistent!
Volx
Step 2 – Blocks are copied from SnapShot area
SnapRestore for Databases Provides a unique solution to database recovery
Rather than restoring large amounts of data from backup tape:1.Simply revert the entire volume back in
time to its state when the Snapshot was taken
2.Then play change logs forward to complete recovery
Effectively protects data without expensive mirroring or replication
Use Snaprestore where time to copy data from either a Snapshot or tape into the active filesystem is prohibitive
Positioning SnapManager Products
RTO RPO
W D H M M H D W
File Data
SQL
Oracle
SnapShot / SnapRestore
SnapManager for Exchange
SnapManager for SQL
SnapManager for Oracle
Information Lifecycle Management
Solutions to Meet Customer Challenges
Simplify for Lowest TCO
Best of Breed Solutions
Customer Satisfaction
Backup& Recovery
StorageConsolidation
RegulatoryCompliance
DisasterRecovery
Storage Consolidation
High availability storage
Effortless, large scale server consolidation
Pooled storage with non-disruptive expansion
Heterogeneous file sharing
Simplified data management
Seamless integration with existing software and hardware
NearlineStorage
PrimaryStorage
Backup and Recovery
Simplified, centralized backup and restore
Perform remote backup and restores locally
Instantaneous access to backup data (file format)
Uses significantly less storage
Eliminate backup window problems (backup hourly)
UNIXServer
ChicagoSan FranciscoNYLondon
WindowsServer WAN
NearlineStorage
Data Center
NearlineStorage
Remote Site
WAN
SnapVault™
Regulated Data
UK Data Protection Act, European Union Directive 95/46
• SEC Rule 17a.4 (Broker dealers)• DoD 5015.2 (Government)• HIPAA (Healthcare) • 21CFR11 (Life Sciences/Pharmaceuticals)• Sarbanes Oxley (public companies over $75m cap.)• FSA Handbook (UK Financial Services)• BSI DISC PD 0008 (evidential weight, code of practise)• Basel II Accord • Freedom of Information Act 2000
…..to name just a few !
NearlineStorage
Data Center
NearlineStorage
Remote Site
Comprehensive solution:• Data permanence• Data security
Increased data protection
Retention date support
Easy to integrate with existing applications
Meets requirements for SEC 17a-4, HIPAA, DOD 5015.2, GoBS, and more
Unmatched flexibility• Runs on all platforms• Systems can store
compliance andnon-compliance data
DB or E-mailArchival
WORMVolumes
Accesses Data and Moves itto WORM storage
ChicagoTokyoLondon
SnapMirror®
Regulatory Compliance
WAN
WAN
Disaster Recovery
Mirror sites for rapid disaster recovery
Remote site users failover to mirrored site automatically
Single solution for sync, async, semi-sync
Runs across all platforms
Cost effective for remote sites
Economical DR solution
NearlineStorage
PrimaryStorage
ChicagoTokyoLondon
DR Site
WAN
SnapMirror™
Infrastructure Utilization Challenges
• Overall storage utilization is low– Most enterprises are below 50% utilization
• Too many untapped resources– Static allocation– Suboptimal performance– No sharing of resources
FlexClone™ Software
• Enables multiple, instant data set clones with no storage overhead
• Provides dramatic improvement for application test and development environments
• Renders competitive methods archaic
FlexClone™ Volumes: Ideal for Managing Production Data Sets
• Error containment– Bug fixing
• Platform upgrades– ERP– CRM
• Multiple simulations against a large data set – ECAD– MCAD– Oil and gas
The Pain of Development
Prod Volume (200gb)
Pre-Prod Volume (200gb)
QA Volume (200gb)
Dev Volume (200gb)
Test Volume (200gb)
Sand Box Volume (200gb)
1.4 tb Storage Solution
200 Gb Free
Create copies of the volume
Requires processor time and Physical storage
Clone’s remove the pain
Prod Volume (200gb)
Pre-Prod Volume
QA VolumeDev Volume
Test Volume
Sand Box Volume
1.4 tb Storage Solution
Create Clones of the Volume – no additional space required
Start working on Prod Volume and Cloned Volume
Only changed blocks get written to disk!
1 Tb Free
In an Ideal IBM N Series World….
Primary Production
Array
Secondary
Array
SnapMirror
Create Clones from the Read Only mirrored volume
Removes development workload from Production Storage!
FAS3070
DMX FamilyCX-80AX150/S
IBM N Series / EMC / HP
Centera
FAS3050FAS3020FAS270FAS250
NetAppNetApp
Celerra
NS Series NS SeriesEMCEMC
FAS6030 FAS6070
DL380
MSA1000 RISSMSA1500 MSA1500cs EVA4/6000
EVA8000
XP Family
HPHPDL38
0
CX-40CX-20
Data ONTAP™ Operating System – SAN, NAS, iSCSI
• One architecture• One application interface
• One management interface• Total interoperability
A real world Scenario
• Customer is looking for a Scalable Platform to support future growth
• Needs to consider Disaster Recovery options
• And has a requirement for a Compliancy Solution
DMX FamilyCX-80AX150/S
IBM N Series / EMC / HP
Centera
N5500N5200N3700
IBMIBMN SeriesN Series
Celerra
NS Series NS SeriesEMCEMC
DL380
MSA1000 RISSMSA1500 MSA1500cs EVA4/6000
EVA8000
XP Family
HPHPDL38
0
CX-40CX-20
Scalability
N5600 N7600 N7800
DMX FamilyCX-80AX150/S
IBM N Series / EMC / HP
CenteraCelerra
NS Series NS SeriesEMCEMC
DL380
MSA1000 RISSMSA1500 MSA1500cs EVA4/6000
EVA8000
XP Family
HPHPDL38
0
CX-40CX-20
Interoperability
N5500N5200N3700
IBMIBMN SeriesN Series N5600 N7600 N7800
DMX FamilyCX-80AX150/S
IBM N Series / EMC / HP
CenteraCelerra
NS Series NS SeriesEMCEMC
DL380
MSA1000 RISSMSA1500 MSA1500cs EVA4/6000
EVA8000
XP Family
HPHPDL38
0
CX-40CX-20
Compliance
N5500N5200N3700
IBMIBMN SeriesN Series N5600 N7600 N7800
A real world Scenario
• Customer is looking for a Scalable Platform to support future growth– N Series systems scale from the Entry to the Enterprise
• Needs to consider Disaster Recovery options– Any N Series system can replicate to any other N Series System
– Natively over IP or FC
– Can be a mix FC-SAN or iSCSI
• And has a requirement for a Compliancy Solution– SnapLock can be added to any N Series system
And don’t forget FlexClone!