sas grid at statistics canada by: yves deguire statistics canada june 12, 2014
TRANSCRIPT
![Page 1: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/1.jpg)
SAS Grid at Statistics Canada
BY: Yves DeGuire Statistics Canada
June 12, 2014
![Page 2: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/2.jpg)
Agenda
• SAS at Statistics Canada• What is the StatCan SAS
Grid?• Migration and Use Cases• Lessons Learned• Looking Forward
![Page 3: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/3.jpg)
Statistics Canada
• Canada’s central statistical agency.• Mandate to collect, compile, analyse and
publish statistical information on the economic, social and general conditions of the country and its citizens.
• Mandate is fulfilled under the authority of the Statistics Act which prohibits the disclosure of identifiable information.
Crunching numbers is our business!
![Page 4: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/4.jpg)
Processing Analysis
SAS@StatCan Where?
Collection Dissemination
Input Database
Clean Microdata
Output Database
Survey Lifecycle
![Page 5: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/5.jpg)
SAS@StatCan What?
• Data processing• Application development• Query and reporting• Statistical analysis• Exploratory data analysis• “Specialised” computations (time-series,
optimization, matrix operations, etc.)
![Page 6: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/6.jpg)
SAS@StatCan How?
• SAS/SHARE• SAS/STAT• SAS/TOOLKIT• Integration Technologies• Enterprise Guide • Enterprise Platform• DI Server• JMP• Grid Manager
• Base SAS • SAS/ACCESS • SAS/AF • SAS/CONNECT • SAS/ETS• SAS/GRAPH• SAS/IML• SAS/Intrnet• SAS/OR
![Page 7: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/7.jpg)
SAS@StatCan Some Numbers!
•2,500,000 SAS jobs run every year•4,000 PC-SAS installations•2,500 active SAS users•450 production applications•80 Windows servers•25 Unix servers•20 platforms •3 versions of SAS: 9.1.3, 9.2 and 9.3•1 grid!
![Page 8: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/8.jpg)
SAS@StatCan More than 2500 Users!
*
![Page 9: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/9.jpg)
What is the StatCan SAS Grid?
• A complete SAS Platform deployment utilizing the SAS Grid Manager 9.4.
• Available to the entire Agency via a Hosting service.• Part of the Network Transformation Initiative (NTI)• 3 objectives:
– Consolidate 100+ SAS servers (Phase 1)– Migrate processing from workstations to the grid (Phase 2)– Enable new computing initiatives/possibilities (Phase 1 & 2)
![Page 10: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/10.jpg)
StatCan Grid Milestones
• 2005-2010: Several “home-made” grids developed over the years using Base SAS and SAS/CONNECT
• 2011: first test grid based on Grid Manager• 2013: enhanced test grid released• May 2014: production grid released for IBSP (V1)• Q3 2014: full production grid will be released for
general availability (V2)
![Page 11: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/11.jpg)
A Few Impressive Results while Testing the Grid
• Capital stock calculation: 89% improvement on elapsed time (2005)
• Audit module in G-Confid: Over 90% improvement on elapsed time (2009)
• NHS-Tax Linkage project: from 59 hours to 50 minutes using G-Link V3 (2012)
• Simulations with CCHS data: hundreds of simulations run in a few hours compared to days on a workstation. (2013)
![Page 12: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/12.jpg)
Why the StatCan Grid?
• Reduced costs $ $ $• Process Higher Volume of Data. • Process data in less time. • Scalable • Secure • Centrally managed• Usage metrics
![Page 13: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/13.jpg)
Implementation Highlights (phase 1)
Shared File System
Clustered
2-tier storage
80 TB
SAS Metadata Server
Node1Node2Node3Node4Node5Node6Node7Node8Node9Node10Node11Node12Node13Node14Node15Node16
Node1Node2Node3Node4Node5Node6Node7Node8Node9Node10Node11Node12Node13Node14Node15Node16
16 cores
256GB ram
IntelX86_64
Grid Nodes
SAS Platform Clients
Web Clients and Services
SAS Mid-Tier
![Page 14: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/14.jpg)
The Transparent Grid
One of the objectives of the grid is to make the user experience as transparent as possible.
Single sign-onSamba shares
Helpers (Macros, Stored Processes)
![Page 15: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/15.jpg)
SAS Grid Data Tier
• Data Files (must “live” on the CFS)– Flat files / SAS files– PC files (Excel spreadsheets, etc.)– Exposed to Windows via SAMBA
• Databases:– SQL*Server– ORACLE– Sybase
![Page 16: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/16.jpg)
Migration Requirements
The StatCan SAS grid is a “pure” SAS compute service!
Platform clients only such as Enterprise Guide
No host commands available
SAS/Access to PC File formats with limitations
No direct access to Windows Shares
SAS 9.4 and SAS 9.3M1 supported
![Page 17: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/17.jpg)
Use Cases
•Use Case #1: Ad hoc users•Users who need to process/analyze data “on-demand”•Large number of concurrent users
•Use Case #2: Batch Jobs•SAS Jobs that run unattended.•A new mainframe!!!
•Use Case #3: Parallel Processing•Jobs broken into smaller tasks and dispatched to the grid.•Myth: a SAS program will execute in parallel with no modifications!
![Page 18: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/18.jpg)
Lessons Learned
• A SAS grid project is an also infrastructure project.
• Linux offers some challenges to integrate with a Windows.
• Managing users expectations is critical.
• Resistance to change must be managed.
• Start simple and build on success.
• Be proactive: plan/think about your next SAS environment.
![Page 19: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/19.jpg)
Looking Forward
• Phase 1: consolidate 80 servers over the next 2 years.• Phase 2:
• Introduce a new grid at SSC Data Centre.• Complete servers consolidation started in Phase1.• Migrate workstation processing to the grid.
Are there opportunities to collaborate with other
departments?
![Page 20: SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014](https://reader034.vdocuments.mx/reader034/viewer/2022051401/56649e115503460f94afda92/html5/thumbnails/20.jpg)
Thank You!
Yves DeGuireSection ChiefSystem Engineering DivisionStatistics CanadaR.-H.-Coats Building 14 A100, Tunney’s Pasture drivewayOttawa, Ont., K1A 0T6
(613) 951-1282