sonia pignorel program manager windows server hpc microsoft corporation
Post on 19-Dec-2015
233 views
TRANSCRIPT
Compute Cluster Server And NetworkingSonia PignorelProgram ManagerWindows Server HPCMicrosoft Corporation
Key Takeaways
Understand the business motivations for entering the HPC marketUnderstand the Windows Compute Cluster Server solutionShowcase your hardware’s advantages on the Windows Compute Cluster Server platformDevelop solutions to make it easier for customers to use your hardware
Agenda
Windows Compute Cluster Server V1Business motivationsCustomer case studiesProduct overview
NetworkingTop500Key challengesCCS V1 features
Networking roadmapCall to actions
Business Motivations“High productivity computing”
Application complexity increases faster than clock speed so need for parallelizationWindows applications users need cluster-class computingMake compute cluster ubiquitous and simple starting at the departmental levelRemove customer pain points for
Implementing, managing and updating clustersCompatibility and integration with existing infrastructureTesting, troubleshooting and diagnostics
HPC market is growing. 50% cluster servers (source IDC 2006). Need for resources such as development tools, storage, interconnects, and graphics
Clusters Used On Each Vertical
FinanceOil and GasDigital MediaEngineeringBioinformaticsGovernment/Research
Partners
Agenda
Windows Compute Cluster Server V1Business motivations
Customer case studies
Product overview
NetworkingTop500
Key challenges
CCS V1 features
Networking roadmapCall to actions
Investment BankingWindows Server 2003 simplifies development and operations of HPC cluster solutions Challenge
Investment banking driven by time-to-market requirements, which are driven by structured derivativesComputation speed translates into competitive advantage in the derivatives businessFast development and deployment of complex algorithms on different configurations
ResultsEnables flexible distribution of pricing and risk engine on client, server, and/or HPC cluster scale-out scenarios Developers can focus on .NET business logic without porting algorithms to specialized environments Eliminates separate customized operating systems
“By using Windows as a standard platform our business-IT can concentrate on the development of specific competitive advantages of their solutions.“
Andreas KokottProject Manager
Structured Derivatives Trading PlatformHVB Corporates & Markets
Oil And GasMicrosoft HPC solution helps oil company increase the productivity of research staffChallenge
Wanted to simplify managing research center’s HPC clusters Sought to remove IT administrative burden from researchersNeeded to reduce time for HPC jobs, increase research center’s output
ResultsSimplified IT management resulting in higher productivityMore efficient use of IT resourcesScalable foundation for future growth“With Windows Compute Cluster Server, setup time has decreased from
several hours—or even days for large clusters—to just a few minutes, regardless of cluster size.”
IT Manager, Petrobras CENPES Research Center
EngineeringAerospace firm speeds design, improves performance, lowers costs with clustered computingChallenge
Complex, lengthy design cycle with difficult collaboration and little knowledge reuseHigh costs due to expensive computing infrastructure Advanced IT skills required of engineers, slowing design
ResultsReduced design cost through improved engineer productivityReduced time to marketIncreased product performanceLower computing acquisition and maintenance costs “Simplifying our fluid dynamics engineering platform will increase
our ability to bring solutions to market and reduce risk and cost to both BAE Systems and its customers.”
Jamil Appa Group Leader, Technology and Engineering Services
BAE Systems
Agenda
Windows Compute Cluster Server V1Business motivations
Customer case studies
Product overview
NetworkingTop500
Key challenges
CCS V1 features
Networking roadmapCall to actions
Microsoft Compute Cluster Server
Windows Compute Cluster Server 2003 brings together the power of commodity x64 (64-bit x86) computers, the ease of use and security of Active Directory service, and the Windows operating system
Version 1 released 08/2006
CCS Key FeaturesEasier node deployment and administration
Task-based configuration for head and compute nodesUI and command line-based node managementMonitoring with Performance Monitor (Perfmon), Microsoft Operations Manager (MOM), Server Performance Advisor (SPA), and 3rd-party tools
Extensible job schedulerSimple job management, similar to print queue management3rd-party extensibility at job submission and/or job assignmentSubmit jobs from command line, UI, or directly from applications
Integrated Development EnvironmentOpenMP Support in Visual Studio, Standard EditionParallel Debugger in Visual Studio, Professional EditionMPI Profiling tool
User App
MPI
Node Manager
Job Execution
How CCS Work
DB/FS
User
Cmd line
Desktop App
Job Mgr UIAdmin
Admin Console
Cmd line
Head Node
Job Mgmt
Resource Mgmt
Cluster Mgmt
Scheduling
High speed, low latency interconnect
Tasks
Man
ag
em
en
t
Jobs Policy, reports
Active Directory
Data
Inp
ut
Domain\UserA
Agenda
Windows Compute Cluster Server V1Business motivations
Customer case studies
Product overview
NetworkingTop500
Key challenges
Features
Networking roadmapCall to actions
Stretching CCS
ProjectExercise driven by engineering team prior shipping CCS V1 (Spring 2006)Venue: National Center for Supercomputing Applications
Goals How big will Compute Cluster Server scale?
Where are the bottlenecks inNetworkingJob schedulingSystems managementImaging
Identify changes for future versions of CCSDocument tips and tricks for big cluster deployment
Stretching CCSHardware
Servers896 Processors
Dell PowerEdge 1855 bladesTwo single core Intel Irwindale 3.2 GHz EM64T CPUs
Four GB memory73 GB SCSI local disk
NetworkCisco IB HCA on each compute nodeTwo Intel Pro1000 GigE ports on each compute nodeCisco IB switchesForce10 GbE switches
Stretching CCS Software
Compute nodeCCE, CCP CTP4 (CCS released 08/06)
Head nodeWindows Server 2003 64-bit Enterprise Edition x64SQL Server 2005 Enterprise Edition x64
ADS/DHCP serverWindows Server 2003 R2 Enterprise Edition x86 versionADS 1.1
DC/DNS serverWindows Server 2003 R2 Enterprise Edition x64 version
Stretching CCSNetworking
InfiniBandBenchmarks traffic
InfiniBand Cisco HCAOpenFabrics driversTwo layers of Cisco InfiniBand switches
Gigabit EthernetManagement + out of
band trafficIntel Pro1000 GigE portsTwo layers of GigE Force10 switches
Infiniband
Ethernet(private)
ComputeNode
ComputeNode
HeadNode
ADS/DHCP
DC/DNS
Ethernet(public)
Ethernet(Out Of Band)
Stretching CCSResults
130/500 fastest computers in the world – 06/20064.1 TFlops – 72% efficiency
Increased robustness of CCSGoals reached
Identified bottlenecks at large scale Identify changes for future versions of CCS
V1 SP1, V2, Hotfixes
Document tips and tricks for big cluster deploymentLarge scale cluster best practices whitepaper
Strong partnershipsNCSA, InfiniBand vendors
Cisco, Mellanox, Voltaire, Qlogic
Intel, Dell, Foundry Networks
Top500 More coming up
Agenda
Windows Compute Cluster Server V1Business motivations
Customer case studies
Product overview
NetworkingTop500
Key challenges
Features
Networking roadmapCall to actions
Key Networking Challenges
Each application has unique networking needsNetworking technology often designed for micro-benchmarks less for applicationsNeed to prototype your code to identify your application networking behavior and adjust your cluster
Cluster resources usage and parallelism behaviorCluster architecture (e.g., single or dual proc), network hardware and parameters settings
Data movement over network takes server resources away from application computation
Barriers for high speed still exist at network end-points
Managing network equipment is painfulNetwork driver deployment and hardware parameter adjustmentsTroubleshooting for performance and stability issues
Agenda
Windows Compute Cluster Server V1Business motivations
Customer case studies
Product overview
NetworkingTop500
Key challenges
Features
Networking roadmapCall to actions
CCS Networking Architecture
WinSock Direct
TCP/IP
IP
NDIS
Drivers
RDMA
High-Speed HW
WSD SPI
RDMA Capable High-Speed HW
WinSock API
User mode
Kernel mode
WSD Provider
TDI
WinSock
Deploy OOB Data Storage Computation
IPMI MSMPI SocketCiFSPXE NFSiSCSI
Mgmt
.NET
Networking Features Used By a Compute Cluster Server MSMPI CCP Version of the Argonne National Labs Open Source MPI2
Microsoft Visual Studio® includes a parallel debugger End-to-end security over encrypted channels
Network Management
CCP Auto configuration for five network topologies
Winsock API CCE Inter-process communications with socket
Winsock Direct
CCE Takes advantage of RDMA hardware capabilities to implement socket protocol over RDMA
Remove context transition from app to kernelBypass TCPZero memory copySolve the header/data split to enable application level zero
copyBypass the intermediary receive data copy to the kernel
TCP Chimney Offload
CCE Manages the hardware doing the TCP offloadOffload TCP transport protocol processingZero memory copy
Microsoft Message Passing Interface (MSMPI)
Version of Argonne National Labs Open Source MPI2 implementation
Compatible with MPICH2 Reference Implementation
Existing applications should be compatible with Microsoft MPI
Can use low-latency, high-bandwidth interconnects
MS MPI is integrated with job schedulerHelps improve user security
MSMPI Security Architecture
Job runs on compute cluster with user credentials
Uses Active Directory for a single sign on to all nodesProvides proper access to data from all nodesMaintains security
Client
DataCompute nodes access dataunder credentials of Job owner
Job submitted by user tied toActive Directory credentials
Public Network
Pri
vate
N
etw
ork
Head n
ode
Com
pute
nod
e
Network Types
Public network
Usually current business/organizational networkMost users log onto this to perform workCarries management and deployment traffic, if no private or MPI network exists
Private network
Dedicated for intra-cluster communicationCarries management and deployment trafficCarries MPI traffic, if no MPI network exists
MPI network
Dedicated networkPreferable high bandwidth, low latencyCarries parallel MPI app communication between cluster nodes
Winsock Direct and TCP Chimney
* InfiniBand doesn’t use TCP for transport
** iWARP offload networking into hardware, no need for TCP Chimney
CCS v1 Usage Interconnect
InfiniBand
GbE, 10GbE
iWARP GbE, 10GbE
Winsock Direct(Socket over RDMA)
Low-latency High bandwidthBypass TCP
Yes Yes Yes
TCP Chimney High bandwidthUse of TCP
N/A* Yes N/A**
2006
Future version based on Windows Server codenamed “Longhorn”
Networking Mission: Scale Beta in the Fall
MSMPI improvementsLow-latency, better tracing, multi-thread
Network managementDriver and hardware settings configuration, deployment and tuning from new UI‘Toolbox’ of scripts and tips
2008+
CCS v1 networking based on Windows Server 2003
MSMPI and Winsock APIBoth using Winsock Direct to take advantage of RDMA hardware mechanisms
CCS Networking Roadmap
Networking References
WhitepaperPerformance Tuning White Paper releasedhttp://www.microsoft.com/downloads/details.aspx?FamilyID=40cd8152-f89d-4abf-ab1c-a467e180cce4&DisplayLang=en
Winsock Direct QFE from Windows 2003 Networking
Only install the latest. QFEs are cumulative, latest QFE supersedes the othersLatest as of 05/15/07: latest QFE is 924286
CCS v1 SP1 releasedContains fixes of latest QFE 924286
Call To Action
Make 64-bit drivers for your hardware and complete WHQL certification for CCS v1 Make Windows Server Longhorn drivers for your hardware for CCS v2Focus on easy to deploy, easy to manage networking hardware that integrates with CCS v2 network managementBenchmark your hardware withreal applications
Dynamic Hardware Partitioning And Server Device DriversServer-qualified Drivers must meet Logo Requirements related to
Hot Add CPU
Resource Rebalance
Hot Replace “Quiescence/Pseudo S4“
ReasonsDynamic Hardware Partition-capable (DHP) systems will become more common
Customer may add arbitrary devices to those systems
This is functionality all drivers should have in any case
Server-qualified Drivers must pass these Logo Tests
DHP Tests Hot Add CPU
Hot Add RAM
Hot Replace CPU
Hot Replace RAM
Must test with Windows Server Longhorn “Datacenter”, not Windows Vista
4 Core, 1GB system required
Simulator provided, an actual partitionable system not required
LinksCompute Cluster Server Case studies
http://www.microsoft.com/casestudies/ Search with keyword HPC
Top500 listhttp://www.top500.org/lists/2006/06
Microsoft HPC web site (evaluation copies available)http://www.microsoft.com/hpc/
Microsoft Windows Compute Cluster Server 2003 community site
http://www.windowshpc.net/
Windows Server x64 informationhttp://www.microsoft.com/64bit/http://www.microsoft.com/x64/
Windows Server System informationhttp://www.microsoft.com/wss/
© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date
of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.