intel knights landing: performance and usability assessment ......intel knights landing: performance...
TRANSCRIPT
IntelKnightsLanding:performanceandusabilityassessmentforscientific
communitiesE.Boyer,G.Hautreux (GENCI)- Technologicalwatchgroup,
GENCI-CEA-CNRS-INRIAandFrenchUniversities
IXPUGBoF ISC17 1
• InchargeofnationalHPCstrategy forcivilresearch• Morethan 6,5Pflops available on3nationalcenters (CINES,IDRISandTGCC)
• Partnerships attheregional level• Equip@meso,15partners
• Represent FranceinthePRACE research infrastructure
• Promote theuseofsupercomputing forthebenefit ofFrenchscientificcommunities andforindustries• Specific actiontoSMEs through theSimseo intiative
IXPUGBoFISC17 2
Frioul:KNLbased prototype
• AtosSequana cell @CINES,Montpellier(France)• 3 partitionsof48KNL7250nodes,146Tflops peak
• 16Quadrantcluster+Cachememorynodes• 16Quadrantcluster+Flatmemorynodes• 16Quadrantcluster+Hybrid memorynodes
• Quadrantmodeused forboth partitions• Poorresults inother processormodes
• BIOSversionsshould be updatedWork inprogress with Atos
• Highlevel support• Thanks toAtosandIntelteams
IXPUGBoFISC17 3
• Thecomparisonismadebetween• Frioul (XeonPhi,KnightsLanding,EDRinterconnect)• Occigen (Broadwell node:[email protected],FDRinterconnect)
▪ Theoverall performanceforthose applicationsatthemoment▪ Node tonode comparison
▪ Energy efficiency isanestimationusingTDPratio
à Results for14applications
à Assesment ofmemorymodes,vectorization impact,…
IXPUGBoFISC17 4
“Real”applicationsresults
KNLmemorymodes▪ ImpactofMCDRAM
▪ Applicationsranin« fullflat »and« cache »mode▪ Resultsprovidedareforthebesttestcase(favorabletoflatmode)▪ NorealimpactofusingMCDRAMforourapplications
▪ Testcases donotfitinMCDRAM,realapplicationsareusingmemory▪ Lotsofindirections leadingtobelatencybound▪ Currentlytestingthehybridmode, itcouldbethesolution
IXPUGBoFISC17 5
Firstconclusions▪ Easeofuse
▪ Notroublewithportability▪ Goodspeed-upinafewdayswork▪ Energyefficiency▪ Stilltroubleswithscalability
▪ PerformanceofIBdrivers?IBparameters?BIOSfixes?
▪ Achoiceconcerningthememorycanbemade▪ Cacheforlargeinfrastructureswithawidevarietyofusers▪ AcomparisonFlatvsCacheshouldbedoneifyouonlyhaveafewusers
▪ FullopeningoftheplatforminApril▪ Frenchcommunityatfirst▪ Helpustohaveabetterfeedbackontheplatform▪ moreapplicationsandnewfocusonDLapplications
▪ AKnightMillplatformmaybeconsideredIXPUGBoFISC17 6
0
0,5
1
1,5
2
1 2 4 8 16 32 64
Spee
d up
Working time (hours)
Effective mean speed up obtained after 2 workshops (Haswell vs KNL)
Performance vs Haswell 24c
Energy efficiency vs Haswell 24c