《光子学在高性能计算和人工智能超级计算机系统中的应用.pdf》由会员分享,可在线阅读,更多相关《光子学在高性能计算和人工智能超级计算机系统中的应用.pdf(16页珍藏版)》请在三个皮匠报告上搜索。
1、Duncan RowethPhotonics in Supercomputer Systems for HPC&AIDuncan Roweth,HPE Fellow&Slingshot Chief ArchitectHewlett Packard EnterprisePhotonics in Supercomputer Systems for HPC&AISpecial Focus PhotonicsHPE Labs has a long history of developing photonics technologyI am from the HPC&AI businessWe deve
2、lop systems(HPE Cray Ex),network ASICs such as Slingshot,and softwareWe are a consumer of optical networking a customerWhat am I going to talk about?Background on todays supercomputer systemsRole of photonics in future systemsChallengesCall to actionOverviewCray developed proprietary interconnects f
3、or many yearsWe shipped our first Ethernet based HPC network in 2018Easier to add the function we needed to EthernetBenefits outweighed the(small)costsSlingshot networkEthernet physical layerEthernet link layer with negotiated enhancementsForwards Ethernet,routes IP,and a proprietary reliable transp
4、ort(ST)Open network API,our libfabric provider is on githubOn track to standardization through Ultra Ethernet Consortium(UEC)Deployed in direct liquid cooled Cray Ex infrastructure or standard racksHPC interconnects are(slowly)becoming normalSome success with this approach#1#2#3#5#7#8#10There is non
5、e,why?Short version:cost and manufacturabilityAn exascale class system:30-100,000 NICs,3-5000 switches,100-250,000 linksNetwork design is optimized for global bandwidth/$(not absolute performance)Optical links cost 10X that of electricalDragonfly networkOne long link per pair of NICsElectrical links
6、 within cabinetOptical links between cabinetsUse of photonics todayN0N15N112N127N36,864N36,879N36,976N36,991All-to-all amongst groupsS016S1518Group 0288 global linksS116S141616181818S016S1518Group 288288 global linksS116S141616181818ASICs designs for 100Gbps/lane and 200Gbps/lane are finishedThey us