《面向下一代人工智能系统的光子互连.pdf》由会员分享,可在线阅读,更多相关《面向下一代人工智能系统的光子互连.pdf(17页珍藏版)》请在三个皮匠报告上搜索。
1、Benjamin Lee,NVIDIAPhotonic Interconnectfor Next-Generation AI SystemsPhotonic Interconnectfor Next-Generation AI SystemsBenjamin Lee,NVIDIASPECIAL FOCUS:PHOTONICSGPUs Unlock the AI Revolution“Today,were at the cusp of a major shift in computing.The intersection of AI and accelerated computing is se
2、t to redefine the future.”Jensen HuangIngredients for AILarge data setsAlgorithmsEfficient computeAI models and AI data sets are largeAI model parameter sizes have grown 70,000X in a decadeParallelized across 4 dimensions(data,pipeline,tensor,and expert)No.of GPUs used for training and inference of
3、state-of-the-art generative AI models can be in the 10,000s to 100,000s Tirumala&Wong,HotChips 2024 Single-chip Inference PerformanceH100A100Q8000K20XM40P100V1001000Xin 10 years2 years B.Dally,HotChips 2023 J.Huang,GTC 2024 FULL DATA CENTER WITH 32,000 GPUsAI FACTORY FOR THE NEW INDUSTRAL REVOLUTION
4、645 exaFLOPS of AI performance13PB of fast memory58PB/s of aggregate NVLink bandwidth16.4 petaFLOPs of In-Network ComputingCost-effective and energy-efficient bandwidth scaling for:both scale-out and scale-up networks both switch I/O and GPU I/O both switches and cables both electrical and optical t
5、echnologiesWhat is Needed from the Networks to Power Future Generations?Switch ScalingPublic data from commercial switch ASICs from a variety of vendors over the past 20 years:2X every 2 years Energy per bit has decreased due in part to CMOS scaling,but not fast enough to keep power from increasing.
6、This is expected to get worse as CMOS scaling slows.I/O power is scaling disproportionately to core power consumption.Need a low-power I/O solution,which can be adopted for both switches and GPUs.All bandwidths are per directionSwitch ScalingPublic data from commercial switch ASICs from a variety of