《人工智能规模化:100T交换机的案例.pdf》由会员分享,可在线阅读,更多相关《人工智能规模化:100T交换机的案例.pdf(13页珍藏版)》请在三个皮匠报告上搜索。
1、Matt Roman,Senior Director PLM,Celestica2025 OCP Global SummitScaling AI:The Case for 100T SwitchingExpo Hall SessionTue,October 14,12:27pm-12:42pm SJCC-Concourse Level-Expo Hall TheaterScaling AI:The Case for 100T SwitchingMatt Roman,Senior Director PLM,CelesticaScaling AI:The Case for 100T Switchi
2、ngMatt RomanSr Director,PLMCelestica$9.65billionin 2024 revenue40+locations in 16 countries headquarteredin North America27,000+employees worldwide100+customers across multiple marketsFocused on enabling the worlds leading technology brandsTailoring customer-centric solutions for the markets we serv
3、eOperating a global network of sites with specialized Centers of ExcellenceA Global Leader in InnovativeEnd-to-End Product Lifecycle SolutionsThe exponential growth of AI clusters,with thousands of interconnected GPUs,demands a fundamental shift in network architecture.Traditional scale-out designs
4、struggle with the high-bandwidth,low-latency requirements of collective communication.This presentation explores the critical role of 100T scale-up switching in meeting these challenges.Well analyze how cutting-edge switch ASICs,leveraging high-speed 100G/200G SERDES,enable a single-stage,non-blocki
5、ng fabric.This approach dramatically increases intra-cluster bandwidth and reduces latency by eliminating intermediate network hops.The talk will provide a technical overview of this architecture,detailing its benefits for memory-semantic communication and coordinated computing.By adopting a scale-u
6、p fabric,we can unlock the full potential of AI supercomputers,ensuring future-proof performance for the most demanding workloads.Session AbstractKey Challenges in Deploying Large Scale AI ClustersSerDes Speeds100G 200GCoolingAir LiquidGPU Cadence12 to 18 MonthsOptics800G 1.6TSingle stage,non-blocki