1、Hyve Solutions ConfidentialPowering Next Generation Switch ArchitectureMichael LaneVP,Networking,Hyve SolutionsFor AI Hyperscale InfrastructureOCP Global Summit 2025Traditional Data Center Networks StrugglesoCongestion with large AI modelsoHigh overhead in scalingoInefficient GPU/CPU interconnectsAI
2、 Workloads Are Data-intensive RequireoHigh bandwidth for trainingoUltra-low latencyoScalable fabric for multi-tenantsWhy Next-Gen Networking for AI1.Bandwidth:Multi-terabits per rack2.Latency:Sub-microsecond within racks,low inter-rack3.Telemetry:Real-time monitoring and dynamic traffic optimization
3、4.Security:Zero-trust for distributed AI workloads5.Topology:Clos and Dragonfly networks for scalabilityAI Cluster Networking RequirementsFive Key Factors forSuccessfully Building a Network for an AI ClusterAI Cluster Networking RequirementsFive Key Factors for Successfully Building a Network for an
4、 AI Cluster Topology:Clos vs.Dragonfly networks for scalabilitygloballinkrouternodegrouplocallinkCoreAggregationEdgeEmerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically Emerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically oMassive Scale-Out Traffic:F
5、or exascale AI clustersEmerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically oMassive Scale-Out Traffic:For exascale AI clustersoInfrastructure Alignment:DLC-based network switchesEmerging TrendsoSoftware Defined Networking(SDN):Managing traffic dynamically oMassive Scale-Out
6、 Traffic:For exascale AI clustersoInfrastructure Alignment:DLC-based network switchesoEdge Networking:Extending network infrastructure to the edgeConsiderationsChallengeMitigationCost OptimizationInteroperabilityScalabilityCabling InfrastructureThermalsAlign D