塑造人工智能开放基础设施的未来.pdf

上传人：明****

编号：1011815

2025-12-21

PDF 18页 4.16MB

《塑造人工智能开放基础设施的未来.pdf》由会员分享，可在线阅读，更多相关《塑造人工智能开放基础设施的未来.pdf（18页珍藏版）》请在三个皮匠报告上搜索。

1、Ian BuckVP of HPC and HyperscaleNVIDIAShaping the Future of Open Infrastructure for AIGiga-Scale AI is Transforming Data CentersDriving extreme co-design from chip to grid with open collaborationNVIDIA Giga-Scale Reference DesignsPowerCoolingNetworkingComputeMechanicalScale-Up Spectrum-X EthernetOpe

2、n CollaborationCPXPower Smoothing45C Liquid CoolingMGX010,00020,00030,00040,00050,00060,0000100200300400500600GPT-OSS LaunchInferenceMAXTensorRT-LLM+Spec DecodeAug 2025GPT-OSS LaunchTodayCost per Million TokensBlackwell Optimizations Achieve 5X Throughput in 2 MonthsMulti-fold reduction in token cos

3、tsThroughputTPS per GPUInteractivityTPS per UserGPT-OSS-120B$0.11$0.02 5X100030,000 TPS/GPU5x Throughput in 2 monthsH200 NVL8GB200 NVL72Non-GPU CostsGPU CostsProfitExtreme Hardware-Software Co-Design for Inference Performance$5M GB200 NVL72 investment can generate$75M token revenue02,5005,0007,50010

4、,00012,500105090Measured DeepSeek-R1ThroughputTPS per GPUInteractivityTPS per User15xNVL72FP4DynamoTRT-LLMTRT Model OptimizerCUDA GraphsH200GB200AI Factory ROI$75M Revenue$5M$5M CostRevenue estimates assume 3-year operation on 72 GPUs at 50 TPS/User with DeepSeek R1 and$1.45/M token cost,based on In

5、ferenceMAX results and SemiAnalysis TCO model;actual ROI may vary.Inference Complexity is ExplodingMore parameters,experts,reasoning,kernels&shapes,and contextDS-R1,GPT OSS,Kimi K2,Llama4,Qwen3,Cosmos,Gemini,LTM-2-mini,Sora2Mixture of ExpertsDense TransformersDense LLMsInferenceComplexityBERTLlama32

6、024201820232025Massive Context(Video generation,software application development)1Expert10KKernels,Shapes300+Experts10MKernels,Shapes1M+Context Tokens(2,000 x vs.BERT)Next Generation Vera Rubin for Giga-Scale AIOCP MGX compatible infrastructureVera Rubin NVL144Vera Rubin CPXComputeMemoryBandwidthNVL

塑造人工智能开放基础设施的未来.pdf

相关报告