《利用 AWS 上的 NVIDIA GPU 加速 AI 创新.pdf》由会员分享,可在线阅读,更多相关《利用 AWS 上的 NVIDIA GPU 加速 AI 创新.pdf(51页珍藏版)》请在三个皮匠报告上搜索。
1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.AIM251Dvij BajpaiPrincipal Product ManagerAWSSreekar ReddySenior Product ManagerAWSErsin YumerSeni
2、or Director of EngineeringAdobeAccelerating AI innovation with NVIDIA GPUs on AWS 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Customer Use-Cases in Gen AIKey GPU Infrastructure Investments in 2025Adobe Firefly on AWSAgenda 2025,Amazon Web Services,Inc.or its affiliates.All rig
3、hts reserved.Partnering with NVIDIA for 15 YearsDeep engineering collaboration to provide the best GPU performance for customers201020152016201720192020202220232024/2025/CG1(2010)P3(2017)P5(2023)P6e-GB200(2025)Introducing GPU-accelerated compute in the cloudNVLink across GPUs to enable bigger models
4、UltraCluster networks to scale training to 10k+GPUsEC2 UltraServers for Trillion-parameter AI 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Large-Scale PretrainingParallelism techniques to train at 10k+GPUsModel DistillationSmaller models to optimize inference economicsSingle-No
5、de InferenceInference within a single GPU or instance“GenAI 1.0”2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Simplistically 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Reasoning through Chain-of-ThoughtScale Test-time vs train-time computeHigher GPU compu
6、te and aggregate memory bandwidth for inferenceReasoning in ProductionBreak complex problems down and reason through them step-by-step 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Prefill(compute-intensive)vs decode(memory-bandwidth-intensive)Disaggregate inference to decouple