《利用 SageMaker HyperPod 优化 AI 基础设施性能.pdf》由会员分享,可在线阅读,更多相关《利用 SageMaker HyperPod 优化 AI 基础设施性能.pdf(32页珍藏版)》请在三个皮匠报告上搜索。
1、A I M 3 6 6 Optimize AI infrastructure performance with SageMaker HyperPodSiamak NarimanSr.Product Manager TechnicalAmazon SageMaker Amazon Web Services Ankit AnandPrinciple Business DevelopmentAmazon SageMaker Amazon Web Services 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.AG
2、ENDAGenerativeAI modeldevelopmenttrends andchallenges 1.Keyconsiderations&challengesfor modeltraining2.IntroducingAmazonSageMakerHyperPod 3.HyperPod task governance:key capabilitiesand demo4.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Foundation model compute demand outpaces d
3、evelopment efficiency innovations4Data source:Epoch(2025)LoRA and other parameter-efficientfine-tuning techniquesFlash Attention 2.0 foroptimized GPU memory accessMixture of Experts architecturefor efficient layer activationDisaggregated serving forhigher per-GPU throughput GRPO for streamlinedreinf
4、orcement learningMARKET ADOPTION MILESTONESOCT 20,2022MAY 5,2023NOV 21,2023JUN 8,2024DEC 25,2024JUL 13,2025100 BILLION10 BILLION1 BILLION100 MILLIONDisclosure required at 100 billion petaFLOP under the Executive Order Grok-3Claude 3.7 SonnetDeepSeek-R1DeepSeek-V3Amazon Nova ProGPT-4oGrok-1Llama 2-70
5、B 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Model customization:Progression of optionsLEVEL OF CUSTOMIZATION/LEVEL OF EFFORT Fine-tuning(SFT)Model distillation Direct preferenceoptimization(DPO)Reinforcement learning(RL)Continued pre-training(CPT)POST-TRAININGPRE-TRAININGCUS
6、TOMIZING A MODEL Prompt engineering(e.g.,few shot prompting,chain of thought prompting)Retrieval augmentedgeneration(RAG)OPTIMIZING AN FM 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.High-performance compute is scarceand difficult to scaleHardware failures disrupt and extend tr