《使用 SageMaker HyperPod CLI 和 SDK 构建、微调和部署 AI 模型.pdf》由会员分享,可在线阅读,更多相关《使用 SageMaker HyperPod CLI 和 SDK 构建、微调和部署 AI 模型.pdf(30页珍藏版)》请在三个皮匠报告上搜索。
1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.A I M 3 7 1Build,fine-tune&deploy AI models with SageMaker HyperPod CLI&SDKGiuseppe A.Porcellihe/himPrincipal ML Solutions ArchitectAmazon Web ServicesArun Nagarajan
2、he/himPrincipal Software EngineerAmazon Web Services 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.AgendaIntroduction to Amazon SageMaker HyperPodAmazon SageMaker HyperPod CLI and SDKGetting startedTraining AI modelsDeploying AI modelsOptimizing resource utilizationIDEs and note
3、books on HyperPodResources 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Introduction toAmazon SageMaker HyperPod 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Scale and accelerate generative
4、 AI model development across thousands of AI acceleratorsAmazonSageMakerHyperPodImproved efficiencyTools for maximizing compute resources utilization,advanced observability,and seamless cluster customizationReduced time-to-trainResilience features and distributed training libraries help reduce timet
5、o train by up to 40%Lower costsLess time invested in hardware maintenance and more efficient cluster engagement reduces FM training TCO 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.HyperPod benefitsScalableScalable Single-spine node topology Pre-configured EFA for optimal inter
6、-nodecommunication speeds Flexible paths to securing compute capacity Rapid cluster scale-up without performance degradationResilient Proactively screen health of inbound nodes Continuous cluster hardware monitoring Automated repair and job resumption Spare capacity dedicated to self-healingEfficien