1、A I M 4 2 4Scaling foundation model inference on Amazon SageMaker AIVivek GangasaniPrincipal GenAI Specialist ArchitectAmazon Web ServicesKareem Syed-MohammedPrincipal Product Manager-TechAmazon Web ServicesRichard ChangSoftware Architect-AI/MLSalesForce 2025,Amazon Web Services,Inc.or its affiliate
2、s.All rights reserved.AgendaTrends in 2025Deploying models on Amazon SageMaker AIOptimizing for price performanceFlexibility Ease of UseAgentic AI architectureSalesForce AgentForce 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Trends in 2025 2025,Amazon Web Services,Inc.or its a
3、ffiliates.All rights reserved.GenAIAgentic AI Primary purpose:Content generation Dependency:Prompts and input dataAgency level:None(requires more human prompting)Ability to execute workflows:None/lowPrimary purpose:Goal achievementDependency:Access to tools,data,and agentsAgency level:Higher(less to
4、 no human oversight)Ability to execute workflows:Medium/highAgentic AI will help realize the true promise of LLMs 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Inferencing compute demandsZero-shot promptingRAGReasoning modelsAgentic systems with ReasoningThe best performing LLMs
5、 utilize chain-of-thought(CoT)reasoningCoT significantly increases inferencing compute needsAgentic systems utilize reasoning models amplifying test-time compute requirements123Bottom line:inferencing compute resource demand is increasing rapidlyRise of test-time compute 2025,Amazon Web Services,Inc
6、.or its affiliates.All rights reserved.Gartner,“Top strategic Technology Trends for 2025,”October 2024 Gartner,“Top Strategic Technology Trends:Agentic AIthe Evolution of Experience”February 202533%of enterprise software apps will include agentic AI by 2028,up from less than 1%in 202415%of day-to-da