1、Open,Heterogeneous Infrastructure for Sustainable Agentic AIAnil NanduriVice President,Go-to-Market&Product ManagementAI AcceleratorsIntelCommitmentto sustainable,flexible,and scalable tech infrastructureChoiceBreaking free from proprietary constraints100 ContributionsFrom OCP-NIC to DC-MHS to open
2、systems for AICommitmentto sustainable,flexible,and scalable tech infrastructureChoiceBreaking free from proprietary constraints100 ContributionsFrom OCP-NIC to DC-MHS to open systems for AI1020304050Nov 2024Feb 2025May 2025Aug 2025 Dec 2025 Mar 2026Gemini 3.1 ProGemini 3 ProGPT-5Grok 4Gmni 2.5 ProG
3、PT-o3Gmni 2.5 Pro-ExpGPT-o3-miniDS-R1GPT-o1GPT-4oContext Window Reaches MilestoneExplosive Token GrowthRapid Growth in Capability Inference Dominate AI Workloads9.7 Tr1300Tr130 xGoogle Token processed per day8K16K32K128K1M2M+Mistral 0.12023GPT 3.52023GPT 4o2024Claude S 2026250 xGemini 3 Pro2026Human
4、itys Last Exam*Source:Context window from respective model labs company websites.Token growth from Google IO May 2025 and Q3 Earning.Capability growth from Humanitys Last Exam.AI Inference as%of AI workloads from Morgan Stanley(Mar 2026)and Bank of America(Feb 2026)reports.Visualization for illustra
5、tive purposes only.80%InferenceBy 2030AI WorkloadsMistral 0.22023May 2024Oct 2025%AccuracySoftwareProprietary/vendor lock-inOrchestration across heterogeneous hardwareMulti-stage,multimodal AI servingSystemsPower consumption and coolingMemory&BW constraintsSupply,component&system costsHardwareDistin
6、ct compute requirements for different inference workload phasesLarger CPU roleNetworking limitsHeterogeneous systemsInference&agentsAgentic systemsHigher memory capacity&BWLong-context economics&low turn latencyFutureWorkload-aware schedulingRight silicon for the right jobAI serv