《头条新闻——克服人工智能记忆墙:为什么智能代理需要新的基础.pdf》由会员分享,可在线阅读,更多相关《头条新闻——克服人工智能记忆墙:为什么智能代理需要新的基础.pdf(21页珍藏版)》请在三个皮匠报告上搜索。
1、Overcoming The AI Memory Wall:Why AI Agents Need a New Foundation Val BercoviciChief AI OfficerThe Rise of Agent Swarms2025Paradox:Upside-Down TokenomicsToken Cost vs Token Volume00.511.522.533.5400.20.40.60.811.21.4$3B$4B(1-QRR)$550M ARRPricing Ripple EffectAgentic AI isHitting Hard LimitsThe AI Me
2、mory WallGPU BoundModel ParametersMemory CapacityMemory BoundPrefill DecodeMemoryDecodeGPUPrefillInputOutputKey Metric:KV Cache Hit RateScaling The Memory WallTokenWarehousingGPUPrefillThe Augmented Memory RevolutionGPUPrefillInputMemoryDecodeOutput1 Prefill DecodeMemoryDecodeOutputMemoryDecodeOutpu
3、tMemoryDecodeOutputMemoryDecodeOutputMemoryDecodeOutputInsights from Our Labs4X More Users&Agent SessionsPer GPUReal-World Agent Inference Performance02468101202040608010012014016018052224010444802088960Time to First Token(TTFT)Output T/sWorking Set SizeWEKA TTFTDRAM TTFTWEKA Output T/sDRAM Output T/sTokens Per SecondTTFTWhat it Takes to Win More Tokens Track KV Cache Hit Rate Prefill Once,Decode Forever Leverage GPU,Network,Memory Abundant Quality&Safety Tokens(Every Agent Step)Profitable AI Requires Overcoming The Memory WallLearn How to Maximize Your AIToken ProductionTHANKS FOR YOUR TIME