1、Memory technology optimized for at-scale AI systemsSiamak Tavallaei,Sr.Principal Engineer,Samsung Semiconductor,Inc.MillindMittal,Founder,MemWize.AISERVER:COMPOSABLE MEMORY SYSTEMS(CMS)Levels of memory tiers in AI infrastructure Memory growth drivers and mapping of workloads to memory tiersExample S
2、W frameworks for AI LLM inference A candidate cluster architecture for memory scaling Considerations and role of optics in addressing memory scaling challenge Outline Baseline Server NodeBaseline Server NodeSRAM/Cache T0CPU-MemT1Local Node StorageT2Storage on DC NetworkT3M:Local DDRx MemoryC:CPUS:NV
3、Me/PCIe SSD StorageN:NICMemory TiersHigh-BWLow-latencyLarger CapacityNetworked Bulk CapacityAI Infra Memory Tiers SRAM/CacheT0GPU-HBMT1CPU-Mem(+CXL)T2(T2+)Storage on SOT3-SOGPU-HBM-SUT1-SUCPU-Mem-SU(+CXL)T2(T2+)-SUStorageT3Storage on SUT3-SUStorage on DC Network T4Sever-centric Memory Tier Pyramid v
4、s.AI Infra Memory tier PyramidsScale-up(SU)and Scale-out(SO)FabricsBaseline Server NodeAI Infrastructure SRAM/Cache T0CPU-MemT1Local Node StorageT2Storage on DC NetworkT3SRAM/Cache T0CPU-MemT1CPU-CXL Fabric MemoryT2-CXLCPU-CXL MemT1+Storage on DC NetworkT4Local Node StorageT3Memory SOT2-SOServer Nod
5、e with memory expansion Remote Memory T1(+)SO/DCReasoning and multi-modal models,and Agentic AI driving accelerated growth in memory capacity and bandwidthMultiple fold increase in active KV contextsLonger lived contexts multi-turn,shared contexts(e.g.code generation)Growing knowledge DBs and databa
6、ses of past-conversations Growing models sizes Multi-terabyte capacity for embeddings for recommendation modelsGrowing size of memory-resident modelsMixture-of-experts,Collection-of-experts,.Growing Memory Capacity Needs Mapping AI use cases to Memory TiersScratch pad for computation units Model wei