1、Gaurav AgarwalAnil GodboleSeema MehtaJianping Jiang,Xinjun YangCo-Designing for Scale:CXL-Based Memory Solution for Data-Centric WorkloadsCXL-Based Memory Solution for Data-Centric WorkloadsGaurav Agarwal Distinguished Engineer,MarvellAnil Godbole Sr Datacenter Marketing Manager,IntelSeema Mehta Pro
2、duct Management,Ampere ComputingJianping Jiang SVP Business and Product,Xconn technologiesXinjung Yang VP,AlibabaSERVER:COMPOSABLE MEMORY SYSTEMS(CMS)Exponential growth of LLM model sizesMillion+tokens LLM context windowHigh-dimensional vector dataPoor utilization of accelerator due to memory bound
3、operationsEscalating power requirementsHigh cost of servingTightly integrated designs arent suitable for diverse workloadsScaled Dataset Trends And ChallengesGenerative AIRec.SystemsDatabasesCachesAnalyticsData Centric WorkloadsSource:Estimated Global Data Center CapacityDemand(Gigawatts)Source:McKF
4、lexible and composableopen systems are a must for efficientand scalablesolutionsTraditional Memory TiersExtremely limited capacityFallback to CPU attached DRAMTier-1 xPUs w/Integrated HBMTier-2 CPUs w/DDR DRAMTier-3 StorageOverflowSwapLimited memory BW and capacityOversubscribed by multiple consumer
5、sLarge capacity low performance tierDisaggregated Memory w/CXLExtremely limited capacityFallback to CPU attached DRAMTier-1 xPUsw/Integrated HBMTier-2 CPUs w/DDR DRAMTier-3 StorageOverflowLarge capacity low performance tierScalable Large Memory TierDedicated AssignmentsPredicable performanceCXL Fabr
6、icCapacityExpansionSwapNear Memory Compute Acceleration-Unlock internal DRAM BW,hide CXL latencyScalable Composability w/CXL AcceleratorsCPU200 GiB/s Memory BW200 GiB/s Memory BW64GiB/s64GiB/sHostCXL AcceleratorsOverall SystemTotal Cores Count128128256Aggregate DRAM BW(GB/s)80016002400Memory BW/Core