《CXL-PNM 研究历程:架构演进与软件环境.pdf》由会员分享,可在线阅读,更多相关《CXL-PNM 研究历程:架构演进与软件环境.pdf(26页珍藏版)》请在三个皮匠报告上搜索。
1、Kwangsik Shin()The Journey of CXL-PNM Research:Architecture Evolution and Software EnvironmentThe Journey of CXL-PNM Research:Architecture Evolution and Software EnvironmentKwangsik Shin()Compute scaled exponentially,memory performance failed to keep upExplosive growth of AI&data is outpacing tradit
2、ional computer architectureMemory wall:Data movement,not compute,is the bottleneckThe Data-Centric BottleneckGoogle,“Whats Next for the Foundations of AI”,AI Infra Summit.Sep.10,2025PNM(Processing-Near-Memory)Design Principles:Minimize data movement(bandwidth wall)Leverage memory parallelism(multi-c
3、hannel)Maximize memory bandwidth utilization(streaming)Match compute to memory access patterns(sequential)CXL-Attached Approaches Compute behind CXL endpointsMemory capacity expansionPNM Explained1231PNMDDR HostDDR 123Three Architectural Approaches with Trade-offsComparison TableCXL-PNM Architecture
4、 TaxonomyArchitectureImplementationPerformanceFlexibilityDev RiskLogic-onlyFixedCore-onlyARM/RISC-V onlyHybridFixed+ARM CoresLogic-only prototype(Initial CMM-Ax)Demonstrated huge speedups and energy savings by co-locating compute and data.LimitationsOnly supported a narrow set of pre-defined operato
5、rs;no control for new algorithms.Hybrid design(Current CMM-Ax)Adds ARM cores to run a device-side runtime with flexible microcode,plus streaming data-paths.CMM-Ax Architecture:From Logic-Only to Hybrid Programmable offloads without sacrificing DRAM burst efficiencyCXL EndpointcDeviceMemoryDDRDDRDDRD
6、DRMCMCMCMCInterconnectPFL IPsVA-PATrans.armMACACCCMPData PathControl PathPNM Engine.mem.io&CommandData ,Command Software Platform,Ready for Service-Level EvaluationThree-Layer Software Architecture1.Application LayerFAISS AdaptorVector Search Lib.2.Programming ModelUser Defined Operations3.System In