《迈向GPU驱动存储新时代的技术路径.pdf》由会员分享,可在线阅读,更多相关《迈向GPU驱动存储新时代的技术路径.pdf(23页珍藏版)》请在三个皮匠报告上搜索。
1、1Technical paths to the new era of GPU-initiated storageCJ Newburn,Distinguished Engineer,NVIDIA GPU CloudVikram Mailthody,Senior Researcher,NVIDIA ResearchSTORAGE22App taxonomy:bottlenecks,dynamism,granularityDesigning storage solutions hinges on understanding app requirementsTraditional focus for
2、GPUs has been compute-intensive apps like trainingThe explosion of innovation for VecDB,predictive AI drives new technologies to fill the gapsAll appsCompute-intensiveData-intensiveCPU-initiatedpredictable,coarse-grainedGPU-initiateddynamic,fine-grainedBottleneckStorage accessVecDB search/index,pred
3、ictive AI,relational graphsInference model load,inference KV$,multi-modal small-model trainingLLM training333Styles of compute node storage interactionCompute-intensive appsLLM training/inferenceGenerative AIWorking sets fit in memoryLow bandwidth,not perf criticalcuFile/cuObject,POSIX,S3Coarse grai
4、ned,standard NVMeNAND:large pages,low IOPsTB/TCOData-intensive appsGNNs,vector DB,relational graphsPredictive AI,searchWorking sets spill to storageHigh bandwidth,perf criticalSCADAFine grained,customize front endNAND:many dies*planes,BCH,high IOPsIOPS/TCOCPU,file/objectGPU with cache,item44Getting
5、ahead of the trendAnticipating the needs of emerging usage models while sustaining core volume for legacyBreakout out of artificial memory constraintsBut then storage needs to keep up with memorys sparse IOPsCategoryDisruptive trendWhat we want to doSize wrt memory10TB 1 PBAccess storage like memory
6、 with an APIConcurrencyO(100)/CPU O(100K)/GPUthread accessesNew programming modelaccesses from the GPUAccess patternSparse,random on vector dataNew storage SKUs optimized for sparse IOPs/TCOGranularityCoarse fineProgramming modelGPU autonomyGPU-initiated,fine-grained(SCADA)DisaggregationCant fit TB