《超越固态硬盘:SK海力士AIN系列重新定义存储使其成为大规模人工智能的核心赋能者(SK海力士发布).pdf》由会员分享,可在线阅读,更多相关《超越固态硬盘:SK海力士AIN系列重新定义存储使其成为大规模人工智能的核心赋能者(SK海力士发布).pdf(16页珍藏版)》请在三个皮匠报告上搜索。
1、Title(Arial Black 28pt)Redefining Memory Storage as the Core Enabler of AI at ScaleCS(CHUNSUNG)KIM,SVP of eSSD PD,SK hynixTitle(Arial Black 28pt)Nova PremierGPT-4oLlama4 MaverickDeepSeek V3 Claud 4 OpusLlamaNemotronO4-miniQwen3 235BGemini2.5 FlashGemini 2.5 ProDeepSeek R1 Grok 4Output Tokens1)(M tok
2、ens)Traditional LLM ModelReasoning ModelNot just compute speed token throughput matters too1)Answering+Reasoning Tokens*(Source)Artificial AnalysisTitle(Arial Black 28pt)TPS for 1 UserTotal TPS/MW1122Improving Token throughput Per QoS with DRAM&SSDMaximize Token throughput Per QoS with next AI Solut
3、ionsTitle(Arial Black 28pt)Title(Arial Black 28pt)RAG over GNNVectorEmbeddingVector DBSearchModel InferencePromptAugmentationMulti-UserQueriesOutputDeliveryGPUHBMPreprocessingQuery EmbeddingUser Query+Retrieved ContextModel Param.OthersKV$PrefillDecodeKV$?Title(Arial Black 28pt)PrefillDecodePromptTo
4、kenizationLLM(Weight Matrix)KV CacheLLM(Weight Matrix)DetokenizationOutputTTFT1)TPOT2)1)TTFT:Time to First Token,2)TPOT:Time per Output Token300 x more token than todayPromptUser requestLLM Response TimeResponse completedResponseResponseCompute intensiveMemory intensiveTitle(Arial Black 28pt)Planner
5、 Agent Planner Agent Reasoning LLMReasoning LLMMultimodalMultimodalVision+TextVision+TextCodingCodingLLMLLMData AnalysisData AnalysisSQL gen LLMSQL gen LLMMemory AgentMemory AgentTitle(Arial Black 28pt)Title(Arial Black 28pt)Usage modelAccessPatternThroughputGranularityKV cachingSRMed4GBSearchSpeedS
6、earchSpeedIndexSizeIndexSizeSearchQualitySearchQualityHigh throughputLow latencyFast Random access to meta-dataLatency-sensitiveHigh capacityFast Seq.ReadApplication access patterns in inferenceTitle(Arial Black 28pt)Up to 7x higher IOPS with next-gen fast NANDInnovative controll