《NVIDIA-徐添豪-大模型时代对基于GPU的软硬件系统设计的思考.pdf》由会员分享,可在线阅读,更多相关《NVIDIA-徐添豪-大模型时代对基于GPU的软硬件系统设计的思考.pdf(34页珍藏版)》请在三个皮匠报告上搜索。
1、大模型时代对基于大模型时代对基于GPUGPU的的软硬件系统设计的思考软硬件系统设计的思考徐添豪徐添豪 NVIDIANVIDIA消费互联网行业技术负责人消费互联网行业技术负责人个人简介个人简介 徐添豪,NVIDIA消费互联网行业解决方案架构师负责人,长期从事基于GPU的解决方案构建和落地工作,在GPU软硬件/CUDA/深度学习算法/工程及架构方面,有多年经验。近期主要从事生成式AI和大语言模型的工程加速相关工作。目录目录 Hardware and System Evolution for AI NVIDIAs Full-stack Ecosystem NeMo Framework for LLM
2、 What about Inference目录目录 Hardware and System Evolution for AI NVIDIAs Full-stack Ecosystem NeMo Framework for LLM What about InferenceGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU archi
3、tecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FLOPSGPU architecture and FL
4、OPSCollaboration via inter-GPU connection:Collaboration via inter-GPU connection:NVLinkNVLink and and NVSwitchNVSwitchBest-of-breed infrastructure for AI development built Best-of-breed infrastructure for AI development built on NVIDIA DGXon NVIDIA DGX目录目录 Hardware and System Evolution for AI NVIDIA
5、s Full-stack Ecosystem NeMo Framework for LLM What about InferenceNVIDIANVIDIA全栈全栈NVIDIANVIDIA全栈全栈NVIDIANVIDIA全栈全栈NVIDIANVIDIA全栈全栈目录目录 Hardware and System Evolution for AI NVIDIAs Full-stack Ecosystem NeMo Framework for LLM What about InferenceEstimating GPT3-175B training timeEstimating GPT3-175B t
6、raining timeTake 128*A800 node as an exampleEstimating GPT3-175B training timeEstimating GPT3-175B training timeTake 128*A800 node as an exampleTraining OptimizationTraining OptimizationTraining OptimizationTraining OptimizationTraining OptimizationTraining Op