以与 GPU 无关的方式为开放式 AI 系统启用 IBGDA 支持.pdf

编号:1011734 PDF 15页 939.82KB 下载积分:VIP专享
下载报告请您先登录!

以与 GPU 无关的方式为开放式 AI 系统启用 IBGDA 支持.pdf

1、Eddie WaiEnable GDA Support in a GPU-agnostic Manner for Open AI SystemsIBGDA is an extension to the GPUDirect familyGPUDirect enables direct GPU memory placement from a peer device GPUDirect Async enables a GPU to directly initiate the transferIBGDA allows not only the GPU to initiate but also carr

2、ies out the transfer workIBGDA improves message rates for small message size packetsAs compared to the CPU proxy conduit used without IBGDAIBGDA optimizes Prefill-Decode(PD)disaggregation phases used for inferencePD disaggregation splits the Prefill and the Decode work into separate GPU nodesIBGDA h

3、elps to improve the communication efficiency between these separate nodesWhat is IBGDA and how does it help?1.GPU produces data in HBM2.GPU writes a work request to the proxy buffer3.CPU calls into the NIC userlib to post_send4.NIC userlib translates WR to WQE and rings DB5.NIC reads the WQE from th

4、e SQ6.NIC DMAs payload data from GPU memory7.NIC sends data over the network8.NIC updates the SCQ9.CPU polls CQ for completion10.CPU notifies GPU of completionNon-GDA Fast Path1.GPU produces data in HBM2.NIC kernel directly writes a WQE to the SQ3.NIC kernel rings DB4.NIC reads the WQE from the SQ5.

5、NIC DMAs payload data from GPU memory6.NIC sends data over the network7.NIC updates the SCQ8.NIC kernel polls CQ for completionGDA Fast PathNo CPU interventionGDA ResultsAll-to-all latencyGPUs x NodesRanks 8,16,32RC Latency increases proportionally as the number of ranks increase22us,28us,51usGDA la

6、tency increases minimally up to the number of parallel executed SMs32us,33us,35usNon-standard APIsParallel Executionthreads,warps/waves,thread blocksConcurrent Work building and Completion handlingSynchronization and Locking mechanicsMemory coherency issuesAtomic Ops support in some GPUsDoorbell Upd

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(以与 GPU 无关的方式为开放式 AI 系统启用 IBGDA 支持.pdf)为本站 (明日何其多) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠