《虚拟化场景下 GPU 应用的网络优化.pdf》由会员分享,可在线阅读,更多相关《虚拟化场景下 GPU 应用的网络优化.pdf(25页珍藏版)》请在三个皮匠报告上搜索。
1、Pengzhi Zhu, Dec 2020 虚拟化场景下 GPU应的络优化 2 INFINIBAND OPENSTACK -ORCHESTRATING AI CLOUDS 3 ACCELERATING MACHINE LEARNING WITH RDMA VM & Storage Machine Learning -Zero CPU -Higher throughput -NVMe drives -Storage is no longer the bottleneck -NVMe-oF brings direct attached like performance -Low Latency (
2、1-2 Sec E2E) -High Bandwidth (200Gbps+) -GPU Direct uses RDMA to scale and increase performance TCP/IP RACK 1RACK 2 NIC OS NIC Buffer 1 Buffer 1 NIC HARDWAREKERNELUSER RDMA Operations Buffer 1 OS NIC Buffer 1 Buffer 1 Buffer 1 4 GPUDIRECT RDMA TECHNOLOGY With GPUDirect RDMA Without GPU Direct - Same
3、 Data Copied 3x HDR 200Gb InfiniBand 5 PCIe device presents multiple instances to the OS/Hypervisor Enables Application Direct Access Bare metal performance for VM Reduces CPU overhead Enables advanced NIC features, RDMA SINGLE ROOT I/O VIRTUALIZATION (SR-IOV) Para-VirtualizedSR-IOV NIC Hypervisor VMVM SR-IOV NIC HypervisorVM VM HDR 200Gb Physical Function (PF) Virtual Function (VF) Net Device 6 O