《4136 - 使用 Velox、Presto 和 CuDF 加速数据处理.pdf》由会员分享,可在线阅读,更多相关《4136 - 使用 Velox、Presto 和 CuDF 加速数据处理.pdf(27页珍藏版)》请在三个皮匠报告上搜索。
1、Orlando,FLOctober 69IBM TechXchange 20254136Deepak Majeti-IBMZoltn Arnold Nagy IBM ResearchLuis Garcs-Erice IBM ResearchKarthikeyan Natarajan-NVIDIAAccelerating Data Processing with Velox,Presto and CuDFAgenda0102030405Presto&Velox&GPUsMultiple GPUs:ExchangeFaster I/O:StorageFeeding the GPUs fasterF
2、utureIBM TechXchange|2025 IBM CorporationWhat is Presto?IBM TechXchange|2025 IBM Corporation3Presto UsersIBM TechXchange|2025 IBM Corporation4Presto ArchitectureIBM TechXchange|2025 IBM Corporation5Presto C+&Velox&cuDFIBM TechXchange|2025 IBM Corporation6A composable and fully extensible C+execution
3、 engine library for data management systems.Agenda0102030405Presto&Velox&GPUsMultiple GPUs:ExchangeFaster I/O:StorageFeeding the GPUs fasterFutureIBM TechXchange|2025 IBM CorporationWhy and how can weexchangedata faster?8IBM TechXchange|2025 IBM CorporationSystem diagram(DGX A100)9IBM TechXchange|20
4、25 IBM Corporation10IBM TechXchange|2025 IBM CorporationDistributed JOINs are very exchange heavyHardware accelerated exchange for PrestoHTTP-based exchangeCPU I/O overheadTCP-Many threads to achieve good performanceNew exchangeUCX for communication(through Nvidias UCXX)Seamless hardware acceleratio
5、nfull hardware offload-0 CPUIBM TechXchange|2025 IBM Corporation12NVLink intra-nodeAWS(EFA)and on-prem(InfiniBand or RoCE)inter-nodeHardware accelerated exchange for Presto (8 A100 GPUs,single node)1 minute 56 seconds down to 15 secondsPresto Generalized ExchangeIBM TechXchange|2025 IBM Corporation1
6、4Agenda0102030405Presto&Velox&GPUsMultiple GPUs:ExchangeFaster I/O:StorageFeeding the GPUs fasterFutureIBM TechXchange|2025 IBM CorporationHow can wereaddecode(de)compressdata faster?16IBM TechXchange|2025 IBM CorporationGPUDirect RDMAIBM TechXchange|2025 IBM Corporation17Infiniband NICGPUPCIe Switc