《4135 - 使用 Velox 和 NVIDIA cuDF 加速 Presto.pdf》由会员分享,可在线阅读,更多相关《4135 - 使用 Velox 和 NVIDIA cuDF 加速 Presto.pdf(24页珍藏版)》请在三个皮匠报告上搜索。
1、Accelerating Presto with Velox and NVIDIA cuDF Karthikeyan Natarajan*,Devavret Makkar,Shruti Shivakumar,Greg Kimball,Todd MostakNVIDIAIBM TechXchange 2025 41352025-10-06Introduction to Presto GPUSingle node performance resultsOperator development for Presto GPUNext stepsAgendaOpen Source Software fo
2、r Data Processing“Data processing”workloads include SQL query processing,dataframe operations,feature engineering and more.Apache Spark(2010)and Presto(2012)are both JVM-based(Java)distributed query engines,with a long history of open source development GPU execution is efficient and cost-effective
3、for many data processing workloads Presto GPU is a new,open-source integration based on Presto,Velox and NVIDIA cuDF.GPU data processingDistributed SQLInteractive SQLLaunch date of(selected)open source projectsStandard columnar formatJVM-to-native interfaceExtensible execution engineGPU-accelerated
4、SparkSpark-RAPIDScuDFPresto GPU(all Apache 2.0 license)Bringing Accelerated Workers to Presto Presto is a fast,scalable SQL query engine for modern data analytics The Presto coordinator generates query plans and distributes the tasks to the cluster of Presto workers Presto workers can include:Defaul
5、t,Java-based workers Software accelerated,C+Velox workers Hardware accelerated,CUDA C+Velox-cuDF Presto Java WorkercuDF AlgorithmsVelox ExecutionPresto C+Presto JavaVelox-cuDF OperatorsPresto CoordinatorVelox AlgorithmsVelox OperatorsVelox ExecutionPresto C+Presto C+WorkerPresto GPU WorkerQuery Plan
6、Building up the Presto GPU Worker Presto C+workers translate plans from the Presto coordinator into plans compatible with Velox execution Velox workers use the Velox-cuDF backend to replace CPU operators with GPU operators.Velox executes the query plan,and GPU operators call into cuDF algorithms.cuD