1、Gluten GPU backendVelox LibCUDF A Velox DriverAdapter is used to replace CPU operators with GPU operators that call cuDFs C+code.This DriverAdapter can be registered at startup by the application using Velox to enable the cuDF backend.Each operator will have a cuDF equivalent.For example,OrderBy wil
2、l be replaced by CudfOrderBy.In between CPU operators and GPU operators,another conversion operator is inserted to handle CPU-GPU and GPU-CPU data movement.This allows cuDF operators to be used alongside existing Velox operators.The conversion currently uses Arrow(Velox to Arrow,then Arrow to cuDF).
3、A direct Velox-to-cuDFinterop without Arrow may be built in the future for higher performance.Currently,no custom CUDA kernels are needed for this code.All functionality is implemented in pure C+calling cuDF,which implements the CUDA kernels.There is a lot more to say about tuning for performance(GP
4、U batch sizes,CUDA streams,number of Velox drivers,.)but Im leaving that out of this document for the moment.Link:Experimental RAPIDS cuDF Backend for Velox#12412Wave:Velox on CUDA Experimental subproject in Velox to support GPU The main logics are in Velox,only basic CUDA API usedgpu:common(memory,
5、event)CUDA APIHashtableBuild()HashtableProbe()SQL OperatorsVeloxCUDAGluten ArchitectureVelox GPU backend Validation,cudf only supports some operators Operator implement,now it copies all the operator implement,and rewrite by cudf,How to sync with veloxMostly copy acceptBehavior difference Spill supp
6、ort Memory cuda global pool cudf:detail:cuda_stream_pool Conversion Cudf to arrow conversion,cache all the velox vector,combine to a big vector and then convert to arrow,then to cudf table Insert format conversion when 1 operator not supported,maybe table scan