《LLVM中RISC-V Vector代码生成功能的改进.pdf》由会员分享,可在线阅读,更多相关《LLVM中RISC-V Vector代码生成功能的改进.pdf(19页珍藏版)》请在三个皮匠报告上搜索。
1、Improvements to RISC-V Vector code generation in LLVMLuke Lau,Alex BradburyRISC-V Summit Europe 2025RVV codegen developmentImprovements to RISC-V Vector Codegen in LLVMAlex Bradbury,Luke Lau,RISC-V Summit Europe 2025Basic experimental RVV enablement Enablement of RVV codegen by default Expansion of
2、additional RVV extension support Further tuning of performance of generated code We are hereImproving RVV code generationImprovements to RISC-V Vector Codegen in LLVMAlex Bradbury,Luke Lau,RISC-V Summit Europe 2025Objective:faster execution time!Might be achieved by:Avoiding vectorisation when it is
3、nt profitableReducing overhead such as CSR switchingMinimising spillingBetter exploiting capabilities of RVV.Note:this talk gives an overview of recent improvements covering contributions from many companies.Non-power-of-two vectorizationImprovements to RISC-V Vector Codegen in LLVMAlex Bradbury,Luk
4、e Lau,RISC-V Summit Europe 2025Unique to RVV,the vl vector length register can handle vectors of arbitrary(not just power of two)sizesNew in LLVM 20:Support for non-power-of-2 vector widths in the SLP(Superword Level Parallelism)vectorizerImprovements to RISC-V Vector Codegen in LLVMAlex Bradbury,Lu
5、ke Lau,RISC-V Summit Europe 2025struct rgb float r,g,b;void brighten(struct rgb*x,float f)x-r*=f;x-g*=f;x-b*=f;vsetivli zero,2,e32,mf2,ta,mavle32.v v8,(a0)flw fa5,8(a0)vfmul.vf v8,v8,fa0fmul.s fa5,fa0,fa5vse32.v v8,(a0)fsw fa5,8(a0)vsetivli zero,3,e32,m1,ta,mavle32.v v8,(a0)vfmul.vf v8,v8,fa0vse32.v
6、 v8,(a0)clang-O3-march=rva23u64clang-O3-march=rva23u64-mllvm-slp-vectorize-non-power-of-2Non-power-of-two vectorizationvl tail foldingImprovements to RISC-V Vector Codegen in LLVMAlex Bradbury,Luke Lau,RISC-V Summit Europe 2025GCC already performs tail folding=Avoid the need for a separate loop to h