通过在 IREE 中启用 RISC-V 微内核支持来加速 GenAI 工作负载.pdf

上传人： c**

编号：955327

2025-10-27

PDF 17页 1.73MB

《通过在 IREE 中启用 RISC-V 微内核支持来加速 GenAI 工作负载.pdf》由会员分享，可在线阅读，更多相关《通过在 IREE 中启用 RISC-V 微内核支持来加速 GenAI 工作负载.pdf（17页珍藏版）》请在三个皮匠报告上搜索。

1、Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREEAdeel Ahmad,Ahmad Tameem,Nouman Amir,Bilal Zafar,Saad Bin Nasir10 xEngineersOutlineGenerative AI workloadsIREE compilation with custom microkernels(ukernels)Custom RISC-V matrix multiplication ukernels-implementationKernel-an

2、d model-level resultsSummary2Generative AI WorkloadsConversational LLMsGenerative AI workloads are dominated by transformer-based auto-regressive large language models(LLMs)text/image/code generation,chatbots,content writing,video generation and other common uses-cases heavily employ LLMsMatrix-matr

3、ix and matrix-vector multiplications dominate these workloadsSource:Chatgpt3IREE Compilation with Custom KernelsOpen-source direct code generation MLIR-based compiler and runtimeHost/device programming model with multiple target architectures through a hardware abstraction layer(HAL)stack is mostly

4、architecture agnostic step towards heterogeneous compilationHost does scheduling,vm-bytecode for runtime portabilityDevice-side codegen;Upstream IREE has RVV codegen through LLVMMicrokernelsIntended to prevent the dichotomy between compiler and kernelsperform arithmetic but no memory allocationstand

5、alone development and unit testing in C leads to quicker development4Matrix Multiplication ukernel(mmt4d)Compilation in IREEFor x86_64 and ARM64 architectures,IREE leverages linalg dialects mmt4d op for matrix multiplicationmmt4d op is meticulously optimized to exploit hardware-specific vector instr

6、uctions and cache hierarchiesMaterializeHostEncodingPassCPULowerToUKernelsPassLowerUKernelOpsToCallsPass+Only relevant parts of MLIR and pass pipeline are shownmatmul pack+mmt4d+unpackmmt4d iree_uk_mmt4d ukernel call ConvertToLLVMPassmatmul.mlirPrecompiled ukernel bitcodeukernel_bitcode_*.bcStatic l

通过在 IREE 中启用 RISC-V 微内核支持来加速 GenAI 工作负载.pdf

相关报告