Neuron 的性能工程：如何使用 NKI 优化您的 LLM.pdf

上传人：明****

编号：1013569

2025-12-21

PDF 17页 542.88KB

《Neuron 的性能工程：如何使用 NKI 优化您的 LLM.pdf》由会员分享，可在线阅读，更多相关《Neuron 的性能工程：如何使用 NKI 优化您的 LLM.pdf（17页珍藏版）》请在三个皮匠报告上搜索。

1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.A I M 4 1 4Performance engineering on Neuron:How to optimize your LLM with NKIScott PerryPrincipal Solutions Architect,AI/ML PerformanceAnnapurna Labs,AWSSadaf Rasoo

2、lSolutions Architect,AI/ML PerformanceAnnapurna Labs,AWS 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Innovating at the silicon levelInnovating at the silicon level3AWS TrainiumAWS InferentiaAWS AI Chips 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.AWS AI

3、Chipsfor Generative AIAWS Inferentia AWS Inferentia2 AWS Trainium AWS Trainium2Deep learning modelsMedium to large-scale inferenceLLMs,multi-modal modelsMedium to large-scale training and inference:LLMs,multi-modal modelsTraining and inference for Gen AI modelsAWS Trainium3AWS AI ChipsNext-gen agent

4、ic,reasoning,and video generation applications 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.NeuronCore ArchitectureHBMNeuronCoreGPSIMD EngineScalar EngineVector EngineTensor EngineDMA EnginesPSUMSBUFHost(CPU)Memory 2025,Amazon Web Services,Inc.or its affiliates.All rights reser

5、ved.Memory HierarchySRAMAccelerator HBMHost Memory-Size:MBs-Bandwidth:10TB/s-Size:10s GBs-Bandwidth:TB/s-Size:10s GBs TBs-Bandwidth:GB/s 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.How do we improve performance?Pipeline operations Minimize data movement Maximize data throughpu

6、t Collectives time compute/data ops timecompute boundPerformanceArithmetic Intensity(ops/byte)2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.ML DevelopersData ScientistsPerformance EngineersNeuron Developer Stack 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.

Neuron 的性能工程：如何使用 NKI 优化您的 LLM.pdf

相关报告