报告预览

HotChips_tesla_dojo_uarch.pdf

编号：136914

PDF 28页 9.72MB 下载积分：VIP专享

下载报告请您先登录！

HotChips_tesla_dojo_uarch.pdf

1、The Microarchitecture of Teslas Exa-Scale Computer Emil Talpes,Douglas Williams,Debjit Das SarmaWhat is DOJO?2Teslas in-house supercomputer for Machine Learning Highly scalable and fully flexible distributed systemOptimized for Neural Network training workloadsGeneral-purpose system capable of adapt

2、ing to new algorithms and applicationsBuilt from grounds up with large systems in mindNot evolved from existing small systemsAnatomy of a distributed system3Distributed systems are built as hierarchies of nesting boxesCPU-Die-Module-Board-Rack-Cabinet-SystemIntegration gets looser as we move outward

3、 lower bandwidth,higher latenciesSystem is described by three modelsCompute architecture of the inner boxCommunication how data moves between boxesSynchronization how events get ordered across the entire systemThis talk describes our way of filling these boxesHigh throughput,general purpose CPUDOJO

4、nodes are full-fledged computers Dedicated CPU,local memory,communication interfaceSuperscalar,multi-threaded organization Optimized for high-throughput math applications rather than control heavy codeCustom ISA optimized for ML kernelsMicroarchitecture of the DOJO nodeProcessing pipeline32B fetch w

5、indow holding up to 8 instructions 8-wide decode handling 2 threads per cycle4-wide scalar scheduler,4-way SMT 2 integer ALUs 2 address units Register file replicated per thread2-wide vector scheduler,4-way SMT 64B wide SIMD unit 8x8x4 matrix multiplication unitsSMT support focuses on single threade

6、d application No virtual memory,limited protection mechanisms,SW-managed sharing of resources Typical application uses 1 or 2 compute threads and 1-2 communication threads1.25MB SRAM per node 400 GBps load,270 GBps storeGather engine 8B and 16B granularityLoad,store,load+execute from local memory Ex

友情提示

1、下载报告失败解决办法
2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。
3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

本文（HotChips_tesla_dojo_uarch.pdf）为本站（2200）主动上传，三个皮匠报告文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知三个皮匠报告文库（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。