《英伟达:2026 Kimodo技术报告(英文版)(21页).pdf》由会员分享,可在线阅读,更多相关《英伟达:2026 Kimodo技术报告(英文版)(21页).pdf(21页珍藏版)》请在三个皮匠报告上搜索。
1、2026-3-16Kimodo:Scaling Controllable Human Motion GenerationDavis Rempe*,Mathis Petrovich*,Ye Yuan,Haotian Zhang,Xue Bin Peng,Yifeng Jiang,Tingwu Wang,UmarIqbal,David Minor,Michael de Ruyter,Jiefeng Li,Chen Tessler,Edy Lim,Eugene Jeong,Sam Wu,EhsanHassani,Michael Huang,Jin-Bey Yu,Chaeyeon Chung,Lina
2、 Song,Olivier Dionne,Jan Kautz,Simon Yuen,Sanja FidlerNVIDIA*Co-First Authorshttps:/ human motion data is becoming increasingly important for applications in robotics,simula-tion,and entertainment.Recent generative models offer a potential data source,enabling human motionsynthesis through intuitive
3、 inputs like text prompts or kinematic constraints on poses.However,thesmall scale of public mocap datasets has limited the motion quality,control accuracy,and generalizationof these models.In this work,we introduce Kimodo,an expressive and controllable kinematic motiondiffusion model trained on 700
4、 hours of optical motion capture data.Our model generates high-qualitymotions while being easily controlled through text and a comprehensive suite of kinematic constraintsincluding full-body keyframes,sparse joint positions/rotations,2D waypoints,and dense 2D paths.Thisis enabled through a carefully
5、 designed motion representation and two-stage denoiser architecture thatdecomposes root and body prediction to minimize motion artifacts while allowing for flexible constraintconditioning.Experiments on the large-scale mocap dataset justify key design decisions and analyzehow the scaling of dataset
6、size and model size affect performance.1.IntroductionWhile human motion data has always been central to games and other media,recent advances in robotics andphysical AI have increased the demand for such data.In robotics,human demonstrations allow humanoidsto move realistically and complete complex