《多环境下的 LLM Agent 应用与增强.pdf》由会员分享,可在线阅读,更多相关《多环境下的 LLM Agent 应用与增强.pdf(81页珍藏版)》请在三个皮匠报告上搜索。
1、多环境下的LLM Agents 应与增强演讲:刘邦蒙特利尔学&Mila研究所 助理教授 Canada CIFAR AI Chair1CONTENTS录01模型与智能体02多模具身智能体03推理密集智能体04科学领域智能体2LLM and Agents01它的定义,框架,与挑战3什么是Agent?Give me the definition of Agent.4Agent的定义Give me the definition of Agent.“An agent is anything that can be viewed as perceiving its environment through
2、sensors and acting upon that environment through actuators.Stuart J.Russell and Peter Norvig”5简之。Give me the definition of Agent.“An agent is a system that can help complete tasks intelligently.”6Just Do ItBut how?We need tools7Just Say ItBut how?We need toolsTools as subagents of an agent system8幕后
3、主脑But how?We need toolsTools as subagents of an agent systemWe need LLM9languagelanguagemachineunderstanding(NLU)generation(NLG)基于LLM的然语处理 LLMEnvToolsPerceptionAction10Agent系统框架World ModelWorldMemoryCostSensorActorBackbonePerceptInputOutputActResponseThinkErrorPredictInputReadWriteReadQueryFeedbackD
4、ynamic environment stateStaticdataUserInputsErrorWrite11些现存的 AgentsMobile ALOHAAI ScientistAlpha GeometryCradleGPT-4VGPT-3.5DeepMind SIMAGenerativeAgentsVoyagerThinkThriceM3AWith AndroidWorldAndroidControlAgentData InterpreterOPExM3AWith AndroidWorldThinkThriceAgents的分类13CognitionPerceptionActionGPT
5、-3.5GenerativeAgentsMobile ALOHAOPExAlpha GeometryCradleDeepMind SIMAGPT-4VData InterpreterVoyagerLLM Agent的些核挑战14World ModelWorldMemoryCostSensorActorBackbonePerceptInputOutputActResponseThinkErrorPredictInputReadWriteReadQueryFeedbackDynamic environment stateStaticdataUserInputsErrorWriteHow to re
6、present and align multimodal input signals?How to achieve real-time perception in dynamic settings?How can agents handle incomplete or noisy data robustly?How to perform complex tasks?How to deal with unseen tasks?How to learn and utilize domain knowledge?How to effectively execute actions?How to se