当前位置:首页 > 报告详情

复旦大学:2025大模型能力来源与边界报告(55页).pdf

上传人: 杨*** 编号:654000 2025-05-21 55页 12.49MB

下载:

1、1张奇 复旦大学大语言模型能力来源与边界4输入:Fudan University is located inLLaMA2-13B(PPL 5.877):Fudan University is located in Shanghai,China.It is locally known as 复旦大学.The university was established in 1905.It isaccredited by Ministry of Education of the Peoples Republic of China.There are over 40,000 students studyi

2、ng in various courses offered by FudanUniversity.The language of instruction is Chinese.LLaMA2-13B-修改语言修改语言非核心区非核心区LayerNorm其他维度其他维度扩大扩大10倍倍(PPL 5.914):Fudan University is located in Shanghai,China,the largest city with the most economic and cultural activities in China.With the most advanced infras

3、tructure and the best living condition,it has become the international education center with the largest oversea students.It consists of Jinan,Kangqiao and Fenglin campus,which boasts the best resources from both education and research.Fudan University has been a famous and attractive university for

4、 international students,especially in the past one decade from 2001-2010.LLaMA2-13B-修改语言修改语言核心区核心区1维扩大维扩大10倍倍(PPL 376079936):Fudan University is located in NoSYouThereThatAThis#ThisThistThe/Whatthdv 仅修改130亿参数中的1个就会使模型完全混乱大语言模型基础理论突破,发表大模型相关论文80+篇Unveiling Linguistic Regions in Large Language Models,

5、ACL 2024国际上首次提出的大语言模型语言核心区和维度依赖理论,可以有效指导大语言模型训练过程5Unveiling Linguistic Regions in Large Language Models,ACL 2024破坏 Arabic/Vietnamese 区域ArabicMMLU:Assessing Massive Multitask Language Understanding in Arabic(Koto et al.,arXiv 2024)1.大语言模型语言核心区与维度依赖1.大模型能力边界在哪里?6知识利用层次图Wang et al.Knowledge Mechanisms

6、in Large Language Models:A Survey and Perspective,EMNLP 2024大模型目前在哪个层级?未来可以到哪个层级?知识利用层次图Wang et al.Knowledge Mechanisms in Large Language Models:A Survey and Perspective,EMNLP 2024大模型目前在哪个层级?未来可以到哪个层级?目前?知识利用层次图Wang et al.Knowledge Mechanisms in Large Language Models:A Survey and Perspective,EMNLP 2

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
本文主要探讨了大语言模型LLaMA2-13B的能力来源与边界。复旦大学位于中国上海,成立于1905年,是中国教育部认证的一所综合性大学,拥有超过40,000名学生。文章指出,大语言模型的能力来源于预训练和后训练,预训练使得模型记住知识并学习到语义分布表示,后训练则可以微调这些知识并激活预训练能力。此外,强化学习也被用于提升模型的推理能力。然而,大模型依然存在边界,如在高考数学题目中,尽管能回答正确,但计算过程和答案不相符的比例很高。文章还提到,不同LLMs的数据需求差异很大,使用记忆水平较高的数据进行后训练可以提高LLM在相应知识水平上的表现。总之,大模型能力来源是多方面的,包括预训练、后训练和强化学习,但在实际应用中仍存在挑战和边界。
"大模型能力如何提升?" "大模型训练中的困难是什么?" "如何评估大模型训练效果?"
客服
商务合作
小程序
服务号
折叠