当前位置：首页 >英文主页 >中英对照 > 报告详情

斯坦福大学（Stanford）：大语言模型（LMM）简介（2024）（英文版）（72页）.pdf

上传人： Kell****reet 编号：620742 2024-12-30 PDF PDF 中文版中文版中文版 PPTX PPTX PPTX 72页 6.11MB

下载：

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载报告到电脑，查找使用更方便

VIP专享文档

书签

分享

收藏

已收藏

版权投诉

/72

立即下载

《斯坦福大学（Stanford）：大语言模型（LMM）简介（2024）（英文版）（72页）.pdf》由会员分享，可在线阅读，更多相关《斯坦福大学（Stanford）：大语言模型（LMM）简介（2024）（英文版）（72页）.pdf（72页珍藏版）》请在三个皮匠报告上搜索。

1、Large Language ModelsIntroduction to Large Language ModelsLanguage modelsRemember the simple n-gram language modelAssigns probabilities to sequences of wordsGenerate text by sampling possible next wordsIs trained on counts computed from lots of textLarge language models are similar and different:Ass

2、igns probabilities to sequences of wordsGenerate text by sampling possible next wordsAre trained by learning to guess the next wordLarge language modelsEven through pretrained only to predict wordsLearn a lot of useful language knowledgeSince training on a lot of textThree architectures for large la

3、nguage modelsDecoders Encoders Encoder-decodersGPT,Claude,BERT family,Flan-T5,WhisperLlama HuBERTMixtralPretraining for three types of architecturesThe neural architecture influences the type of pretraining,and natural use cases.32DecodersLanguage models!What weve seen so far.Nice to generate from;c

4、ant condition on future wordsEncodersGets bidirectional context can condition on future!How do we train them to build strong representations?Encoder-DecodersGood parts of decoders and encoders?Whats the best way to pretrain them?Pretraining for three types of architecturesThe neural architecture inf

5、luences the type of pretraining,and natural use cases.32DecodersLanguage models!What weve seen so far.Nice to generate from;cant condition on future wordsEncodersGets bidirectional context can condition on future!How do we train them to build strong representations?Encoder-DecodersGood parts of deco

6、ders and encoders?Whats the best way to pretrain them?Pretraining for three types of architecturesThe neural architecture influences the type of pretraining,and natural use cases.32DecodersLanguage models!What weve seen so far.Nice to generate from;cant condition on future wordsEncodersGets bidirect

word格式文档无特别注明外均可编辑修改，预览文件经过压缩，下载原文更清晰！

三个皮匠报告文库所有资源均是客户上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作商用。

本文主要介绍了大型语言模型（LLM）的原理、应用和挑战。 1. LLM通过预测下一个词来学习语言知识，可以用于文本生成、文本补全、情感分析、问答等任务。 2. LLM的训练分为预训练和微调两个阶段。预训练通常在大规模文本语料库上进行，通过预测下一个词来学习语言知识；微调则是在特定任务上调整模型参数，以适应新的应用场景。 3. LLM的性能与模型大小、训练数据量和计算资源密切相关。随着模型规模的扩大，性能呈幂律增长。 4. LLM在实际应用中存在一些问题，如幻觉、版权侵犯、隐私泄露、毒性内容生成等。 5. 为了解决LLM的规模问题，研究者提出了参数高效微调（PEFT）等方法，通过只更新模型的一部分参数来提高微调的效率。 6. 总体来说，LLM在自然语言处理领域取得了巨大进展，但同时也面临着诸多挑战，需要进一步的研究和改进。

大型语言模型如何处理隐私问题？大型语言模型如何避免产生有害内容？大型语言模型如何提高生成内容的多样性？

全行业研究报告分享下载平台

0731-84720580
商务合作：really158d
友链申请 (QQ)：1737380874

关于我们

更多

关于我们

三个皮匠报告微信公众号

三个皮匠报告微信小程序

扫码咨询商务合作事宜

友情链接：

营销自动化亿欧智库微播易阿里妈妈

copyright@2008-2013 长沙思想领动信息技术有限公司版权所有网站备案/许可证号：湘B2-20190120 | 工信部备案号：湘ICP备2023027541号-2 | 公安备案号：湘公网安备43010402001071号

客服

小程序

服务号

折叠