当前位置：首页 >英文主页 >中英对照 > 中译版报告详情

OpenAI：2025 GPT-5.1-Codex-Max技术报告（中译版）（27页）.pdf

上传人： 1****1 编号：976579 2025-11-27 PDF PDF 中文版中文版中文版 DOCX DOCX DOCX 27页 2.56MB 34张图表

下载：

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载报告到电脑，查找使用更方便

VIP专享文档

书签

分享

收藏

已收藏

版权投诉

/27

立即下载

《OpenAI：2025 GPT-5.1-Codex-Max技术报告（英文版）（27页）.pdf》由会员分享，可在线阅读，更多相关《OpenAI：2025 GPT-5.1-Codex-Max技术报告（英文版）（27页）.pdf（27页珍藏版）》请在三个皮匠报告上搜索。

1、GPT-5.1-Codex-Max System CardOpenAINovember 18,20251Contents1Introduction32Baseline Model Safety Evaluations32.1Disallowed Content Evaluations.32.2Jailbreaks.42.3Vision.43Product-Specific Risk Mitigations53.1Agent sandbox.53.2Network access.64Model-Specific Risk Mitigations64.1Harmful Tasks.64.1.1Ri

2、sk description.64.1.2Mitigation.64.1.2.1Safety training.64.2Prompt Injection.74.2.1Risk description.74.2.2Mitigation.74.2.2.1Safety training.74.3Avoid data-destructive actions.84.3.1Risk description.84.3.2Mitigation.84.3.2.1Safety training.85Preparedness95.1Capabilities Assessment.95.1.1Biological a

3、nd Chemical.95.1.1.1Long-form Biological Risk Questions.915.1.1.2Multimodal Troubleshooting Virology.105.1.1.3ProtocolQA Open-Ended.105.1.1.4Tacit Knowledge and Troubleshooting.115.1.1.5Troubleshooting Bench.115.1.2Cybersecurity.125.1.2.1Capture-the-flag(professional).145.1.2.2CVE-Bench.155.1.2.3Cyb

4、er Range.165.1.2.4External Evaluations by Irregular.185.1.2.5Preparing for High Cyber Capability.185.1.3AI Self-Improvement.195.1.3.1SWE-Lancer.195.1.3.2Paperbench-10(n=10).205.1.3.3MLE-bench-30(n=30).215.1.3.4OpenAI PRs.225.1.3.5OpenAI-Proof Q&A.235.1.3.6External Evaluations by METR.245.2Research C

5、ategory Update:Sandbagging.265.2.1External Evaluations by Apollo Research.2621IntroductionGPT-5.1-Codex-Max is our new frontier agentic coding model.It is built on an update to ourfoundational reasoning model trained on agentic tasks across software engineering,math,research,medicine,computer use an

6、d more.It is our first model natively trained to operate across multiplecontext windows through a process called compaction,coherently working over millions of tokensin a single task.Like its predecessors,GPT-5.1-Codex-Max was trained on real-world softwareengineering tasks like PR creation,code rev

word格式文档无特别注明外均可编辑修改，预览文件经过压缩，下载原文更清晰！

三个皮匠报告文库所有资源均是客户上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作商用。

根据《GPT-5.1-Codex-Max System Card》内容，以下是全文关键点概括： 1. **模型能力**：GPT-5.1-Codex-Max是OpenAI的新一代编码模型，具备处理多任务和跨多个上下文窗口的能力。 2. **安全措施**：模型经过严格的安全评估和训练，包括对有害任务、提示注入和数据破坏行为的防范。 3. **安全评估**：GPT-5.1-Codex-Max在生物化学领域被评估为高风险，在网络安全领域未达到高级别，在AI自我改进领域也未达到高级别。 4. **安全培训**：模型通过安全培训来拒绝有害任务，并增强对提示注入和数据破坏行为的抵抗力。 5. **评估结果**：在网络安全能力评估中，GPT-5.1-Codex-Max在CVE-Bench和CTF挑战中表现出色，但在Cyber Range评估中未达到高级别。 6. **未来展望**：OpenAI预计模型能力将继续快速提升，并可能在不久的将来达到高级网络安全水平。

AI安全如何保障？" GPT-5.1-Codex-Max风险如何评估？" GPT-5.1-Codex-Max能力如何提升？"

全行业研究报告分享下载平台

0731-84720580
商务合作：really158d
友链申请 (QQ)：1737380874

关于我们

更多

关于我们

三个皮匠报告微信公众号

三个皮匠报告微信小程序

扫码咨询商务合作事宜

友情链接：

营销自动化亿欧智库微播易阿里妈妈

copyright@2008-2013 长沙思想领动信息技术有限公司版权所有网站备案/许可证号：湘B2-20190120 | 工信部备案号：湘ICP备2023027541号-2 | 公安备案号：湘公网安备43010402001071号

客服

小程序

服务号

折叠