上海人工智能实验室：2025前沿人工智能风险管理框架实践：风险分析技术报告（英文原版+译版）（97页）.pdf

上海人工智能实验室：2025前沿人工智能风险管理框架实践：风险分析技术报告（英文版）（97页）.pdf

《上海人工智能实验室：2025前沿人工智能风险管理框架实践：风险分析技术报告（英文版）（97页）.pdf》由会员分享，可在线阅读，更多相关《上海人工智能实验室：2025前沿人工智能风险管理框架实践：风险分析技术报告（英文版）（97页）.pdf（97页珍藏版）》请在三个皮匠报告上搜索。

1、-上海人工智能实验室-:=-.=.Shanghai Artificial Intelligence LaboratorySafeWorkFrontier AI Risk Management Framework in Practice:A Risk Analysis Technical ReportShanghai Artificial Intelligence LaboratoryAbstractTo understand and identify the unprecedented risks posed by rapidly advancing artificialintelligenc

2、e(AI)models,this report presents a comprehensive assessment of their frontierrisks.Drawing on the E-T-C analysis(deployment environment,threat source,enabling capa-bility)from the Frontier AI Risk Management Framework(v1.0)(SafeWork-F1-Framework)(Shanghai AI Lab&Concordia AI,2025),we identify critic

3、al risks in seven areas:cyberoffense,biological and chemical risks,persuasion and manipulation,uncontrolled autonomousAI R&D,strategic deception and scheming,self-replication,and collusion.Guided by the“AI-45Law,”we evaluate these risks using“red lines”(intolerable thresholds)and“yellowlines”(early

4、warning indicators)to define risk zones:green(manageable risk for routinedeployment and continuous monitoring),yellow(requiring strengthened mitigations and con-trolled deployment),and red(necessitating suspension of development and/or deployment).Experimental results show that all recent frontier A

5、I models reside in green and yellowzones,without crossing red lines.Specifically,no evaluated models cross the yellow line forcyber offense or uncontrolled AI R&D risks.For self-replication,and strategic deception andscheming,most models remain in the green zone,except for certain reasoning models i

6、n theyellow zone.In persuasion and manipulation,most models are in the yellow zone due to theireffective influence on humans.For biological and chemical risks,we are unable to rule outthe possibility of most models residing in the yellow zone,although detailed threat modelingand in-depth assessment