《时尚前卫安全至上:大规模支持新一代人工智能.pdf》由会员分享,可在线阅读,更多相关《时尚前卫安全至上:大规模支持新一代人工智能.pdf(21页珍藏版)》请在三个皮匠报告上搜索。
1、Fashion Forward,Security First:Supporting Gen AI at ScaleFlorence MottayCISO ZalandoStand up if youve worked on an AI system Remain standingif youve ever worked on a red team or security assessment.Stand up if youve worked on an AI system Remain standingif youve ever worked on a red team or security
2、 assessment.Remain standingif youve conducted prompt injection testing or similar techniques.Stand up if youve worked on an AI system Fashion-forward security firstWhere itall startedChatGPT-poweredZalando assistant“With our Zalando Assistant,we can help customers find what to wear for a certain occ
3、asion-a birthday party,a business meeting or even hiking to Machu Picchu.Customers can get inspired by a certain style,celebrity,or cultural moment the possibilities are almost endless.”SecurityassessmentThe risks we faced:PrivacySecurityBut alsoBiasesInappropriate contentMisinformation,hallucinatio
4、nand robustness issuesa new world!UserExternalresourcesPersistent storageLLM(e.g.ChatGPT)AdversaryApp(e.g.semantic)Output(e.g.candidates)APIExternalThird-partyZalandoOutputsIndirect Prompt InjectionPrompt InjectionLLM may have access to external sources(e.g.web or DBs)LLM may have the capability to
5、write to some persistent storageThreat modelling123SecurityassessmentA few examplesWill ZA fabricate information regarding Terms and Conditions,refund policy,shipping,.at Zalando?Will ZA provide the same outcome for all genders,all backgrounds of customers?Is ZA susceptible to jailbreak attacks?Reme
6、diationtimeFine tuningFine tuned the model with classifier training80K prompts Today,every customer message is being parsed by our safety classifier as well as the OpenAI Moderation API Fashion-forward security firstHow it evolvedTwo main pillarsAI threat modelingAI red teaming Framework HighlightsC