训练专家模型：自动化恶意软件开发.pdf

上传人：竿***

编号：981876

2025-11-29

PDF 26页 734.39KB

《训练专家模型：自动化恶意软件开发.pdf》由会员分享，可在线阅读，更多相关《训练专家模型：自动化恶意软件开发.pdf（26页珍藏版）》请在三个皮匠报告上搜索。

1、#BHUSA BlackHatEventsTraining Specialist ModelsTraining Specialist ModelsAutomating Malware DevelopmentKyle Avery#BHUSA BlackHatEventsKyle AveryKyle Avery R&D Outflank Red team background AI hobbyistwhoamiwhoamikyleaverykyleavery_#BHUSA BlackHatEventsagendaagendaProblems with current modelsIntro to

2、LLM trainingRL with verifiable rewardsCase study:Automating malware development#BHUSA BlackHatEventsToo bigDependent on third-party APIsToo smallLacks reasoning or accuracytwo types of LLM:two types of LLM:#BHUSA BlackHatEventswhat makes big models smarter?what makes big models smarter?Number of Ski

3、llsQuality per Skill8b70b405bModel Size#BHUSA BlackHatEventswhat makes big models smarter?what makes big models smarter?Number of SkillsQuality per Skill8b70b405bModel Size#BHUSA BlackHatEventswhat makes big models smarter?what makes big models smarter?Number of SkillsQuality per Skill8b70b405bModel

4、 Size#BHUSA BlackHatEventswhat makes big models smarter?what makes big models smarter?Number of SkillsQuality per Skill8b70b405bModel Size?#BHUSA BlackHatEventsCan a small,focused model outperform large generalists on a single task?#BHUSA BlackHatEventsCompress knowledge into the model Next-token pr

5、ediction on books,blogs,GitHub,Wikipedia,Reddit,etc.Results in a sort of“auto-completion”model,not a chatbotLLM preLLM pre-training training What is 2+2?Isnt it 4?What is 2-2?Isnt it 0?And what is 2x2?Isnt it 4?And what isThe sky isthe limit for UHV professors hobbies.Many children dream of flying h

6、igh in the sky.For one University of#BHUSA BlackHatEventsSupervised fine-tuning(SFT)Teaches model to follow instructions and format answers May also include tool examplesLLM postLLM post-training training systemYou are a helpful assistant.userWhat is 2+2?assistant2+2=4#BHUSA BlackHatEventsReinforcem

训练专家模型：自动化恶意软件开发.pdf

相关报告