《1480 - 基于深度强化学习的GPU作业调度.pdf》由会员分享,可在线阅读,更多相关《1480 - 基于深度强化学习的GPU作业调度.pdf(34页珍藏版)》请在三个皮匠报告上搜索。
1、Orlando,FLOctober 69IBM TechXchange 2025Dr.Dr.AijunAijun AnAnDepartment of Electrical Engineering and Computer Science,York University,ProfessorMichael FeimanMichael FeimanIBM,Spectrum Computing and IBM Cloud HPC,principal software architect14801480GPU Job Scheduling Using GPU Job Scheduling Using D
2、eep Reinforcement LearningDeep Reinforcement LearningAgenda010203040506Introduction to IBM Spectrum Spectrum LSF and GPU schedulingPartnership between IBM&York UResearch objectivesMethodsEvaluationIBM TechXchange/2025 IBM Corporation3HPC SimulationsAnalytics,ML/DLHPC Data analysisIBM Spectrum and HP
3、C workloadsIBM TechXchange/2025 IBM CorporationAnalytics,Machine&Deep LearningHPC SimulationLifeLifeSciencesSciencesMaterialMaterialScienceScienceCAECAEOil&GasOil&GasHigh PerformanceData AnalysisFinancialFinancialAnalyticsAnalyticsSocialSocialAnalyticsAnalyticsBig DataBig DataBusinessBusinessIntelli
4、genceIntelligence4Who is using IBM Spectrum Computing?14 of top 20Global Banks7 of top 10Aero&Defense CompaniesTop 3Cancer Centers in the US10 of top 12Automotive Companies23 of top 25Electronic CompaniesIBM TechXchange/2025 IBM Corporation23 of the 30 largest commercial enterprises in the world On
5、PremiseCloudData is staged from on prem to cloud before hosts are provisioned.It also caches files to avoid repeatedly moving the same files,and it returns results to on prem out of band.Cloud resources are autoscaled based upon workload demands and policies.Portal,command line and restful API for s
6、ubmission and monitoringWorkload is forwarded to the appropriate cloud based on site defined policies.Hybrid HPC CloudIBM TechXchange/2025 IBM Corporation5IBM Spectrum LSF:GPUsGPUs workload manager for Hybrid HPC CloudFeature rich workload manager for Hybrid HPC Cloud.Intelligent autoscalingIntegrat