1、Orlando,FLOctober 69IBM TechXchange 2025Marios Fokaefs,Associate Professor,Director Economics and Administration of Software Engineering Lab,York UniversityMichael Harrison,Software Engineer,IBM Increasing productivity and reducing Increasing productivity and reducing cognitive load in reliability e
2、ngineering cognitive load in reliability engineering with AI 1491with AI 1491but strategic AI implementation can transform your systemsPutting AI in your architecture diagram isnt enoughIBM TechXchange|2025 IBM Corporation2Agenda010203040506Real challenges at IBM scaleOur three-pronged AI approachCa
3、se study 1:Anomaly in BillingCase study 2:VULN managementTechnical deep-dive(research)Future directionsIBM TechXchange|2025 IBM CorporationBy the end of this session,youll understand:IBM TechXchange|2025 IBM Corporation4 How to move from AI concepts to practical implementation Specific techniques fo
4、r reducing cognitive load in reliability engineering Measurable approaches to AI-driven incident management Real-world metrics and business impact Research directions for intelligent reliability systemsIBMs Scale ChallengeNearly 1 million 1 million customer resources under managementHundreds of thou
5、sands Hundreds of thousands of pods generating dataBillionsBillions of logs processed every dayGlobal infrastructure spanning multiple spanning multiple continentscontinentsIBM TechXchange|2025 IBM Corporation5Our Three-Pronged ApproachMultimodal incident dataReduce mean-time-to-repairDetect anomali
6、es within hours/minutesRoot cause analysis through explainable modelsPredictive Detection with Explainable AITraditional anomaly detection Find deviations from normal patternsWhat happens when monitoring does not produce discernible patterns?Is MTTR reduction still possible?Pattern-Free Anomaly Dete