1、W P S 2 0 2Chaos&continuity:Using GenAI to improve humanitarian workload resilienceMike George(he/him)Principal Solutions ArchitectAmazon Web Services 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Categories of f
2、ailure(SEEMS)Single points of failureLack of redundancy or fault toleranceExcessive loadNot havingsufficient capacity/resources/limitsExcessive latencyNot responding in the expected timeMisconfiguration and bugsIncorrect executionShared fateViolating intended fault isolation 2025,Amazon Web Services
3、,Inc.or its affiliates.All rights reserved.Categories of failure(SEEMS)Single points of failureLack of redundancy or fault toleranceExcessive loadNot havingsufficient capacity/resources/limitsExcessive latencyNot responding in the expected timeMisconfiguration and bugsIncorrect executionShared fateV
4、iolating intended fault isolation 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Categories of failure(SEEMS)Single points of failureLack of redundancy or fault toleranceExcessive loadNot havingsufficient capacity/resources/limitsExcessive latencyNot responding in the expected ti
5、meMisconfiguration and bugsIncorrect executionShared fateViolating intended fault isolation 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.ResilienceAbility of a workload to recover from infrastructure or service disruptionsThe mental modelContinuous improvement CI/CD,observabili
6、ty,moving beyond pre-deployment testing towards chaos engineering patternsHigh availabilityResistance to common failures through design and operational mechanisms at a primary site Core services,design goals to meet availability goalsDisaster recoveryReturning to normal operation