1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.I N V 5 0 8LLMs Reflecting on Reasoning:A Probabilistic VC-Theory ApproachJae Oh WooSenior Applied ScientistMengdie(Flora)WangData ScientistAWS GenAI Innovation Cent
2、er 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.AgendaIntroductionBackgroundExperiment DesignResultsPractical ImplicationsConclusion 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Introductio
3、n 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Motivation&Problem StatementThe ChallengeAs LLMs tackle increasingly complex reasoning tasks,one question emerges:Can they accurately evaluate their own reasoning quality?Essential for trustworthy AI systemsEnables autonomous error
4、 detectionCritical for high-stakes applicationsCurrent GapsFocus limited to final answer accuracyNo theoretical self-assessment frameworkConfidence calibration unmeasured 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Our Approach:Extending VC TheoryClassical VC Theory Limitation
5、sDesigned for deterministic binary classifiersCannot handle probabilistic outputsInadequate for modern LLM evaluationOur Solution:Probabilistic VC FrameworkPVC Dimension:Measures pattern recognition capacityC-PVC Dimension:Adds calibration for meaningful confidenceMaintains theoretical rigor for pro
6、babilistic reasoningCore Insight:Calibrated Self-reflection Capacity=Assessment Accuracy+Calibrated Confidence 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Background 2025,Amazon Web Services,Inc.or its affiliat