1、Issue BriefFebruary 2025Putting Explainable AI to the TestA Critical Look at AI Evaluation ApproachesAuthorsMina NarayananChristian SchoeberlTim G.J.RudnerPutting Explainable AI to the TestA Critical Look at AI Evaluation ApproachesAuthorsMina NarayananChristian SchoeberlTim G.J.RudnerCenter for Sec
2、urity and Emerging Technology|1 Executive Summary Policymakers frequently invoke explainability and interpretability as key principles that responsible and safe AI systems should uphold.However,it is unclear how evaluations of explainability and interpretability methods are conducted in practice.To
3、examine evaluations of these methods,we conducted a literature review of studies that focus on the explainability and interpretability of recommendation systemsa type of AI system that often uses explanations.Specifically,we analyzed how researchers(1)describe explainability and interpretability and
4、(2)evaluate their explainability and interpretability claims in the context of AI-enabled recommendation systems.We focused on evaluation approaches in the research literature because data on AI developers evaluation approaches is not always publicly available,and researchers approaches can guide th
5、e types of evaluations that AI developers adopt.We find that researchers describe explainability and interpretability in variable ways across papers and do not clearly differentiate explainability from interpretability.We also identify five evaluation approaches that researchers adoptcase studies,co
6、mparative evaluations,parameter tuning,surveys,and operational evaluationsand observe that research papers strongly favor evaluations of system correctness over evaluations of system effectiveness.These evaluations serve important but distinct purposes.Evaluations of system correctness test whether