1、2024 Databricks Inc.All rights reservedMethods for evaluating your Methods for evaluating your GenAI application qualityGenAI application quality1Michael Carbin,Alkis PolyzotisMichael Carbin,Alkis PolyzotisJune 2024June 20242024 Databricks Inc.All rights reserved2Iterating on qualityDefining and mea
2、suring the quality of GenAI appsBuilding high-quality GenAI apps in Databricks:Mosaic AI Agent EvaluationAgenda Agenda 2024 Databricks Inc.All rights reservedIterating GenAI App QualityIterating GenAI App Quality3PoCProductionRelease CandidateCostQuality2024 Databricks Inc.All rights reservedRAG Cha
3、inRAG ChainUser requestResponse to user4Embedding and foundation modelsVector/search indexEnterprisedata2024 Databricks Inc.All rights reservedUser queryResponse to user5RAG ChainRetrieve supporting data PromptaugmentationLLM GenerationPost-ProcessingUnderstand queryGuardrailsEmbedding and foundatio
4、n modelsVector/search indexParse raw documentsExtract document metadataChunk documentsEmbed documentsSync to indexEnterprisedataData PipelineData Pipeline2024 Databricks Inc.All rights reservedIterating GenAI App QualityIterating GenAI App Quality6PoCProductionRelease CandidateCostQuality2024 Databr
5、icks Inc.All rights reservedImprove Improve QualityQualityDeployDeployGather Gather requirementsrequirementsMonitorMonitorDefine&Define&Measure Measure Quality Quality Deploy Deploy POCPOC7EvaluationEvaluation-Driven DevelopmentDriven Development2024 Databricks Inc.All rights reservedImprove Improve
6、 QualityQualityDeployDeployGather Gather requirementsrequirementsMonitorMonitorDefine&Define&Measure Measure Quality Quality Deploy Deploy POCPOCEvaluation SetEvaluation Set8EvaluationEvaluation-Driven DevelopmentDriven Development2024 Databricks Inc.All rights reservedRepresentativeRepresentative:r