《Delta 的多语言数据和机器学习工作负载之美.pdf》由会员分享,可在线阅读,更多相关《Delta 的多语言数据和机器学习工作负载之美.pdf(27页珍藏版)》请在三个皮匠报告上搜索。
1、2024 Databricks Inc.All rights reservedTHE BEAUTY OF THE BEAUTY OF DELTA FOR DELTA FOR POLYGLOT POLYGLOT DATA AND ML DATA AND ML WORKLOADSWORKLOADSMicha KunzeMicha KunzeDate 2024Date 2024-0606-131312024 Databricks Inc.All rights reservedTransported by Maersk20+ML products1200+datasets for analytics
2、and operational dataMixed batch and streaming2CONTEXTCONTEXTWe ship data and MLWe ship data and ML2024 Databricks Inc.All rights reservedBatch&StreamingBatch&StreamingAnalytics&OperationsDecision AutomationAnalytics&OperationsDecision Automation3DATA PLATFORMDATA PLATFORMApache Spark for batch and S
3、tructured StreamingDelta LakePandasApache FlinkAll datasets available in a metastoreStructured Streaming to feed operational stores20+ML productsSimulations using historical dataIntegration with operational apps/services2024 Databricks Inc.All rights reservedNo extra component needed(catalog)Protoco
4、l works with many enginesEasy setup and testingTransaction log+table historyChange data feedVersion controlSelfSelf-described open table format described open table format Rich metadataRich metadata4THE BEAUTY OF DELTA METADATATHE BEAUTY OF DELTA METADATA2024 Databricks Inc.All rights reservedNo ext
5、ra component needed(catalog)Protocol works with many enginesEasy setup and testingTransaction log+table historyChange data feedVersion controlSelfSelf-described open table format described open table format Rich metadataRich metadata5THE BEAUTY OF DELTA METADATATHE BEAUTY OF DELTA METADATA2024 Datab
6、ricks Inc.All rights reserved2024 Databricks Inc.All rights reserved6RUN JOBSRUN JOBSONLY WHEN ONLY WHEN NEEDEDNEEDED2024 Databricks Inc.All rights reservedWithout Delta LakeReactive in-house schedulerRuns when upstream ranWith Delta LakeRun only if delta log shows data changed7DELTA&ORCHESTRATIONDE