3741 - 利用 Apache Gluten 提升 Pinterest 的数据查询性能.pdf

编号:982858 PDF 20页 241.43KB 下载积分:VIP专享
下载报告请您先登录!

3741 - 利用 Apache Gluten 提升 Pinterest 的数据查询性能.pdf

1、IBM TechXchangeOctober 2025Enhancing Pinterests Data PlatformA 2025 Update on Apache Gluten Integration&our Spark PlatformFelixSoftware EngineerAboutBig Data Query Platform TeamSparkSQLTrinoZaheenEngineering ManagerOur PlatformA History of SparkPlatform IntegrationPerformanceChallenges&Learnings1234

2、5AgendaFuture Plans6Spark Task Retry7SparkSQLOur Platform60k+Daily Scheduled SparkSQL queries15k+Worker InstancesK8sMigrating from YARN8k+Daily Adhoc SparkSQL queries500+Daily Adhoc usersCelebornShuffle ServiceSource:Pinterest internal data;Global analysis;Q3 202542025 PinterestApacheA History of Sp

3、ark-Timeline Moved from Hive to Spark 2.4 in late 2021 Spark provided significant performance gains over Hive In late 2022 we moved to Spark 3.2 Spark costs were exploding Compute became memory bound Various projects introduced to reduce memory Little we can do with vanilla Spark to improve performa

4、nce52025 Pinterest62025 PinterestA History of Spark-ArchitectureA History of Spark-Always Improving The query platform team is always looking into ways to improve query performance Even a 10%improvement would result in significant savings for the business So we began looking at products that could h

5、elp us72025 PinterestA History of Spark-Our Requirements Produce speed ups of at least 10%to make the ROI worth it Migration impact to users and difficulty of Migration Reduction in memory usage A Frictionless experience Users should be able to have speed ups without doing anything We shouldnt have

6、to re-architect our entire system82025 PinterestA History of Spark-The market There are many solutions on the market Photon Nvidia Rapids DataFusion Starrocks ClickHouse Comet Velox92025 PinterestWhy Gluten+Velox Increasing job requirements from customers ML jobs Clusters are often memory bound Larg

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(3741 - 利用 Apache Gluten 提升 Pinterest 的数据查询性能.pdf)为本站 (竿头日上) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠