1、Scaling an Embedded Database for the Cloud:Challenges and Trade-OffsAbout MeHello,my name is Stephanie!Staff Software Engineer MongoDBex-Founding Engineer MotherDuck What is DuckDB?DuckDBFast,in-process database systemThe database is inside your application!In-processDuckDBFast,in-process database s
2、ystemThe database is inside your application!No coordination overhead.Single userIn-processDuckDBFast,in-process database systemThe database is inside your application!No coordination overhead.Single userThe database comes to your data.EmbeddedIn-processPopularity in the open-source communityPopular
3、ity in the open-source communityMeteoric rise in database rankingsMotherDuckServerless DuckDBA Cloud Data Warehouse built on top of DuckDB for easy and cost effective data analytics.Why is it hard?Just toss DuckDB on EC2 and call it a Cloud database Right?The Core Challenge:DuckDB is not a Client-Se
4、rver DatabaseClient-Server Database ExecutionDuckDB Executiondriverresults!ServerApplicationSQLdataDuckDBresults!ApplicationSQLdataDuckDBFast,in-process database systemX Networked query executionX External metadata managementX Multi-user concurrency controlsSingle userX Persistent,long-lived compute
5、EmbeddedIn-processChallenges of Cloudifying DuckDB Coupled compute and application Coupled compute and storage Limited concurrent reads and writes Distribution challengeChallenges of Cloudifying DuckDB Coupled compute and application Coupled compute and storage Limited concurrent reads and writes Di
6、stribution challengeCoupled compute and applicationEmbedded Lifecycle CouplingDuckDB runs inside your application(e.g.,Jupyter,CLI,API server)When your app starts,DuckDB starts.When your app exits,DuckDB disappears,along with all its memory,state,and open resources.DuckDB is a library,not a service.