《利用自动化数据 CI CD 流水线提升开发者体验.pdf》由会员分享,可在线阅读,更多相关《利用自动化数据 CI CD 流水线提升开发者体验.pdf(70页珍藏版)》请在三个皮匠报告上搜索。
1、By Elisabeth Baumann-SMB Digital,Public DomainImproving Developer Experience using automated data CI/CD pipelinesSimona Pencea&Nomi VnyiApril 20243Agenda Testing with separate data branches01Zero downtime migrations024Testing with data branches Code development flow 01Data testing improvements02Data
2、 masking 04Data copying deep dive031.Code development workflow 62.Data testing improvements-using production dataIts realIts fastIts largeProsConsPrivacy issues-PII&PHI data Large does not mean completeRefreshing takes timeData incompatibility7Data privacySize of the dataset2.Data testing improvemen
3、ts-using production data8Use production data,but make it safeAnd fast!2.Data testing improvements-using production dataAnd automated9Github as a standard workflowDatabase branch creation triggered by Git PR creationCopy data after new branch gets created1.Automated workflow 2.Data testing improvemen
4、ts 10FastPreemptive copyCopy on writeSmaller data setCompleteLinks as data type “Resolve”the linksOptimize for time,not size2.Use production data2.Data testing improvements 11Where do we startHow do we advanceHow do we know we got a complete data setComplete subset data copy3.Data copy deep dive 123
5、.Data copy deep dive 133.Data copy deep dive 143.Data copy deep dive 153.Data copy deep dive 163.Data copy deep dive 17The table order gives us the starting point:T7,T1T2,T3T4,T5T6The anatomy of the schema 3.Data copy deep dive 18Where do we startHow do we advanceHow do we know we got a complete dat
6、a setComplete subset data copy3.Data copy deep dive 19The schema says what is possible but it is not mandatory that all the links are full.The static analysis we did on the table projected on individual rows.Always look for the next step exhaustively,do not stop at the first“next”recordAlways allow