1、Slacks Migration to a Cellular ArchitectureCooper BetheaDrainable Cellular Design Goals1.Remove as much traffic as possible within 5 minutes2.Drains must not add user-visible errors3.Drains/undrains must be incremental4.Draining mechanism must not rely on the impacted AZOption 1:SiloingOption 2:Inte
2、rnally Managed DrainingWhat Service Goes Where?Draining MechanismDraining MechanismDraining MechanismCoordination Headwindhttps:/ Organizations Are Like Slime MoldsIndependent teams will tend to go in random directions,or even actively different directions.This means over time they tend to diverge.B
3、ut if theyre all sighting off the same moon,over time theyll tend to naturally converge.This allows an ideal pattern of loosely coupled,tightly aligned.With a widely agreed upon strategy,there will be a smaller risk of disagreements.1.Write a proposal and circulate.2.Engage deeply with several high-
4、value services.3.Expand to all critical services.4.Wind down,following the long tail.Project Cadence Not everything is worth doing.Some things are worth doing but will take a long time.Measure process with weekly drains.Roadmaps are key.When Is Good Enough Good Enough?Where Does That Leave Us?Siloed
5、 services drainable in 60s Vitess automation can reparent at the speed of replication Remaining critical services have roadmaps“Happy path”for new services is to silo Drains happen for incident response,rollout,even drillsDo We Actually Use This Thing?Single-AZ AWS outages,of course Internal service
6、 mesh clients can self-serve drains for rollout/rollback Siloing helps prevent contagion cross-AZ Drains are quick and painless,so easy to tryAnd What Did We Learn?Listening to people is important.Systems have been shaped by evolutionary pr