1、The Harsh Reality of Building a The Harsh Reality of Building a Realtime ML Feature PlatformRealtime ML Feature PlatformIvan BurmistrovPrincipal Software EngineerShareChatStory timePartners Models dont simply tell whats wrongPartners Models dont simply tell whats wrongThe least we can do is to ensur
2、e the data is OKHi!Im Ivan BurmistrovPrincipal Software EngineerShareChatData infrastructure for MLIn the past:Background:Moj,a Short Video AppTikTok-like appFully Personalised Feed20M DAU,100M MAU8K RPS for Feed2K candidates being ranked on each requestAgendaWhat to expect&how to overcomeIntroducti
3、on to Feature PlatformHigh-level overview of Feature Platform componentsFinal remarks0101030302020404BackgroundArchitectureChallenges&SolutionsTakeawaysBackground:Features“Post likes in the last 30m”“User clicks in the last 1d”“Post language”“Post creation date”“Last 100 user actions”“Last 10 viewed
4、 posts”WindowWindowCountersCountersPropertiesPropertiesLifetime Lifetime counterscountersLast NLast N“Post likes total”“User total engagements”What is Feature Platform?High-level architectureHigh-level architectureHigh-level architectureHigh-level architectureHigh-level architectureHigh-level archit
5、ecture:Window CountersThe boring oneStory 1Story 1Database&Streaming EnginesAbout being naiveStory 2Story 2Stream ProcessingStreaming SQL for featuresStreaming SQL for featuresStreaming SQL for featuresStreaming SQL for featuresStreaming SQL for featuresStreaming SQL for featuresStreaming SQL for fe
6、aturesStreaming SQL for featuresStreaming SQL for featuresSQL-DataStream:ChangelogSQL-DataStream:ChangelogSQL-DataStream:ChangelogSQL-DataStream:ChangelogSQL-DataStream:ChangelogSQL-DataStream:ChangelogSQL-DataStream:ChangelogSQL+DataStream Architecturefun updateState(rowUpdate:Row)var featuresState