《5-马国维(黎钢)、高谟(云骞)-基于Flink DataStream API的流批一体处理-ZH.pdf》由会员分享,可在线阅读,更多相关《5-马国维(黎钢)、高谟(云骞)-基于Flink DataStream API的流批一体处理-ZH.pdf(40页珍藏版)》请在三个皮匠报告上搜索。
1、基于基于FlinkFlink DataStream APIDataStream API的的 流批一体处理流批一体处理 Unified DataStream API for Streaming and Batch Execution 马国维(黎钢) 阿里巴巴 Alibaba Group 高赟(云骞) 阿里巴巴 Alibaba Group 语义语义 SemanticsSemantics #2 现状和目标现状和目标 Status and TargetsStatus and Targets #1 具体实现具体实现 ImplementationImplementation #3 总结总结 Users s
2、hould use operators with specified time characteristics. *1.12中未完全在Batch模式中禁用基于Processing Time的接口(如Processing time window,注册Processing time timer等), 从而令使用Processing Time的作业在Batch模式下仍能运行,但是如前所述,需要考虑这些作业的实际意义。 *In 1.12 processing time support is not forbidden and jobs using processing time window, tim
3、er, etc. could still running. This is to allow all jobs to run in both mode. However, the result might be indeterministic. Event Time ?Event Time ? 111513 W(16) 141920 W(22) 24 过期数据 Stale Records * 对批模式过于复杂 config.set(ExecutionOptions.RUNTIME_MODE, RuntimeExecutionMode.BATCH); / config.set(ExecutionOptions.RUNTIME_MODE, RuntimeExecutionMode.STREAMING); env.configure(config, getClass().getClassLoad