1、Intel 高级软件工程师张锐英特尔平台上的功耗性能优化Understand power features from performance perspective of viewPower vs.PerformancePerformanceLatencyThroughputPowercore pstatecore cstateuncoreRAPLSSTWhy Custom tableCore cstate Aggressive C1 vs.POLL idle Separated C1/C1E for finer grained cstate control Accurate latency
2、description of deep cstates workaround firmware problemsLarger exit latency may impact latency sensitive workloadMore power savings allow busy CPUs to run at higher frequency,for longer timeCore pstateHigher frequency allows bigger throughputEPP decides how fast/aggressive cpu scales up to a higher
3、frequency thus it impacts latency as wellSST(Intel Speed Select Technology)Higher frequency for fewer cores or for cores with higher priorityLess cores with higher frequencySST-PPHigh priority cores have higher base frequencySST-BFHigh priority cores have higher turbo frequencySST-TFLow priority cor
4、es are power capped firstSST-CPPLR(Perf Limit Reasons)Perf limit reasons give hint for frequency/performance bottleneckFREQUENCY/CURRENT/POWER/THERMAL/PLATFORM/MCP/RAS/MISC/QOS/DFCUFS(Uncore Frequency Scaling)/sys/devices/system/cpu/intel_uncore_frequencyInterfaceHierarchy Package Power domain Fabri
5、c cluster minimum granularity sharing the same frequency AgentControl policy user setting ELC(Efficiency Latency Control)mode Performance modeUncore frequency control for better Power distributionUFS ELC(Efficient Latency Control)If domain cpu utilization elc_low_threshold_percentuse elc_floor_freq_khzelse if domain cpu utilization elc_high_threshold_percentfirmware huristic/telemetry based frequency selectionelse if elc_high_threshold_enable=1increase uncore frequency in 100MHz steps until power limit is reachedelsefirmware huristic/telemetry based frequency selectio