《OCP遥测白皮书——超大规模数据中心遥测指南(第三方集成遥测API)相关工作的总结.pdf》由会员分享,可在线阅读,更多相关《OCP遥测白皮书——超大规模数据中心遥测指南(第三方集成遥测API)相关工作的总结.pdf(10页珍藏版)》请在三个皮匠报告上搜索。
1、Scott SharpOCP Telemetry White PaperA summary of the work led by hyperscalers on DC Telemetry guidelinesOCP Telemetry White PaperA summary of the work led by Industry Experts on DC Telemetry guidelinesScott Sharp,GoogleDATA CENTER FACILITY(DCF)Telemetry is the collection of measurements or other dat
2、a at remote points.Think of it like a hospitals EKG machine monitoring a patients vital signs.Telemetry is how we monitor the data centers vital signs.TelemetryTelemetry gives real time monitoring of critical systems for troubleshooting,optimization,and predicting workloads or problems before they a
3、rise.New AI workloads are creating challenges and are evolving all the timeThird Party Data Centers and TelemetryGenerally,the building management and power management systems are part of the base building and are owned by the service provider.Base Building Systems:Chillers,Pumps,cooling towers,etc.
4、Medium voltage switchgear,low voltage switchboards,UPS systems,etc.Tenant Systems:Cold aisle temperatures and humidityBus duct end feed unit monitoring,tap box monitoring(if used),etc.Challenge:How to get data center vitals back to the customers for monitoring,adjusting workloads,and reporting?Mecha
5、nical systems(chillers,generators,etc)react slower than AI workloads Workloads fluctuate on a shorter timescale than mechanical equipmentIncreased rack densities means the fluctuations increase in amplitude Telemetry Importance for Future AI All other variables held constant,as rack density increase
6、,the potential fluctuations will increase=more infrastructure operation variability Telemetry SecurityBuilding telemetry is critical,and security of those systems is imperative to operations.Security applies to both the service provider and the tenantsBuilding Management System(B