arXiv:2505.19414v2 Announce Type: replace
Abstract: The revolution in artificial intelligence (AI) has brought sustainable challenges in data center management due to the high carbon emissions and short cooling response time associated with high-power density racks. While machine learning (ML) offers promise for intelligent management, its adoption is hindered by safety and reliability concerns. To address this, we propose a multiphysics-informed machine learning (MPIML) framework that integrates physical priors into data-driven models for enhanced accuracy and safety. We introduce an integrated system architecture comprising three core engines: DCLib for versatile facility modeling, DCTwin for high-fidelity multiphysics simulation, and DCBrain for decision-making optimization. This system enables critical predictive and prescriptive applications, such as carbon-aware IT provisioning, safety-aware intelligent cooling control and battery health forecasting. An illustrative example on an industry-grade data center cooling control demonstrates that our MPIML approach reduces annual carbon emissions up to 200 kilotons compared with conventional methods while ensuring operational constraints are met. We conclude by outlining key challenges and future directions for developing autonomous and sustainable data centers.
THE AI TODAY 