Click Here for more inforamation
  • Fri. Apr 12th, 2024

Data center ops: How AI and ML are boosting efficiency and resilience


Apr 13, 2023
Data center ops: How AI and ML are boosting efficiency and resilience

This article is part of a VB special issue. Read the full series here: Data centers in 2023: How to do more with less.

Data centers must deliver more granular, real-time data to keep retailers’ operations resilient, responsive and online despite potential security and outage threats. Unpredictable supply chains, chronic labor shortages, spiraling inflation and energy costs are just a few of the challenges that retail CIOs and senior management teams face when optimizing their data centers. 

AI and machine learning (ML) can help identify how existing data centers can be redesigned to make them less rigid, siloed and more reliable. One of the key goals of using AI and ML is to troubleshoot why so many outages are occurring on-premises and in the cloud. Add to that the spiraling costs of electricity and energy costs with the need to optimize data center performance for aggressive sustainability performance targets, and data centers become a perfect use case for solving complex problems with AI and ML. 

“Workload volumes are set to continue growing at around 20% a year between now and 2025. Traditional data center approaches are struggling to meet these escalating requirements,” writes Tracy Collins, VP of Americas at EkkoSense.

According to Brons Larson, AI strategy lead at Dell, “data centers can leverage AI/ML to improve performance and optimize configuration and deployments.” 


Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.


Register Now

Wendy Zhao, senior director and principal engineer at Alibaba Cloud Intelligence, adds that “AI and ML continue to make great strides in their evolution, and they are now having a tangible impact on data center operations and IT management.”

And according to IDC, 50% of IT assets in data centers will run autonomously due to embedded AI functionality. For enterprises investing in AI to automate their IT infrastructure, the firm says, improving customer satisfaction and automating decision-making and repetitive tasks are the top organization-wide benefits. 

AI and ML gaining adoption

More than half (57%) of data center operators said they would trust AI to make routine operational decisions last year, up from 49% in 2021. Given how manually intensive many tasks are in data centers, AI and ML could significantly reduce costs and increase efficiency.

CIOs tell VentureBeat that taking on the challenging problems of reducing outages, strengthening multisite resiliency, optimizing direct liquid cooling (DLC) and improving capacity planning and security are areas where they’re interested in applying AI and ML-based solutions.

Energy costs are soaring, which means that operating data centers under budget is more challenging. CIOs and data center operators are concentrating on evaluating how software-designed power and AI can help exponentially reduce energy and cooling costs. 

Equinix is a global provider of data center services and network infrastructure for many of the world’s leading enterprises. Their CIO Milind Wagle says that the company operates a fleet of more than 220 data centers in 26 countries. They’re using AI to tune up their internet ‘engine room’ by estimating how much power and space will be consumed in their data centers.

Reducing power and cooling usage using AI to optimize data center operations is predicted to reduce operating costs in the next three years significantly. Source: Uptime Institute Global Data Center Survey 2022: Resiliency remains critical in a volatile world

Where AI can help optimize data center performance

Utilizing AI, CIOs and data center operators can optimize power consumption and improve power usage effectiveness (PUE) for future efficiency gains. As sustainability pressures increase industry-wide, many operators are unprepared to meet carbon emissions reporting requirements.

In addition, outages continue to be costly and frequent, with cloud applications especially susceptible. AI has the potential to aid in the resolution of a number of these issues by enhancing efficiency, decreasing outages and streamlining operations. The following are key areas where AI can help optimize data center performance

Improve capacity planning and resource allocation

Real-time data is critical to capacity planning and resource allocation across any data center. Real-time data holds insights into where, how and what needs to be optimized to improve performance. One critical area is identifying any bottlenecks in capacity planning and load balancing. These are constraint-based problems that supervised ML algorithms excel at solving. Getting capacity planning and resource allocation right is critical to running a thriving data center under budget.

AI and ML can help improve data center security

By learning a network’s normal behavior and detecting anomalies and deviations, AI can help prevent massive data breaches and hacks. AI cybersecurity tools can thoroughly screen and analyze all incoming and outgoing data for security threats.

“Never trust; always verify” underpins zero trust enterprise security. This approach trusts no user, application or device unless explicitly allowed by a security policy. Organizations can improve hybrid environment visibility, security, and compliance while reducing costs by adopting a zero trust mindset.

Get in front of carbon footprint reduction and reporting

AI excels at identifying diverse data patterns and helps form-fit models for how data changes over time. Supervised ML has proven effective in solving complex constraint-based carbon reduction problems that involve hundreds of potential variables and factors that impact emissions.

Getting sustainability right means combining the strengths of AI and ML to excel at carbon footprint reduction. It’s too important to leave it to chance, and it has a significant impact on any retail brand in the future. CIOs say they are seeing their peers’ compensation plans indexed to ESG targets, making sustainability a high priority with carbon footprint reduction and reporting core to the effort.

Improve uptime maintenance levels and benchmark data center performance over time

Knowing why a given type of server needs rebuilding more than others, identifying what’s causing interruptions to power management systems and troubleshooting why resource balancing isn’t working are all the types of problems that ML can help solve. The key is getting real-time data monitoring in place and building a data set that can track all available variables to troubleshoot performance bottlenecks.

Supervised ML models excel at predictive accuracy. Mining machine data and building models that predict when a given server will need preventative maintenance can save thousands of dollars and hours of lost availability. Think of the real-time data generated by every asset in a data center as the intelligence needed to track performance over time and find new ways to improve.

Combine the strengths of AI and ML to automate cooling, electricity, power and security systems

The goal is to have a data center that can operate autonomously. It’s possible to accomplish that by capturing real-time data that tracks air temperature, cooling, power loads, internal air pressure, resource loads and server performance. What motivates CIOs and data center operators to collaborate to accomplish this is the need to measure data center performance against sustainability and ESG goals set by senior management.

Using ML to interpret and create models based on environmental monitoring and control is essential for measuring progress to ESG targets. It’s a given that AI and ML need to be extensively used for tracking power and cooling consumption, two of the most expensive areas of running a data center.

Identifying AI use cases in data centers

AI is proving effective at reducing energy and power consumption and improving predictive maintenance, compliance, and capacity planning. Strengthening zero trust with AI will help protect every identity and endpoint in a data center, reducing the risk of a breach bringing an entire facility down. Source:  AI in data centers: Reality vs. myth, Uptime Institute blog, July 29, 2019

Identifying where AI can make the most significant contribution to securing and optimizing a data center must start where risks to operating costs and security are the greatest. CIOs tell VentureBeat that taking on the challenge of finding new ways to reduce energy consumption to meet carbon reduction and sustainability goals needs to be balanced against the staff shortage they continue to experience. 

Getting cooling, space, power and server optimization right is core to keeping a data center running under budget and averting potential outages. It’s estimated that 35% of the energy used in a data center is consumed through cooling infrastructure alone. Optimizing data center cooling, implementing more renewable energy options and improving IT utilization can improve sustainability gains by 57%, based on Uptime Institues’ Global Data Center Survey, 2022.

Nascent use cases for AI use in the data center include efficiency risk analysis, capacity planning, security and budget impact forecasting. In cybersecurity, using AI to close the gap between IT and OT systems is a given, as is defining least privileged access and identity management for every data center and system. 

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Leave a Reply

Your email address will not be published. Required fields are marked *