Debprakash Patnaik, Manish Marwah, Ratnesh Sharma, Naren Ramakrishnan

Abstract

Motivation: Data centers are a critical component of modern IT infrastructure but are also among the worst environmental offenders through their increasing energy usage and the resulting large carbon footprints. Efficient management of data centers, including power management, networking, and cooling infrastructure, is hence crucial to sustainability. In the absence of a 'first-principles' approach to manage these complex components and their interactions, data-driven approaches have become attractive and tenable.

Results: We present a temporal data mining solution to model and optimize performance of data center chillers, a key component of the cooling infrastructure. It helps bridge raw, numeric, time-series information from sensor streams toward higher level characterizations of chiller behavior, suitable for a data center engineer. To aid in this transduction, temporal data streams are first encoded into a symbolic representation, next run-length encoded segments are mined to form frequent motifs in time series, and finally these metrics are evaluated by their contributions to sustainability. A key innovation in our application is the ability to intersperse "don't care" transitions (e.g., transients) in continuous-valued time series data, an advantage we inherit by the application of frequent episode mining to symbolized representations of numeric time series. Our approach provides both qualitative and quantitative characterizations of the sensor streams to the data center engineer, to aid him in tuning chiller operating characteristics. This system is currently being prototyped for a data center managed by HP and experimental results from this application reveal the promise of our approach.

People

Naren Ramakrishnan


Publication Details

Date of publication:
June 28, 2009
Conference:
SIGKDD international conference on Knowledge discovery and data mining
Page number(s):
1305--1314