Approximate Data Collection in Sensor Networks using Probabilistic Models

Abstract
Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has proposed to achieve this by pushing data-reducing operators like aggregation and selection down into the network. This approach has proven unpopular with early adopters of sensor network technology, who typically want to extract complete "dumps" of the sensor readings, i.e., to run "SELECT *" queries. Unfortunately, because these queries do no data reduction, they consume significant energy in current sensornet query processors. In this paper we attack the "SELECT " problem for sensor networks. We propose a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the network’s PC base station. In addition to data collection, we show that Ken is well suited to anomaly- and event-detection applications. A key challenge in this work is to intelligently exploit spatial correlations across sensor nodes without imposing undue sensor-to-sensor communication burdens to maintain the models. Using traces from two real-world sensor network deployments, we demonstrate that relatively simple models can provide significant communication (and hence energy) savings without undue sacrifice in result quality or frequency. Choosing optimally among even our simple models is NPhard, but our experiments show that a greedy heuristic performs nearly as well as an exhaustive algorithm.

This publication has 21 references indexed in Scilit: