Recover Corrupted Data in Sensor Networks: A Matrix Completion Solution

Abstract
Affected by hardware and wireless conditions in WSNs, raw sensory data usually have notable data loss and corruption. Existing studies mainly consider the interpolation of random missing data in the absence of the data corruption. There is also no strategy to handle the successive missing data. To address these problems, this paper proposes a novel approach based on matrix completion (MC) to recover the successive missing and corrupted data. By analyzing a large set of weather data collected from 196 sensors in Zhu Zhou, China, we verify that weather data have the features of low-rank, temporal stability, and spatial correlation. Moreover, from simulations on the real weather data, we also discover that successive data corruption not only seriously affects the accuracy of missing and corrupted data recovery but even pollutes the normal data when applying the matrix completion in a traditional way. Motivated by these observations, we propose a novel Principal Component Analysis (PCA)-based scheme to efficiently identify the existence of data corruption. We further propose a two-phase MC-based data recovery scheme, named MC-Two-Phase, which applies the matrix completion technique to fully exploit the inherent features of environmental data to recover the data matrix due to either data missing or corruption. Finally, the extensive simulations with real-world sensory data demonstrate that the proposed MC-Two-Phase approach can achieve very high recovery accuracy in the presence of successively missing and corrupted data.
Funding Information
  • National High Technology Research and Development Program of China (863 Program) (2015AA01A705)
  • Jiangsu Future Networks Innovation Institute (BY2013095-4-06)
  • National Natural Science Foundation of China (61572184, 61300219, 61472283, 61271185, 61472131)
  • US National Science Foundation (CNS 1526843)

This publication has 35 references indexed in Scilit: