In today's information-driven workplaces, data is constantly being moved around and undergoing transformation. The typical business-as-usual approach is to use email attachments, shared network locations, databases, and more recently, the cloud. More often than not, there are multiple versions of the data sitting in different locations, and users of this data are confounded by the lack of metadata describing its provenanceĀor in other words, its lineage. The ProvDMS project at the Oak Ridge National Laboratory (ORNL) described in this article aims to solve this issue in the context of sensor data.
ORNL's Building Technologies Research and Integration Center has reconfigurable commercial buildings deployed on flexible research platforms (FRPs). Figure 1 is a Google Earth model of a medium-size commercial office building that is part of the ORNL's FRP apparatus. These buildings (metal warehouse and office) are instrumented with a large number of sensors that measure variables such as HVAC efficiency, relative humidity, and temperature gradients across doors, windows, and walls. The sensors acquire sub-minute resolution data from hundreds of channels. Scientists conduct experiments, run simulations, and analyze the data. The sensor data is also used in elaborate quality assurance exercises to study the effect of systemic faults. The two types of commercial buildings comprising the FRPs stream data at a 30-second resolution for a total of 1,071 channels for both buildings.