Researchers at the Massachusetts Institute of Technology (MIT) have developed the Synthetic Data Vault (SDV), a machine-learning system that automatically creates synthetic data. Such artificial data can be used in data science efforts that otherwise would be thwarted due to limited access to authentic data.
The use of authentic data raises significant privacy concerns, and the synthetic data can still be used to develop and test data science algorithms and models.
The SDV algorithm, known as a recursive conditional parameter aggregation, exploits the hierarchical organization of data common to all databases.
The researchers found the synthetic data can successfully replace real data in software writing and testing. They also note the SDV can be scaled to create very small or very large synthetic datasets, facilitating rapid development cycles or stress tests for big data systems.
From MIT News
View Full Article
Abstracts Copyright © 2017 Information Inc., Bethesda, Maryland, USA