Home → News → Tool Detects Patterns Hidden in Vast Data Sets → Full Text

Tool Detects Patterns Hidden in Vast Data Sets

By Broad Institute

December 19, 2011

[article image]

Researchers at the Broad Institute and Harvard University have developed a tool that can analyze large data sets.

The tool is part of a suite of statistical tools known as Maximal Information-based Nonparametric Exploration (MINE), which can find multiple patterns hidden in massive data sets. "This toolkit gives us a way of mining the data to look for relationships," says the Broad Institute's Pardis Sabeti.

In one test, the researchers used MINE to make more than 22 million comparisons, focusing on a few hundred patterns of interest that had not been observed before in a data set of microorganisms.

"We view this as an exploration tool--it can find patterns and rank them in an equitable way," says Harvard professor Michael Mitzenmacher.

One of the tool's strengths is it can detect a wide range of patterns and organize them according to several different variables. "What’s exciting about our method is that it looks for any type of clear structure within the data, attempting to find all of them," says Harvard graduate student David Reshef. "This ability to search for patterns in an equitable way offers tremendous exploratory potential in terms of searching for patterns without having to know ahead of time what to search for."

From Broad Institute
View Full Article

Abstracts Copyright © 2011 Information Inc. External Link, Bethesda, Maryland, USA 



No entries found