Communications of the ACM,
Vol. 59 No. 6, Pages 15-16
You cannot browse technology news or dive into an industry report without typically seeing a reference to "big data," a term used to describe the massive amounts of information companies, government organizations, and academic institutions can use to do, well, anything. The problem is, the term "big data" is so amorphous that it hardly has a tangible definition.
While it is not clearly defined, we can define it for our purposes as: the use of large datasets to improve how companies and organizations work.
Maybe the big data is not big enough for covering sufficiently many historical outbreaks with similar domain dynamics. To be able to predict something a model is required and in this contexts maybe the assumption is that special techniques are needed for filtering large amounts of data to allow some sort of machine learning algorithms to construct the model or draw conclusions of the future based on the past data.
Thanks for the comment, Martti! You raise some great points. In the case of Google's efforts, it definitely seems like their filtering techniques for the data suffered from their organization's biases.
It is an error to evaluate a pre-intervention model against the post-intervention course of the epidemic when the model itself influences the intervention.
Thanks for the comment, Stephen. Are you referring to the Ebola models used right when the outbreak began? These models may have influenced the epidemic's course, but also suffered from a lack of flexibility when conditions on the ground changed.
Displaying all 4 comments
Log in to Read the Full Article
Purchase the Article
Create a Web Account
If you are an ACM member, Communications subscriber, Digital Library subscriber, or use your institution's subscription, please set up a web account to access premium content and site
features. If you are a SIG member or member of the general public, you may set up a web account to comment on free articles and sign up for email alerts.