The ability to manipulate and understand data is increasingly critical to discovery and innovation. As a result, we see the emergence of a new field—data science—that focuses on the processes and systems that enable us to extract knowledge or insight from data in various forms and translate it into action. In practice, data science has evolved as an interdisciplinary field that integrates approaches from such data-analysis fields as statistics, data mining, and predictive analytics and incorporates advances in scalable computing and data management. But as a discipline, data science is only in its infancy.
The challenge of developing data science in a way that achieves its full potential raises important questions for the research and education community: How can we evolve the field of data science so it supports the increasing role of data in all spheres? How do we train a workforce of professionals who can use data to its best advantage? What should we teach them? What can government agencies do to help maximize the potential of data science to drive discovery and address current and future needs for a workforce with data science expertise? Convened by the Computer and Information Science and Engineering (CISE) Directorate of the U.S. National Science Foundation as a Working Group on the Emergence of Data Science (https://www.nsf.gov/dir/index.jsp?org=CISE), we present a perspective on these questions with a particular focus on the challenges and opportunities for R&D agencies to support and nurture the growth and impact of data science. For the full report on which this article is based, see Berman et al.2