BLOG@CACM
Computing Profession

Opportunities of Data Science Education

Posted
Orit Hazzan and Koby Mike of Technion

In a previous blog from July 2020 entitled Ten Challenges of Data Science Education, we presented the challenges of data science education. We now wish to highlight six new and exciting opportunities that data science presents:  

  • Teaching STEM (science, technology, engineering, and mathematics) in a real-world context 

  • Teaching STEM using real-world data 

  • Bridging gender gaps in STEM education

  • Teaching 21st century skills

  • Interdisciplinary pedagogy

  • Professional development for teachers

As the following description of the six opportunities reveals, these opportunities are largely based on and derived from the interdisciplinarity of data science.

1. Teaching STEM in a real-world context. Since data science attributes a great deal of importance to the application domain, that is, to a real-world context, it offers an opportunity to expand this perspective to the other STEM subjects. Although attempts have been made to implement this approach in different subject matters (e.g. Asamoah et al., 2015), specifically in the context of disciplinary education of the data science components (i.e., math education, statistics education, and computer science education), such methods are not common since they are not simple, for both teachers and learners, to implement. In the context of data science education, however, as the application domain is integrated inherently, it is not only natural to teach mathematics, statistics, and computer science in a real-world context, represented by the application domain, but also essential and unavoidable.   

2. Teaching STEM using real-world data. One method to teach STEM in a real-world context (see Opportunity I) is to teach it using real-world data. Indeed, it is common practice to teach data science education and statistical education using real data. Nevertheless, although studies tell us that learning computer science using real data can motivate students and support their learning processes (e.g. Anderson et al., 2015; Burlinson et al., 2016; Tartaro and Chosed, 2015), recent comprehensive literature reviews on trends in computer science education reveal that it is still not commonplace to teach computer science using real-world data (Becker and Quille, 2019; Luxton-Reilly et al., 2018). As far as we know, this is true also with respect to mathematics education (Edelen et al., 2020; Matthews, 2018). The lack of such courses can be explained by the complexity involved in the development and teaching of interdisciplinary courses (Way and Whidden, 2014). Courses teaching mathematics and computer science in the context of data science, therefore, provide opportunities to teach these subjects using real-world data. For example, we designed an interdisciplinary Introduction to Computer Science course for psychology graduate students in which the students not only use real-world data, but also gather these data in experiments that they design and execute  (Mike and Hazzan, 2022). 

3. Bridging gender gaps in STEM education. Data show that women are underrepresented in STEM subjects in K-12, academia and industry. For example, statistics presented by the U.S. Department of Education (Digest of Education Statistics, 2020) on bachelor's degrees earned in 2018-19 show that women were awarded only 21% of computer science degrees. Since data science is an emerging discipline, Berman and Bourne (2015) propose that data science has the potential to narrow the gender gap in STEM subjects. For example, a significant majority of the participant in two data science workshops for social sciences and digital humanities researchers that we facilitated in 2020 self-identified as women (44 of 53, 83%), although the base rate of woman among social sciences and humanities graduates in Israel is approximately 66%  (Mike, Hartal, and Hazzan, 2021). This gender proportion, which is the opposite of that prevailing in STEM studies, led us to the examine the workshop from a gender perspective. Our findings indicate that the women who participated in the data science workshop perceived it as an opportunity to acquire research tools rather than programming tools. In other words, framing the workshop as a research workshop, rather than as a programming workshop, lowered the gender barriers that prevail in STEM, encouraging a majority of women researchers to participate. Therefore, following Berman and Bourne (2015), we propose that it is important to frame data science as a field of its own (rather than as a subfield of computer science) and to introduce data science skills as professional and research skills, not as programming skills (Mike, Hartal, and Hazzan, 2021). Such framing of data science and data science skills not only promotes gender equality in data science, but also, on a larger scale, promotes gender equality in STEM education.    

4. Teaching 21st century skills. Data science is an important 21st century competence, which includes many cognitive skills (e.g., data thinking (Mike et al., 2022), computational thinking, and statistical thinking), social skills (e.g., teamwork and storytelling), and organizational skills (e.g., cross-team collaborations). Data science also promotes data literacy, which is considered a basic 21st century skill. As it is accepted that skills should be taught and practiced within a specific context, rather than in isolation, data science education is an opportunity to promote these skills in a real-world context, as part of the data science learning processes.  

5. Interdisciplinary pedagogy. Researchers identify several levels of integration between two or more distinct disciplines, two of which are multidisciplinarity and interdisciplinarity (Alvargonzález, 2011). Multidisciplinarity is the lowest level of integration, in which learners are expected to gain knowledge and understanding in each discipline separately. Interdisciplinarity, on the other hand, represents a higher level of integration. In interdisciplinary programs, which should be implemented after learners have gained basic knowledge and understanding in each discipline separately, learners are expected to grasp the interconnections between the disciplines and to solve problems that require them to apply knowledge and methods from the various disciplines.

The term 'interdisciplinary pedagogy' is discussed in the literature mostly as a synonym to interdisciplinary education; for example, see Cargill (2005) and Penny (2009). Others define interdisciplinary pedagogy as an integrative pedagogical approach based on pedagogies of different disciplines (Chesley et al., 2018). In the spirit of interdisciplinary education, however, we propose a new definition of the term interdisciplinary pedagogy: solving pedagogical challenges in interdisciplinary education based on the integration of pedagogical methods from each of the different disciplines.  

Based on this definition, we propose that data science education leverages an opportunity to implement interdisciplinary pedagogy. An example of such an interdisciplinary pedagogy is teaching machine learning applying the pedagogical principles of mathematics education (Mike and Hazzan, 2022b). Data science offers an opportunity to further develop this kind of pedagogy, which has been receiving a great deal of attention recently due to the kind of problems the world is faced with.  

6. Professional development for teachers. Since data science is relevant for a variety of disciplines, including mathematics, statistics, computer science, and many other application domains, data science education is relevant for educators from an equally large variety of disciplines. It offers an opportunity for the professional development of educators who may familiarize themselves with the discipline of data science as well as with methods of teaching data science. In such a professional development process, it is important to encourage the educators to practice data science using data taken either from their field of expertise (e.g., history, sociology, or psychology) or from the educational field. Thus, not only will the educators become learn about data science, but they will also improve their understanding of the application domain they teach and enrich their arsenal of teaching methods.   

Conclusion 

In this blog we presented six new opportunities in data science education. Our research indicates that the challenges presented in our blog "Ten Challenges of Data Science Education" and the opportunities presented in this blog cannot be mitigated or leveraged (respectively) in the individual context of mathematics education, statistics education, computer science education, and education aspects of the application domains. This amplifies the necessity of a new discipline–data science education–that focuses on data science education pedagogy and research. As this field gains more and more attention, it is evident that a new discipline is actually being born: Data Science Education.

References 

Alvargonzález, D. (2011). Multidisciplinarity, interdisciplinarity, transdisciplinarity, and the sciences. International Studies in the Philosophy of Science, 25(4), 387–403. https://doi.org/10.1080/02698595.2011.623366 

Anderson, R. E., Ernst, M. D., Ordóñez, R., Pham, P., and Tribelhorn, B. (2015). A data programming CS1 course. Proceedings of the 46th ACM Technical Symposium on Computer Science Education, 150–155. 

Asamoah, D., Doran, D., and Schiller, S. (2015). Teaching the foundations of data science: An interdisciplinary approach. ArXiv Preprint ArXiv:1512.04456. 

Becker, B. A. and Quille, K. (2019). 50 years of CS1 at SIGCSE: A review of the evolution of introductory programming education research. Proceedings of the 50th ACM Technical Symposium on Computer Science Education, 338–344. 

Berman, F. and Bourne, P. E. (2015). Let's make gender diversity in data science a priority right from the start. PLoS Biology, 13(7), e1002206. 

Burlinson, D., Mehedint, M., Grafer, C., Subramanian, K., Payton, J., Goolkasian, P., Youngblood, M., and Kosara, R. (2016). BRIDGES: A system to enable creation of engaging data structures assignments with real-world data and visualizations. Proceedings of the 47th ACM Technical Symposium on Computing Science Education, 18–23. 

Cargill, K. (2005). Food Studies in the Curriculum: A Model for Interdisciplinary Pedagogy. Food, Culture & Society, 8(1), 115–123. https://doi.org/10.2752/155280105778055371 

Chesley, A., Parupudi, T., Holtan, A., Farrington, S., Eden, C., Baniya, S., Mentzer, N., and Laux, D. (2018). Interdisciplinary pedagogy, integrated curriculum, and professional development.
https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1014&context=aseeil-insectionconference

Digest of education statistics. (2020). National Center for Education Statistics. https://nces.ed.gov/programs/digest/d20/tables/dt20_322.50.asp 

Edelen, D., Bush, S. B., Simpson, H., Cook, K. L., and Abassian, A. (2020). Moving toward shared realities through empathy in mathematical modeling: An ecological systems theory approach. School Science and Mathematics, 120(3), 144–152. 

Luxton-Reilly, A., Albluwi, I., Becker, B. A., Giannakos, M., Kumar, A. N., Ott, L., Paterson, J., Scott, M. J., Sheard, J., and Szabo, C. (2018). Introductory programming: A systematic literature review. Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, 55–106. 

Matthews, J. S. (2018). When am I ever going to use this in the real world? Cognitive flexibility and urban adolescents' negotiation of the value of mathematics. Journal of Educational Psychology, 110(5), 726. 

Mike, K., Hartal, G., and Hazzan, O. (2021). Widening the shrinking pipeline: The case of data science. 2021 IEEE Global Engineering Education Conference (EDUCON), 252–261. 

Mike, K. and Hazzan, O. (2022a). Interdisciplinary CS1 for non-majors: The case of graduate psychology students. 2022 IEEE Global Engineering Education Conference (EDUCON), 86–93. 

Mike, K. and Hazzan, O. (2022b). Machine learning for non-majors: A white box approach. Statistics Education Research Journal, 21(2), 10–10. 

Mike, K., Ragonis, N., Rosenberg-Kima, R. B., and Hazzan, O. (2022). Computational thinking in the era of data science. Communications, 65(8), 31–33. 

Penny, S. (2009). Rigorous interdisciplinary pedagogy: Five years of ACE. Convergence, 15(1), 31–54. 

Tartaro, A. and Chosed, R. J. (2015). Computer scientists at the biology lab bench. Proceedings of the 46th ACM Technical Symposium on Computer Science Education, 120–125. https://doi.org/10.1145/2676723.2677246 

Way, T., & Whidden, S. (2014). A loosely-coupled approach to interdisciplinary computer science education. Proceedings of the International Conference on Frontiers in Education: Computer Science and Computer Engineering (FECS), 1. 

 

Koby Mike is a Ph.D. student at the Technion's Department of Education in Science and Technology under the supervision of Orit Hazzan. Mike's research focuses on data science education. Orit Hazzan is a professor at the Technion's Department of Education in Science and Technology. Her research focuses on computer science, software engineering and data science education. For additional details, see https://orithazzan.net.technion.ac.il/.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More