The Process-Object Duality in Computer Science and Data Science Education

How would you describe reduction to a friend who is not a computer scientist? How would you describe the KNN algorithm to a friend who is not a data scientist?

As we will see, even such simple exercises can teach us a lot about learners' conceptions and understanding of these and many other computer science and data science concepts, and can guide us in the design of teaching materials and learning processes related to these concepts.

It all started about 30 years ago, when a movement in mathematics education developed theories that mainly distinguished between process and object conceptions of mathematical concepts (Dubinsky, 1991; Dubinsky & Mcdonald, 2002; Sfard, 1991). One decade later, those theories were implemented in computer science education (Hazzan, 2003b, 2003a; Hazzan & Hadar, 2005; Sakhnini & Hazzan, 2008). Nowadays, with data science pedagogy developing so rapidly, we propose to implement these theories in data science education (Mike & Hazzan, 2022). In this blog, we illustrate this development and discuss how each one of us can use these theories in class.

Inspiration: The process-object duality theory

According to the process-object duality theory, abstract mathematical concepts can be represented in the human mind as either processes or objects (Sfard, 1991). As a process, a mathematical concept is conceived of as an algorithm or a computation that generates an output from an input. On a higher level of abstraction, as an object, the same mathematical concept is conceived of as a fixed construct.

In the learning processes of most mathematical concepts, the learner goes through three phases. First, the concept is conceived of as a process. Then, the process is mentally packed (encapsulated) and an object representation is created and captured in the learner's mind as a single compact entity. Conceiving of a mathematical concept as an object, as an entity of its own, enables the learner to examine it from various viewpoints based on an analysis of its properties without delving into its details, to analyze its relationships to other mathematical notions, and to apply operations on it. Due to this variety of activities that an object conception enables, an object conception of a concept reflects a deeper understanding than does its comprehension as a process. In the third and final step, after the concept has been packed and has become an object, it can be used by the learner as a component of a more compound concept.

Interpretation: The concept of function in mathematics, computer science, and data science from the perspective of the process-object duality

The mathematics education research community has discussed extensively the object-process duality with respect to the concept of function (see Dubinsky & Harel, 1992). As a process, a function can be represented in the human mind as the steps required to calculate the function output value y_i for a given input value x_i. As an object, on the other hand, the concept of a function can be conceived of as a set of ordered pairs {(x_i,y_i )}. Although students often conceive the concept of function as a process, it still enables them to solve a wide array of problems that deal with functions.

Based on this extensive discussion of the process-object duality with respect to the concept of function in the context of mathematics education, we examine the function conception in the context of computer science education and in the context of data science education through the lens of the process-object duality.

In computer science, reductions are in fact functions, and so it is reasonable to assume that some students conceive them as processes. As in the case of mathematics education, a process conception of reduction is sufficient to successfully solve computability theory questions of a certain type (Hazzan, 2003b). For example, according to Hazzan, when solving computability problems of a certain kind, students prefer to define a reduction over using a (Rice) theorem, which although simpler in terms of the details involved, requires one to conceive of many concepts as objects. Furthermore, a reduction can sometimes be defined through an automatic process, without understanding its subtle details. For example, the dominance of the halting problem in defining reductions enables one to construct a reduction to the halting problem almost automatically, by mimicking known repeatable steps, even without analyzing the properties of the source of the reduction (that is, the original problem). Such behavior reflects a process conception of a reduction.

In the case of data science, we examine the concept of function from the perspective of the process-object duality theory by referring to the concept of machine learning algorithms, which can be described as "mathematical function mapping [of] some set of input values to output values" (Goodfellow et al., 2016, p. 5). Specifically, we use the KNN algorithm to illustrate how the process-object duality theory is applied in the analysis of students' understanding of the concept of machine learning algorithms. Here are several illustrative answers students gave to the second of the two questions that opened this blog: How would you describe the KNN algorithm to a friend who is not a data scientist? (Mike & Hazzan, 2022):

KNN is a particular way of classifying a particular datum or image according to other examples.
Tell me who your neighbors are and I'll tell you who you are.
Classify something into a category according to its proximity to other known objects.
If there were two different groups and you had to choose which group to join, how would you choose? Expected answer: According to someone like me.
The ability to classify an example according to the examples that are most similar to it.
KNN is a machine learning algorithm.
You tend to act like your neighbors.

Our readership is invited to sort these descriptions of the KNN algorithm according to the conception of the KNN algorithm that they reflect, as either an object or a process. Interested readers can select additional data science concepts and describe each of them in two ways: one that reflects a process conception and another that reflects an object conception.

Implications for data science education

In addition to its use as a way of identifying a learners' conceptions of data science concepts, the process-object duality has several educational implications for the design of teaching and learning processes.

For example, to support learning processes of machine learning algorithms, first as processes and then as objects, we recommend offering students a sequence of exercises of the following kinds and in the following order: visualization exercises that are presented visually and can be solved based on an analysis of the visual representation; hands-on activities that require manual tracking of the machine learning algorithm, for which a process conception is sufficient; and, in preparation for object conception, programming tasks that require the identification of each property of the algorithm for its proper implementation in the programming language.

This post illustrates the use of a theory from mathematics education to analyze students' understanding of data science concepts. From a broader perspective, we propose that since data science is based on mathematics, statistics, computer science, and application domains, when designing data science pedagogy, we should also rely on and use the knowledge gained in the educational areas of all of the disciplines that make up data science.

Reference

Dubinsky, E. (1991). Reflective abstraction in advanced mathematical thinking. In Advanced mathematical thinking (pp. 95–123). Dordrecht.

Dubinsky, E. and Harel, G. (1992). The concept of function. Aspects of Epistemology and Pedagogy. USA: Mathematical Association of America (MMA).

Dubinsky, E. and Mcdonald, M. A. (2002). APOS: A constructivist theory of learning in undergraduate mathematics education research. In D. Holton, M. Artigue, U. Kirchgräber, J. Hillel, M. Niss, & A. Schoenfeld (Eds.), The Teaching and Learning of Mathematics at University Level (Vol. 7, pp. 275–282). Kluwer Academic Publishers. https://doi.org/10.1007/0-306-47231-7_25

Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y. (2016). Deep learning (Vol. 1). MIT press Cambridge.

Hazzan, O. (2003a). How students attempt to reduce abstraction in the learning of mathematics and in the learning of computer science. Computer Science Education, 13(2), 95–122.

Hazzan, O. (2003b). Reducing abstraction when learning computability theory. Journal of Computers in Mathematics and Science Teaching, 22(2), 95–117.

Hazzan, O. and Hadar, I. (2005). Reducing abstraction when learning graph theory. Journal of Computers in Mathematics and Science Teaching, 24(3), 255–272.

Mike, K. and Hazzan, O. (2022). Machine learning for non-majors: A white box approach. Statistics Education Research Journal, 21(2), Article 10.

Sakhnini, V. and Hazzan, O. (2008). Reducing abstraction in high school computer science education: The case of definition, implementation, and use of abstract data types. Journal on Educational Resources in Computing (JERIC), 8(2), 1–13.

Sfard, A. (1991). On the dual nature of mathematical conceptions: Reflections on processes and objects as different sides of the same coin. Educational Studies in Mathematics, 22(1), 1–36. https://doi.org/10.1007/BF00302715

Orit Hazzan is a professor at the Technion's Department of Education in Science and Technology. Her research focuses on computer science, software engineering, and data science education. For additional details, see https://orithazzan.net.technion.ac.il/. Koby Mike is a Ph.D. graduate from the Technion's Department of Education in Science and Technology under the supervision of Professor Orit Hazzan. He is currently a post-doc at the Bar-Ilan University.