In my most recent blog post in this space [Hill 2019], I claimed that (1) "a vote is symbolic, discrete, and devoid of connotation; not an act of communication, but an act of declaration, single-shot, unnegotiated, unilateral." And I claimed that (2) a vote is a first-class artifact (existing on its own, subject to creation, destruction, examination, and modification) —so it can be passed as a parameter (to tallying functions) and can be compared to others. And that lacks a description or identifier by design (to preserve anonymity). Through the last couple of months, I have remained intrigued by the nature of the datum we call a vote (and, just so you know, Wyoming has celebrated 150 years of female suffrage). What are the features of the vote, and how do they affect its processing? What is the ontology of the vote?
Not only is the vote data of importance, but so is the vote processing. Voting is a package, a bundle of concepts, as in object-oriented programming. But let's confine ourselves to a discussion of the data structure, independent of its processing, so far as possible. We treat the vote as a datum, adopting the correct term "datum" not (only) to be pedantic, but for precision. Perhaps if we figure out what a vote is, we will be better equipped to protect it from mistreatment during processing.
First, note the absolute integrity or purity of this indivisible object. In data modeling, we sometimes alter data from the real world by allowing for truncation (names), or forcing values into restricted structures (addresses) or limited choices (Zip Codes), or default values. But not with votes. We accept no approximation. There is no estimate or probability or shorthand that substitutes for the raw intact value. This is a domain where technology must be governed—in fact, dictated—by human need, not vice versa.
On the surface, a vote appears to be a scalar, a variable with a simple rather than compound value. This variable should be read-only, or, more precisely, write-once (an venerable term of art in computer science that refers to storage devices, such as a CD-R). The variable should be of a type that allows for aggregation in some manner, where the aggregates can be compared for relative magnitude. And, of course, integrity requires that a vote persist on that write-once medium (for possible recount), and confidentiality requires that a vote be detached from its origin.
We can formulate a simple scenario. A standard ballot lists offices or issues, and a set of choices for each, which we will construe as yes/no questions. Assuming a context B, denoting a particular ballot, let's boil the set of votes down to (instances of) an individual boolean variable V, the vote, set to Yes or No on a particular question Q by an individual voter R at some point in time T. First, what distinguishes V? If Q and T are fixed, is V a function of R? Then R cannot have any other value of V, ever.
Our secret ballot dictates that V not be computable from the value of the voter R, and vice versa; V cannot give the identity of R. V is an attribute of R at T, but never thereafter; that is, R and T must be not just left behind, but stripped away. V is now a free-floating boolean datum, existing on its own. If this all sounds a little goofy, take this passage as the phenomenological reduction [Hill 2018], in which we disregard the constraints of program processing or any algorithmic treatment; that is, we are trying to figure out what happens to the vote as it goes along its way, insofar as it penetrates our first-person experience.
Because a vote is a datum that goes into a count, we can ask whether our conception applies to other data under other counting tasks. I believe not; either those data are subsumed by the aggregation or they persist in attachment to the source. If we're counting transactions, or files, or employees, then each of those objects still exists with its identifying transaction-id or file-ptr or empnum after it contributes to the count. If we're counting CPU cycles, then there is no need to retain any object after it contributes to the count and hence, no identifier is preserved.
So our vote V, after it is cast, becomes a different type of thing. V has changed into a naked value, its existence justified by the fact that it was, right up until now, associated with an entity—the voter R. Yet it will immediately become amalgamated with other entities in the counting process, associated with the datum representing the choice it signifies. Its value (Yes or No) will become its association, coinciding with paradigmatic counting tasks.
And what's more, it must retain some identity, somewhere, somehow, integral and distinct. As noted, above, a vote is an object subject to audit or recount. How can this be accomplished under the practice of detaching a vote from its origin to preserve confidentiality? Can a vote be re-named; that is, does it have an identifier as an attribute? We think of paper ballots in a pile, each vote V on R's ballot B recorded and verified by the voter R. What is the identity status of those votes V? An artificial ID can be assigned to an object, of course, but normally these are given as increasing integers, a simple practice to guarantee that each is unique. But that pattern affords a way to determine where a vote occurred in a sequence, if not the actual time of the event. Any pattern used to generate unique artificial IDs leaves such a trail that may allow identification—a problem akin to preserving anonymity in data collected from people. The stack of paper ballots is identified by separate physical presences; that's how things work in the real world. In the case of some nefarious attempt to connect a vote with a particular voter, our limited mental capacity delivers confidentiality by virtue of the difficulty of distinguishing one ballot in the stack from another. But translating this wholly into computation is tricky. No solution is offered here.
Should we construe a vote as some kind of object other than a Boolean variable? What we perceive on a ballot is commonly a choice among options, not a fine-grained array of booleans. And in our minds, as we formulate our vote, perhaps the salient arrangement is an ordering among them. In Kenneth Arrow's significant analysis of voting, defined over a group of voters, the voter variable R takes values 1 to n and the set of alternatives on the ballot is X, with individual choices x,y,z..., among which one is to be selected. No degree of commitment is available. A preference profile is a set of weak orderings (which allow ties) of X, one for each voter R. In this model, the well-known Arrow's Theorem [SEP-ArrowsThm] exposes limits on how well the will of the people can be determined. Explanation of Arrow's Theorem will have to wait. The point to take away is that representation of the vote informs processing, and processing may be turn out to have odd characteristics. (Low-level Boolean variables are not subject to Arrow's Theorem because they are not ordered.)
It's possible that no declared variable, scalar or not, can capture the individual's wishes. Perhaps R, faced with yes or no, really wants "maybe," or a way to express that "both are equally bad." Or R wants to record a degree of "no," or "yes," between zero and 10. Or R wants a set of contingencies based on factors, such as the results of tomorrow's elections elsewhere, outside of the ballot B. Neither the scalar variable model nor Arrow's model offers such comprehensive flexibility (nor do other common voting methods). But R understands this. At some point, a discrete value must be passed on; best to make it here, now, in hands of the voter.
If the type of the vote, in the sense of a declaration as used in programming, matters so much, consider this: What is the type of the collective will of the electorate? If we were to dig into the heads, or the average head, of a group of voters, what would we find there? Would we find, in the context of a single question Q, a Boolean value? Is it rather a score? A range? Or is it best conceived as a relation or or even a function of arbitrary factors beyond Q? As programmers know, a great deal of conceptual work is captured in the type declaration of a variable.
Philosophy of this kind is neither necessary nor sufficient for writing a working voting program. The reader can tell that these musings are abstruse considerations, inspired by computing but colored by philosophy. More grounded views would come from the social sciences, which boast a voluminous and diverse body of work on voting. In our own field, actual flesh-and-blood software engineers are working on actual bits-and-circuits voting systems out there, writing actual declarations of vote objects or related objects, and we commend them.
[Hill 2018] Hill, Robin K. 2018. Examples of Phenomenology in Computing. Blog@CACM. March 29, 2018.
[Hill 2019] Hill, Robin K. 2019. Voting, Coding, and the Code. Blog@CACM. November 27, 2019.
[SEP-ArrowsThm] Morreau, Michael. "Arrow’s Theorem", The Stanford Encyclopedia of Philosophy (Winter 2019 Edition). Edward N. Zalta (ed.).
Robin K. Hill is a lecturer in the Department of Computer Science and an affiliate of both the Department of Philosophy and Religious Studies and the Wyoming Institute for Humanities Research at the University of Wyoming. She has been a member of ACM since 1978.