I’ve been working on usable privacy and security for about 20 years, and have recently started looking at issues in Algorithmic Fairness, Accountability, Transparency, and Ethics (FATE), which also includes issues of AI Bias. I’ve noticed that there are many similarities between FATE and privacy, and believe that there are a lot of lessons that the FATE community can learn from past work, successes, and failures in privacy.
Similarities Between FATE and Privacy
First, what are some ways that FATE and privacy are similar? Both are problems that companies are deeply struggling with, with failures often ending up as headline news. More interestingly, both FATE and privacy are ill-defined. There is not a clear, widely accepted definition of privacy. There are many specific and narrow definitions, for example different forms of anonymity, the right to be forgotten, the right to be let alone, contextual integrity, and so on, but no single conceptualization that captures the wide umbrella of concerns that people have. The same is true of FATE. There are many useful concepts, such as allocation harms vs representation harms, group fairness, equalized odds, and more, but no single framing that covers the wide range of issues that have been raised.
I would also argue that there is currently an overemphasis on statistical techniques for addressing FATE and privacy problems. This is not to say that statistical approaches do not have value, but rather these methods are useful only in limited situations. Instead, both FATE and privacy need to draw on ideas and methods from law, policy (both public policy and corporate policy), ethics, user experience design, systems design, and more, to successfully address the wide range of concerns people have.
Both FATE and privacy also involve challenges throughout the software development lifecycle, including requirements, design, implementation, and evaluation. Perhaps another way of thinking about this is functional versus non-functional requirements. Functional requirements are features that are often in a specific part of the source code and have a clear criteria of whether they work or not. In contrast, non-functional requirements, such as usability and security, are often cross-cutting and sometimes emergent properties of a system, span many different parts of the system (data collection, storage, transmission, etc), involve many different parts of the source code, aren’t necessarily the responsibility of a single developer or a small team, and are hard to measure well. Here, FATE and privacy are much more like non-functional requirements, and just like usability and security, FATE and privacy need people dedicated specifically to addressing those issues.
Another similarity between FATE and privacy is lack of awareness and knowledge by developers as to what the problems are and what can be done. In various studies, our team and others have found that most developers have little knowledge of privacy issues, both broadly and also in terms of problems with the apps they are building (see for example this survey paper, interviews with developers, and analysis of developer discussion forums). Informally, we have seen the same with FATE. In interviews with ML developers to understand how they evaluate their systems, very few raised issues of fairness or bias. In both cases, while there are many aspirational guidelines (think of Asimov’s Three Laws of Robotics), we have also seen a lack of clear practical guidelines and best practices for developers and organizations, though things have been slowly improving for privacy.
What worked and didn’t work for privacy?
I started looking at privacy in the early 2000s, and since then there has been a great number of research papers probing people’s concerns, large-scale studies about what data is being collected, and proposals for new user interfaces, data processing techniques, and system architectures. I would say that most of this research was good in terms of raising people’s awareness in general and regulators specifically, but in the vast majority of cases did not have much success in directly moving the needle.
Privacy policies were also adopted early on by many web sites and are now commonplace. This kind of transparency is good in theory, but arguably has not helped much in practice since it places the onus of privacy primarily on end-users. In general, attempts to improve privacy that focused on helping end-users have not worked well for this reason. The vast majority of people don’t have the time, expertise, or desire to understand all the nuances of privacy. Furthermore, even if a user did understand these nuances, there often aren’t many good alternatives for users to choose from. If I’m in a hospital and don’t like their HIPAA policies, should I go to another hospital?
There were also many attempts by industry to self-regulate, for example web standards like P3P and Do Not Track. However, these pushes repeatedly failed. Incentives between the various stakeholders, and advertisers in particular, were misaligned with what consumers wanted. In particular, advertisers had an extremely strong incentive to collect as much sensitive data as possible. There was also a lack of developer support for these standards, and perhaps most importantly, no enforcement mechanisms or penalties for lack of compliance.
So what has actually worked for privacy? Market forces have had a positive but small impact on privacy so far. The problem is that it’s difficult to compete on privacy because privacy is hard for people to assess and see others gaining benefit from it, leading to what’s known as a market failure. Imagine you want to buy a webcam. You can easily compare webcams based on price, form factor, and color, but it’s near impossible to compare based on privacy. There are some new approaches to address this market failure, like Apple’s Privacy Labels and Google’s Data Safety Section for smartphone apps, as well as privacy nutrition labels for smart devices. While these labels are ostensibly for end-users, I think they will actually be more effective in influencing product managers, who will want to compete on these, as well as product reviewers, who can easily incorporate aspects of privacy into their reviews and their scores.
Social forces have also had a positive but marginal effect on privacy. Perhaps the most effective tactic has been shame and embarrassment. For example, my colleagues and I built a site called PrivacyGrade.org (sadly, now defunct) that assigned privacy grades to all of the free Android apps available. Our original intent was to raise people’s awareness of privacy issues, but one unexpected result was that some app developers publicly declared how they were changing their app data collection behaviors in response to our site, in part due to bad press resulting from our site.
Overall, the most substantive lever for improving privacy has been comprehensive legislation and regulation such as GDPR and California Online Privacy Protection Act (CalOPPA), as well as regulatory fines by organizations like the FTC. A decade ago, a question I often got from journalists and other researchers was how to get organizations and developers to actually care about privacy. Nowadays, companies and developers have to care about privacy, due to the potential for massive fines. Also, while I’m still skeptical of some parts of GDPR and CalOPPA (do those cookie notices help anyone?), they have forced companies to think a lot more about what data they are collecting and whether they really want to collect it or not. It’s also worth pointing out that this legislation and regulation happened because of repeated failures in the form of data breaches and scandals over data use, as well as industry’s inability to self-regulate.
The next most effective lever for privacy has been smartphone app stores. The centralized nature of app stores and their dominant position for distributing apps made it possible for Apple and Google to dictate certain standards for privacy. Some examples include limiting scanning for what other apps are installed on a person’s phone, requiring better APIs for data storage, and requiring privacy nutrition labels. These requirements strongly incentivized developers to care about privacy, since if they did not comply with these standards they risked having their app be removed from the app store.
What might work for FATE?
Given all this, what can we learn from privacy that might work for FATE? One is that industry self-regulation is unlikely to work. While there is strong desire from industry to improve, FATE is too diverse of an issue, with too many actors, cutting across too many different kinds of systems and industry sectors. Industry simply has no way to monitor the sheer number of algorithmic systems being deployed or sanction actors that fail to comply. There also isn’t a centralized app store or platform where a small number of organizations can dictate and enforce policy.
Market forces are also unlikely to make a large difference. There currently are no clear metrics by which to compare the fairness or ethics of different algorithmic systems, which limits the effects of transparency mechanisms like privacy nutrition labels. Also, as above, the sheer diversity of algorithmic systems coupled with lack of alternatives in many domains (e.g. algorithms for health care or housing for the homeless) makes it hard for the market to solve FATE issues on its own.
Like privacy, social forces have been surprisingly effective in getting individual companies to change their products regarding FATE issues. In particular, audits by third parties have had success in getting companies to change, and we as a community should consider how to support these audits more. Some of these audits have been conducted by experts. Gendershades is perhaps the most prominent example, where independent researchers found that many commercial face recognition systems did poorly on people with darker skin and with women, and especially poorly on dark-skinned women. This work led Microsoft, Amazon, IBM, and others to significantly revise their face recognition systems or halt their services. Surprisingly, other audits have been conducted by everyday users.
In a recent paper, my colleagues and I documented how people on Twitter came together to conduct audits on Twitter’s photo cropping algorithm, Apple’s credit card algorithm, and a computer vision system trained on ImageNet, each of which led to significant interventions either by the company involved or by government agencies. Popular press can also be a partner and an amplifier in these kinds of user-driven third-party audits. However, external audits are hard to generalize, in that they only work for services that are easily accessible and understandable by people. Also, external audits only help after a system is already deployed.
In the long term, I think legislation and regulation will be key to making substantive improvements in FATE, mostly because the other knobs and levers for FATE will turn out to be insufficient. If FATE follows a similar trajectory as privacy, there will be many more years of embarrassing FATE failures that make headline news and a continued failure by industry to self-regulate before policy makers finally step in. One possible analog is the recent National Cybersecurity Strategy which proposes minimum cybersecurity measures for critical infrastructure and imposes liability on companies that fail to comply.
While I may have seemed negative on the value and impact of privacy research, it’s more that research by itself is not sufficient in the majority of cases. One way that research can help is to be more immediately actionable for policy makers or industry, for instance offering more evidence of FATE or privacy problems, which means more ammunition for policy makers. For example, my team’s research showing that some smartphone apps were vastly overcollecting data was helpful to the FTC in imposing fines on apps. As another example, research examining dark patterns on web sites at scale has led to a workshop at the FTC to better understand the problem and what might be done to mitigate the problem.
Research can also help by offering more tools for addressing problems, making it harder for industry to make excuses. Again, I think a particularly promising area for both privacy and for FATE is better support for auditing, by first parties who want to systematically evaluate their systems before they are deployed, and by third parties such as researchers, journalists, consumer advocates, and public policy makers. Trust but verify, as the old saying goes.
Lastly, there is potentially strong synergy for people concerned about FATE issues to work closely with privacy advocates already in an organization. Both groups are interested in transparency, minimizing risks to users, and proper use of demographic data and other sensitive data. Both are also advocates for end-users too. This approach may also make it easier for FATE issues to be addressed in an organization, as privacy engineers and Chief Privacy Officers are starting to become commonplace and much of their work is focused on compliance with policies and procedures across the entire software development lifecycle.
In summary, FATE can learn much from the trajectory of privacy, and working together can make significant steps towards making systems that are responsible and trustworthy.
Jason I. Hong is a professor in the School of Computer Science and the Human Computer Interaction Institute of Carnegie Mellon University.