Holding Algorithms Accountable

Getting algorithms to explain their reasoning is a big part of the problem. — Artificial intelligence programs are extremely good at finding subtle patterns in enormous amounts of data, but don't understand the meaning of anything.

Whether you are searching the Internet on Google, browsing your news feed on Facebook, or finding the quickest route on a traffic app like Waze, an algorithm is at the root of it. Algorithms have permeated our daily lives; they help to simplify, distill, process, and provide insights from massive amounts of data.

According to Ernest Davis, a professor of computer science at New York University's Courant Institute of Mathematical Sciences whose research centers on the automation of common-sense reasoning, the technologies that currently exist for artificial intelligence (AI) programs are extremely good at finding subtle patterns in enormous amounts of data. "One way or another," he says, "that is how they work."

"These technologies don't understand the meaning of anything," Davis explains, adding that AI is good at detecting correlations, but not great at telling us which correlations are meaningful. "They don't know anything about the real world that they can connect that meaning to."

Algorithms can also be biased, discriminating and making mistakes that can impact the job you are applying for, which apartments are displayed to you on real estate Web sites, or the loan you're trying to obtain from the bank.

AI generally lacks transparency, so it is difficult to determine how these technologies may be discriminating against us. Since it is nearly impossible to see how algorithms actually "work," algorithms may depend on biased assumptions and/or data that unintentionally reinforces discrimination. Davis warns you have to be very careful when using AI, to make sure your decisions are not reinforcing biases that are built into your data sources.

The public sector is also increasingly using AI to automate both simple and complex decision-making processes. For instance, New York City assembled a task force to study possible biases in algorithmic decision systems. In May 2018 Mayor Bill de Blasio announced the creation of the Automated Decision Systems Task Force to explore how New York City uses algorithms. The task force was mandated by Local Law 49, which called for "the creation of a task force that provides recommendations on how information on agency automated decision systems may be shared with the public and how agencies may address instances where people are harmed by agency automated decision systems." In theory, the task force would identify error and bias in the city's algorithms, and identify, say, if automation was favoring certain neighborhoods over others, who receives government services first, etc.

Fast-forward a year, and the task force has failed to make any discernable progress. At a hearing before New York's City Council Committee on Technology in April 2019, the task force's co-chair said the task force did not know what automated decision systems were in use, and it did not plan to create or disclose a list of systems the city uses. Although the task force had met more than 20 times, its members were not able to agree on which technologies even meet the definition of "automated decision systems." Lacking a consensus of what it should investigate, the odds are slim the task force will fulfill its mandate to issue policy recommendations by this fall.

Janet Haven, executive director of Data & Society, a non-profit research institute dedicated to advancing public understanding of the social implications of data-centric technologies and automation, says administrative or bureaucratic barriers to examining existing governmental algorithmic systems appear to be a harbinger of battles to come.

One area where algorithmic risk assessments have been widely adopted is by the criminal justice system, in an effort to reduce bias and discrimination in judicial decision-making ranging from pretrial decisions to sentencing and parole. Given that bias is generally inherent in these algorithms, this is disconcertingly problematic.

Iyad Rahwan, director of the Center for Humans and Machines at the Max Planck Institute for Human Development in Berlin, Germany, observes such algorithmic risk assessments can fix one problem while introducing another. "Suppose a risk assessment algorithm that predicts recidivism reduces the total number of classification mistakes made, but increases the racial inequality in those mistakes, with more mistakes made when classifying people of color?"

This very scenario occurs on a regular basis. In 2016, a ProPublica study disturbingly concluded that software used across the country to predict future criminals is biased against black people. The software, called COMPAS, creates risk scores designed to inform judges before sentencing of the likelihood the defendant will be a repeat offender. The ProPublica analysis found that COMPAS was twice as likely to label black defendants as likely recidivists than white defendants, while more likely to falsely label white defendants as low-risk of repeating their offenses. ProPublica found supposedly high-risk black defendants did not commit crimes at the rates predicted by COMPAS, while more supposedly low-risk white defendants committed crimes at higher rates than predicted.

"You now face a difficult question about which value is more important: keeping more people out of jail versus treating everyone the same," Rahwan continues. He maintains that this is where science stops, and societal and legal norms come in. It is important to separate the factual from the normative discussion about these types of algorithms, he notes.

The bigger question is how do we address this bias and discrimination in AI; how do we hold algorithms accountable?

Haven feels we need to put governance in place for algorithmic decision-making systems, both in corporate use and in applications for the public sector. "What's at stake is huge: people's health care, access to public services, and even basic freedoms are at risk," Haven says. "We're at a frustratingly early stage, given how extensively algorithmic decision-making systems have moved into daily life."

According to Haven, the spectrum of what "governance" means is very broad, with different actors in the "responsible AI" world advocating for a different definition or approach, ranging from Facebook's creation of a "Supreme Court" to moderate content, to regulatory approaches like the European Union's General Data Protection Regulation (GDPR), which requires organizations to be able to explain their algorithmic decisions, or the newly introduced U.S. Algorithmic Accountability Bill of 2019 that calls for companies to assess their algorithms for bias and whether they pose privacy or security risks to consumers.

"Auditing of decision-making processes has always been a fundamental part of a functioning society," says Rahwan. "We audit the financials of companies and government functions all the time. When algorithms start making financial or governmental decisions, it seems natural to do the same."

However, Rahwan wonders what the reasonable scope of regulation should be. What, exactly, is being audited; an algorithm's source code, or just its behavior?

Recently, Rahwan has become a proponent of "machine behavior," recognizing some algorithmic decision-making systems can defy interpretation and audit through pure source-code inspection. However, he thinks algorithms could be tested for bias and other behavioral issues, in the same way a psychologist might test a person for cognitive or emotional difficulties. Rahwan acknowledges there is not a toolkit of behavioral tests designed to inspect the behavior of intelligent machines, but hopes quantitative behavioral sciences can be applied to the study of machines in the future.

To further complicate this nascent situation, numerous other organizations are advocating different solutions. For example, the AI Now Institute at New York University, dedicated to understanding the social implications of artificial intelligence, has called for algorithmic impact assessments (AIA) akin to environmental impact reports, but for AI. O'Neil Risk Consulting & Algorithmic Auditing (ORCAA) helps companies and organizations manage and audit their algorithmics for bias; ORCAA issues its own certification mark, reminiscent of organic food labels, a seal of approval indicating tested algorithms are unbiased. Consulting firm Deloitte also is developing a practice around algorithmic risk as well.

Given the tech philosophy of "move fast and break things," it seems that without any common standards and some type of agreement on transparency little can be done on an organized level to mitigate algorithmic bias and discrimination in the future. As the recent bans on facial recognition by some cities demonstrates, response is likely to be individual and at the local level.

"If we are to build intelligent machines," says NYU's Davis, "we need programmers to integrate an understanding of human common sense into the design process." This includes incorporating philosophy and cognitive psychology (perception, learning, memory, reasoning), Davis says.

Haven believes interrogating the values that inform technology design, and ensuring these are visible and intentionally chosen with respect for human dignity, is key. Haven advocates using social science to explore the social and cultural implications of data-centric and automated technologies, which will help us arrive at better governance solutions.

"We need a culturally relevant evidence base that tells us how AI and automated technologies impact different groups of people," Haven says, "which will allow for us to clearly understand the trade-offs inherent in governance."

Rahwan thinks there are a variety of regulatory tools that can be used to audit and regulate algorithms. He feels it is important to recognize this, because it forces the discussion out of the abstract (and somewhat politicized) notion of machine ethics, and more towards the notion of institutional regulation.

"I see the discussion today as too focused on ethics alone, as if there was some magic set of ethical principles that, once implemented, would render AI systems intrinsically good," Rahwan says. He points out that we do not make people ethical by just teaching them ethics at school, but by also having social norms, laws, and institutional checks and balances. The same, he says, goes for machines, so the solution has to be societal and institutional, not just a set of rules programed into an algorithm.

John Delaney is a freelance writer based in Manhattan, NY, USA.