Opinion
Computing Profession Global computing

Global Data Justice

A new research challenge for computer science.
Posted
  1. Article
  2. References
  3. Author
  4. Footnotes
global data, illustration

When the world’s largest biometric population database—India’s Aadhaar system—was challenged by activists the country’s supreme court issued a historic judgment. It is not acceptable, the court said, to allow commercial firms to request details from population records gathered by government from citizens for purposes of providing representation and care. The court’s logic was important because this database had, for a long time, been becoming a point of contact between firms that wanted to conduct ID and credit checks, and government records of who was poor, who was vulnerable, and who was on which type of welfare program. The court also, however, said that this problem of public-private function creep was not sufficiently bad to outweigh the potential good a national population database could do for the poor. Many people, they said, were being cheated out of welfare entitlements because they had no official registration, and this was more unfair than the monetization of their official records.

This judgment epitomizes the problem of global data justice. The databases and analytics that allow previously invisible populations to be seen and represented by authorities, and which make poverty and disadvantage harder to ignore, are a powerful tool for the marginalized and vulnerable to claim their rights and entitlements, and to demand fair representation.2 This is the claim the United Nations is making5 in relation to new sources of data such as cellphone location records and social media content: if the right authorities can use them in the right way, they can shine a light on need and deprivation, and can help evaluate progress toward achieving the Sustainable Development Goals. If data technologies are used in a good cause, they confer unprecedented power to make the world a fairer place.


How to set boundaries for powerful international actors is a question yet to be solved in any field.


That ‘if’, though, deserves some attention. The new data sources’ value to the United Nations, to humanitarian actors, and to development and rights organizations are only matched by their market value. If it is possible to monitor who is poor and vulnerable, it is also possible to manipulate and surveil. Surveillance scholar David Lyon3 has said that all surveillance operates along a spectrum between care and control: a database like Aadhaar can be used to channel welfare to the needy, but it could also be used to target consumers for marketing, voters for political campaigns, transgender people or HIV sufferers for exclusion—the list is endless. The possibilities for monetizing the data of millions of poor and vulnerable people are endless, and may be irresistible if hard boundaries are not set. But how to set boundaries for powerful international actors is a question yet to be solved in any field.

Data technologies have very different effects in different social, economic, and political environments. WhatsApp, for example, allows parents’ groups to message each other about carpooling. It also facilitates ethnic violence in India and Myanmara and facilitates extremist politicsb in Brazil. Technology almost always has unintended consequences, and given the global reach of apps and services, the consequences of our global data economy are becoming less and less predictable.1

uf1.jpg
Figure. A woman has her eyes scanned while others wait during the Aadhaar registration process in India circa October 2018. Aadhaar produces identification numbers to individuals issued by the Unique Identification Authority of India on behalf of the Government of India for the purpose of establishing the identity of every single person.

Global data justice researchers are aiming to frame new governance solutions that can help with this global level of unpredictability. In this emerging research field, we are exploring how the tools we have are globalizing: regulation, research ethics, professional standards and guidelines are all having to be translated into new environments, and get understood differently in different places. Nigeria, the U.S., and India, for example, will each have a different idea of what is ‘good’ or ‘necessary’ to do with data technologies, and how to regulate their development and use. Our research asks how to reconcile those different viewpoints, given that each of those international actors—plus myriad others—will have the power to develop and sell data technologies that will affect people all around the world.

Currently much of the international discussion revolves around harmonizing data protection amongst countries, and getting technology developers to agree on ethical principles and guidelines. Neither of these are bad ideas, but each can go in a radically different direction depending on local views on what is good and desirable. Strongly neoliberal, promarket countries will develop different principles from more socialist ones, and even if they work from similar templates, will apply them differently. Democracies will set boundaries for data collection and use that are different from those of authoritarian states—yet we all have to work together on this problem. Like climate change, any unregulated data market affects us all.

So neither harmonized data protection nor ethical principles are the answer—or at least not on their own. Ethics, at the moment at least, is too frequently just a cover for self-regulation.6 We need to ask global questions about global problems, but we are often stuck looking at our own environment and our own set of tools, without understanding what kind of toolkit can address the international-level consequences of our growing data economy.

If we ask this global question, instead: How to draw on approaches that are working in different places, and how to set boundaries and goals collectively for our global data economy?, we arrive at questions about both justice, and intercultural understandings of it. We need not only to be able to articulate principles of justice and fairness, but to have a productive discussion about them with nations that see things very differently.

Research on global data justice4 is starting from this larger question of how to pick and articulate principles that people seem to agree on around the world; we will then work on how those should be turned into tools for governing data—and creating the institutions we need to do so, if they do not exist. Researchers working on this problem (who now include philosophers, social scientists, lawyers, computer scientists and informatics scholars, doing research in Europe, the U.S., Africa, and Asia) have to try to capture at least three conflicting ideas about what data technologies do and what their value is.

These conflicting ideas offer three main principles: first, that our visibility through data should work for us, not against us. We should be visible through our data when we need to be, in ways that are necessary for our well-being, but that it should be part of a reasonable social contract where we are aware of our visibility and can withdraw it to avoid exploitation. Second, that we should have full autonomy with regard to our use of technology. We should be able to adopt technology that is beneficial for us, but using a smartphone or being connected should not be linked to our ability to exercise our citizenship. Someone who has to use social media to get a national identity document or who has to provide biometrics through a private company in order to register for asylum, is not using data technologies so much as being used by them. Lastly, the duty of preventing data-related discrimination should be held by both individuals and governments. It is not enough to demand transparency so that people can protect themselves from the negative effects of profiling: people should be proactively protected from discrimination by authorities who have the power to control and regulate the use of data.


What is fair or innocuous in one place may be unfair or harmful in another.


These principles form a starting point for understanding how similar challenges play out in different places. The task of research is to identify where common responses to those challenges are emerging, to draw out lessons for governance, and to suggest ways to operationalize them. Translating this vision to the global level is a huge challenge. To do this, we have to place different visions of data’s value and risks in relation to each other, and seek common principles that can inform governance. Framing what global data justice might mean involves law, human rights, the anthropology of data use and sharing, the political economy of the data market and of data governance more broadly, and international relations.

This global problem is also becoming part of the agenda of computer science and engineering. The agenda of justice in relation to digitization is under formation, and needs input from all the fields doing conceptual and applied work in relation to the digital. It is not a task any individual field can address on its own, because work on data technology has evolved beyond the point where those who conceptualize and develop systems can understand what effects they will have on the global level. What is fair or innocuous in one place may be unfair or harmful in another.

Data justice should provide a lens through which we can address questions about how to integrate values into technology, but it is a higher-level question that cannot be answered with guidelines or with toolkits for privacy or explainability (despite the importance of these approaches). It is a conceptual question, though it leads to practical questions of governance: we wish to conceptualize how data should be governed to promote freedom and equality. This is not something academia can do on its own, but is a long-term challenge to be addressed in collaboration with policy-makers, and in consultation with everyone affected by the data economy.

Computer scientists are already part of this process. When they conceptualize and build systems, they make choices that determine how data gets constructed and used. Understanding how computer scientific research connects to the human and to the social world, and how CS research contributes to particular outcomes, is the first step. Making connections between that understanding and social scientific research is a necessary first step. This process is taking place at some computer scientific conferences (notably ACM FAT*, which is now integrating social science and law tracks), but is also visible in smaller workshops and interdisciplinary programs where social scientists and computer scientists come together to work on the social implications of data science and AI, to publish together and to build a research agenda. This work will grow in scale and importance in the coming years, with the notion of global data justice as a benchmark for the inclusiveness and breadth of the debate.

Back to Top

Back to Top

Back to Top

    1. Dencik, L., Hintz, A., and Cable, J. Towards data justice? The ambiguity of anti-surveillance resistance in political activism. Big Data & Society 3, 2 (Feb. 2016), 1–12; https://bit.ly/2VxoF0A

    2. Heeks, R. and Renken, J. Data Justice For Development: What Would It Mean? (Development Informatics Working Paper Series No. 63). Manchester, U.K., 2016; https://bit.ly/2UKVIRr

    3. Lyon, D. Surveillance Studies: An Overview. Polity Press, Cambridge, 2007.

    4. Taylor, L. What Is Data Justice? The Case for Connecting Digital Rights and Freedoms on the Global Level. Big Data and Society, 2017; https://bit.ly/2uZjxXb

    5. United Nations. A World that Counts: Mobilising the Data Revolution for Sustainable Development. New York, 2014; https://bit.ly/1it3l8P

    6. Wagner, B. Ethics as an escape from regulation: From ethics-washing to ethics-shopping? In M. Hildebrandt, Ed. Being Profiled: Cogitas Ergo. Sum Amsterdam University Press, Amsterdam, 2018, 84–90.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More