What Machine Learning Assumes About the World: A Conversation with Abeba Birhane15 min read

Abeba Birhane is a cognitive scientist researching human behaviors, social systems and responsible and ethical artificial intelligence through an interdisciplinary practice that weaves together cognitive science, AI complex science and theories of decoloniality. Her PhD thesis explores the challenges and pitfalls of automating human behavior through critical examination of existing computational models and audits of large scale data sets. Currently, she is a Senior Fellow in Trustworthy AI at the Mozilla Foundation and an Adjunct Lecturer / Assistant Professor at the School of Computer Science at University College, Dublin, Ireland. C/Change spoke with Birhane about what AI misses by reducing behavior and intelligence into data that can be modeled, as well as the potential for building AI according to values of decoloniality.


C/Change: Tell us more about how your background in cognitive science brought you to thinking about ethical frameworks for artificial intelligence. What does the need for ethical AI look like from a cognitive science perspective?

Abeba Birhane: So I am a cognitive science scientist by training. At the start of my PhD, I was fully focused on questions around cognition – what is cognition? What’s human nature? What’s a person? What are various cognitive phenomena? What are emotions? What’s intelligence? Those kinds of questions. And also, I’ve always been interested in how we capture these human phenomena, these human elements in data, and how we build models of them. My background is in a kind of niche area called embodied and enactive cognitive science. So the idea behind embodied cognitive science is to push the Orthodox approach or the traditional approach to cognition, which is somewhat limited, simplistic and reductive, where the focus is the brain, for example, in understanding cognition, you might analyze various brain activation patterns. Or when trying to understand a person, your sole focus is the individual itself. So, enactive, embodied cognitive sciences, broadly speaking, are a push back against this. Instead, it posits the idea that cognition extends outside the brain; the technology you interact with is an element of your cognition, it aids and extends your cognition. The person doesn’t end at the skin, but rather, the person is necessarily also tangled up with the environment, the ecology, the milieu, and it includes the history, it includes the context, the background. So the idea is that if you want to model the person, it’s not just like the person exists as an island, but rather, you have to take account of all those contingent factors, or all those contextual phenomena that aren’t necessarily part of what being a person is from the “objective” perspective. At the start of my PhD, coming from this disposition was already, for me, at least an improvement from the traditional approach, because I like including social factors. I like looking at these messy, but important factors that contribute to who we are. And so I was working within that framework, but there has always been a frustration because these approaches that see themselves as purely scientific tend to ignore social factors such as historical inequalities, or existing and past power dynamics. They tend to kind of ignore these or they tend to push them aside as not a scientific question that has to do with understanding cognition. So that’s when I turned to Black feminist studies and decolonial theories because you get a rich and in-depth understanding of these power dynamics that permeate society and inequalities that have existed throughout history that also exist now and are important contributing factors to racism, sexism, white supremacy, and so on. So as I tried to bring together embodied cognitive science, systems thinking and the decolonial and Black feminist theories, because at University College Dublin where I did my PhD, the cognitive science program is under the School of Computer Science. So naturally, because I was working in a lab with a lot of computer scientists, I was more and more interested in the questions of how computer scientists build models of cognition, models of intelligence, models of emotion, models of social interaction? I was focusing more and more on how we acknowledge and account for inequalities that exist within society? And how do we incorporate that into practices of our modeling? 

C/C: I want to dig a little bit more into the perspective of cognitive science, and your specific perspective regarding embodied cognition and Black feminist theories. I’m thinking about the picture of intelligence that artificial intelligence operates under. So if one way to describe artificial intelligence is as a prediction machine that turns things like language into vector representations, and then changes weights to build this knowledge and then kind of use that to predict the next token in a sentence, then we’re left with this model of human behavior that’s extremely predictable. Everything that anyone could possibly do can be represented mathematically and then predicted. So I’m wondering if you can reflect on what this picture of intelligence is and what it assumes about human behavior. 

So what you find is that you might be able to build a model, but your model is always reductive, and at best you might be able to capture a snapshot of that moving target. But at worst, it might mislead you, it might be a really biased or stereotypical or misguided understanding of what you think you are modeling.

AB: So putting language in a vector and building that kind of model for a language fits really neatly into traditional approaches to cognition of language or to understanding the human condition in general. Because these traditions inherit a lot of individualistic thinking, where the idea is to get as “objective” as possible, and what you end up doing is simplifying language, simplifying intelligence into something that can be measured, something that’s quantifiable, something that can be captured in data so you can model it. This way of thinking obviously has its own benefits, because instead of constantly questioning assumptions, you just take some things for granted. You kind of cut yourself loose from these difficult questions and you just get on with the modeling and produce something. But the problem is that what you are producing does not really reflect language, does not reflect culture, does not reflect human cognition, because language or intelligence or cognition are not things that can be defined once and for all, they can’t be neatly captured in data and measured. These are things that are really messy. These are really ambiguous things that are open to interpretation, that are open to different points of view. There is no clear boundary, as I tried to describe earlier, between individual people – we exist in a wave of relation. So, for example, if you want to build a model of me doing some kind of task, it becomes subjective to kind of say, this is you, I’m modeling you. And, in order to build a model, you have to do that necessarily, but by doing that, you create a boundary by saying this is where you end and where the environment begins. So what you find is that you might be able to build a model, but your model is always reductive, and at best you might be able to capture a snapshot of that moving target. But at worst, it might mislead you, it might be a really biased or stereotypical or misguided understanding of what you think you are modeling.

C/C: I guess we should take them at their word when they call them models, because they are models of reality, not reality itself.

AB: This is one of the biggest misunderstandings where people often mistake the model for reality, and where people assume the model captures reality, when in fact no model can capture reality in its entirety, you might capture a tiny slice of it, as I said earlier.

C/C: One take on AI that I hear a lot, especially in the wake of all the creative text-to-image models that are coming out, is that these models are a sort of average of our entire collective intelligence because they’re trained on the entire internet of human words and images. It encodes our entire collective body of art that anyone has ever produced. And then I think of your work on auditing training data sets and the biases that are encoded in those, and how this kind of thinking about models as reflecting collective intelligence actually reifies biases as what stands in for our collective imagination, which is not a very pleasant picture, to say the least. I’m wondering if this is an argument that you’ve come across and if you have any thoughts on this articulation of collective intelligence?

Image dataset used to train AI.

AB: I don’t know where to begin. There are so many issues with the current models that have been put forward. I guess I’ll start with the issue of dataset connection and curation. I feel that that’s an issue people haven’t been paying so much attention to. On the one hand, with these communities, these groups who claim to “democratize” data sets, one of the positive things coming out of such a practice of open sourcing data sets is that we are able to delve in and take a look at what these gigantic internet source data sets look like. This is in a way positive, because previously, these kinds of data sets have been closed under various proprietary research reasons. These data sets exist in huge corporate labs at Google, Facebook, and Amazon, or even Open AI – despite the name “Open” AI, it does not open source much. The open source data sets are giving us insights into what large data sets that are collected from the internet look like. Myself and my colleagues audited one of the biggest open source datasets called LAION-400. It’s been not even a year since it was released, and we audited it as soon as it came out, when it was 400 million image and text pairs. But now within this amount of time, the data set has grown to be 5 billion image-text pairs. What we found was that that data set, which comes mainly from the internet, contains very problematic content that shouldn’t be there, for example, things such as child rape, explicit images of children, and loads of images that are sourced from pornographic sites. So in its original release form, when we were probing the dataset for really benign words, such as mom, auntie, or Black woman, much of the images that were being returned were really very unpleasant images of naked women from pornographic sites. This is really disturbing. Any data set that’s sourced from the internet will always be problematic, because I remember reading somewhere that 65 to 70% of the internet is pornographic content. So if you don’t filter that out, or if you don’t do active curation and filtering out, your datasets are going to be problematic. When you train your model on that dataset, the model is going to reflect the data whether it’s producing art, or whether it’s some other generative model. 

C/C: This reminds me of the point we arrived at earlier about the model versus reality and what happens when the model, which is not reality, starts affecting reality. Your point about trust is a great segue to talk about your work at the Mozilla Foundation and with the Trustworthy AI research group. Can you tell us a little bit more about what you’re working on there?

AB: I sit at the intersection of many disciplines, so I see myself as an interdisciplinary scholar. And I’ve always had projects that might seem from the outset unrelated, but in fact are. So at the moment, I work on various projects that seem to go in different directions, but I’ll describe some of them. So one of the projects I’m working on is taxonomizing audit tools and mapping the landscape of algorithmic audit work. Auditing is a relatively new phenomenon. You can argue that algorithmic auditing really started with the Gender Shades paper, some might argue that there were a few works previous to that, but it’s recent generally. So, as a result of that, there are so many approaches, there are so many methodologies, there are so many papers, so many case studies, so many perspectives. You find auditing work driven by policy and regulation motivated by holding accountable those responsible, you have auditing work that comes from the CS tradition where people are working on fairness questions, you have various journalistic traditions such as The Markup or ProPublica, where they delve into an algorithmic system. The point I’m trying to make is that the auditing landscape is huge, very broad. We are trying to assemble existing auditing tools into one space, taxonomized by various themes and to create a report examining and mapping the auditing landscape. I also work on dataset audits itself as a continuation of my previous work, where again, we work as a team to try to get a handle on these newly released open source multimodal datasets. We are trying to find out whether they contain toxic content, if there are flawed representations of cultures, individuals, communities, and genders. This is ongoing work. I’ve also been working in theoretical work where again, as a collaboration, we are trying to flesh out decolonial theories and machine learning: are they incompatible by nature? On the one hand, you have machine learning where the very objective is to find similarities, to cluster, and to predict based on the patterns that have been found. On the other hand, if you go to decolonial theories, decolonization means first and foremost understanding, it means undoing past wrongs. Decolonization also means building anti-oppressive, anti-capitalist tools. So when you highlight all these characteristics of decolonization, what you find is that they stand in stark contrast to machine learning. We are working toward a way to have machine learning that is fully decolonial. 

C/C: All three projects are fascinating; I’d like to double down on the last point about this potential incompatibility between machine learning and decolonial theory and practice. I know you’ve written a lot about the extension of colonialism through Big Tech and its global reach. A lot of what we’re thinking about in this project is how to foster a cultural exchange that is safe and inclusive across borders, especially in places around the world where online spaces are restricted. The challenges we face in the present moment with the pandemic seem to necessitate global coordination and communication. But these efforts are mediated by technologies that seem to be incompatible with or at least not built in terms of decolonial practices. So I’m wondering if you can talk about this tension and this paradox where technology creates a lot of problems that we then need to come together and work across borders to try to solve, and how we might effectively use technology to kind of address these problems.

AB: That’s a really good question. 20 years ago, or even 15, 10 years ago, if you look at the most powerful businesses, it has been automotive companies, food chains such as McDonald’s, and so on. But now, if you look at the highest revenue-generating companies, you will find it’s Google, Facebook, Amazon. The tech industry has quickly become the richest, most powerful, most monopolizing industry. A lot of that kind of tech development is driven by profit maximization. On the one hand, you can say much of technology, especially machine learning, is really driven by objectives such as profit maximization and efficiency. We’ve actually done research on the undervalues of machine learning, where we analyzed the 100 most cited machine learning papers, and we found that for machine learning researchers, the most important values are things such as performance, efficiency, novelty, and so on. But on the other hand, you go to the objectives of decoloniality. Again, decoloniality is not just one thing, but rather, it has multiple objectives. But generally, it’s about justice, it’s about equality. It’s about undoing the wrong pasts, it’s about reviving, for example, languages lost due to white supremacy. So the objectives really are community building and equity, and far from commercial motives. So these tensions, these two objectives always exist in conflict. And in the current work, where we are kind of looking at the tension that exists between machine learning and theories of decoloniality, we actually look at the Maori community in New Zealand as an example of how to build machine learning models that are decolonial, that prioritize the interests of communities. So, what the Maori did in one of their projects is they collected huge amounts of data, over 300 hours of voice data from elders about the language itself and the culture. The idea is to get knowledge from them into data and build language models and machine translation systems that the younger generation can learn from. In the process they were really careful about not giving any control or any hold of the data to corporate actors. They have developed their own data principles to ensure that they remain in control of the data. So that community I see as an example of where machine learning can be reborn, can be used for anti-oppressive purposes, where we can truly build decolonial AI.


Portrait Photograph Credit: Vincent Hoban/UCD