It’s no secret — even if it hasn’t yet been clearly or widely articulated — that our lives and our data are increasingly intertwined, almost indistinguishable. To be able to function in modern society is to submit to demands for ID numbers, for financial information, for filling out digital fields and drop-down boxes with our demographic details. Such submission, in all senses of the word, can push our lives in very particular and often troubling directions. It’s only recently, though, that I’ve seen someone try to work through the deeper implications of what happens when our data — and the formats it’s required to fit — become an inextricable part of our existence, like a new limb or organ to which we must adapt. ‘‘I don’t want to claim we are only data and nothing but data,’’ says Colin Koopman, chairman of the philosophy department at the University of Oregon and the author of ‘‘How We Became Our Data.’’ ‘‘My claim is you are your data, too.’’ Which at the very least means we should be thinking about this transformation beyond the most obvious data-security concerns. ‘‘We’re strikingly lackadaisical,’’ says Koopman, who is working on a follow-up book, tentatively titled ‘‘Data Equals,’’ ‘‘about how much attention we give to: What are these data showing? What assumptions are built into configuring data in a given way? What inequalities are baked into these data systems? We need to be doing more work on this.’’
Can you explain more what it means to say that we have become our data? Because a natural reaction to that might be, well, no, I’m my mind, I’m my body, I’m not numbers in a database — even if I understand that those numbers in that database have real bearing on my life. The claim that we are data can also be taken as a claim that we live our lives through our data in addition to living our lives through our bodies, through our minds, through whatever else. I like to take a historical perspective on this. If you wind the clock back a couple hundred years or go to certain communities, the pushback wouldn’t be, ‘‘I’m my body,’’ the pushback would be, ‘‘I’m my soul.’’ We have these evolving perceptions of our self. I don’t want to deny anybody that, yeah, you are your soul. My claim is that your data has become something that is increasingly inescapable and certainly inescapable in the sense of being obligatory for your average person living out their life. There’s so much of our lives that are woven through or made possible by various data points that we accumulate around ourselves — and that’s interesting and concerning. It now becomes possible to say: ‘‘These data points are essential to who I am. I need to tend to them, and I feel overwhelmed by them. I feel like it’s being manipulated beyond my control.’’ A lot of people have that relationship to their credit score, for example. It’s both very important to them and very mysterious.
When it comes to something like our credit scores, I think most of us can understand on a basic level that, yes, it’s weird and troubling that we don’t have clear ideas about how our personal data is used to generate those scores, and that unease is made worse by the fact that those scores then limit what we can and can’t do. But what does the use of our data in that way in the first place suggest, in the biggest possible sense, about our place in society? The informational sides of ourselves clarify that we are vulnerable. Vulnerable in the sense of being exposed to big, impersonal systems or systemic fluctuations. To draw a parallel: I may have this sense that if I go jogging and take my vitamins and eat healthy, my body’s going to be good. But then there’s this pandemic, and we realize that we’re actually supervulnerable. The control that I have over my body? That’s actually not my control. That was a set of social structures. So with respect to data, we see that structure set up in a way where people have a cleaner view of that vulnerability. We’re in this position of, I’m taking my best guess how to optimize my credit score or, if I own a small business, how to optimize my search-engine ranking. We’re simultaneously loading more and more of our lives into these systems and feeling that we have little to no control or understanding of how these systems work. It creates a big democratic deficit. It undermines our sense of our own ability to engage democratically in some of the basic terms through which we’re living with others in society. A lot of that is not an effect of the technologies themselves. A lot of it is the ways in which our culture tends to want to think of technology, especially information technology, as this glistening, exciting thing, and its importance is premised on its being beyond your comprehension. But I think there’s a lot we can come to terms with concerning, say, a database into which we’ve been loaded. I can be involved in a debate about whether a database should store data on a person’s race. That’s a question we can see ourselves democratically engaging in.
But it’s almost impossible to function in the world without participating in these data systems that we’re told are mandatory. It’s not as if we can just opt out. So what’s the way forward? There’s two basic paths that I see. One is what I’ll call the liberties or freedoms or rights path. Which is a concern with, How are these data systems proscribing my freedoms? It’s something we ought to be attentive to, but it’s easy to lose sight of another question that I take to be as important. This is the question of equality and the implications of these data systems’ being obligatory. Any time something is obligatory, that becomes a terrain for potential inequality. We see this in the case of racial inequality a hundred years ago, where you get profound impacts through things like redlining. Some people were systematically locked out because of these data systems. You see that happening in domain after domain. You get these data systems that load people in, but it’s clear there wasn’t sufficient care taken for the unequal effects of this datafication.
But what do we do about it? We need to realize there’s debate to be had about what equality means and what equality requires. The good news, to the extent that there is, about the evolution of democracy over the 20th century is you get the extension of this basic commitment to equality to more and more domains. Data is one more space where we need that attention to and cultivation of equality. We’ve lost sight of that. We’re still in this wild west, highly unregulated terrain where inequality is just piling up.
I’m still not quite seeing what the alternative is. I mean, we live in an interconnected world of billions of people. So isn’t it necessarily the case that there have to be collection and flows and formatting of personal information that we’re not going to be fully aware of or understand? How could the world operate otherwise? What we need is not strikingly new: Industrialized liberal democracies have a decent track record at putting in place policies, regulations and laws that guide the development and use of highly specialized technologies. Think of all the F.D.A. regulations around the development and delivery of pharmaceuticals. I don’t see anything about data technology that breaks the model of administrative state governance. The problem is basically a tractable one. I also think this is why it’s important to understand that there are two basic components to a data system. There’s the algorithm, and there are the formats, or what computer scientists call the data structures. The algorithms feel pretty intractable. People could go and learn about them or teach themselves to code, but you don’t even have to go to that level of expertise to get inside formatting. There are examples that are pretty clear: You’re signing into some new social-media account or website, and you’ve got to put in personal information about yourself, and there’s a gender drop-down. Does this drop-down say male-female, or does it have a wider range of categories? There’s a lot to think about with respect to a gender drop-down. Should there be some regulations or guidance around use of gender data in K-12 education? Might those regulations look different in higher education? Might they look different in medical settings? That basic regulatory approach is a valuable one, but we’ve run up against the wall of unbridled data acquisition by these huge corporations. They’ve set up this model of, You don’t understand what we do, but trust us that you need us, and we’re going to vacuum up all your data in the process. These companies have really evaded regulation for a while.
Where do you see the most significant personal-data inequalities playing out right now? In the literature on algorithmic bias, there’s a host of examples: facial-recognition software misclassifying Black faces, cases in medical informatics A.I. systems. These cases are clear-cut, but the problem is they’re all one-offs. The challenge that we need to meet is how do we develop a broader regulatory framework around this? How do we get a more principled approach so that we’re not playing whack-a-mole with issues of algorithmic bias? The way the mole gets whacked now is that whatever company developed a problematic system just kind of turns it off and then apologizes — taking cues from Mark Zuckerberg and all the infinite ways he’s mucked things up and then squeaks out with this very sincere apology. All the talk about this now tends to focus on ‘‘algorithmic fairness.’’ The spirit is there, but a focus on algorithms is too narrow, and a focus on fairness is also too narrow. You also have to consider what I would call openness of opportunity.
Which means what in this context? To try to illustrate this: You can have a procedurally fair system that does not take into account different opportunities that differently situated individuals coming into the system might have. Think about a mortgage-lending algorithm. Or another example is a court. Different people come in differently situated with different opportunities by virtue of social location, background, history. If you have a system that’s procedurally fair in the sense of, We’re not going to make any of the existing inequalities any worse, that’s not enough. A fuller approach would be reparative with respect to the ongoing reproduction of historical inequalities. Those would be systems that would take into account ways in which people are differently situated and what we can do to create a more equal playing field while maintaining procedural fairness. Algorithmic fairness swallows up all the airtime, but it’s not getting at those deeper problems. I think a lot of this focus on algorithms is coming out of think tanks and research institutes that are funded by or started up by some of these Big Tech corporations. Imagine if the leading research in environmental regulation or energy policy were coming out of think tanks funded by Big Oil? People ought to be like, If Microsoft is funding this think tank that is supposed to be providing guidance for Big Tech, shouldn’t we be skeptical? It ought to be scandalous. That’s kind of a long, winding answer. But that’s what you get when you talk to a philosophy professor!
Opening illustration: Source photograph from Colin Koopman.
This interview has been edited and condensed from two conversations.
David Marchese is a staff writer for the magazine and writes the Talk column. He recently interviewed Emma Chamberlain about leaving YouTube, Walter Mosley about a dumber America and Cal Newport about a new way to work.