The State's Politics of Fake Data

Chuncheng Liu, Microsoft Research
danah boyd, Microsoft Research

States monopolize symbolic power to legitimize the data they create, use, and share. State data, in return, is served to legitimize the state. ‘Fake data’ is commonly understood as either technically mistaken data (failed data) or politically misleading data (fraudulent data). State data are characterized in an essentialist fashion as true or false. In this paper, we deconstruct the common understanding of ‘fake data’ based on two contrasting public sector cases, stemming from our original research involving detailed ethnographic, interview-based, and archival data. The first focuses on how Chinese street-level bureaucrats produce data about neighborhood problems and activities. The second examines an aspect of the United States Census Bureau’s production of the 2020 census data. We map out how the state produces, understands, and navigates ‘fake data,’ while also considering how ‘fake data’ are practiced and interpreted by diverse stakeholders. We problematize the notion that ‘fake data’ are an essential, isolated, and stable object. This conceptualization reinforces the idea that ‘real data’ can be found “in the wild” and ignores the social constructionism of both realness and fakeness. Instead, we argue that the ‘fakeness’ of ‘fake data’ is spectral, processual, and performative. We elaborate on these arguments from our articulations of ‘fake data’ productions across data-making stages of creation, correction, rendering, and practicing. In each stage, ‘fake data’ is produced by multiple actors. However, our examination shows that these ‘fake data’ have contrasting meanings and serve diverse purposes. As a result, that which is labeled ‘fake’ does not always fit the essentialist conceptualization, while stakeholders’ reactions to ‘fake data’ are often rooted in an essentialist notion of ‘real data.’ We urge all who are invested in data to move beyond the debates of what data represents to consider what work data does across the data’s social life and trajectories.

No extended abstract or paper available

 Presented in Session 30. History of Data and Statistics