Scaling Up: The radical challenge of democratic data governance

Published on February 24, 2022

Professor Sabina Leonelli

University of Exeter and Wissenschaftskolleg zu Berlin

The utopia of a globalized data space, in which information seamlessly flows to support decision making across domains and locations, has long haunted the dreams of innovators. From the ideal of universal libraries that made Nineveh, Babylon, and Alexandria into capitals of the ancient world, to the circulation of knowledge and goods overseen by modern colonial empires, to the information networks envisaged by neoliberal economists as the foundation of free-market capitalism, data sharing has taken various forms through the ages.


Commentary

Scaling Up: The Radical Challenge of Democratic Data Governance

Professor Sabina Leonelli
February 24, 2021

The utopia of a globalized data space, in which information seamlessly flows to support decision making across domains and locations, has long haunted the dreams of innovators. From the ideal of universal libraries that made Nineveh, Babylon, and Alexandria into capitals of the ancient world, to the circulation of knowledge and goods overseen by modern colonial empires, to the information networks envisaged by neoliberal economists as the foundation of free-market capitalism, data sharing has taken various forms through the ages. Yet a common denominator remained: the intuition that controlling information is a primary source of power and that the ability to access and use data can foster economic and social development for certain members of society—without relying on democratic rule and benefit-sharing mechanisms.

 In the age of digitalization and the internet, with data technologies increasingly defining social life on and off screen, this intuition has again proved correct. Questions of data ownership and control are now often considered individual decisions, as data are formally recognized and traded as commodities; yet, individual agency in the data economy has shrunk, with a few organisations dominating the conditions under which information can be exchanged and used, to the detriment of individual rights and collective action. This has come as a shock to those who hoped for digital technologies as democratizing tools, enabling dialogue across diverse social groups and a more equal distribution of information and related power. Given the history of data-sharing efforts, however, the realization that a globalized data economy is not supportive of democracy should not come as a surprise.

 Matt Previtt and Divya Siddarth ask how we can turn things around—whether data can be used as a force for social good, and the data economy governed according to democratic rule. Their solution lies in what they call “accountable intermediaries”—institutions that steward data resources in ways that consider relevant stakeholders, thus “unlocking public benefits while protecting individual and community choice”. This proposal is truly radical, flying in the face of deeply rooted contemporary data hegemonies and the capital-oriented, competition-driven system that feeds the current data economy. And yet, somehow, it does not sound new.

 Over the last two decades of working with many stewards of data infrastructures, I have witnessed several attempts to muster the latest technologies and organizational insights to develop and maintain reliable, extensive, and widely accessible databases. My focus has been on the management of research data, particularly data produced by publicly funded bodies. Among the bodies responsible for data governance are: institutes, like the Secure Anonymised Information Linkage database in Wales, which developed from a data-provision service to a recognized authority on whether and how to use medical data and how to do so responsibly; consortia, tasked with maintaining the semantics of research databases like Gene Ontology, Crop Ontology, and many other networks dedicated to managing keywords for the retrieval and reutilization of data; countless databases, devoted to the collection of data on specific diseases, organisms, ecosystems, and/or locations; and expert groups, formed to discuss and define data standards, such as the hundreds of working groups hosted by the Research Data Alliance and multiple task forces sponsored by national funders, charities, and non-governmental organisations around the world.   

Many of these are wonderful examples of precisely the accountable intermediation that Previtt and Siddarth propose. In a recent report for the Food and Agriculture Organisation, we discussed such intermediation as a form of meso-governance, layered strategically between local initiatives and top-down policy. This type of data governance is grounded in the recognition that multiple stakeholders matter, that data infrastructures are accountable to both their users and whomever may be affected by the knowledge produced, and that accountable data governance involves a serious restructuring of the values and motivations driving research. Hence, data stewards must work to devise venues and formats to regularly engage multiple audiences in their work. A prime example here are the communities of practice brought together to debate how to share plant genetic data, which encompass data scientists, infrastructure providers, farmers, breeders, policy-makers, and others.

Such meso-level initiatives play a critical role in the data landscape. So why are they not more widely known and used as models around the world? The key issue is that of scale. The obstacles to scaling up these efforts cannot be underestimated. These forms of data governance have never worked beyond specific domains; their effectiveness is profoundly tied to local circumstances. The truly radical challenge of our time is scaling up these efforts to achieve democratic data governance. This has long been recognized by organisations like the European Commission, whose own conflicting accountabilities to member states mirror the problem of making data governance accountable to an increasingly large and diversified public. As in the case of the European Union, the effective federation of data efforts is not just a question of investment, though such an operation undoubtably requires money. Scaling up requires a cultural shift in both how research is organized and how data governance is approached.

On one hand, accountable intermediation demands a reorganization of priorities and evaluative systems for research, which places infrastructures and community service at the centre of the academic ethos, with competition playing a secondary role. Data stewards often have unorthodox careers in the name of fostering good science. Their work is viewed as “service” rather than an integral part of research and does not necessarily lead to high-impact publications. Hence such work is not typically rewarded through existing hiring and promotion trajectories, which adds both fragility to the data ecosystem and friction to the dialogue between data stewards and domain experts. This situation is not sustainable, nor is it conducive to wide-ranging debate and long-term collaboration around the potential uses of data.

On the other hand, accountable intermediation demands a recognition that value choices lie at the heart of scientific efforts. Data governance cannot be apolitical. The meaning of democracy in relation to data must be spelled out from the outset of any data initiative, and constantly debated and revised as projects and infrastructures develop and expand. In practical terms, data stewards should not just tolerate, but actively foster pluralism and engagement among participants. Democratic values can and do emerge in all aspects of data governance, particularly technical choices about the conditions under which data can be accessed, analysed, and shared. Standards play an important role, but they are not magical, one-off solutions, nor are they neutral instruments with infinite flexibility. Standards are social objects whose adoption and scope depend on their fit to diverse and rapidly changing conditions. They define who is included or excluded from using data systems, and their effectiveness cannot be evaluated without consideration of their wider social roles.

This cultural and political mandate has become evident in many of the data-governance efforts cited above. Several data infrastructures have found themselves playing an unanticipated socio-political role. This political turn has taken some data stewards—especially those coming from hard-science backgrounds—by surprise, causing discomfort among scientists who thought they were creating a purely technical platform only to discover that such a platform could not exist without a normative vision for the role of data in society. The use of data to address the COVID-19 pandemic has brought such issues into stark relief, as questions of data access, accuracy, and use became a matter of daily dispute on mass media and social networks.

 Acknowledging the political foundations and goals of data governance today means re-inventing existing approaches to developing data infrastructures. The question underpinning data sharing should never be whether data should be made openly accessible. The problem of data access merely distracts from the real issue with data governance, which is who decides how data is used and under which conditions. This is the key question today for both democracy and research. Answering it requires significant collective action.

 

*The author has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 101001145) as well as the Alan Turing Institute and the Wissenschaftskolleg zu Berlin. This paper reflects only the author's view and that the funders are not responsible for any use that may be made of the information it contains.


Previous
Previous

UNESCO Recommendation on Open Science – thoughts by Robert-Jan Smits

Next
Next

Bridging science and global health to overcome the pandemic: A mission for HERA?