A engineer coding in on two monitors.
Decentralization and Data Flows
There is more to decentralization than just where the data is stored. How the data flows and whether it is portable is important to achieving the true benefits of decentralization.
Chris Riley
Chris Riley is the executive director of the nonprofit Data Transfer Initiative and a distinguished research fellow at the Annenberg Public Policy Center at the University of Pennsylvania; he holds a Ph.D. in computer science from Johns Hopkins University and a J.D. from Yale Law School.

Decentralized is the new disruptive; a large piece of the narrative power motivating enthusiasm around distributed ledger technologies (or blockchain, or Web3, depending on the author and context) is the potential for transforming power dynamics and resources as compared to existing ecosystems. But more is needed to realize the normative goals under the concept of decentralization than merely setting up servers in a different way. Notably, like the proverbial spice, the data must flow — both in and out, as users dictate. Where data flows are unduly limited, so too is decentralized infrastructure. Data portability reflects a particular way in which data can flow or be blocked, and designing policy around data portability to maximize the control of technology users and data subjects — including promoting reciprocity of transfers — ultimately promotes both effective data flow and meaningful decentralization.

One large and well-studied category of obstacles to free data flows arises from the tension between the global internet and national law. Some of these issues arise from laws and regulations which fall clearly into the category of protectionism, such as Russia’s notorious data localization mandate. Others arise from differences in protections for fundamental rights, including the long-running tensions between the European Union and the United States over data protection. Despite years of investment in mechanisms for legitimate data transfers (such as the “Privacy Shield” processes), obstacles remain, including the Irish data protection authority’s response to Meta transferring data on European Users to the United States for processing in 2023.

Separate from cross-border concerns, intranational considerations also impose (often highly legitimate and necessary) limitations on the free flow of data. In particular, privacy laws ensure that individuals maintain ultimate control over the use and transfer of their data in various ways, and these obligations generally supersede the value of free flow of (personal) data. Data portability, through public policy and tools, works at this intersection to ensure that users can transfer their data to the services of their choice; thus, the General Data Protection Regulation in the EU includes an explicit right to data portability. Here too, despite the best of intentions, things can go awry, notably where motivations other than protection of the subject of the data are given excess weight.

Notably, the European Union’s forthcoming Data Act allows users to request certain data regarding their use of connected devices, but expressly prohibits users from sharing such data with entities designated as “gatekeepers.” Restrictions on user choice motivated by competition considerations, such as the Data Act’s gatekeeper language, would seem to be attempting to force a split between decentralization and data flows, effectively forcing data to pool in multiple places. The policy intends to motivate users to migrate from large platforms to small, and should they then wish to move back, or to a different large platform, they will be unable to do so.

Like decentralization, the free flow of data is not an unequivocal good, and the design of the mechanisms of data flow contribute substantially to its proclivity for good or bad outcomes. Mark Nottingham’s IETF proposal notes that “not all centralization is avoidable, and in some cases, it is even desirable.” Mark’s characteristics of the kinds of centralization that should be regarded as harmful also broadly apply to those restrictions on data flows that should be concerning: a restriction on data flow “is most concerning when it is not broadly held to be necessary, when it has no checks, balances, or other mechanisms of accountability, when it selects 'favorites’ which are difficult (or impossible) to displace, and when it threatens to diminish the success factors that enable the Internet to thrive.”

One of the foundational principles of the Data Transfer Initiative, established in the earliest days of development of the Data Transfer Project codebase, is reciprocity: services that let users transfer data in should also allow users to transfer their data out. As a baseline, data portability is in general a user right, so all services should allow users to download their data, regardless. But there is substantial value for both users and businesses in going beyond this minimum and actively facilitating direct transfers. Tools supported by DTI, such as Meta’s Transfer Your Information tool for Facebook, make it easy for users to transfer their photos and other personal data from one service to another, with safeguards to ensure the legitimacy of the request as well as its proper scope.

For users, direct transfer technologies eliminate the need for potentially slow and costly downloads and uploads to devices that may not have adequate storage or processing power. Also, the use of adapters to translate between services minimizes inconsistencies in how data is stored and used between two different services. Businesses, in turn, benefit through improved user experience, trust, and sentiment, while reducing costs and technical challenges associated with importing data.

Thus, for businesses seeking to benefit from these advantages through DTI-supported tools, reciprocity is expected. While the language and the execution of the principle may be oriented towards service providers, user interests are at its core. Reciprocity not only ensures that users can move their data to the services of their choice; it encourages users to experiment with new services, helping to make sure that if the user decides not to continue with the experiment and wants to move any new data they have created back to their original service provider, they are free to do so.

Decentralization depends on data flows, and balancing the technology’s policies and public policy’s technicalities in a manner that keeps user interests at the core, including promoting reciprocity, offers the best path forward for promoting both data flows and decentralization.