HomeResourcesBlogUnlocking Research Opportunities: The Vision for a UK National Data Library
ResearchInnovationFINDS

Unlocking Research Opportunities: The Vision for a UK National Data Library

24 Feb 2025by Magdalena Getler
library shelves with a data image overlayed

Smart Data Foundry (SDF) has a clear vision of opening private sector financial data for societal good. Our part in achieving this future is to support research and innovation to solve urgent challenges facing everyone, from cost-of-living crises to economic inequality and how to improve economic well-being. 

This is no small task. Responsible access to this data encompasses everything from managing complex data operations in the Trusted Research Environment (TRE) to ensuring that only accredited researchers can access sensitive financial data. Additionally, it involves forming partnerships with private-sector data providers, engaging the public on ethics, and establishing guardrails for the use of data for the public good—which can carry different meanings for different people—to ultimately drive demand among researchers. The notion of ‘build it, and they will come’ does not readily apply here, either. 

I have been reading the proposals for the National Data Library (NDL), initiated by the Wellcome Trust and the Economic and Social Research Council (ESRC), with great interest. Some points resonated with me, especially those concerning the responsible handling of sensitive data and the future scope of the NDL. 

“Data is not like oil, it’s more like nuclear material.” 

Comparing data to oil has become somewhat of a cliché, but in the case of sensitive data, as one of the NDL proposals outlines, the reality is that ‘Data is not like oil; it’s more like nuclear material. Personal data has huge potential to do good; but also intrinsic risk. Small amounts of data achieve little and pose few risks. Data becomes powerful only when brought together, and refined: then it also becomes more dangerous…Data needs secure environments because if data leaks, it can’t be unleaked. Today sending data around to multiple locations, chaotically, for one off projects that are often trivial we sometimes treat national data like we treated nuclear material in the early days of the atomic age: enthusiastic amateurs, in unsafe conditions, painting glowing material on their teeth, in a clock factory’1 

So, how do we provide safe access to data for impactful research whilst ensuring individuals' privacy and trust?   

Data minimisation is important; we need to be strategic about the data we acquire and curate for research purposes to counter the tendency to accumulate data for its own sake. One approach is to understand how the data we hold and access can answer our research questions or illuminate the problems we want to solve. For example, focusing on areas like productivity and prosperity, health and well-being, digital society, and sustainability2 makes great sense, providing the focus for both those curating data and those requesting access to it. 

In addition, a federated model to minimise data at the source represents a step in the right direction, especially considering the potential for linking data across sectors. For instance, at Smart Data Foundry, we are currently exploring how to link individual-level health and finance data using a privacy-preserving records linkage methodology. As you can imagine, this presents a significant challenge, and we will share more about this project as it progresses. 

The role of a National Data Library for research 

Ensuring that “the NDL adds something new, or improves or replaces what already exists”3 is an obvious yet critically important statement. The authors of this response to Wellcome’s National Data Library challenge provide a crash course on how crowded the UK public sector data ecosystem is and emphasise the need for long-term funding for data infrastructure to ensure sustainability, regardless of shifting political interests. 

They also emphasize the importance of establishing the right infrastructure for researchers to improve access to government-held data and to maintain a clear vision. Recent comments from Science and Technology Minister Peter Kyle seem to elide the concept of data for research and improving operational data access for the public sector. There are technical and legal differences between research data and operational data, so we must be cautious not to confuse the needs of service delivery with those of researchers. 

Also crucial for enhancing researcher access is the machine-readability of data catalogues, curation, and clear documentation. In particular, appropriate licensing and provenance information should be attached to datasets so researchers understand what data they are accessing and how it can be used.  

The future scope of a National Data Library 

From Smart Data Foundry’s perspective, we are also interested in the potential for the NDL to include private-sector data. Naturally, this is easier said than done, but wouldn’t it be great if we could facilitate the safe and effective use and flow of data across administrative and research contexts, as well as among the private, governmental, and academic sectors?  

Integrating data infrastructures from traditional and administrative sources with smart data infrastructure would fundamentally enhance the discoverability of data available for research and insights, while ensuring equitable access to this data for the public good. The primary challenge will be building a cross-cutting infrastructure that research users want to engage with, enabling smooth collaboration across various data types, disciplines, and domains. 

Assuring private sector data partners 

Another critical piece of the puzzle is private-sector data partners, who must also be assured that allowing researchers access to their data is a good idea, compatible with their commercial interests, and acceptable to their customers. The Data (Use and Access) bill, currently making its way through parliament, will create an obligation for private-sector organisations to share data for research purposes on request, and it will be interesting to see how the industry responds to this. 

Currently, private-sector data partners approach data sharing for research with caution. They, understandably, express concerns about customer privacy, security implications, legal compliance, and the potential undermining of commercial interests due to information disclosure. Data partners require reassurance that sharing data will not adversely affect them and that sharing data for research will support their corporate objectives. Our role at Smart Data Foundry centres on providing such assurances as well as demonstrating how data can provide socially beneficial insights into human behaviour, leading to improved policy analysis and decision-making while simultaneously mitigating risks with robust data sharing agreements (e.g., anonymised individual-level data never leaves our TRE and we follow best practices such as the Five Safes Framework.) 

Final thoughts 

A National Data Library for researchers would be a welcome addition to UK research data infrastructure, provided it solves the challenges researchers face when it comes to the complexity of data siloes, clarity over what data is accessible and confidence that the data can answer their questions. 

The challenges of opening up public and private sector data for research multiply the deeper we delve, but the truth is that what we do gives us hope that, with hard work and robust partnerships, we will move towards a better future. Now, more than ever, there is a pressing need to bring together government, NGOs, academia and the public and private sectors to tackle the socioeconomic challenges we all face. A National Data Library could help us reach this goal sooner.  

Share this
Magdalena Getler

Get in touch

Magdalena Getler
Head of Academic Engagement
magdalena.getler@smartdatafoundry.com

More blog posts

View all
21 Feb 2025

Celebrating 3 Years of Smart Data Foundry!

Three years ago, Smart Data Foundry was founded with a bold mission: to unlock financial data for the public good. Since then, we’ve grown.
Partnership
13 Feb 2025

Who’s Falling Behind? New Dashboard Uncovers the Real Impact of Income Volatility

We believe data should serve people, not just institutions. Partnering with JRF’s Insight Infrastructure, we’re exposing the financial forces Britons face
Economic WellbeingIncome VolatilityPartnership
A person taps an ipad, with security locks overlaid
28 Jan 2025

Responsible data sharing: what is the Five Safes Framework?

Compliance with best practices such as the Five Safes Framework helps us to ensure we are engaging in responsible data sharing.
Data ProtectionGovernance

Sign up for the latest updates

Receive news, insights and event invites straight to your inbox.