19 April 2020

NHS and technology partners develop Covid-19 Datastore

As the Covid-19 pandemic continues, NHS England announces plans to partner with several tech companies to develop and build a data platform to better inform the response to coronavirus. In this article we report on the plans, and consider some of the data protection issues that need to be considered.

The UK Government has announced plans that NHS England is to partner with a number of technology companies in order to develop and build a comprehensive data protection platform (the Covid-19 ‘Datastore’) to better inform the national response to Covid-19.

Those plans follow an announcement by the Secretary of State for Health, Matt Hancock, that notice had been served under the Health Service (Control of Patient Information) Regulations 2002 (COPI), requiring organisations to process confidential patient information to support the Secretary of State’s response to Covid-19. If the notice is valid then this would require and compel the dissemination of confidential patient information to persons and organisations permitted to process that information under COPI. Those organisations, are ”Organisations providing health services, General Practices, Local Authorities [and] Arm’s (sic) Length Bodies of the Department of Health and Social Care”.

The overarching aim of the Datastore initiative is to facilitate gathering and collecting in one place the enormous amount of disparate data that is continuously generated throughout the NHS and social care sector so that informed decisions may be made.

The power to utilise data

In order to respond to the Covid-19 pandemic in a timely manner, the Government and NHS decision-makers need consolidated, accurate and real-time information. While those data have already been collected in various parts of the NHS, its value is enhanced if it is linked to and integrated with other datasets so as to produce a single body of reliable and up to date information.

Mining the Datastore to, for instance, create digital dashboards will enable healthcare researchers and advisers to better understand how and where the virus is spreading. This will enable the identification of emerging hotspots which, in turn, will inform how the NHS and social care service providers will be best placed to respond such as, for instance, determining the optimum length of a hospital stay for a symptomatic Covid-19 patient. By using those data, simulations will also help officials understand the efficacy of interventions such as social distancing, school closures, and local household quarantines. The Datastore will also assist in anticipating and fulfilling supply needs, such as by determining locations where ventilators and other equipment are likely to be needed.

NHSX, the digital transformation arm of the NHS, and NHS England and Improvement have been tasked with contracting and entering into data processing agreements with tech companies to develop the Covid-19 Datastore. In particular, Palantir will perform a processing function, so as to play a central role in developing the Datastore. Palantir will be responsible for integrating NHS datasets, using its data-management platform, known as Foundry.

Palantir has experience in such projects, having assisted the US government coordinate its response to the cholera outbreak in Haiti following the earthquake in 2010. The London-based machine learning company, Faculty AI, previously known as ASI Data Science, is also supporting the wider development of the project and will provide analytic support with necessary models and simulations.

Microsoft, Amazon Web Services, and Google are also understood to be involved in the Covid-19 Datastore: Microsoft is helping to build the Datastore using its cloud platform, Azure, so as to hold the data in a single secure location; Amazon Web Services will be providing infrastructure and technologies; and certain software tools in Google’s G Suite family will allow the NHS to collect real-time information on hospital responses to coronavirus, including aggregated operational data on hospital occupancy levels and A&E capacity.

Data being used

It is understood that data being used to create the interactive Datastore dashboards will be drawn from existing NHS data sources, and will include information derived from calls to the NHS’ 111 and Covid-19 test results enquiry lines. Data concerning where ventilators are being deployed, the incidence of healthcare staff illness, and levels of patient occupancy will also be included.

However, the Datastore is not intended to include personally identifiable health data, although those data may be present on the dashboard in an ‘anonymised and aggregated form’. Those data may include clinical information about patients in intensive care, and data concerning gender, symptoms, and prescription details.

Other personal data concerning geographical location, addresses and phone numbers may be included, though it is unclear whether those data will be blurred – a process known as Barnardisation. It has also been reported that a ‘pseudo NHS number’ may be used to cross-match large datasets.

Data protections issues

Although the advent of the Covid-19 pandemic is unprecedented, it is clear that the obligations of the GDPR remain. The GDPR already has provisions to help in these exceptional times, and we have reported in our previous article how the GDPR provides a legal basis by which employers and public health authorities may process personal data, in accordance with the processing principles, in the context of epidemics, without the need to obtain the consent of the data subject.

The NHS has offered the assurance that the computer code and those data involved in the developing the Datastore will be made ‘open source wherever we can,’ and that those data will only be used for the Covid-19 project and not for any other purpose, and that only relevant information will be collected. It is also understood that requests to view those data will be reviewed on an individual basis by NHS England and NHSX, and that all data in the Datastore will remain under NHS England and Improvement’s control.

However, the rapid development of the Datastore, of course, raises many questions concerning the extent to which data controllers are compliant with the applicable legislation, namely the e-Privacy Regulation, the General Data Protection Regulation (‘GDPR’), and the Data Protection Act 2018 (Data Protection Legislation).

Personal data and anonymisation

Under the GDPR, personal data is any information that relates to an identified individual or an individual who is, without too much trouble, identifiable. Special category data is any personal data that needs more protection because of its sensitive nature. This includes genetic data, biometric data, and data concerning health. However, any personal data or special category data that is properly anonymised (and not just pseudonymised) are not considered to be personal data for the purposes of Data Protection Legislation.

NHSX has offered assurances that any personal data will be anonymised, so that individual patients cannot be identified. However, the relevant announcement goes on to state that this will involve removing names, addresses and other identifiers, and replacing them with a “pseudonym”.  Pseudonymisation and true anonymisation are different concepts under Data Protection Legislation, and more clarity is required from NHSX and NHS England and Improvement to understand how any personal data are being categorised and processed, and what steps are actually being taken to properly anonymise personal data.

Although a pandemic calls for swift action, ‘getting it right’ is important not only from a legal perspective, but also from a reputational perspective, as the NHS relies on patients having confidence that any data provided are being appropriately protected.

Data retention

The GDPR imposes strict rules on the retention of data. Specifically any personal data should not be kept for longer than is necessary. NHSX has confirmed that the Covid-19 Datastore will be closed after the crisis has passed, and that all data will be destroyed. However, further detail remains to be clarified.

Even though the pandemic will ultimately pass and the Covid-19 Datastore will eventually be closed, it is hoped that lessons will be learned, which will enable sophisticated data collection, aggregation and analysis to become embedded in the NHS in the future, whilst at the same time protecting the privacy of patients.

The Government’s announcement can be found here.

For more information please contact Partner, James Tumbridge at