In May, Norway became one of the latest countries to abandon its COVID-19 contract-tracing app amid data privacy concerns. As countries begin the slow process of emerging from lockdown, test and trace is an essential tool in minimizing the risk of spikes in infection. However, the need to share information makes sensitive data vulnerable to breach and misuse. Data tokenization can deliver a secure means to exchange data while maintaining the privacy of personal information.
Norway is just another name on the quickly growing list of countries that have so far failed to deliver a COVID-19 contact-tracing app, along with the UK, Germany, and Singapore. Apart from the complexity of making huge apps work at very short notice, data privacy has been at the forefront of the issues besetting these programs.
A centralized approach to collecting COVID-19 data
Most of the early contact-tracing apps followed the traditional, centralized client/server approach that underpins most internet services and applications. An application installed on the user’s mobile regularly records a person’s location and proximity data and transmits the information to a data center that stores it.
That location data can be consolidated with other information such as body temperature, blood pressure, and other vital signs obtained from wearable medical devices. When a person contracts COVID-19 or another infectious disease, the server application queries its database for all users who might have come in contact with or been in the vicinity of the infected person. The server can then notify those people, instruct them to self-quarantine, and prioritize them for testing.
It’s clear, however, that this approach also relies on a vast amount of sensitive personal data passing through the system. That gives rise to significant data privacy concerns – concerns that are only too real. When Nature Medicine analyzed 50 COVID-19 apps worldwide, it found that only 16 promised to anonymize, encrypt and secure the data they collect.
Does Google have a better solution?
Many countries that are now abandoning their own centralized contract-tracing app are turning to the alternative developed by Google and Apple.
The Google and Apple solution uses low-level Bluetooth signals to alert anyone whose Android device or iPhone has come near a phone owned by an infected person in the previous two weeks. If a user is diagnosed with the coronavirus, it is up to them to inform the app’s public registry, which then notifies anyone whose phone has been near that phone.
This approach relies on rolling proximity identifiers, or RPIDs, that are used to ping other Bluetooth devices. RPIDs are changed every few minutes, and users who believe they are infected can share their previous RPIDs with a public registry that verifies whether the user is infected, and subsequently, alerts any recently connected “pings” to that user’s device.
Although the companies have added encryption to help secure those pings, they remain vulnerable. The Electronic Frontier Foundation has pointed out: “A well-resourced adversary could collect RPIDs from many different places at once by setting up static Bluetooth beacons in public places, or by convincing thousands of users to install an app. […] But once a user uploads their daily diagnosis keys to the public registry, the tracker can use them to link together all of that person’s RPIDs from a single day.”
Where trust is key, tokenization is the answer
However, data privacy isn’t really a technology issue at all. It’s an emotional one. For any contact-tracing app to be successful, it has to be voluntarily adopted by the majority, or very significant minority, of a country’s population. And, that adoption is built solely on trust. Trust that the app works and trust that its providers and the governments and third parties dealing with your personal data will not misuse it.
But, a PrivSec article suggests that the idea that privacy has to be sacrificed for utility is false, saying: “People are wrongly focused on a traditional and binary approach to privacy protection, which forces the trade-off between privacy and data use to occur … New approaches, such as newly defined GDPR-compliant pseudonymization, as well as the concept of data protection by design and by default, do not degrade the accuracy and effectiveness of data, while at the same time providing greater privacy protection.”
One of the most popular forms of pseudonymization is data tokenization. Tokenization is a relatively simple concept to understand. A token or surrogate value – replaces the original sensitive data, such as personal health information or social security number – in systems, applications, and databases.
The power of tokenization lies in two capabilities. First, once the token is created, it can be safely used by every application, system, and database across the enterprise without ever exposing the original data. It can enable sensitive data to be used within analytics and decision support systems.
Secondly, tokenization is built around a central data vault where all sensitive data is held so that you have a ‘single source of the truth’ for an individual’s data that is never exposed to the public sector body, healthcare providers, or other app users. This would ensure that the contact-tracing app complies with regulations such as HIPAA or GDPR.
We’re still in the early stages of building effective contact-tracing apps. But, sadly, it’s highly unlikely that COVID-19 is the last pandemic we’ll face. By looking to build data tokenization into apps from the outset can address the data privacy issues that have so dogged the first generation of apps.
If you’d like to find out more about data tokenization, read our Data Tokenization eGuide.
Want to know more about how OpenText is helping public sector organizations respond to COVID-19? Visit our website.