The following is a work of fiction. However all the solutions are real things I’ve attempted in my time as a Cloud Consultant. Image this, you’re minding your own business when an empowered developer pops out of nowhere. They need to get an application deployed. You begin to open your [...]
This past year, as the Covid-19 virus began to spread so did the efforts to digitize the contact tracing process. As fast as the virus grew, so did the number of technical efforts by countries, institutions, enterprises and hobbyists 1. While people across the world showed themselves eager play their part in helping to overcome the virus, many expressed concerns over the security of their personal health and location data. This article will attempt to give a scoped overview of the current Covid app ecosystem and explore in-depth some of their security implications.
An Overview of the Covid-Tracing Ecosystem
Covid tracing apps can largely be separated into two distinct camps. Those which attempt to associate an individual with a specific time and place, and those that focus on associating individuals with each other.
Time and place based apps
The efforts focused on associating individuals with a time and place, allow health agencies to better focus on retracing the steps of individuals, identifying potential hotspots and better informing individuals of potential exposures. They tend to work either individually (through contact tracers or application alerts) or in bulk (though publication of hotspots or paths taken by potentially positive testing individuals).
SafeEntry was one of the first technical efforts to receive mainstream press attention. Implemented by the Singaporean government, SafeEntry is a checkin based system. When members of the public enter a school, workspace, hotel or taxi they will see either a posted QR code or business maintained reader, and can either scan the code with their device or scan their device or national ID card. They may also checkin manually through an app and are encouraged to “check-out” when leaving one of these places. By law, certain types of businesses are mandated to participate in this program either actively or by posting a QR code in a easy to see place. Checkins are stored centrally on government servers for “as long as needed” and are used to assist health agencies.
CovidSafePaths, developed in coordination with MIT uses GPS data provided voluntarily by positive testing individuals. The app itself has 2 parts. In the first, a users location is tracked once every 5 minutes and saved locally to the users device. If the user tests positive they may voluntarily share this information with public health authorities. The second feature of this app allows public health authorities to alert users to potential exposures. In this scenario, users will download information about potential hotspots and compare them with the locally stored location data. A match will alert users to a potential exposure and possibly provide information for next steps.
The apps above rely on user submitted data sent to individual agencies. The data is fragmented, incomplete in what it collects and depends on large scale adoption. An alternative is to use existing data sources and collectors. A wealth of location information is already being collected by telecoms, corporations and governments through already installed apps, location service providers, Bluetooth ad beacons and various other means. In mid-march, the Israeli government announced that it would be using information acquired though its “Shin-Bet” internal security service to cross reference data collected through cell phones and “special means” for use in contact tracing. Google has released “aggregate, anonymized” community mobility reports to help health agencies determine if the effectiveness and need for social distancing policies. US based SafeGraph, a commercial provider of geospacial datasets, expanded its metrics to include information about social distancing, which could also help contact tracers in determining if an area is “high risk”.
Connection based apps
Other protocols and apps focus on the connection between individual devices than on linking specific individuals with specific times and places. These apps rely largely on Bluetooth and involve devices sending messages to each other. This method generally aims to be more privacy-centric and is seen in some of the more popularly covered apps and frameworks. Once again, these efforts can be centralized or decentralized.
The Bluetrace protocol is currently used by the TraceTogether and CovidSafe apps, implemented by the Singaporean and Australian governments respectively. It aims to assist a “centralized contact tracing operation” with “decentralized recall”. These apps continuously measure the distance and duration between two devices running bluetrace base apps while exchanging messages between the devices containing temporary keys. The protocol provides a means for users to share information with public health authorities and for health authorities to contact potentially infected individuals. When a user is confirmed positive, they may choose to share their encounters with public health authorities who in-turn may contact other potentially infected users. Bluetrace also supports a method of federation between health authorities using different apps while still attempting to respect the privacy of individuals.
While the Bluetrace protocol itself allows for a decentralized option without the need for centralized contact tracers, the authors of the protocol recommend against it. They opt instead to keep a “human in the loop“. To quote the authors: “because a human contact tracer would seek to in-corporate information beyond just physical proximity,he/she can correct for systematic biases introduced by a purely automated notification system” 2 . They also note that contact tracing may produce anxiety which can be helped mitigated by person to person interactions.
Bluetrace promotes the idea of using a centralized backend to generate temporary keys, which may be downloaded in bulk to allow the app to continue functioning in areas without internet access. The authors acknowledge that it is possible to generate temporary keys on the device using assemetric encryption, but advise against it as it requires more computing and battery power and argue that centralization of key generation can serve as a secondary way to track app adoption. This functionality means that in order for this app to work, a user must register with the public health authority operating the app.
The Exposure Notification API, also known as the “Apple-Google” approach is also serving as a basis for many new apps. In a way, it is the “official” way to introduce contact tracing apps to the market due to the fact that Apple does not let Bluetooth tracking Apps run in the background if they are not using this API. The Exposure API itself is based on the DP-3T protocol which aims to allow contact tracing while preserving privacy. Like Bluetrace it relies on individual public health agencies for the actual app implementation and instead focuses on providing the protocol and framework for potentially federated contact tracing. Unlike Bluetrace, the Exposure Api is made to be decentralized with almost no identifying data collected by public health agencies by default (although it may include a mechanism for agencies to request information). With the Exposure API, temporary keys are generated on the device and never shared with a public health authority unless the user intentionally uploads them. The master key which these keys are derived from is ideally never shared at all. When a user is confirmed positive and uploads their data, their temporary keys are published publicly and other devices check to see if they have recently encountered those keys for a prolonged period of time.
False positives, spoofing and replay attacks
In various situations a user may end up reporting false positives. This can potentially be a user mistakenly or maliciously self confirming a positive test, a bad actor attempting to either mislead the system or impersonate a user or a carry out a replay attack. In a replay attack, previously sent messages of a positive testing user are resent in order to simulate a fake outbreak. All of these scenarios can have an adverse impact on individuals and the community as a whole as it could cause panic, distrust in the system or misallocation of a health agencies already limited resources.
Individuals wrongfully reported or alerted to exposure may face an unnessarary quarantine. In Israel, people have been ordered to quarentine based on cell phone Exposure data alone, such as a woman who was only waiving to her boyfriend from outside their house. In another incident, contacts of a health employee (including coworkers) were ordered to quarantine after mixed up lab results. Even after the error was discovered the employee was still unable to contact those affected to let them know it was safe to come back to work 3. Even if not under legal order, individuals receiving contact alerts may themselves impose a voluntary “self quarantine”. No matter the cause, unnecessary quarantines can have detrimental effects on individuals, causing them to miss out on work related income, be unable to care for others such as aging relatives or patients and can prevent them from seeking unrelated but still potentially critical medical care themselves.
Uploading data with TraceTogether
Having a “human in the loop” is a strong mitigation against these types of attacks. A human can verify positive tests, sanity check situations like those described about, can verify potential “replays” and can give context to those who receive and alert saying that they’ve been “exposed”. This is the preferred behavior of the Bluetrace based apps and is how Covidsafe and TraceTogether work. The Exposure API does allow (but not require) a human to verify positive tests and and choose which of a users temporary keys are published. However, Exposure API based apps require an affected user to proactively reach out to a contact tracer when presented with an “exposure alert”. To prevent users from unknowingly sharing information with bad actors impersonating officials, Bluetrace provides verification in the form on PINs that contact tracers must provide when requesting data.
Technical measures can also be put inplace to prevent against replay and impersonation attacks. Encrypting metadata and including timestamps in shared messages can help limit the attack for Bluetooth based approaches. Using temporary keys can prevent individuals from “minting” new messages unless they otherwise compromise the “seed” those temporary keys are created from.
Unwanted location tracking
There are of course types of users who don’t want their physical location tracked, either directly by the health agencies/app providers or indirectly by third parties exploiting the apps functionality.
Limiting the information collected and the amount of time its stored can help with this. TraceTogether removes locally stored information about encounters after 25 days. As mentioned above, CovidSafePaths only takes location information once every 5 minutes. This reduction in granularity reduces the amount of information that can be obtained on an individuals habits. Apps may also choose to fuzz the exact GPS coordinates of an individual. Both of these approaches will help mitigate location tracking concerns, though it is up to the health authorities to decide if this degrades the effectiveness of their apps. For CovidSafePaths, sending location data is “opt-in” and should only be submitted when a user tests positive for the virus. This greatly reduces the amount of data collected. SafeEntry collects and stores indefinitely all checkins by its users, however, this information is also in a way “opt-in” as its up to users to choose if and when they manually “check-in” at a place.
The Bluetooth model is attractive in this regard because it does not log location, but rather relationships with other peers. In the Exposure API model, the app is specifically forbidden from requesting location permissions. This is incompatible with certain older versions of android however which require the location permission in order to access Bluetooth. This is the case with TraceTogether, however they explicitly state that location is not collected. This and other threats are mitigated by the fact that the application is open source, allowing users to verify these claims themselves.
If an advisory knows the seed used to generate temporary tokens, that party may be able to use that information to track a user in real time. This assumes they have a means to listen for messages which is not beyond the realm of feasibility as many stores and locations already have Bluetooth trackers installed for use with advertising. In the Bluetrace model, the seed is stored centrally by the provider. This increases the likelihood that the central authority will be able to exploit it, as well as the impact of a breach. Storing the seed and generating the tokens locally reduces this risk but makes it easier for an attacker to retrieve the seed from a targeted individual by compromising their device.
Reidentification
Reidenitifaction is the threat that a postive testing individual can be identified after submitting their interactions. While no previously discussed method can completely eliminate this threat, there are several ways to mitigate it.
In the Bluetooth models, a user may upload their temporary keys to a public health authority. With the Exposure API method, this information is available to all users, who’s devices then locally check it against a list of messages that they have received, alerting them if there has been a match. Reidentification can happen if a third party can use this information to identify who that specific person is. If a person only had contact with 1 person on a given day for instance, they could effortlessly extrapolate who that contact is.
The “PACT” whitepaper 4, which is similar to the Exposure API, describes how the app itself, may withhold additional information such as exact time and date of the encounters, admiting that this is only a mild mitigation as sophisticated users would still be able to retrieve the exact messages from their device. In the centralized model, instead of publishing all encounters, the public health authority may choose to ignore certain encounters deemed irrelevant or withhold or fuzz the exact time of the encounter from the end user. A “human in the loop” may reach out to potentially exposed individuals directly, eliminating the need to publish those encounters at all. This is possible because the authority, having access to all users seeds, would be able to identify those encountered individuals from the positive testing user’s uploaded messages.
Relying on third parties to properly anonymize PII is not itself without risk. A whitepaper by the ACLU described an incident in which South Korean officials, working on a policy to anonymize and publicize location history published a description of a “43 year old man, resident of Now on district” who was “at his work in Mapo district attending a sexual harassment class.”
Insider threats and breaches
Once a health authority receives PII about a user, they assume a (often times legal) responsibility to make a best effort to protect that data. Once that information in “in the system” if may be made available to contact tracers, technical engineers, other affiliated agencies and vendors. Each of which may include bad actors. The Apple and Google App stores already limit who they allow to publish Covid related apps. Use of the Exposure API requires even stricter vetting and is generally limited strictly to public health agencies.
In many aspects, safeguarding this data is no different from the traditional safeguarding of PII and can be mitigated by best practices, vetting staff, enforcing the principal of least privilege and through keeping and reviewing logs. Minimizing the amount of data collected and the length it is stored for can also mitigate what bad actors can do. In the case of Covid tracing apps, information about individuals looses its value very quickly and so there may be less of an incentive to keep it indefinitely in a non-anonymized state.
Given the immediate demand for contact tracers it may be infeasible to properly vet all contact tracing candidates. Researchers at George Washington University have estimated that the United States alone may require 100-300 thousand contact tracers 5. In this case, it may actually be more secure to collect more information. The PACT whitepaper proposes “mobile assisted interviews” in which a person may programmatically submit the majority of their PII, of which only some may be shared with the contact tracers directly. Phone numbers may also be obscured to contact tracers who can instead be asked to contact an individual though a supplied application.
Hardware and software threats
In addition to the human element. Any application or device will have the usual technical threat vectors. In the examples described here, the device is typically a smart-phone which must accept arbitrary inputs. There may be a malicious QR code or a specially crafted Bluetooth message intended to cause a buffer overrun or other undesired behavior. The device itself may be compromised by unrelated malware through other apps or physical means. The Bluetooth card or other device hardware itself may also be vulnerable.
Applications may have intentional or unintentional backdoors, either put in by the developers, snuck in by attackers or introduced through vulnerable dependencies. Software may contain bugs and unknown vulnerabilities. These flaws can be mitigated by proper software development lifecycle best practices such as vulnerability scanning and code reviews and can be further improved by open sourcing the code and promoting bug bounty programs.
Conclusion
We live in novel and challenging times, as people and companies rise to the occasion and attempt to help in whatever way they know how. The struggle between security, convenience and effectiveness is nothing new, and neither are many the threat vectors described in this article. As our global population continues to look toward technology, we must make sure that we are doing our best to keep safe the very people we are trying to help.
Recently I was helping a company audit their 1password account. Thought I’d share some useful snippets using jq and the 1password CLI tool. This command suspends users who haven’t logged ...
Post comments (0)