Although you might feel like you have some right to information about yourself, in a legal sense just about any type of data that can be collected about you without invading your personal space is fair game. Online click-through agreements can further expand the rights of companies to collect and distribute data about you. It turns out that, other than your thoughts, you don’t own much of your own data at all.
Data is intellectual property and under most circumstances, the default rights holder is the individual or organization that creates the data point. This is true even if this data point happens to describe something intensely personal, such as your favorite color, social security number, or genome.
That type of information is broadly referred to as personal data, or personally identifiable information (PII).
There are many reasons why both governments and individuals might have an interest in establishing controls over the collection and ownership of personally identifiable information. Legally, however, there is no single, comprehensive federal law regarding collection or ownership of personal data. Instead, a patchwork of existing consumer protections has been adapted to questions of personal data, sometimes in unclear or contradictory ways.
A Patchwork of Rules Govern Personal Data Use and Collection
The Federal Trade Commission has broad responsibilities for consumer protection in the United States and has frequently brought actions against corporations for failing to comply with their own posted privacy policies. But these actions have been under the auspices of unfair or deceptive business practices, rather than privacy issues—it’s not the fact that information was misused so much as consumers were lied to about how it would be used.
Some rules exist that restrict health care providers, financial institutions, and other organizations from collecting information on children, but there is no federal requirement to have a general privacy policy. Some states, however, including California, do mandate such protections.
The FCC (Federal Communication Commission) also has rules that can impact personal data through the Electronic Communication Privacy Act and the Computer Fraud and Abuse Act. HIPAA (Health Insurance Portability and Accountability Act) creates controls for medical data, and various financial regulations control banking and other investment-related information.
But the greater concern for most people are more mundane… the embarrassing selfies posted to Facebook, or the Google location data that pinpoints them at an ex-girlfriend’s house. In some cases, state regulations apply to how companies can use and sell this kind of data. But the hodgepodge of state rules makes it difficult to sort out what information is covered by these laws, and to what extent.
This question is about to become considerably more pressing for both consumers and data scientists. Gartner Research predicts that 25 billion smart devices will be connected to the internet as part of the coming onslaught of the Internet of Things (IoT) by 2020. Each of those gadgets, everything from coffee makers to industrial robots, will be generating and sending back vast amounts of detailed information about their own activities and how they are being used, as well as the condition of other things around them, both in the physical and virtual environments.
That data may be used to provide unprecedented detail about the personal and professional lives of everyone involved. And it’s not clear if the owner of the device, or the vendor who created it, will be considered the owner of the mass of data being generated.
Data Ownership May Be Less Important Than Data Control
In the end it may be less important who owns your data, in a legal sense, than who controls it.
Since commerce and data collection have gone global, data scientists also have to consider the position of other governments on the matter of data ownership. Most companies have a multinational presence, and those operating in, or collecting data from, any country in the European Union have to contend, as of 2018, with the EU’s Global Data Protection Regulation. The GDPR defines personal data as information about a natural person “…that can be used to directly or indirectly identify the person.”
Although the GDPR does not define ownership in any conventional sense, it does offer unprecedented access and control to individuals over their personal data. Companies will face new requirements for:
- Obtaining clear and unambiguous consent for collection
- Notifying individuals of breaches in which their data was lost
- Allowing individuals complete access to their own data in common formats
- Erasing personal data on request… the so-called “Right to be Forgotten”
As with any regulation written by non-technical bureaucrats, these strictures contain a number of inherent technical problems, but also introduce some significant ethical issues for data scientists to face. The “Right to be Forgotten” provision, for example, has already been incorporated into EU law, and its implementation has proven problematic in terms of weighing public interest against private right. One request made of search giant Google in 2014, was from a doctor asking for results regarding a series of botched medical procedures he had performed to be delisted.
Google did not fully comply with that request because of the obvious public interest in that information being widely available. But such decisions will become even more problematic when made of information in huge data sets that is not directly relevant to the public interest, but which, in aggregate, could impact that interest nonetheless.
An example might be information collected by a service such as DigitalGlobe that creates predictive policing strategies. Although DigitalGlobe’s approach does not attempt to forecast individual crimes or identify individual criminals, it does incorporate personal data in making its predictions. It is easy to imagine likely criminals opting out of that dataset and skewing the results, to the detriment of their future victims. Similar scenarios can be imagined with respect to medical studies or environmental protection efforts.
These types of issues mean that there is likely to never be a clear-cut answer to who owns your data. But it does mean that important steps are being taken to decide who controls it.