Protecting information privacy: challenges and opportunities in federal legislation

Polly Turner-Ward

By NCL Google Public Policy Fellow Pollyanna Turner-Ward

On September 11, 2019, policymakers, industry stakeholders, and consumer advocates gathered at The Brookings Institution to discuss the pressing question of how to protect information privacy through federal legislation. Representing the National Consumers League was Executive Director, Sally Greenberg.

How did we get here?

To set the scene, panelists first discussed why there is consensus on the need for federal legislation to address privacy and data security. The Snowden revelations showed consumers how much of their data is out there, and they began to question whether companies could be trusted to keep their data safe from the government. More recently, in light of the Cambridge Analytica scandal and increasing instances of identity theft and fraud resulting from data breaches, consumers have begun to question whether companies themselves can be trusted with their data.

Businesses are worried about lack of consumer trust interfering with their adoption of digital products and services. For instance, parental refusal to provide consent to the collection and use of data regarding their kid’s academic performance prevents the personalization of their children’s learning experience. By providing individuals with greater privacy protections, businesses hope that individual participation in the digital economy will increase.

In response to consumer privacy concerns, a patchwork of state bills on privacy and data security are also popping up. Business claims to be overwhelmed by the idea of complying with these differing regulatory schemes, especially in light of the EU’s General Data Protection Regulation (GDPR), which has already moved many organizations to comply with privacy and data security rules. To support businesses and to regain U.S. privacy leadership, greater international operability is necessary.

What should federal legislation look like?

Each panelist set forth their idea of what federal legislation should aim to achieve. Intel drafted a privacy bill which includes various protections but which lacks a private right of action – that is, the ability to take wrongdoers to court if they violate privacy laws. If companies promise not to use your information in certain ways and then do it anyway, in violation of law, you should have the right to take them to court. NCL’s Sally Greenberg directed audience members towards the Public Interest Privacy Principles signed by thirty-four consumer advocacy and civil rights organizations. Advocating in favor of strong protections, strong enforcement, and preemption, and highlighting the importance of “baking data privacy into products and services”, she offered NCL’s vision of a strong, agile and adaptive national standard.

Panelists drew comparisons between this approach and that of the EU’s GDPR, but criticized the time-consuming and resource intensive nature of that legislation. They agreed that U.S. legislation should avoid being too prescriptive in the details. Rather than requiring documentation of policies, practices, and data flow maps, legislation should focus on high-level issues.

Breaking down these issues according to consensus and complexity, Cameron F. Kelly listed covered information, de-identification, data security, state enforcement, accountability, and FTC authority as solvable issues. Implementation issues, he said, include notice and transparency and individual rights (access, portability, right to object to processing, deletion, nondiscrimination). However, Mr. Kelly noted that disagreement clouds a number of complex issues. These relate to algorithmic transparency, algorithmic fairness, and data processing limitations (use restrictions). Until consensus is reached in these areas, disagreements about preemption and private right of action are unlikely to be resolvable.

Notice and Transparency 

While notice and transparency are important aspects of a comprehensive approach towards privacy and data security, it is difficult for consumers to process the volume of information contained in privacy policies. Consumers also often have little choice but to “agree” to services that are essential to everyday life. As such, legislators may wish to explore the extent to which a company may force an individual to waive their privacy rights as a condition of service. Consent should only have a limited role in relation to sensitive data uses, and companies should focus on designing user interfaces to enable meaningful consumer consent. Panelists criticized the California Consumer Protection Act (CCPA) for its lack of detail and for putting the burden on individuals to protect themselves. It was agreed that federal standards should move beyond notice-and-consent and put the burden back on businesses.

De-identification 

One panelist called de-identification the “secret sauce” to privacy. Preserving the utility of data while removing identification puts the focus on data processing harms. It is important to get de-identification right for valuable research purposes. However, de-identification is often not done well and confusion lurks around pseudonymization. This technique involves replacing personally identifiable information fields within a data record with artificial identifiers. As data remains identifiable using that technique, data security and privacy risks remain. Companies must be incentivized to effectively de-identify data, to not re-identify, and to contractually restrict downstream users from doing the same. To avoid conflating data security levels with pseudonymization levels, a universal and adaptable de-identification standard must be developed.

Data security 

Because data security is critical to privacy, panelists agreed that it is the foundation upon which privacy legislation should be built. Panelists warned against an overly prescriptive approach towards data security but suggested that the Federal Trade Commission (FTC) should offer more guidance. “Reasonable” data security depends upon the nature and scope of data collection and use. This affords organizations flexibility when adopting measures that make sense in terms of information sensitivity, context, and risk of harm.

However, determining data security standards according to the risk of privacy harm is difficult because “risk of privacy harm” is an unsettled and controversial concept. It was also debated whether “information sensitivity” should be used to determine the reasonableness of data security standards. Public Knowledge argued that all data should be protected in the same way because the distinction between sensitive and non-sensitive data is increasingly questionable. When data is aggregated and sophisticated technologies such as machine learning are applied, each and every data point can lead back to an identifiable person.

While use of off-the-shelf software should generally be considered reasonable, higher standards should apply to companies that are more aggressive in their data collection and use. Extending to third party processors and service providers, organizations must continually develop physical, technical, and legal safeguards. To ensure robust infrastructure to secure their data, they should run tests, impact assessments, and put resources towards data mapping.

Data processing limitations

In sectors ranging from education to healthcare, the use of data undoubtedly has the potential to help us solve many societal problems. However, data use is pervasive, and new and unpredictably bad outcomes are also possible. Consumers want data to be used in ways that benefit them, for data not to be used in ways that harm them, and for their data to be protected. However, information collection and sharing is largely unbounded. If Congress wishes to move beyond a notice-and-consent model and put the burden back on organizations that handle data, then the boundaries of how data should be collected, retained, used, and shared must be confronted. Without limitations, the high value of data will continue to incentivize organizations to collect and retain data for the sake of it. These practices increase cybersecurity and privacy risks on unforeseen levels.

Calling out data brokers, Intel’s David Hoffman stated that databases containing lists of rape victims are simply “unacceptable.” However, transfer restrictions are likely to be one of the hardest areas to reach consensus on. Use restrictions, which relate to what organizations can and cannot do with data at a granular level, may be approached by creating presumptively allowed and presumptively prohibited lists. Use and sharing could be presumptively allowed for responsible advertising, legal process and compliance, data security and safety, authentication, product recalls, research purposes, and the fulfillment of product and service requests. Meanwhile, use of data for eligibility determinations, committing fraud or stalking, or for unreasonable practices could be presumptively prohibited.

However, it is difficult to determine the standards by which a particular data use should be “green-lighted” or “red-lighted.” To determine if a data use is for a purpose related to that which a user originally shared data, factors may be considered such as whether the use is primary or secondary, how far down the chain of vendors processing occurs, and whether the processor has a direct or indirect relationship with the data subject. The FTC has done work to articulate “unreasonable” data processing and sharing, and the Center for Democracy and Technology’s Consumer Bill of Rights emphasizes respect for context (user expectations) by laying out applicable factors such as consumer privacy risk and information sensitivity.

However, “context” is difficult to operationalize. One option may be to grant the FTC rulemaking authority to determine issues such as which data uses are per se unfair, or which information is sensitive. The deception and unfairness standard has guided the FTC for decades. However, panelists were concerned about giving the FTC a blank check to use the abusiveness standard to deal with data abuses. Instead, the FTC could be given a clear set of instructions in the form of FTC guidance, legislative preamble, or written in detail in the legislation. If this approach is taken, it would be necessary to confront the difficult question of what harm legislation should seek to address. Because privacy injury is not clear or quantifiable, it is difficult to agree on the appropriate harm standard. A specific list of the types of injury – not an exhaustive list – resulting from data processing would give the harm standard substance, and algorithmic data processing ought to be directly confronted.

Because the purpose of data analysis is to draw differences and to make distinctions, the privacy debate cannot be separated from the discrimination debate. Intent to engage in prohibited discrimination is difficult to prove, especially with use of proxies. For instance, rather than directly using a protected characteristic such as racial heritage as a proxy to offer payday loans, an algorithm could use zip code or music taste as a proxy for race in order to decide who to advertise payday loans to. To provide clarity and to promote algorithmic fairness, existing discrimination laws could be augmented with privacy legislation by defining unfair discrimination according to disparate impact on protected classes (disadvantaged groups). Privacy legislation should ensure that data use does not contribute to prohibited discrimination by requiring risk assessments and outcome monitoring.

To increase consumer trust and to provide them with recourse when they suspect that they are the victims of unfair discrimination, legislation should directly confront algorithmic transparency and burden of proof. Consumers cannot be expected to understand the mechanisms that determine what advertisements they are presented with or how automatic decisions are made about them. However, organizations should not be able to escape liability by claiming that they do not have access to the data or algorithm necessary to prove discrimination claims.

Enforcement

Panelists agreed that State Attorney Generals need to be able to enforce the law and that the FTC requires increased resources and enforcement powers. As Congress cannot anticipate every possible scenario, it is appropriate to give the FTC narrow rulemaking authority, the authority to fine for first offences, to be able to approve codes of conduct, and to clarify guidance on how to comply with the law on issues such as de-identification. The FTC needs vastly more resources to be able to accomplish this oversight and enforcement role. The jury is out as to whether Congress will pony up.

Sally Greenberg described the importance of also including an option for private parties to bring class-action suits. However, there was disagreement between panelists about whether individuals should be able to privately enforce their rights where the government lacks the resources or will to act. David Hoffman highlighted evidentiary problems associated with the difficulty in proving privacy harms. To better serve the public, he argued in favor of the creation of a uniform standard with strong protections.

Preemption of state laws 

The objective of creating a consistent federal standard was emphasized as a key driving factor for industry for the creation of a federal bill. Not including preemption of state law is a kind of “deal-breaker” for industry. They claim that complying with a patchwork of fifty different data breach notification standards is hard today. It was suggested that states could be given a window of five years with no preemption to allow them to adapt and innovate, after which time the situation could be reviewed. Or the reverse – preempt for five years and sunset the federal law. These suggestions both have merit, but in the end, answering the questions of preemption and private right of action remain to be seen.

Developing an approach towards consumer privacy and data security

Polly Turner-Ward

By NCL Google Public Policy Fellow Pollyanna Sanderson

This blog post is the first of a series of blogs offering a consumer perspective on developing an approach towards consumer privacy and data security.

For more than 20 years, Congressional inaction on privacy and data security has coincided with increased data breaches impacting millions of consumers. In the absence of Congressional action, states and the executive branch have increasingly stepped in. A key part of the White House’s response is the National Telecommunication and Information Administration (NTIA) September Request for Comment (RFC).

While a “Request for Comment” sounds incredibly wonky, it is a key part of the process that informs the government’s approach to consumer privacy. The NTIA’s process gathers input from interested stakeholders on ways to advance consumer privacy while protecting prosperity and innovation. Stakeholder responses provide a glimpse into where consensus and disagreements lie among consumer and industry players on key issues. We have read through the comments and in this series of blogs are pleased to offer a consumer perspective.

This first blog focuses on a fundamental aspect of any proposed approach to privacy and data security: the scope. Reflecting risks of big data classification and predictive analytics, one suggestion by the Center for Digital Democracy (CDD) was to frame the issues according to data processing outputs. This would cover inferences, decisions, and other data uses that undermine individual control and privacy. However, focusing on data inputs, there was consensus among many interested stakeholders that privacy legislation must cover “personal information.”

The Center for Democracy and Technology noted that personal information is an evolving concept, the scope of which is “unsettled…as a matter of law, policy, and technology.” Various legal definitions exist at the state, federal, and international level. The Federal Trade Commission’s (FTC) 2012 definition defines it as information capable of being associated with or reasonably linked or linkable to a consumer, household, or device. Subject to certain conditions, de-identified information is excluded from this definition. To help to address privacy concerns while enabling collection and use, many stakeholders agree that regulatory relief should be provided for effective de-identification techniques. This would incentivize the development and implementation of privacy-enhancing techniques and de-identification technologies such as differential privacy and encryption. Federal law to avoid classifying covered data in a binary way as personal or non-personal. An all-or-nothing approach requiring irreversible de-identification is a difficult or impossible standard.

In an attempt to recognize that identifiability rests on a spectrum, the EU’s General Data Protection Regulation (GDPR) excludes anonymized information and introduces the concept of pseudonymized data. These concepts demand federal consideration, having been introduced to United States law via the California Consumer Protection Act (CCPA). The law should clarify how it applies to aggregated, de-identified, pseudonymous, identifiable, and identified information. To be considered de-identified data subject to lower standards, data must not be linkable to an individual, risk of re-identification must be minimal, the entity must publicly commit not to attempt to re-identify the data, and effective legal, administrative, technical, and/or contractual controls must be applied to safeguard that commitment.

While de-identified and other anonymized data may be subject to lower privacy standards, they should not be removed from protection altogether. In their NTIA comment, the CDD highlights that third-party personal data, anonymized data, and other forms of non-personal data may be used to make sensitive inferences and to develop profiles. These could be used for purposes ranging from persuading voters to targeting advertisements. However, individual privacy rights may only be exercised after inferences or profiles have been applied at the individual level. Because profiles and inferences can be made without identifiability, this aspect of corporate data practice would therefore largely escape accountability if de-identified and other anonymized data were not subject to standards of some kind.

This loophole must be closed. Personal information should be broadly defined to address risks of re-identification and to capture evolving business practices that undermine privacy. While the GDPR does not include inferred information in its definition of personal information, inspiration could be taken from the definition of personal information given by the CCPA, which includes inferred information drawn from personal information and used to create consumer profiles.

Our next blog  will explore “developing an approach for handling privacy risks and harms.” In its request for comment, the NTIA established a risk and outcome-based approach towards consumer privacy as a high-level goal for federal action. However, within industry and society, there is a lack of consensus about what constitutes a privacy risk. Stay tuned for a deep dive into the key issues that arise.

The author completed her undergraduate degree in law at Queen Mary University of London and her Master of Laws at William & Mary. She has focused her career on privacy and data security.