California Jury to Decide if Facebook’s Deep Learning Facial Recognition Creates Regulated Biometric Information

Following a recent decision issued by Judge James Donato of the U.S. District Court for the Northern District of California, a jury to be convened in San Francisco in July will decide whether a Facebook artificial intelligence technology creates regulated “biometric information” under Illinois’ Biometric Information Privacy Act (BIPA).  In some respects, the jury’s decision could reflect general sentiment toward AI during a time when vocal opponents of AI have been widely covered in the media.  The outcome could also affect how US companies, already impacted by Europe’s General Data Protection Regulation (GDPR), view their use of AI technologies to collect and process user-supplied data. For lawyers, the case could highlight effective litigation tactics in highly complex AI cases where black box algorithms are often unexplainable and lack transparency, even to their own developers.

What’s At Stake? What Does BIPA Cover?

Uniquely personal biometric identifiers, such as a person’s face and fingerprints, are often seen as needing heightened protection from hackers due to the fact that, unlike a stolen password that one can reset, a person cannot change their face or fingerprints if someone makes off with digital versions and uses them to steal the person’s identity or gain access to the person’s biometrically-protected accounts, devices, and secure locations. The now 10-year old BIPA (740 ILCS 14/1 (2008)) was enacted to ensure users are made aware of instances when their biometric information is being collected, stored, and used, and to give users the option to opt out. The law imposes requirements on companies and penalties for non-compliance, including liquidated and actual damages. At issue here, the law addresses “a scan” of a person’s “face geometry,” though it falls short of explicitly defining those terms.

Facebook users voluntarily upload to their Facebook accounts digital images depicting them, their friends, and/or family members. Some of those images are automatically processed by an AI technology to identify the people in the images. Plaintiffs–here, putative class action individuals–argue that Facebook’s facial recognition feature involves a “scan” of a person’s “face geometry” such that it collects and stores biometric data in violation of BIPA.

Summary of the Court’s Recent Decision

In denying the parties’ cross-motions for summary judgment and allowing the case to go to trial, Judge Donato found that the Plaintiffs and Facebook “offer[ed] strongly conflicting interpretations of how the [Facebook] software processes human faces.” See In Re Facebook Biometric Information Privacy Litigation, slip op. (Dkt. 302), No. 3:15-cv-03747-JD (N.D. Cal. May 14, 2018). The Plaintiffs, he wrote, argued that “the technology necessarily collects scans of face geometry because it uses human facial regions to process, characterize, and ultimately recognize face images.” On the other hand, “Facebook…says the technology has no express dependency on human facial features at all.”

Addressing Facebook’s interpretation of BIPA, Judge Donato considered the threshold question of what BIPA’s drafters meant by a “scan” in “scan of face geometry.” He rejected Facebook’s suggestion that BIPA relates to an express measurement of human facial features such as “a measurement of the distance between a person’s eyes, nose, and ears.” In doing so, he relied on extrinsic evidence in the form of dictionary definitions, specifically Merriam-Webster’s 11th, for an ordinary meaning of “to scan” (i.e., to “examine” by “observation or checking,” or “systematically . . . in order to obtain data especially for display or storage”) and “geometry” (in everyday use, means simply a “configuration,” which in turn denotes a “relative arrangement of parts or elements”).  “[N]one of these definitions,” the Judge concluded, “demands actual or express measurements of spatial quantities like distance, depth, or angles.”

The Jury Could Face a Complex AI Issue

Digital images contain a numerical representation of what is shown in the image, specifically the color (or grayscale), transparency, and other information associated with each pixel of the image. An application running on a computer can render the image on a display device by reading the file data to identify what color or grayscale level each pixel should display. When one scans a physical image or takes a digital photo with a smartphone, they are systematically generating this pixel-level data. Digital image data may be saved to a file having a particular format designated by a file extension (e.g., .GIF, .JPG, .PNG, etc.).

A deep convolutional neural network–a type of AI–can be used to further process a digital image file’s data to extract features from the data. In a way, the network replicates a human cognitive process of manually examining a photograph. For instance, when we examine a face in a photo, we take note of features and attributes, like a nose and lip shape and their contours as well as eye color and hair. Those and other features may help us recall from memory whose face we are looking at even if we have never seen the image before.

A deep neural network, once it is fully trained using many different face images, essentially works in a similar manner. After processing image file data to extract and “recognize” features, the network uses the features to classify the image by associating it with an identity, assuming it has “seen” the face before (in which case it may compare the extracted features to a template image of the face, preferably several images of the face). Thus, a digital image file may contain a numerical representation of what is shown in the image, and a deep neural network creates a numerical representation of features shown in the digital image to perform classification.  A question for the jury, then, may involve deciding if the processing of uploaded digital images using a deep convolutional neural network involves “a scan” of “a person’s face geometry.” This question will challenge the parties and their lawyers to assist the jury to understand digital image files and the nuances of AI technology.

For Litigators, How to Tackle AI and Potential AI Bias?

The particulars of advanced AI have not been central to a major federal jury case to date.  Thus, the Facebook case offers an opportunity to evaluate a jury’s reaction to a particular AI technology.

In its summary judgment brief, Facebook submitted expert testimony that its AI “learned for itself what features of an image’s pixel values are most useful for the purposes of characterizing and distinguishing images of human faces” and it “combines and weights different combinations of different aspects of the entire face image’s pixel value.” This description did not persuade Judge Donato to conclude that an AI with “learning” capabilities escapes BIPA’s reach, at least not as a matter of law.  Whether it will be persuasive to a jury is an open question.

It is possible some potential jurors may have preconceived notions about AI, given the hype surrounding use cases for the technology.  Indeed, outside the courthouse, AI’s potential dark side and adverse impacts on society have been widely reported. Computer vision-enabled attack drones, military AI systems, jobs being taken over by AI-powered robots, algorithmic harm due to machine learning bias, and artificial general intelligence (AGI) taking over the world appear regularly in the media.  If bias for and against AI is not properly managed, the jury’s final decision might be viewed by some as a referendum on AI.

For litigators handling AI cases in the future, the outcome of the Facebook case could provide a roadmap for effective trial strategies involving highly complex AI systems that defy simple description.  That is not to say that the outcome will create a new paradigm for litigating tech. After all, many trials involve technical experts who try to explain complex technologies in a way that is impactful on a jury. For example, complex technology is often the central dispute in cases involving intellectual property, medical malpractice, finance, and others.  But those cases usually don’t involve technologies that “learn” for themselves.

How Will the Outcome Affect User Data Collection?

The public is becoming more aware that tech companies are enticing users to their platforms and apps as a way to generate user-supplied data. While the Facebook case itself may not usher in a wave of new laws and regulations or even self-policing by the tech industry aimed at curtailing user data collection, a sizeable damages award from the jury could have a measured chilling effect. Indeed, some companies may be more transparent about their data collection and provide improved notice and opt-out mechanisms.

How Privacy Law’s Beginnings May Suggest An Approach For Regulating Artificial Intelligence

A survey conducted in April 2017 by Morning Consult suggests most Americans are in favor of regulating artificial intelligence technologies. Of 2,200 American adults surveyed, 71% said they strongly or somewhat agreed that there should be national regulation of AI, while only 14% strongly or somewhat disagreed (15% did not express a view).

Technology and business leaders speaking out on whether to regulate AI fall into one of two camps: those who generally favor an ex post, case-by-case, common law approach, and those who prefer establishing a statutory and regulatory framework that, ex ante, sets forth clear do’s and don’ts and penalties for violations. (If you’re interested in learning about the challenges of ex post and ex ante approaches to regulation, check out Matt Scherer’s excellent article, “Regulating Artificial Intelligence Systems: Risks, Challenges, Competencies, and Strategies,” published in the Harvard Journal of Law and Technology (2016)).

Advocates for a proactive regulatory approach caution that the alternative is fraught with predictable danger. Elon Musk for one, notes that, “[b]y the time we’re reactive in A.I., regulation’s too late.” Others, including leaders of some of the biggest AI technology companies in the industry, backed by lobbying organizations like the Information Technology Industry Council (ITI), feel that the hype surrounding AI does not justify quick Congressional action at this time.

Musk criticized this wait-and-see approach. “Normally, the way regulation’s set up,” he said, “a whole bunch of bad things happen, there’s a public outcry, and then after many years, a regulatory agency is set up to regulate that industry. There’s a bunch of opposition from companies who don’t like being told what to do by regulators, and it takes forever. That in the past has been bad but not something which represented a fundamental risk to the existence of civilization.”

Assuming AI regulation is inevitable, how should regulators (and legislators) approach such a formidable task? After all, AI technologies come in many forms, and their uses extend across multiple industries, including some already burdened with regulation. The history of privacy law may provide the answer.

Without question, privacy concerns, and privacy laws, touch on AI technology use and development. That’s because so much of today’s human-machine interactions involving AI are powered by user-provided or user-mined data. Search histories, images people appear in on social media, purchasing habits, home ownership details, political affiliations, and many other data points are well-known to marketers and others whose products and services rely on characterizing potential customers using, for example, machine learning algorithms, convolutional neural networks, and other AI tools. In the field of affective computing, human-robot and human-chatbot interactions are driven by a person’s voice, facial features, heart rate, and other physiological features, which are the percepts that the AI system collects, processes, stores, and uses when deciding actions to take, such as responding to user queries.

Privacy laws evolved from a period during late nineteenth century America when journalists were unrestrained in publishing sensational pieces for newspapers or magazines, basically the “fake news” of the time. This Yellow Journalism, as it was called, prompted legal scholars to express a view that people had a “right to be let alone,” setting in motion the development of a new body of law involving privacy. The key to regulating AI, as it was in the development of regulations governing privacy, may be the recognition of a specific personal right that is, or is expected to be, infringed by AI systems.

In the case of privacy, attorneys Samuel Warren and Louis Brandeis (later, Justice Brandeis) were the first to articulate a personal privacy right. In The Right of Privacy, published in the Harvard Law Review in 1890, Warren and Brandeis observed that “the press is overstepping in every direction the obvious bounds of propriety and of decency. Gossip…has become a trade.” They contended that “for years there has been a feeling that the law must afford some remedy for the unauthorized circulation of portraits of private persons.” They argued that a right of privacy was entitled to recognition because “in every [] case the individual is entitled to decide whether that which is his shall be given to the public.” A violation of the person’s right of privacy, they wrote, should be actionable.

Soon after, courts began recognizing the right of privacy in civil cases. By 1960, in his seminal review article entitled Privacy (48 Cal.L.Rev 383), William Prosser wrote, “In one form or another,” the right of privacy “was declared to exist by the overwhelming majority of the American courts.” That led to uniform standards. Some states enacted limited or sweeping state-specific statutes, replacing the common law with statutory provisions and penalties. Federal appeals courts weighed in when conflicts between state law arose. This slow progression from initial recognition of a personal privacy right in 1890, to today’s modern statutes and expansive development of common law, won’t appeal to those pushing for regulation of AI now.

Even so, the process has to begin somewhere, and it could very well start with an assessment of the personal rights that should be recognized arising from interactions with or the use of AI technologies. Already, personal rights recognized by courts and embodied in statutes apply to AI technologies. But there is one personal right, potentially unique to AI technologies, that has been suggested: the right to know why (or how) an AI technology took a particular action (or made a decision) affecting a person.

Take, for example, an adverse credit decision by a bank that relies on machine learning algorithms to decide whether a customer should be given credit. Should that customer have the right to know why (or how) the system made the credit-worthiness decision? FastCompany writer Cliff Kuang explored this proposition in his recent article, “Can A.I. Be Taught to Explain Itself?” published in the New York Times (November 21, 2017).

If AI could explain itself, the banking customer might want to ask it what kind of training data was used and whether the data was biased, or whether there was an errant line of python coding to blame, or whether the AI gave the appropriate weight to the customer’s credit history. Given the nature of AI technologies, some of these questions, and even more general ones, may only be answered by opening the AI black box. But even then it may be impossible to pinpoint how the AI technology made its decision. In Europe, “tell me why/how” regulations are expected to become effective in May 2018. As I will discuss in a future post, many practical obstacles face those wishing to build a statute or regulatory framework around the right of consumers to demand from businesses that their AI explain why it made or took a particular adverse action.

Regulation of AI will likely happen. In fact, we are already seeing the beginning of direct legislative/regulatory efforts aimed at the autonomous driving industry. Whether interest in expanding those efforts to other AI technologies grows or lags may depend at least in part on whether people believe they have personal rights at stake in AI, and whether those rights are being protected by current laws and regulations.