California Appeals Court Denies Defendant Access to Algorithm That Contributed Evidence to His Conviction

One of the concerns expressed by those studying algorithmic decision-making is the apparent lack of transparency. Those impacted by adverse algorithmic decisions often seek transparency to better understand the basis for the decisions. In the case of software used in legal proceedings, parties who seek explanations about software face a number of obstacles, including those imposed by evidentiary rules, criminal or civil procedural rules, and by software companies that resist discovery requests.

The closely-followed issue of algorithmic transparency was recently considered by a California appellate court in People v. Superior Court of San Diego County, slip op. Case D073943 (Cal. App. 4th October 17, 2018), in which the People sought relief from a discovery order requiring the production of software and source code used in the conviction of Florencio Jose Dominguez. Following a hearing and review of the record and amicus briefs in support of Dominguez filed by the American Civil Liberties Union, the American Civil Liberties Union of San Diego and Imperial Counties, the Innocence Project, Inc., the California Innocence Project, the Northern California Innocence Project at Santa Clara University School of Law, Loyola Law School’s Project for the Innocent, and the Legal Aid Society of New York City, the appeals court granted the People’s relief. In doing so, the court considered, but was not persuaded by, the defense team’s “black box” and “machine testimony” arguments.

At issue on appeal was Dominguez’s motion to compel production of a DNA testing program called STRmix used by local prosecutors in their analysis of forensic evidence (specifically, DNA found on the inside of gloves). STRmix is a “probabilistic genotyping” program that expresses a match between a suspect and DNA evidence in terms the probability of a match compared to a coincidental match. Probabilistic genotyping is said to reduce subjectivity in the analysis of DNA typing results. Dominguez’s counsel moved the trial court for an order compelling the People to produce the STRmix software program and related updates as well as its source code, arguing that defendant had a right to look inside the software’s “black box.” The trial court granted the motion and the People sought writ relief from the appellate court.

On appeal, the appellate court noted that “computer software programs are written in specialized languages called source code” and “source code, which humans can read, is then translated into [a] language that computers can read.” Cadence Design Systems, Inc. v. Avant! Corp., 29 Cal. 4th 215, 218 at fn.3 (2002). The lab that used STRmix testified that it had no way to access the source code, which it licensed from a software authorized seller.  Thus,  the court considered whether the company that created the software should produce it. In concluding that the company was not obligated to produce the software and source code, the court, citing precedent, found that the company would have had no knowledge of the case but for the defendant’s  subpoena duces tecum, and it did not act as part of the prosecutorial team such that it was obligated to turn over exculpatory evidence (assuming software itself is exculpatory, which the court was reluctant to find).

With regard to the defense team’s “black box” argument, the appellate court found nothing in the record to indicate that the STRmix software suffered a problem, as the defense team argued, that might have affected its results. Calling this allegation speculative, the court concluded that the “black box” nature of STRmix was not itself sufficient to warrant its production.

Moreover, the court was unpersuaded by the defense team’s argument that the STRmix program essentially usurped the lab analyst’s role in providing the final statistical comparison, and so the software program—not the analyst using the software—was effectively the source of the expert opinion rendered at trial. The lab, the defense argued, merely acted in a scrivener’s capacity for STRmix’s analysis, and since the machine was providing testimony, Dominguez should be able to evaluate the software to defend against the prosecution’s case against him.

The appellate court disagreed. While acknowledging the “creativity” of the defense team’s “machine testimony” argument (which relied heavily on Berkeley law professor Andrea Roth’s “Machine Testimony” article (126 Yale L.J. 1972 (2017)), the panel noted the testimony that STRmix did not act alone, that there were humans in the loop: “[t]here are still decisions that an analyst has to make on the front end in terms of determining the number of contributors to a particular sample and determin[ing] which peaks are from DNA or from potentially artifacts” and that the program then performs a “robust breakdown of the DNA samples,” based at least in part on “parameters [the lab] set during validation.” Moreover, after STRmix renders “the diagnostics,” the lab “evaluate[s] … the genotype combinations … . to see if that makes sense, given the data [it’s] looking at.” After the lab “determine[s] that all of the diagnostics indicate that the STRmix run has finished appropriately,” it can then “make comparisons to any person of interest or … database that [it’s] looking at.”

While the appellate court’s decision mostly followed precedent and established procedure, it could easily have gone the other way and affirmed the trial judge’s decision granting Defendant’s motion to compel the STRmix software and source code, which would have given Dominguez better insight into the nature of the software’s algorithms, its parameters and limitations in view of validation studies, and the various possible outputs the model could have produced given a set of inputs. In particular, the court might have affirmed the trial judge’s decision to grant access to the STRmix software if the policy of imposing transparency in STRmix’s algorithmic decisions were given more consideration from the perspective of actual harm that might occur if software and source code are produced. Here, the source code owner’s objection to production was based in part on trade secret and other confidentiality concerns; however, procedures already exist to handle those concerns. Indeed, source code reviews happen all the time in the civil context, such as in patent infringement matters involving software technologies. While software makers are right to be concerned about the harm to their businesses if their code ends up in the wild, the real risk of this happening can be low if proper procedures, embodied in a suitable court-issued Protective Order, are followed by lawyers on both sides of a matter and if the court maintains oversight and demands status updates from the parties to ensure compliance and integrity in the review process. Instead of following the trial court’s approach, however, the appellate court conditional access to STRmix’s “black box” on the demonstration of specific errors in the program’s results, which seems intractable: only by looking into the black box in the first place is a party able to understand whether problems exist that affect the result.

Interestingly, artificial intelligence had nothing to do with the outcome of the appellate court’s decision, yet the panel noted that “We do not underestimate the challenges facing the legal system as it confronts developments in the field of artificial intelligence.” The judges acknowledged that the notion of “machine testimony” in algorithmic decision-making matters is a subject about which there are widely divergent viewpoints in the legal community, a possible prelude to what is ahead when artificial intelligence software cases make their way through the courts in criminal or non-criminal cases.  To that, the judges cautioned, “when faced with a novel method of scientific proof, we have required a preliminary showing of general acceptance of the new technique in the relevant scientific community before the scientific evidence may be admitted at trial.”

Lawyers in future artificial intelligence cases should consider how best to frame arguments concerning machine testimony in both civil and criminal contexts to improve their chances of overcoming evidentiary obstacles. Lawyers will need to effectively articulate the nature of artificial intelligence decision-making algorithms, as well as the relative roles of data scientists and model developers who make decisions about artificial intelligence model architecture, hyperparameters, data sets, model inputs, training and testing procedures, and the interpretation of results. Today’s artificial intelligence systems do not operate autonomously; there will always be humans associated with a model’s output or result and those persons may need to provide expert testimony beyond the machine’s testimony.  Even so, transparency will be important to understanding algorithmic decisions and for developing an evidentiary record in artificial intelligence cases.

How Privacy Law’s Beginnings May Suggest An Approach For Regulating Artificial Intelligence

A survey conducted in April 2017 by Morning Consult suggests most Americans are in favor of regulating artificial intelligence technologies. Of 2,200 American adults surveyed, 71% said they strongly or somewhat agreed that there should be national regulation of AI, while only 14% strongly or somewhat disagreed (15% did not express a view).

Technology and business leaders speaking out on whether to regulate AI fall into one of two camps: those who generally favor an ex post, case-by-case, common law approach, and those who prefer establishing a statutory and regulatory framework that, ex ante, sets forth clear do’s and don’ts and penalties for violations. (If you’re interested in learning about the challenges of ex post and ex ante approaches to regulation, check out Matt Scherer’s excellent article, “Regulating Artificial Intelligence Systems: Risks, Challenges, Competencies, and Strategies,” published in the Harvard Journal of Law and Technology (2016)).

Advocates for a proactive regulatory approach caution that the alternative is fraught with predictable danger. Elon Musk for one, notes that, “[b]y the time we’re reactive in A.I., regulation’s too late.” Others, including leaders of some of the biggest AI technology companies in the industry, backed by lobbying organizations like the Information Technology Industry Council (ITI), feel that the hype surrounding AI does not justify quick Congressional action at this time.

Musk criticized this wait-and-see approach. “Normally, the way regulation’s set up,” he said, “a whole bunch of bad things happen, there’s a public outcry, and then after many years, a regulatory agency is set up to regulate that industry. There’s a bunch of opposition from companies who don’t like being told what to do by regulators, and it takes forever. That in the past has been bad but not something which represented a fundamental risk to the existence of civilization.”

Assuming AI regulation is inevitable, how should regulators (and legislators) approach such a formidable task? After all, AI technologies come in many forms, and their uses extend across multiple industries, including some already burdened with regulation. The history of privacy law may provide the answer.

Without question, privacy concerns, and privacy laws, touch on AI technology use and development. That’s because so much of today’s human-machine interactions involving AI are powered by user-provided or user-mined data. Search histories, images people appear in on social media, purchasing habits, home ownership details, political affiliations, and many other data points are well-known to marketers and others whose products and services rely on characterizing potential customers using, for example, machine learning algorithms, convolutional neural networks, and other AI tools. In the field of affective computing, human-robot and human-chatbot interactions are driven by a person’s voice, facial features, heart rate, and other physiological features, which are the percepts that the AI system collects, processes, stores, and uses when deciding actions to take, such as responding to user queries.

Privacy laws evolved from a period during late nineteenth century America when journalists were unrestrained in publishing sensational pieces for newspapers or magazines, basically the “fake news” of the time. This Yellow Journalism, as it was called, prompted legal scholars to express a view that people had a “right to be let alone,” setting in motion the development of a new body of law involving privacy. The key to regulating AI, as it was in the development of regulations governing privacy, may be the recognition of a specific personal right that is, or is expected to be, infringed by AI systems.

In the case of privacy, attorneys Samuel Warren and Louis Brandeis (later, Justice Brandeis) were the first to articulate a personal privacy right. In The Right of Privacy, published in the Harvard Law Review in 1890, Warren and Brandeis observed that “the press is overstepping in every direction the obvious bounds of propriety and of decency. Gossip…has become a trade.” They contended that “for years there has been a feeling that the law must afford some remedy for the unauthorized circulation of portraits of private persons.” They argued that a right of privacy was entitled to recognition because “in every [] case the individual is entitled to decide whether that which is his shall be given to the public.” A violation of the person’s right of privacy, they wrote, should be actionable.

Soon after, courts began recognizing the right of privacy in civil cases. By 1960, in his seminal review article entitled Privacy (48 Cal.L.Rev 383), William Prosser wrote, “In one form or another,” the right of privacy “was declared to exist by the overwhelming majority of the American courts.” That led to uniform standards. Some states enacted limited or sweeping state-specific statutes, replacing the common law with statutory provisions and penalties. Federal appeals courts weighed in when conflicts between state law arose. This slow progression from initial recognition of a personal privacy right in 1890, to today’s modern statutes and expansive development of common law, won’t appeal to those pushing for regulation of AI now.

Even so, the process has to begin somewhere, and it could very well start with an assessment of the personal rights that should be recognized arising from interactions with or the use of AI technologies. Already, personal rights recognized by courts and embodied in statutes apply to AI technologies. But there is one personal right, potentially unique to AI technologies, that has been suggested: the right to know why (or how) an AI technology took a particular action (or made a decision) affecting a person.

Take, for example, an adverse credit decision by a bank that relies on machine learning algorithms to decide whether a customer should be given credit. Should that customer have the right to know why (or how) the system made the credit-worthiness decision? FastCompany writer Cliff Kuang explored this proposition in his recent article, “Can A.I. Be Taught to Explain Itself?” published in the New York Times (November 21, 2017).

If AI could explain itself, the banking customer might want to ask it what kind of training data was used and whether the data was biased, or whether there was an errant line of python coding to blame, or whether the AI gave the appropriate weight to the customer’s credit history. Given the nature of AI technologies, some of these questions, and even more general ones, may only be answered by opening the AI black box. But even then it may be impossible to pinpoint how the AI technology made its decision. In Europe, “tell me why/how” regulations are expected to become effective in May 2018. As I will discuss in a future post, many practical obstacles face those wishing to build a statute or regulatory framework around the right of consumers to demand from businesses that their AI explain why it made or took a particular adverse action.

Regulation of AI will likely happen. In fact, we are already seeing the beginning of direct legislative/regulatory efforts aimed at the autonomous driving industry. Whether interest in expanding those efforts to other AI technologies grows or lags may depend at least in part on whether people believe they have personal rights at stake in AI, and whether those rights are being protected by current laws and regulations.