IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Judges Must Weigh Admissibility of AI Evidence

Evidence that comes from algorithms or that might be deepfake will have to go before a judge, who must then decide based on a number of mitigating factors whether it is admissible.

Retired Judge Paul Grimm, dressed professionally in suit and bowtie, and seated in a leather chair beside a small side table, speaks into a microphone held in his left hand.
Retired Judge Paul Grimm speaks during the 2023 American Association for the Advancement of Science's Scientific Evidence and the Courts conference.
Screenshot
Findings from algorithms as well as any content that may have been produced or altered by AI is likely to show up in court soon, and judges must be able to assess whether to allow it before the jury.

This isn’t a new responsibility for judges, who already need to evaluate relevance and authenticity of scientific or technical materials before admitting them as evidence. But the need to understand often opaque and complicated algorithmic systems, as well as the risk of undetected deepfakes biasing the jury, present particular challenges.

Judges are guided by the Federal Rules of Evidence or an equivalent state policy, but there are not yet AI-specific rules within those, with updates typically taking between two and five years to get approved, said Paul Grimm, retired judge of the U.S. District Court of Maryland and current director of the Duke Law School Bolch Judicial Institute. He made his remarks at part of the American Association for the Advancement of Science's (AAAS) recent Scientific Evidence and the Courts conference.

As such, judges must find ways to apply current rules to this emerging tech challenge.

JUDGES' DUTY


Judges must determine whether expert testimony or scientific or technical evidence is sufficiently relevant, accurate and authentic, as well as whether it is likely to unfairly bias a jury.

Considering authenticity means assessing whether the AI system does what it was intended to and also produces results consistently when used in similar conditions.

Judges assessing AI evidence should look, in part, to theDaubert standard to assess the scientific quality of the evidence, Grimm said. Per the American Bar Association, the standard requires judges to consider whether the theory put forth in expert testimony can, and has, been tested; if it’s been peer-reviewed; what the error rate is and whether the theory or methodology is generally accepted by the relevant scientific community.

And for judges to make reasoned assessments on authenticity and opposing parties to be able to mount arguments against evidence, they need transparency into how the algorithmic model was trained. Judges should use protective orders to ensure even owners of proprietary AI allow the other party to see details about the algorithm design and training data. For example, knowing a facial recognition technology was trained primarily on faces of white men has important implications because it indicates the tool would likely then perform worse at recognizing faces of other demographics.

When assessing evidence’s accuracy and authenticity, judges aren’t looking for absolute proof that it's genuine rather than a deepfake — just that this is more likely than not the case. After all, it’s the jury who will be weighing the evidence and the parties’ arguments as convincing.

Before admitting evidence, the judge also needs to consider whether it should be held back from the jury because it would cause unfair or excessive prejudice. In such a situation, the risk of harm from the evidence turning out to be fake, unreliable or invalid is so severe that it outweighs the potential value of the evidence to prove something in the case. For example, if the evidence is being used to influence the length of a person’s sentence and it turns out to be a deepfake, that could mean someone being locked up longer than necessary.

And deepfakes present particularly high risks of prejudicing jurors. People who are shown a video — even if they are told it might be deepfaked — are still likely to struggle to disbelieve it when considering facts, Grimm and co-authors note in a recent Duke Law & Technology Review article. They point to a 2023 study that found “jurors who hear oral testimony along with video testimony are 650 percent more likely to retain the information,” and that “video evidence powerfully affects human memory and perception of reality.”

To consider such risks, judges can look to factors such as whether the algorithmic-provided information is the critical evidence in the case and whether other, non-tech-generated evidence corroborates the algorithm’s findings, Grimm said.

THE JURY CONSIDERS DEEPFAKES


Jurors may be presented with evidence that one party claims is genuine and another claims is a deepfake. After seeing the media and hearing both arguments, jurors will decide whether to trust it as genuine or dismiss it as faked.

But the jury “really does not have the technical expertise to decide,” such matters, Grimm said. In fact, the rise of deepfakes has prompted some calls for removing the jury’s authority to determine whether “digital and audiovisual evidence” is genuine, per the Duke Law article, but “such a change would involve a substantial departure from the current evidentiary framework and would take considerable time to adopt, making it infeasible as a practical solution.”

Right now, there are two key ways to prove a piece of content is deepfaked: detecting “trace evidence that was left in the digital components of the image” or having a copy of the original content that was used in making the deepfake, said David Doermann, director of the State University of New York-Buffalo’s Institute for Artificial Intelligence and Data Science, during the AAS panel.

But because there are ways of spoofing marks of originality, and because deepfake technology is only getting more convincing, “you can’t really say that something is not a deepfake,” he said.
Jule Pattison-Gordon is a senior staff writer for Governing and former senior staff writer for Government Technology, where Jule specialized in cybersecurity. Jule also previously wrote for PYMNTS and The Bay State Banner and holds a B.A. in creative writing from Carnegie Mellon.