AI legal scrutiny intensifies

A judge criticised a key OpenAI witness in the publishers’ copyright case for having 'hazy recollections,' underlining that courts are forcing AI companies to explain training and retrieval practices under adversarial pressure. That kind of scrutiny is spreading and means product and engineering teams will increasingly have to make design and audit decisions that are legally defensible, not just technically optimal. The case is a reminder that governance will shape product scope going forward. (nydailynews.com)

A federal judge in Manhattan said an OpenAI witness was not adequately prepared, after the witness struggled to answer basic questions about an internal project and the judge said his answers turned “increasingly evasive” after repeated objections from company lawyers. The court ordered another 3.5 hours of testimony on April 8 and warned that sanctions could still follow. (justia.com) That sounds procedural, but it goes to the center of how generative artificial intelligence cases now work in court. Judges are no longer just asking whether a chatbot produced a bad answer; they are asking who built the data pipeline, what copies were made, what metadata was stripped, and which employees knew what at the time. (justia.com, bloomberglaw.com) The witness fight sits inside a much bigger bundle of lawsuits. In April 2025, the Judicial Panel on Multidistrict Litigation transferred a set of OpenAI copyright cases from New York and Northern California into coordinated pretrial proceedings in the Southern District of New York before Judge Sidney Stein, with Magistrate Judge Ona Wang handling discovery disputes. (courtlistener.com) One of those publisher cases was filed on April 30, 2024 by the Daily News, the Chicago Tribune, the Orlando Sentinel, the San Jose Mercury News, and other newspapers against Microsoft and OpenAI. The complaint put newspaper articles, copying, and chatbot outputs into the same courtroom record instead of treating them as separate fights. (courtlistener.com) Courts are pressing on a simple question that turns technical fast: when an artificial intelligence company trains a model, does it make unauthorized copies on the way in. A Second Circuit panel hearing a related publisher appeal in March 2026 sounded open to the idea that copying itself can be the injury, with Judge Richard Wesley saying reproduction is an “age-old” core of copyright law. (bloomberglaw.com) That is why witness preparation suddenly matters so much. If a company cannot clearly explain how a training set was assembled, how a retrieval system surfaces text, or why a dataset was deleted, a judge can treat the gap as a discovery failure instead of a mere public-relations problem. (justia.com, hollywoodreporter.com) That pressure is already spreading beyond one deposition. In late 2025, OpenAI lost a discovery fight over internal communications tied to the deletion of two book datasets known as “books1” and “books2,” and the court allowed probing into the company’s reasons for deleting them because that could bear on willful infringement and evidence preservation. (hollywoodreporter.com) The product implication is blunt: engineering choices now need a paper trail. Data retention, source attribution, logging, model-evaluation records, and decisions about whether a system quotes, summarizes, or retrieves source text can all become exhibits if a case reaches discovery. (justia.com, bloomberglaw.com) That changes what “best” design means inside an artificial intelligence company. A feature that is fast, cheap, and accurate in a benchmark may still be risky if nobody can later explain where the inputs came from, which copies were made, and what safeguards existed when copyrighted material entered the system. (bloomberglaw.com, justia.com) The OpenAI cases are starting to look less like a narrow fight over one chatbot and more like a stress test for the whole industry’s memory. In court, “we think that is how it worked” is turning out to be a much weaker answer than a dated log, a preserved dataset record, or a witness who can name the people in the room. (justia.com, nydailynews.com)

AI legal scrutiny intensifies

Get your own daily briefing