Power
Robot Reading a Book
(Getty Images)

Judge rules Anthropic training on books it purchased was “fair use,” but not for the ones it stole

Anthropic still faces litigation for training its models on millions of pirated texts.

When AI companies like OpenAI, Anthropic, and Meta were racing to build and train new large language models, they scrambled to find enough text to train their systems on. Countless web pages, photos, YouTube videos, Disney movies, Reddit threads, and book texts were slurped up to feed the models to add billions and billions of tokens.

Resulting litigation initiated by copyright holders has shown that the legality of the process was on the minds of some AI company employees, like researchers at Meta who raised concerns while training its Llama model, only to be told that the use of LibGen, a corpus of pirated texts, was approved by “MZ.”

But yesterday, a court decided a case partially in favor of AI companies, with far-reaching consequences for all the companies that were sucking copyrighted material into their models.

A federal judge in the Northern District of California has ruled that Anthropic was not violating the copyright of authors of the books it purchased and scanned for training.

A group of authors filed the suit against Anthropic last August, alleging that Anthropic had acknowledged training its Claude AI model using “The Pile,” a mass of text shared online that contained millions of copyrighted works, including some written by the plaintiffs.

The process of buying, scanning, and ingesting the text for use in training the Claude model was determined to be “exceedingly transformative and was a fair use under Section 107 of the Copyright Act” by Judge William Alsup, a key test of the fair use doctrine in intellectual property law.

But what about the “over seven million copies of books” that Anthropic admitted were pirated that it did not pay for? The judge said that was not fair use, and warrants its own trial.

Judge Alsup wrote:

“The downloaded pirated copies used to build a central library were not justified by a fair use. Every factor points against fair use. Anthropic employees said copies of works (pirated ones, too) would be retained ‘forever’ for ‘general purpose’ even after Anthropic determined they would never be used for training LLMs. A separate justification was required for each use. None is even offered here except for Anthropic’s pocketbook and convenience.”

The case is the first of its kind to be decided in the US, and lays out a potentially legal way for AI companies to safely train their models using copyrighted works — as long as they purchase them. That said, there are still many other cases pending and many factors at play before the industry has clear rules.

But companies that are caught knowingly using pirated, copyrighted works to train AI models may face new legal exposure.

An Anthropic spokesperson told Sherwood News:

“We are pleased that the Court recognized that using ‘works to train LLMs was transformative — spectacularly so.’ Consistent with copyright’s purpose in enabling creativity and fostering scientific progress, ‘Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different.’”

More Power

See all Power
Looking up at the US Capital

Congress votes to end shutdown

The over 40-day government shutdown came to an end without a guarantee that the ACA tax credits will be extended.

power

OpenAI: The New York Times is forcing us to turn over 20 million ChatGPT conversations

A judge in the The New York Times’ copyright lawsuit against OpenAI (and Microsoft) has ordered that the ChatGPT maker hand over the conversations of 20 million users to the Times’ lawyers, in an effort to find examples of copyright violations.

Today, OpenAI is lobbying the public in a last-ditch effort to prevent the release, which is due Friday:

The New York Times is demanding that we turn over 20 million of your private ChatGPT conversations. They claim they might find examples of you using ChatGPT to try to get around their paywall. This demand disregards long-standing privacy protections, breaks with common-sense security practices, and would force us to turn over tens of millions of highly personal conversations from people who have no connection to the Times’ baseless lawsuit against OpenAI.”

If the company’s final appeals to the court do not succeed, OpenAI explains that it will de-identify the chat logs, scrub any personally identifying information from the chats, and that technical experts hired by The New York Times’ legal team will be the only ones who can examine the data, which will be tightly controlled.

Today, OpenAI is lobbying the public in a last-ditch effort to prevent the release, which is due Friday:

The New York Times is demanding that we turn over 20 million of your private ChatGPT conversations. They claim they might find examples of you using ChatGPT to try to get around their paywall. This demand disregards long-standing privacy protections, breaks with common-sense security practices, and would force us to turn over tens of millions of highly personal conversations from people who have no connection to the Times’ baseless lawsuit against OpenAI.”

If the company’s final appeals to the court do not succeed, OpenAI explains that it will de-identify the chat logs, scrub any personally identifying information from the chats, and that technical experts hired by The New York Times’ legal team will be the only ones who can examine the data, which will be tightly controlled.

Big four airlines sink as Transportation Secretary Duffy says parts of US airspace could close if shutdown continues

The US may close parts of its airspace as early as next week if the government shutdown continues, according to comments made by Transportation Secretary Sean Duffy on Tuesday.

“If you bring us to a week from today, Democrats, you will see mass chaos. You will see mass flight delays. Youll see mass cancellations, and you may see us close certain parts of the airspace, because we just cannot manage it,” Duffy said at a news briefing on Tuesday.

The shutdown, which entered its 35th day on Tuesday, has fueled already problematic shortages of air traffic controllers. This week, airlines said 3.2 million passengers have faced delays or cancellations because of the shortages. Last week, about 13,000 air traffic controllers and 50,000 TSA agents received their first $0 paycheck amid the shutdown.

Shares of the big four US airlines all sank on Duffy’s comments, with United Airlines, American Airlines, and Delta Air Lines all down more than 5%.

power
Jon Keegan

Trump’s deal offering top Nvidia chips to China was nixed at last minute, the WSJ reports

Nvidia’s CEO, Jensen Huang, really wants to sell the chipmakers most powerful Blackwell GPUs to China. He almost had his way.

According to a report from The Wall Street Journal, President Trump was ready to put Blackwell chips on the negotiating table for his meeting with Chinese President Xi to seek relief from Chinas decision to block crucial rare earth exports to the US.

But according to the report, Trump advisers presented a unified front and were able to dissuade him from giving up the most powerful chips to China at the last minute. Secretary of State Marco Rubio, Commerce Secretary Howard Lutnick, and US Trade Representative Jamieson Greer were among those opposed to the chip deal. After the meeting, Trump said he did not talk with Xi about Nvidia’s “super duper” chips.

Reportedly those opposed to the deal cited national security concerns, as well as wanting to keep a competitive edge as China seeks to challenge the US’s current dominance of the AI industry.

But according to the report, Trump advisers presented a unified front and were able to dissuade him from giving up the most powerful chips to China at the last minute. Secretary of State Marco Rubio, Commerce Secretary Howard Lutnick, and US Trade Representative Jamieson Greer were among those opposed to the chip deal. After the meeting, Trump said he did not talk with Xi about Nvidia’s “super duper” chips.

Reportedly those opposed to the deal cited national security concerns, as well as wanting to keep a competitive edge as China seeks to challenge the US’s current dominance of the AI industry.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.