Anthropic wins fair use victory for AI – but still in trouble for stealing books

> But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.

This feels like an unwarranted anthropomorphization of what LLMs are doing.

I feel like the fundamental issue, and the things people really have a problem with, is that the speed and scale with which LLMs function completely breaks the use cases for which fair use was originally envisioned. IMO, existing copyright law is just wholly unsuited to deal with the consequences of AI.

That is, I don't think anyone (especially on this website) would have a problem if someone read a ton of books, and them opened a website where you can chat with them and ask them questions about the books. But if this person had "super abilities", where they could read every book that ever existed, then respond almost instantly to questions about any book that was read, and the person could respond to millions of questions simultaneously, I think that "fair use" as it exists now would have never existed - it completely breaks the economic model that copyright was supposed to incentivize in the first place. I'm not arguing which position is right or wrong, but I am arguing that using "if a human did it it would be fair use" is a very bad analogy.

As a similar example, in the US, courts had regularly held that people walking around outside don't have an expectation of privacy. But what if computers could then record you, upload you to a website, and use facial recognition so that anyone else in the world could set an alert to be notified if you ever appeared on some certain camera. The original logic that fed into the "no expectations of privacy when in public" rulings breaks down solely due to the speed and scale with which computers can operate.

Like corporations, the machines will be human for purposes of rights and abstract, ephemeral entities for purposes of responsibility.

I'm unsure if this is true. I'm far from an expert in the current legal framework, but so far the court cases regarding liability in autonomous vehicle crashes have held humans responsible. That may change as driverless vehicles reach higher levels of automation, but in my understanding the ruling is still out.

I don't see why it would be different for LLMs.

Not a lawyer, but how would you think the law react when I sell computer for authors with pdf of pirated books come pre-installed as part of the 'reference' for aspiring authors to look at, without permission from publishers?

The issue is the recall LLMs have over copyrighted contents.

That's not a bad analogy. I like that it makes clear that the storage mechanism isn't relevant.

Personally, my read is that the issue with most of these cases is that we are treating and talking about LLMs as if they do things that humans do. They don't. They don't reason. They don't think. They don't know. They just map input to probabilistic output. LLMs are a tool like any other for more easily achieving some outcome.

It's precisely because we insist on treating LLMs as if they are more than an inefficient storage device (with a neat/useful trick) that we run into questions like this. I personally think the illegal status of current models should be pretty clear simply based of the pirated nature of their input material. To my understanding, fair use has never before applied to works that were obtained illegally.

Indeed, we don't charge humans for breathing, but we do attempt to discourage CO2 emissions for machines. These are completely different things on a completely different scale.

Misanthropic has convinced this particular judge, but there are many others, especially in other countries.

DDOSes are illegal attacks due to the speed and scale of the automation.

If a website gets organically DOSed by Slashdot, that is not an illegal attack.

LLMs 'reading' a book is not the same as a human reading a book in the exact same way that following a very popular link is not participating in a DDOS

No, DDoS are illegal (and Slashdot effect is legal) due to intent. That is usually the most important distinction in criminal matters.

Discussion (168 points, 1 day ago, 201 comments) https://news.ycombinator.com/item?id=44367850

> Then, its service providers stripped the books from their bindings, cut their pages to size, and scanned the books into digital form — discarding the paper originals.

This is basically the plot to Vinge's Rainbows End, AI and all.

Is that the one where the books were shredded and the shreds scanned, being reconstructed from the shreds as part of the process?

Yes

I wonder how this is going to affect the Disney+Universal vs OpenAI trial.

Openai is being sued by Disney? Do you mean Disney+universal vs midjourney?

Stealing? Which book store did they burgle? Was it a publisher's warehouse?