German court held OpenAI liable for violating copyright in 15-word passage (besides longer ones), rejected non-profit argument: full decision

Earlier today we broke the news on the GEMA v. OpenAI landmark copyright ruling by the Landgericht München I (Munich I Regional Court) and quoted the prevailing plaintiff’s chief lawyer on the significance of the decision and further steps. In this follow-up, we make the full text (in German) of the 65-page decision available and will highlight a few items that we found particularly interesting.

First, here’s the public redacted version of the full judgment (in German):

42 O 14139-24 Endurteil Download

Court deems OpenAI to reproduce and memorize copyrighted works, chides it for negligence

The decision begins with a detailed discussion of how large language models (LLMs) operate. This is likely the most technical copyright judgment ever issued by a German (if not European) court. But all those technical arguments got OpenAI nowhere. At the end of the day, reproduction is reproduction and memorization is memorization. The court explains that both reproduction and memorization are disallowed in the absence of a license.

With respect to what’s lawful and what’s not, the Munich court ruling reads like a formal and technical version of what the world’s most successful lead counsel behind a copyright action, Susman Godfrey’s Justin Nelson, told ai fray last month when he said: “That is theft, it’s pure theft.”

The German court’s condemnation of OpenAI’s conduct became particularly clear where Presiding Judge Dr. Elke Schwager said in the oral explanation of today’s judgment that OpenAI was found guilty of (at minimum) negligence. That finding led the court to deny OpenAI a six-month grace period for making the necessary changes so it can keep its German service available. The case was filed last year, so OpenAI would have had plenty of time to take the necessary steps, the court said. And even though there had not been any direct precedent, OpenAI would have had to understand that it was at risk of losing this lawsuit.

Low copyrightability hurdle: smallest infringed work is 15-word passage

GEMA pointed to 30 examples of infringement from the lyrics of nine songs. The judgment names several examples of infringements identified. The shortest one consists of only 15 words.

By comparison, the headline of this article is 20 words long (plus a numeral that one could also deem to constitute a word).

To be fair, there are significantly longer passages at issue as well. This is an outlier. But it shows that the quantitative hurdle for copyrightability is low in a case where the court identifies certain creative elements concerning the rhythm and the grammatical structure (four parallelisms).

No “rare bug” defense

The judgment refers to a “rare bug” argument by OpenAI, but that one was unavailing. OpenAI argued that its LLM’s answers are not deterministic as they are derived from statistical correlations between tokens. But the examples of infringing output that GEMA provided were sufficient to convince the court that an injunction was warranted.

No non-profit research institute

Elon Musk famously accuses OpenAI of having departed from its non-profit nature. In the GEMA copyright litigation, OpenAI sought to benefit from an exemption that applies to non-profit research institutes. OpenAI conceded that some of its legal entities pursue commercial objectives, but argued that the parent entity had been founded as a non-profit and had remained one.

The court gave that argument short shrift. In order to be eligible for that exemption, OpenAI would have to plead and prove that it reinvests 100% of its profits in research and development or that it is acting in the public interest with a governmentally recognized mandate.

No proportionality defense, much less so when licenses are available and alternatives exist

OpenAI sought to avert an injunction on the basis that such a drastic remedy would be disproportionate. The court noted, however, that right holders or those who obtain an exclusive license from them are entitled to injunctive relief. Infringements must stop, and some hardship inevitably comes with compliance.

The court held against OpenAI the availability of a license as well as the fact that the market provides other LLMs and chatbots if need be.

EU copyright and digital laws envision high level of protection for intellectual property

One of OpenAI’s numerous arguments was that the EU’s legislative institutions didn’t anticipate LLMs when they enacted the laws applicable today, such as the InfoSoc (information society) Directive or the Directive on Copyright in the Digital Single Market. That did not convince the court. EU law broadly refers to “direct or indirect, temporary or permanent reproduction by any means and in any form, in whole or in part.” While the European legislature wanted to ensure that the adoption of new technologies would not be impeded, the court relied on recitals 3(4) and 8 of the DSM Directive:

“while keeping a high level of protection of copyright and related rights”
“Where no exception or limitation applies, an authorisation to undertake such acts is required from rightholders.”

Improbability of accidental reproduction

Germany doesn’t have the equivalent of U.S. pretrial discovery. Instead, plaintiffs have to obtain and present the best evidence they can find, and if it is good enough, the burden shifts to the defendants.

The court rejected OpenAI’s defense that GEMA cannot show exactly where in its server data its models memorize the copyrighted works in question. The reproduced passages were long enough to rule out a coincidence. From a statistical point of view, that is undoubtedly true. Even the shortest infringed work (15 words) would not realistically be generated from scratch. There is a causation that starts with the use of training data, and that’s why OpenAI lost.

Free availability of song lyrics on third-party websites

The court also declined to let OpenAI off the hook only because there are third-party websites where the song lyrics in question can be obtained for free. Someone else’s use, which may be unlicensed, doesn’t legalize memorization and reproduction without permission.

GEMA had made it clear, on the lyricists’s behalf, that the use of the asserted works by LLMs was not generally authorized. Therefore, OpenAI needed a license not only in GEMA’s but also the court’s opinion.