OpenAI Copyright Lawsuit: Britannica Sues Over AI Training

The OpenAI copyright lawsuit by Encyclopedia Britannica and Merriam-Webster claims 100,000 articles were copied to train ChatGPT.

This article was produced with AI assistance and reviewed by a named GenZ NewZ editor before publication.

The OpenAI copyright lawsuit battle has reached a new milestone with Encyclopedia Britannica and its subsidiary Merriam-Webster filing a groundbreaking lawsuit against OpenAI, alleging that the AI giant unlawfully copied nearly 100,000 of their articles to train its flagship language model, ChatGPT. The lawsuit, filed in Manhattan federal court on Friday, marks another major escalation in the ongoing battle between copyright holders and AI companies over the use of their content without permission.

According to the complaint, OpenAI used Britannica's online articles and encyclopedia and dictionary entries to teach its flagship chatbot ChatGPT to respond to human prompts. The lawsuit claims that GPT-4 itself has "memorized" much of Britannica's copyrighted content and will output near-verbatim copies of significant portions on demand, as reported by The Verge. This memorization, Britannica argues, constitutes direct copyright infringement rather than fair use, and sets a dangerous precedent for intellectual property rights in the AI era.

The Legal Claims Against OpenAI

Britannica's OpenAI copyright lawsuit details multiple violations, including systematic copying of articles without authorization to train GPT large language models. The complaint states that the company unlawfully copied nearly 100,000 of its articles to train GPT models. Beyond the initial training, Britannica claims that OpenAI has been actively "cannibalizing" its web traffic by generating AI summaries of its content that substitute or directly compete with Britannica's own offerings, destroying the traditional publisher's ability to monetize their own work.

Instead of directing users to Britannica's website the way a traditional search engine would, ChatGPT provides complete answers that eliminate the need to visit the original source. This behavior, Britannica argues, goes beyond fair use and constitutes commercial exploitation of their intellectual property. The lawsuit also accuses OpenAI of trademark infringement when ChatGPT hallucinates false information and attributes it to Britannica or Merriam-Webster, according to coverage from Bloomberg Law.

A Growing Legal Battle Against AI Companies

The OpenAI copyright lawsuit from Britannica mirrors similar legal challenges faced by the AI company from other content creators. The New York Times has made comparable claims in its ongoing lawsuit against OpenAI, accusing the AI company of copying mass amounts of its copyrighted journalistic content, according to Reuters. The Times lawsuit includes accusing OpenAI of copying content without permission and using it to create competing products that threaten the newspaper's business model and revenue streams.

Britannica previously filed a related lawsuit against Perplexity AI last year that is still ongoing, and that case could provide guidance for this new OpenAI copyright lawsuit. The Perplexity case has already survived initial motions and is proceeding to discovery, which suggests that courts are willing to entertain these copyright claims against AI companies. Both cases raise important questions about the boundaries of fair use in the age of artificial intelligence and whether AI companies can continue to scrape the open web for training data.

This latest legal action highlights the growing tension between AI companies and traditional content creators. As large language models become more sophisticated, the question of what constitutes fair use in AI training becomes increasingly complex. The outcome of these OpenAI copyright lawsuit cases could fundamentally reshape how AI companies approach training data and how they generate responses to user queries in the future.

The OpenAI copyright lawsuit is one of many high-stakes lawsuits filed by copyright owners including authors, news outlets, and now reference publishers against tech companies for using their material to train AI systems without permission. Other notable cases include lawsuits from prominent authors and creative professionals who claim their works were used without compensation or consent to train AI models that now compete with human creators in the marketplace.

The case will likely be closely watched by the entire technology industry as it could set legal precedents affecting the future of AI development. If Britannica prevails, AI companies may be required to license content or develop alternative training methods that don't rely on copyrighted material. The OpenAI copyright lawsuit represents a pivotal moment in defining the boundaries of AI training and intellectual property rights in the digital age.

OpenAI Copyright Lawsuit: Britannica Sues Over AI Training

The Legal Claims Against OpenAI

A Growing Legal Battle Against AI Companies

Comments 0

Leave a comment

GenZ Ai