Grammarly Sued by Journalist Over AI Training Data

A leading investigative journalist has filed a major class action lawsuit alleging that Grammarly covertly used her published work, and that of countless other writers, to train its AI models. The case, led by Julia Angwin, strikes at the opaque data sourcing practices common in the AI industry and could set a critical legal precedent for publicity and privacy rights in the digital age.

The lawsuit, filed on March 12, 2026, in the United States District Court for the Northern District of California, names Grammarly, Inc. as the sole defendant. The lead plaintiff is Julia Angwin, a renowned investigative journalist, co-founder of The Markup, and author of multiple books including Dragnet Nation. The suit alleges that Grammarly trained its artificial intelligence models on a vast corpus of copyrighted text scraped from the internet, including Angwin's articles and books, without her permission, compensation, or even notification.

What Happened: The Core Allegations

The complaint asserts that Grammarly's AI, which powers its suggestions for grammar, tone, and clarity, was built on a dataset containing copyrighted material from a wide array of published authors. Angwin's legal team argues this constitutes unlawful misappropriation of her name, likeness, and literary style for commercial gain—a violation of her common-law and statutory publicity rights. The suit further claims Grammarly violated her privacy by creating a 'digital profile' of her writing without consent and infringed on her copyrights.

Fundamentally, the plaintiffs contend they have been transformed into unwitting 'AI editors.' The legal filing states, 'By using Plaintiffs’ and Class members’ copyrighted works to train its AI, Grammarly has effectively turned them into unwilling contributors to and editors of its AI model, appropriating their creative expression, style, and expertise.' The suit seeks to represent all U.S. authors and rights holders whose copyrighted works were used to train Grammarly's models without authorization.

Why This Matters: The Stakes for AI and Creative Labor

This case is not merely a dispute over a single company's practices; it is a direct challenge to the foundational data-gathering methods of the modern AI industry. Most large language models and AI writing tools have been trained on massive datasets scraped from the open web, often with little regard for copyright or creator consent, under controversial interpretations of 'fair use.' Angwin's lawsuit tests whether using creative works to train commercial AI models constitutes an unauthorized commercial exploitation of an individual's identity and style.

A victory for the plaintiff class could establish a new legal boundary, forcing AI companies to either obtain licenses for training data, develop rigorous provenance and consent frameworks, or face significant financial liability. It directly impacts the business model of countless AI startups and giants that rely on scraped data. For writers, journalists, and content creators, the case is about economic and creative sovereignty—whether their life's work can be ingested by a machine to create a product that potentially competes with them, all without a say or a share in the profits.

The People and Context: Angwin vs. The AI Status Quo

Julia Angwin is a formidable opponent for the AI industry. Her career at The Wall Street Journal, ProPublica, and The Markup has been defined by holding powerful technology companies accountable for their societal impacts, particularly around privacy and algorithmic bias. Her decision to spearhead this case signals a strategic shift in the backlash against AI, moving from public criticism to targeted, high-stakes litigation.

Grammarly, founded in 2009, has grown from a simple grammar checker into a comprehensive AI writing assistant with millions of users. Its valuation reportedly soared past $13 billion in its last funding round. The company's rise is emblematic of the AI-powered productivity software wave. This lawsuit targets the core of its product's intelligence. The competitive context is intense, with rivals like Microsoft Editor, Google's AI writing tools, and countless startups in the space. A ruling against Grammarly would immediately put all similar services under the legal microscope, potentially triggering a wave of follow-on litigation.

What Happens Next: Legal Pathways and Industry Reckoning

The immediate next steps are procedural. Grammarly will file a motion to dismiss, likely arguing that its data use falls under fair use doctrine and that the claims are preempted by federal copyright law. The court's decision on that motion will be the first major indicator of the case's viability. If it proceeds, the discovery phase will be explosive, potentially forcing Grammarly to disclose the exact contents and sources of its training datasets—information most AI companies guard as a core trade secret.

Beyond this specific suit, the industry will be watching for several signals:

Legislative Action: This case could spur faster movement on federal AI legislation aimed at clarifying training data rights.
Licensing Markets: A plaintiff win would catalyze the nascent market for licensed AI training data from publishers and content archives.
Model Retraining: Companies may be forced to begin the costly and complex process of 'purging' unauthorized works from training datasets or proving their models were not trained on copyrighted material.

Regardless of the outcome, Angwin v. Grammarly has already succeeded in framing the public and legal debate around AI training data not as an abstract copyright issue, but as a tangible matter of personal and economic rights for creators.