News outlets sue Cohere over alleged copyright infringement

This lawsuit continuous the ongoing battle between AI companies and news publishers surrounding copyright issues.

More than a dozen top news publishers, including Forbes, Vox, The Guardian and Politico have filed a joint lawsuit against the Canadian artificial intelligence (AI) company Cohere over allegations of “systematic copyright and trademark infringement”.

The complaint, which was filed yesterday (13 February) at the Southern District Court of New York by claims that Cohere scrapes copies of published articles, trains its AI models using the data, and in turn, uses the outputs to compete with the outlets it ‘stole’ the data from.

According to the plaintiff’s filing, Cohere’s AI models deliver full verbatim copies, substantial summaries and publishers’ articles protected by paywalls through an interface.

Moreover, the lawsuit also alleges that Cohere’s AI hallucinates and “blatantly manufactures fake pieces” that it represents as coming from the plaintiffs, misleading the public, while tarnishing the publishers’ brand reputation.

The Toronto-based AI start-up was founded by Aidan Gomez in 2019 and is valued at more than $5bn. It’s backed by Cisco, AMD and Fujitsu, as well as the Canadian pension investment manager PSP Investments and Canada’s export credit agency EDC.

Gomez has claimed in interviews that Cohere does not scrape data that it shouldn’t scrape. Moreover, he also told The Verge – one of the plaintiffs in this lawsuit against the company – that “we [the company] don’t want to be training on stuff that people don’t want us training on, full stop”. However, the lawsuit alleges that Gomez’s claims are false.

Responding to the filing, a Cohere spokesperson told news outlets that the company “strongly stands by its practices for responsibly training its enterprise AI”.

“We have long prioritised controls that mitigate the risk of IP [intellectual property] infringement and respect the rights of holders. We would have welcomed a conversation about their specific concerns – and the opportunity to explain our enterprise-focused approach – rather than learning about them in a filing.”

The fresh lawsuit continuous the ongoing battle between AI companies and news publishers who claim their intellectual property is infringed upon by AI models who steal their copyrighted content without permission.

Just days ago, Thomson Reuters, a Canadian technology company and the parent company of the Reuters news agency won a partial summary judgement against AI start-up Ross Intelligence in a copyright infringement lawsuit. The judge presiding over the case overturned his previous summary judgement and disallowed the AI company from using ‘fair use’ as a defense for training models on proprietary data without permission.

However, two news outlets Raw Story Media and AlterNet Media lost a copyright lawsuit against OpenAI at the Southern District of New York last year for being unable to prove that ChatGPT caused them “concrete injury”.

While The New York Times launched a similar legal battle against OpenAI and Microsoft in 2023 which is still ongoing.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.