Elon Musk’s AI War: New AI rankings spark power shift in chatbot world

Contents

Musk’s “smartest AI in the world” just came third. What is Grok – and why is Elon groaning?Is the leaderboard legit – or just a vibe-fest?What are the real champs doing differently?

AI showdown: the new global rankings are in.

Grok logo displayed on a smartphone with Elon Musk and Xai seen in the background.

Credit: Lucia Fdez, Shutterstock

Elon Musk called Grok 4 the smartest AI alive – but the global rankings just dropped, and the real winner might surprise you. Who rules the bots?

Musk claimed it was brainier than grad students – but the scoreboard tells a different tale.

He called it a genius. The scoreboard called it average. Elon Musk’s shiny new AI bot, Grok 4, just got schooled in front of the entire tech world – and the result is more Oppenheimer than Iron Man.

Fresh off declaring Grok 4 ‘smarter than almost all graduate students in all disciplines,’ Musk is now facing a brutal dose of reality. The UC Berkeley Chatbot Arena – basically the Premier League of AI smarts – just dropped its latest rankings. And guess what? Grok didn’t even make the top two.

Musk’s “smartest AI in the world” just came third.

Topping the table was Google’s Gemini 2.5, followed by OpenAI’s GPT-4o and GPT-4.5. Grok 4 limped in tied for third – a very decent effort if your PR team hadn’t already plastered ‘world’s smartest AI’ all over social media.

Let’s be honest – bronze isn’t bad, and it’s a work in progress. But when you’ve been telling everyone your robot could outthink Oxford, finishing third behind the usual suspects stings just a bit.

What is Grok – and why is Elon groaning?

Grok is Musk’s answer to ChatGPT – an edgy, opinionated chatbot cooked up by his AI startup, xAI. It lives inside X (formerly Twitter), and was pitched as a free-thinking, free-speaking, fearless alternative to the supposedly “woke” competition.

But it’s had a rocky start. Not long ago, Grok was caught spewing antisemitic and racist content when prompted – behaviour that had even Musk fans wondering if this thing had a screw loose. Others see it as a blatant media trick, baiting an AI to say mean things so you can publish negative press about Musk and his companies.

It didn’t stop the Pentagon, mind you – they reportedly pumped $200 million into Grok’s development.

Is the leaderboard legit – or just a vibe-fest?

Some experts are questioning the scoreboard itself. According to a damning report by researchers at Cohere, the Chatbot Arena has some dodgy practices behind the scenes, like private pre-testing, score deletions, and even model swaps before rankings go public.

Meta was caught doing just that – sending a secret version of its LLaMA 4 model to compete. It’s the AI equivalent of showing up to a job interview with a twin who’s actually qualified.

So if the system’s flawed, does Grok’s bronze even mean anything? It depends on who you ask. But even in this chaotic competition, the best models keep rising to the top – and Grok’s still trailing.

What are the real champs doing differently?

Google’s Gemini 2.5 is no slouch. It handles text, images, code, and more – and it’s been trained to reason like a scientist, not just repeat internet fluff. OpenAI’s GPT-4o is famous for smooth, human-like dialogue, while GPT-4.5 packs some of the sharpest problem-solving skills seen in any model to date.

Grok, in contrast, has focused more on attitude than academics, and it shows.

Musk made bold claims. But once again, the reality came up short. Or so it appears.

Want more AI drama, tech tantrums, and brainy bots behaving badly? Stay tuned to viraltrendingcontent Tech.

More Spanish living news.

More news in English from around Spain.