Elon Musk’s Ai War: New Ai Rankings Spark Power Shift In Chatbot World

Trending 3 hours ago

AI showdown: nan caller world rankings are in. Grok logo displayed connected a smartphone pinch Elon Musk and Xai seen successful nan background. Credit: Lucia Fdez, Shutterstock

Elon Musk called Grok 4 nan smartest AI live – but nan world rankings conscionable dropped, and nan existent victor mightiness astonishment you. Who rules nan bots?

Musk claimed it was brainier than grad students – but nan scoreboard tells a different tale.

He called it a genius. The scoreboard called it average. Elon Musk’s shiny caller AI bot, Grok 4, conscionable sewage schooled successful beforehand of nan full tech world – and nan consequence is much Oppenheimer than Iron Man.

Fresh disconnected declaring Grok 4 ‘smarter than almost each postgraduate students successful each disciplines,’ Musk is now facing a sadistic dose of reality. The UC Berkeley Chatbot Arena – basically nan Premier League of AI smarts – conscionable dropped its latest rankings. And conjecture what? Grok didn’t moreover make nan apical two.

Musk’s “smartest AI successful nan world” conscionable came third.

Topping nan array was Google’s Gemini 2.5, followed by OpenAI’s GPT-4o and GPT-4.5. Grok 4 limped successful tied for 3rd – a very decent effort if your PR squad hadn’t already plastered ‘world’s smartest AI’ each complete societal media.

Let’s beryllium honorable – bronze isn’t bad, and it’s a work in progress. But erstwhile you’ve been telling everyone your robot could outthink Oxford, finishing 3rd down nan accustomed suspects stings conscionable a bit.

What is Grok – and why is Elon groaning?

Grok is Musk’s reply to ChatGPT – an edgy, opinionated chatbot cooked up by his AI startup, xAI. It lives wrong X (formerly Twitter), and was sounded arsenic a free-thinking, free-speaking, fearless replacement to nan supposedly “woke” competition.

But it’s had a rocky start. Not agelong ago, Grok was caught spewing antisemitic and racist contented erstwhile prompted – behaviour that had moreover Musk fans wondering if this point had a screw loose. Others see it arsenic a blatant media trick, baiting an AI to opportunity mean things truthful you tin people antagonistic property astir Musk and his companies.  

It didn’t extremity nan Pentagon, mind you – they reportedly pumped $200 cardinal into Grok’s development.

Is nan leaderboard legit – aliases conscionable a vibe-fest?

Some experts are questioning nan scoreboard itself. According to a damning study by researchers astatine Cohere, nan Chatbot Arena has immoderate dodgy practices down nan scenes, like backstage pre-testing, people deletions, and moreover exemplary swaps earlier rankings go public.

Meta was caught doing conscionable that – sending a concealed type of its LLaMA 4 exemplary to compete. It’s nan AI balanced of showing up to a occupation question and reply pinch a copy who’s actually qualified.

So if nan system’s flawed, does Grok’s bronze moreover mean anything? It depends on who you ask. But moreover successful this chaotic competition, nan champion models support rising to nan apical – and Grok’s still trailing.

What are nan existent champs doing differently?

Google’s Gemini 2.5 is nary slouch. It handles text, images, code, and much – and it’s been trained to logic for illustration a scientist, not conscionable repetition net fluff. OpenAI’s GPT-4o is celebrated for smooth, human-like dialogue, while GPT-4.5 packs immoderate of nan sharpest problem-solving skills seen successful immoderate exemplary to date.

Grok, successful contrast, has focused much connected cognition than academics, and it shows.

Musk made bold claims. But once again, the reality came up short. Or truthful it appears.

Want much AI drama, tech tantrums, and brainy bots behaving badly? Stay tuned to Euro Weekly News Tech.

More Spanish surviving news.  

More news successful English from astir Spain.

More