Lien copié

Grok 4 Falls Hard To OpenAI’s o3 In Finals

Sun 10 Aug 2025 ▪ 4 min read ▪ by Luc Jose A.

Getting informed ▪ Intelligence Artificielle

Summarize this article with:

On the chessboard, two visions of AI faced each other. Sam Altman, head of OpenAI, and Elon Musk, founder of xAI, crossed their models in a chess tournament organized by Google. For three days, OpenAI’s o3 and xAI’s Grok 4 competed without any specialized assistance. Much more than a simple exhibition match, the event turned into a revealing moment : behind the final score was the real gap between two artificial intelligences, and two strategies, which came to light.

An OpenAI robot capturing Grok’s queen from xAI in a chess match between AI models.

In brief

Sam Altman and Elon Musk faced off via AI in a chess tournament organized by Google.
OpenAI’s o3 and xAI’s Grok 4 played without a chess engine or specialized training.
OpenAI largely dominated the match, winning the final 4–0.
The tournament revealed the current limits of generalist AIs facing strict rules.

A one-sided final

During the final held on August 7, OpenAI’s o3 model, which had been criticized a few days earlier in favor of GPT-5 that is already disappointing, inflicted a decisive 4–0 defeat on xAI’s Grok 4.

This tournament, called “Kaggle Game Arena AI Chess Exhibition”, prohibited any use of a chess engine or dedicated training, leaving the models to manage with their general knowledge gathered from the Internet.

From the first games, the difference was noticeable. Magnus Carlsen, world champion and event commentator, compared the two AIs to “a gifted child who doesn’t know how the pieces move”, estimating their level around 800 ELO, far from competitive standards.

Grok’s blunders multiplied throughout the final :

A free loss of important pieces, giving an immediate material advantage to OpenAI ;
A failed “poisoned pawn”, with a poorly chosen target that immediately cost the capture of its queen ;
Wasted solid positions in the middle game, with a series of incoherent moves ;
Poor management of the initial advantage in the fourth game, allowing o3 to turn the situation around.

Hikaru Nakamura, international grandmaster and event streamer, summed up the difference between the two opponents : “OpenAI didn’t make the mistakes Grok did”. He also praised o3’s spectacular comeback in the last game, where a bad start had predicted a possible victory for Elon Musk’s xAI Elon Musk.

Your 1st cryptos with Coinbase This link uses an affiliate program.

The limits of generalist AIs exposed

Beyond the score, the tournament revealed the structural difficulties of generalist AIs when confronted with a strict framework like chess. Many models were disqualified in the preliminary phase after attempting impossible actions : teleporting pieces, resurrecting captured units, illegal pawn moves.

Even in the final, understanding of the rules seemed fragile, alternating between brilliant moves and absurd decisions. As Carlsen pointed out, “these AIs know how to count captured pieces, but not how to conclude a winning game”.

This observation is not isolated. Earlier this year, international master Levy Rozman organized a similar tournament where language models accumulated illegal moves, even invoking missing pieces. Stockfish, a specialized AI, won the event hands down. These episodes show that, despite promises and claims about their versatility, language models remain far from mastering tasks requiring coherence and procedural rigor.

For Elon Musk, this defeat to Sam Altman, his second direct competition loss this year, comes at a bad time, as xAI has just raised $10 billion and seeks to position itself as a credible player in the race for general AI. However, for the whole sector, Google’s exhibition mostly reminds us that current large models excel at natural language processing, much less so in strict application of complex rules. AI may one day rival the best chess players… but that day, it will have also proven capable of reasoning well beyond black and white squares.

Maximize your Cointribune experience with our "Read to Earn" program! For every article you read, earn points and access exclusive rewards. Sign up now and start earning benefits.

Join the program

Lien copié

Luc Jose A.

Diplômé de Sciences Po Toulouse et titulaire d'une certification consultant blockchain délivrée par Alyra, j'ai rejoint l'aventure Cointribune en 2019. Convaincu du potentiel de la blockchain pour transformer de nombreux secteurs de l'économie, j'ai pris l'engagement de sensibiliser et d'informer le grand public sur cet écosystème en constante évolution. Mon objectif est de permettre à chacun de mieux comprendre la blockchain et de saisir les opportunités qu'elle offre. Je m'efforce chaque jour de fournir une analyse objective de l'actualité, de décrypter les tendances du marché, de relayer les dernières innovations technologiques et de mettre en perspective les enjeux économiques et sociétaux de cette révolution en marche.

DISCLAIMER

The views, thoughts, and opinions expressed in this article belong solely to the author, and should not be taken as investment advice. Do your own research before taking any investment decisions.