What open math problem did AI solve?

Frontier models solved an open conjecture on Ramsey hypergraphs that was first proposed in 2019 and had stumped the original authors. The problem sits on FrontierMath's open-problem track, which is specifically designed to contain problems with no known human solution.

Which AI models solved it?

According to Epoch AI, GPT-5.4 Pro was the first to clear it, followed by Gemini 3.1 Pro and Claude Opus 4.6. The benchmark is designed so models can't have seen solutions in training data — the problems were unsolved by humans when the models encountered them.

What does this mean for AI and mathematics?

It represents the first documented case of an AI system producing a solution to a genuinely open mathematical problem — not just solving known problems faster, but advancing mathematics itself. Researchers are cautious about over-interpreting a single result, but the precedent is significant.

Frontier AI Models Solve an Open Math Problem That Stumped Humans for Years

AI systems have solved an open mathematics problem that had stumped human researchers since 2019.

Epoch AI reported that GPT-5.4 Pro became the first model to clear FrontierMath's open-problem track, solving a conjecture on Ramsey hypergraphs that the original authors had been unable to resolve. Gemini 3.1 Pro and Claude Opus 4.6 subsequently also solved it.

The distinction matters: these are not problems where the solution exists and AI found it faster. These are problems that were genuinely unsolved — by the humans who created them — when the models encountered them.

What Ramsey Hypergraphs Are

Ramsey theory is a branch of combinatorics concerned with conditions under which order must appear in structures that seem chaotic. Hypergraph problems in this space involve understanding how colors, connections, or patterns must emerge across high-dimensional graph structures once certain size or density thresholds are crossed.

The specific 2019 conjecture that AI solved involved predicting the existence or properties of certain Ramsey configurations in hypergraphs. The original authors could not find a proof. Neither could subsequent researchers. FrontierMath — a benchmark specifically designed to contain problems beyond current human solving capacity — had listed it as an open problem.

GPT-5.4 Pro produced a valid solution.

The IQ Trajectory That Makes This Less Surprising

Epoch AI's announcement landed in the same week that researcher Charbel-Raphael Ségerie published a striking data point: in March 2023, Claude had an estimated IQ equivalent of approximately 64 on standardized reasoning tests. Today, Claude Opus 4.6 scores 133 on the Mensa Norway test. GPT-5.2 Thinking scores 141. Gemini 3 Pro reaches 142.

That's a jump from cognitively impaired to gifted in approximately three years. No human population in recorded history has ever improved that fast on standardized cognitive assessments.

The Ramsey hypergraph result fits this trajectory. Models aren't just getting better at producing fluent text — they're getting better at mathematical reasoning, at decomposing novel problems into tractable subproblems, and at generating and verifying proofs. The same week that Claude proved it can do original theoretical physics research, another cluster of frontier models proved they can extend human mathematics.

What FrontierMath Is

FrontierMath is a benchmark developed specifically to stay ahead of AI capability. Standard math benchmarks like MATH and GSM8K were saturated — models were scoring at or near 100% — and stopped measuring meaningful differences between frontier systems.

FrontierMath collects problems from working mathematicians, many of which involve research-level difficulty or genuinely open questions. The open-problem track is its most extreme tier: problems listed there have no known human solution at the time they're added.

The fact that frontier models have now cleared this track doesn't mean AI has solved mathematics. It means the tier of problems that can serve as a meaningful test of frontier AI capability has moved again, further into territory that was previously considered uniquely human.

Implications

The practical significance of AI solving open math problems is still being worked out. Mathematical research doesn't produce immediate products, but it underpins fields ranging from cryptography and materials science to fundamental physics. A system that can advance mathematics could, in principle, accelerate progress across all of those areas.

More immediately, the result updates the timeline for when AI might be considered a genuine research collaborator in formal domains. The cautious view — that AI systems are good at pattern matching but can't do real mathematical reasoning — has become harder to hold. What happened with the Ramsey hypergraph conjecture is closer to genuine mathematical discovery than anything AI systems had previously demonstrated.

Frontier AI Models Solve an Open Math Problem That Stumped Humans for Years

What Ramsey Hypergraphs Are

The IQ Trajectory That Makes This Less Surprising

What FrontierMath Is

Implications

More in Research

Anthropic's Project Deal: 69 Employees, 186 AI-Brokered Trades, and a Quiet Warning About 'Agent Quality' Gaps

Sony AI's Project Ace becomes first robot to beat elite table tennis players, lands Nature cover

X Square Robot Unveils Wall-B Embodied AI Model, Promises Home Robots in 35 Days