A Leaky-Sieve Reasoning System Just Started Doing Real Math

What really sent a chill down my spine in the AI world these past few days wasn't a funding round. It wasn't a product launch.

It was this:

OpenAI's unreported general-purpose reasoning model reportedly solved the Erdős planar unit distance problem — first posed in 1946.

A decades-old problem that mathematicians couldn't crack.

What shook me wasn't that "it solved it."

It was how it solved it.

The full chain of thought reportedly printed out to 125 pages.

Not the kind of genius epiphany you see in movies.

Quite the opposite.

All trial and error. Dead ends. Backtracking. Repeated reversals and detours.

Like a mentally shattered grad student grinding away in a mountain of scratch paper.

But here's the thing:

It actually found the door in the end.

And what's most interesting:

This wasn't a math-specific model.

It was a general-purpose reasoning model.

Most people haven't grasped what this means yet.

Over the past few years, there's been a loud contrarian voice in the AI world.

The most prominent being the LeCun camp.

Their long-held position:

LLMs don't do real reasoning. Just statistical language modeling.

Then, as reasoning capabilities visibly improved, they doubled down:

These so-called reasoning outputs are merely crude imitations of human reasoning.

They're not entirely wrong.

Today's large model reasoning does resemble a sieve riddled with holes.

Wild speculation. Frequent wrong turns. Logic that collapses mid-stream.

But the LeCun camp may have underestimated one thing:

Intelligence doesn't always require "perfect reasoning."

With enough scale, sufficiently broad search, and strong self-reflection and course-correction,

a kind of "rough but effective" intelligence can suddenly emerge.

And mathematics and programming happen to be where this breaks through first.

Because they share one crucial property:

Verifiability.

You can flail. Generate wildly. Even "guess blindly."

But in the end, the verifier tells you:

Right.

Or wrong.

And so for the first time, AI enters a deeply unsettling state:

It may not actually "understand the world,"

yet it can already conduct effective exploration near the frontier of human knowledge.

That alone is staggering.

Now, LeCun isn't entirely without a point.

He says:

The real world isn't like mathematics.

In the real world, many problems are unverifiable, inexhaustible, and impossible to formalize in language.

I agree with that.

But the problem is:

His critique of LLMs and reasoning is far too absolute.

Especially now, after mainstream approaches have broken through time and again,

his "tear it all down to build anew" contrarian stance feels increasingly out of touch.

More critically:

The "bypass language, prioritize visual world models" approach he's championed for years

has yet to produce any truly industry-shaking result.

At least not so far.

So the real question today is no longer:

"Do LLMs have genuine intelligence?"

It's a more dangerous one:

If a reasoning system "as leaky as a sieve" is already contributing to real mathematical discoveries,

then with more scaling...

Isn't that superintelligence?

by Tuya

发布者

立委

立委博士，多模态大模型应用咨询师。出门问问大模型团队前工程副总裁，聚焦大模型及其AIGC应用。Netbase前首席科学家10年，期间指挥研发了18种语言的理解和应用系统，鲁棒、线速，scale up to 社会媒体大数据，语义落地到舆情挖掘产品，成为美国NLP工业落地的领跑者。Cymfony前研发副总八年，曾荣获第一届问答系统第一名（TREC-8 QA Track），并赢得17个小企业创新研究的信息抽取项目（PI for 17 SBIRs）。查看立委的所有文章

发布者

立委

发表回复