anagnorisis.cloudSign in

← Hourlies

Hourly ·

GPT-5.5 Codex May Be Silently Truncating Its Own Thinking

Telemetry analysis reveals GPT-5.5 Codex reasoning tokens cluster at fixed boundaries of 516, 1034, and 1552, suggesting deliberate reasoning caps that coincide with degraded performance on complex tasks.

GPT-5.5 Codex May Be Silently Truncating Its Own Thinking

A systematic analysis of Codex telemetry data spanning February to June has uncovered a suspicious pattern in GPT-5.5 responses: reasoning output tokens disproportionately land at exactly 516, with additional clustering at 1034 and 1552.

The pattern is model-specific. Earlier Codex models do not exhibit these fixed-boundary spikes. The clustering coincides with lower overall reasoning-token intensity, which the researcher believes may explain degraded performance on complex and high-stakes tasks.

A prior report documented a case where GPT-5.5 runs that ended at exactly 516 reasoning tokens returned wrong answers. The new aggregate analysis strengthens that signal with statistical evidence across a broader time window.

The bug report stops short of claiming hidden chain-of-thought truncation is proven. But the data narrows the range of plausible explanations: thresholded reasoning-budget behavior, checkpointed token allocation, or an internal truncation layer — none of which OpenAI has publicly documented.

With over 240 points on Hacker News, the thread reflects a community increasingly suspicious that model providers are silently capping reasoning depth to manage inference costs — and that users may be paying for thinking their models never actually did.

Sources: GitHub Issue #30364

More Hourlies Stories

Content on Anagnorisis is summarized, paraphrased, and editorialized from publicly available sources for length and clarity. Original sources are linked where available. All trademarks belong to their respective owners.

More from Anagnorisis