-
June 23, 2026
-
June 23, 2026

Codex CLI is open-source, and every release note links straight to the merged pull request behind each fix. So when OpenAI closes a security-relevant gap, the note ships with the receipt: the maintainer's own account of what was wrong and why it changed. We did not have to guess.
Most security disclosures make you reconstruct what broke. Codex hands you the evidence: update and you move forward; stay pinned below the fix and you keep carrying the old behavior, with no CVE, advisory, or bulletin to tell you. We read seven weeks of stable releases and pulled the PRs behind every security-relevant line - the maintainers already wrote the disclosure for us.
This is not a "Codex is uniquely unsafe" post. Every fix below is OpenAI doing the right thing: finding a trust-boundary weakness and closing it. The point is what happens to the fleet that never updated, and what its own release notes already admit.
An agent's version is part of its control plane
A Codex CLI install is far more than a model client. It wires model reasoning into local files and shells (behind an approval layer), a sandbox with a configurable network egress policy, credentials at rest, MCP servers, plugins, hooks, app connectors, and local WebSocket listeners.
Every one of those is a boundary. When a release note says a boundary changed, the pinned version keeps the old boundary. That is the entire exposure, in one sentence.
The security fixes (the signals)
Seven weeks of stable Codex CLI releases - 0.128.0 (Apr 30, 2026) through 0.141.0 (Jun 18, 2026) - produced this set of trust-boundary fixes:
What changed in each release, and what a fleet pinned below it still carries.
/diff ran code chosen by the repository (floor 0.136.0)This is the cleanest issue of the batch, because OpenAI classified it as a security one: PR #24954 cites internal security ticket PSEC-4395, "codex-cli /diff executes repository-selected…". In the maintainer's words:
/diffis intended to display working-tree changes, but its Git invocations honored repository-selected executable helpers. A repository could configure diff/text conversion helpers, clean/process filters,core.fsmonitor, orpost-index-changehooks that execute when a user runs/diff.
Read the threat model: clone or open an untrusted repository, run /diff to review it - a perfectly normal, "I'm being careful" action - and the repository's own Git config executes code on your machine. It needs no model cooperation and no approval prompt; it is a property of how /diff shelled out to Git. Anyone pinned below 0.136.0 still has this. (We separately confirmed the underlying Git vector is real and trivially weaponizable - see the appendix.)
Hooks are a defense mechanism - a PostToolUse hook can veto a tool result. PR #28365:
Previously, a PostToolUse hook could block a completed tool result, but code mode would still return the original typed result to JavaScript. The hook appeared blocked in hook telemetry while the running script continued with the result.That is the worst kind of control failure: it reports success while doing nothing. Anyone enforcing policy through PostToolUse hooks in code mode, pinned below 0.141.0, has a hook that lies.
deny_read could be dropped during escalation and by "safe" commands (floors 0.131.0 and 0.136.0)Two PRs close two ways an administrator's read-deny could be bypassed:
deny_read. Managed deny-read is an admin control on specific paths (think ~/.ssh, credential files). Below 0.131.0, escalation can silently re-open them.allow) was enough to run it outside the filesystem sandbox - so cat/ls on a deny-listed path would read it anyway. Fixed at 0.136.0.For any team relying on Codex's managed permission profiles to keep secrets unreadable, the effective floor is 0.136.0.
Release 0.140.0 landed a four-PR credential-storage stack (#27504 → #27535 → #27539 → #27541). Read precisely, it does not add encryption for the first time: the credential-store settings cli_auth_credentials_store (file/keyring/auto/ephemeral) and mcp_oauth_credentials_store (auto/file/keyring) are present unchanged at least as far back as 0.137.0 - and keyring already means OS-encrypted storage. What 0.140 changes is the backend: keyring-mode CLI auth now keeps only the encryption key in the OS keyring and stores the payload in an encrypted local-secrets file (a workaround for the Windows Credential Manager 2,560-byte limit), adds auth-specific encrypted namespaces, and extends the encrypted backend to MCP OAuth credentials (#27541).
The fleet implication is therefore narrower than "no encryption below 0.140": below 0.140.0 you don't get the encrypted-local-secrets backend (large keyring payloads can fail, notably on Windows) and MCP OAuth credentials aren't on the reworked encrypted store. The plaintext-vs-encrypted choice itself is a pre-existing config setting; we did not establish the shipped default, so we make no claim that credentials are plaintext "by default" on either version.
Codex runs local WebSocket listeners. Two fixes harden them:
Origin header (floor 0.136.0).Browser-origin requests to a localhost developer service are the classic DNS-rebinding / CSRF path to local code execution; our repro demonstrates the missing guard, not a full exploit chain. Below these floors, that boundary is simply absent. We reproduced this one end-to-end: a raw WebSocket upgrade carrying Origin: <http://evil.example> is accepted (101 Switching Protocols) on 0.135.0 but refused (403 Forbidden) on 0.136.0, while a no-Origin client still upgrades on both versions (details in the appendix).
Same binary, same request, one version apart.
The minimum version each security guarantee requires.
The Claude Code article asked three questions. None of them have easy answers for Codex, and that is the point - the gap lives in the space between them:
exec / app-server sessions?Most teams can answer (1) only by hand today. (2) is changelog-plus-PR archaeology. (3) is a policy decision most teams have not made yet, because nothing forced the question.
Codex's release notes are not product news. They are security signals, and because the project is open-source, they come with the receipts attached. In seven weeks, Codex closed an untrusted-repo code-execution path on /diff, two ways to bypass an administrator's read-deny, a browser-reachable local WebSocket, a long-lived-token remote-control design, a reworked at-rest credential backend, and a hook that reported blocking while letting code through.
We did not just read these fixes. We ran one. On the exec-server WebSocket, a connection carrying Origin: <http://evil.example> is accepted (101 Switching Protocols) on 0.135.0 and refused (403 Forbidden) on 0.136.0: same binary, same request, one version apart: present on one, gone on the other. The whole argument, in a single test.
None of this shipped with a CVE. Every fix is a boundary the vendor already decided was wrong, still live on any fleet that has not caught up. The receipts are attached to every release. The only question left is whether anyone on your side is reading them.
CHANGELOG.md; the release page is the changelog, and the alpha tags are stubs - the real notes live in the stable releases. 0.128.0 (Apr 30, 2026) through 0.141.0 (Jun 18, 2026) - roughly seven weeks.PSEC-4395).0.135.0-0.140.0 installed side by side) for the boundaries we can probe without credentials or network.We separate three things on purpose, because conflating them turns a changelog reading into an overclaim:
Every fix in the table is at least PR-confirmed. The version floors are not new exploits we discovered; they are public vendor fixes, read for their fleet implications - the same posture as the Claude Code post.
Origin: <http://evil.example> returns 101 Switching Protocols on 0.135.0 but 403 Forbidden on 0.136.0; a no-Origin upgrade still returns 101 on both (built-in negative control). This is the actual Codex binary's behavior across the version boundary - a true credential-less before/after, not an analogue (scripts/exec-server-origin-repro.py)./diff attack vector. A throwaway repo with a diff.<driver>.textconv helper (plus a one-line .gitattributes) executes that helper the instant git diff renders the file - our sentinel fired; the same diff under a hardened --no-textconv invocation did not. This confirms the class of issue #24954 closes. It does not drive Codex's interactive /diff end-to-end (that path needs a TUI session and auth), so the Codex-specific version delta stays PR-confirmed.Could not resolve host), and a host-side negative control (example.com → HTTP 200 outside the sandbox) isolates that failure to codex sandbox, not a broken resolver.Everything in this report is already public in OpenAI's merged PRs, so the lab was never about establishing the facts. It was about confirming the version deltas are real on the actual binary. A few boundaries (credential-at-rest, sandbox proxy enforcement, and interactive /diff) sit outside what a credential-less, network-off probe can exercise, so we left them PR-confirmed rather than chase a reproduction the published commit already settles.
Floors follow the first stable release whose notes ship the fix - not the PR merge date, which can mislead. Example: #27035 merged the same day as the 0.138.0 tag, but that tag does not contain the commit (git compare shows it behind_by: 1); it first ships in 0.139.0.
Map your installed version against the floor table above: each row is a security boundary and the first release that moved it. Anything below 0.136.0 is still carrying the /diff code-execution path, the read-deny bypasses, the browser-reachable WebSocket, and the old token handling; the sandbox-egress boundary moved at 0.139.0, the credential backend at 0.140.0, and PostToolUse hook enforcement at 0.141.0. The point is not a single "safe" number. It is that the gap grows quietly with every release you skip.
No. Codex CLI does not auto-update. It may notify you when a newer version exists, but you update manually through the channel you installed with: npm install -g @openai/codex@latest, brew upgrade codex, or by re-running the install script. A running session keeps the version it launched with, so restart to pick up an update. That is exactly how a pinned or long-lived install drifts below the floor with no signal.
codex --version on one machine. Across a fleet (laptops, CI images, containers, long-lived exec/app-server sessions) you need a version inventory; a single check misses background sessions still running their launch version.
We have no evidence of in-the-wild exploitation, and this is not an exploit report. Every item is a public vendor fix, read for what it means for a fleet still pinned below it.
No. Open-source is why these fixes are this legible: the release note links the PR that explains what changed. Tools that disclose less are not safer, just quieter. The same trust-boundary classes show up across coding agents.