Skip to content

DO NOT MERGE YET: harden Codex structured output handling#86

Draft
PhilosophiMoonbeam wants to merge 1 commit into
Agent-Field:mainfrom
PhilosophiMoonbeam:codex-native-structured-output-hardening
Draft

DO NOT MERGE YET: harden Codex structured output handling#86
PhilosophiMoonbeam wants to merge 1 commit into
Agent-Field:mainfrom
PhilosophiMoonbeam:codex-native-structured-output-hardening

Conversation

@PhilosophiMoonbeam

@PhilosophiMoonbeam PhilosophiMoonbeam commented Jun 29, 2026

Copy link
Copy Markdown

Status

Draft only. Do not merge without further live Codex/SWE-AF runtime testing.

Why

The upstream Codex fixes address auth/default model behavior, but AgentField SDK 0.1.96 still does not appear to drive Codex through native --output-schema / --output-last-message. In SWE-AF runs this can still leave AgentField relying on the generic .agentfield_output.json write-tool contract, which is fragile for Codex and related to empty/null structured outputs.

Changes

  • Use Codex native structured-output flags and write the final answer to AgentField's expected .agentfield_output.json path.
  • Keep non-Codex providers on AgentField's original output-file prompt suffix.
  • Preserve model selection and project directory handling for Codex.
  • Make HITL form choice options an explicit Pydantic model so strict structured-output schemas remain valid.

Permission-mode note

This draft intentionally preserves SWE-AF's existing Codex permission mapping. permission_mode: "auto" still uses --dangerously-bypass-approvals-and-sandbox; explicit read-only, workspace-write, and danger-full-access values are still passed through; and unspecified/unknown values still fall back to --sandbox workspace-write.

We briefly tried mapping auto to workspace-write, but live SWE-AF git initialization can fail because Codex cannot create .git/refs/...lock files inside that sandbox. This PR does not attempt to change permissions; safer sandboxing should be treated as separate future work requiring a git-management design that does not need Codex to mutate .git metadata.

Local verification

  • AGENTFIELD_SERVER=http://localhost:9999 .venv/bin/python -m pytest tests/test_codex_harness_patch.py tests/test_ask_user.py tests/test_environment_scout.py -q -> 40 passed
  • AGENTFIELD_SERVER=http://localhost:9999 .venv/bin/python -m ruff check swe_af/runtime/codex_harness_patch.py swe_af/hitl/ask_user.py tests/test_codex_harness_patch.py tests/test_ask_user.py tests/test_environment_scout.py -> passed
  • AGENTFIELD_SERVER=http://localhost:9999 .venv/bin/python -m pytest tests/test_model_config.py tests/test_fatal_error.py tests/test_empty_build_guard.py tests/fast/test_app.py -q -> 123 passed
  • Full suite: 1017 passed, 1 skipped, with one expected guard failure because this branch intentionally modifies non-swe_af/fast files.

Still needed before merge

  • Live end-to-end SWE-AF run using runtime: "codex".
  • Confirm HITL pause/resume behavior still works during active child executions.
  • Confirm no regressions for claude_code or open_code providers.

@CLAassistant

CLAassistant commented Jun 29, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@PhilosophiMoonbeam PhilosophiMoonbeam force-pushed the codex-native-structured-output-hardening branch from 968bf10 to 2a308d2 Compare June 30, 2026 02:39
@PhilosophiMoonbeam PhilosophiMoonbeam force-pushed the codex-native-structured-output-hardening branch from 2a308d2 to fd7597f Compare June 30, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants