Skip to content

Refactor RFC: extract a deep materialize module from handle_execution_result (toward rfcs#3 run/ingest split) #717

@lewisjared

Description

@lewisjared

Refactor RFC generated via /improve-codebase-architecture. Contributes toward the run/ingest split in Climate-REF/rfcs#3 (execution lifecycle consolidation).

Problem

handle_execution_result (packages/climate-ref/src/climate_ref/executor/result_handling.py:345) is a very commonly used function that fuses four distinct concerns into one body:

Concern Lines Touches
A. Promote artifacts scratch→results (log, metric bundle, output bundle + its referenced plots/data/html, series) 373-430 + _copy_file_to_results (90), _copy_output_bundle_files (456) filesystem, config.paths.scratch/results, output_fragment
B. Status state machine (mark_failed on missing log → keep dirty/retry; unsuccessful → retryable vs diagnostic-error; mark_successful) 386-400, 453 Execution model
C. DB ingest (load CV, register outputs, scalar+series values, nested txn, swallow errors) 432-444 → ingest_execution_result (234) DB, CV file
D. Dirty / pending-work flag 398-399, 449-450 ExecutionGroup

The ordering is load-bearing and implicit (copy must precede ingest), and the failure handling is interleaved with copying, so no concern is reusable in isolation.

Concrete friction: "copying the results is not obvious"

When an execution is copied (reingest, reingest.py:211), the promotion happens in two hops through two unrelated code paths:

  1. shutil.copytree(scratch_dir, new_scratch_dir) (reingest.py:284) — copies scratch → new scratch
  2. handle_execution_result(...) (reingest.py:312) — which internally copies new-scratch → results, buried among DB writes and status mutation

Concern A — "materialize the durable results tree from a scratch tree" — has no name and no callable entry point. It is hidden inside a function called "handle result," entangled with DB and status. That is the non-obviousness this RFC removes.

This is exactly the run/ingest split of Climate-REF/rfcs#3: every backend completes by writing content to disk (run/materialize), then a transport-agnostic, idempotent, replayable step loads it into the DB (ingest). Today A+B+D and C are fused.

Proposed Interface

A new module packages/climate-ref/src/climate_ref/executor/materialize.py owning only concern A (scratch→results promotion). It never touches the DB or mutates Execution; it reports an outcome and lets the caller drive status (B/D) and ingest (C).

Two-function split — a decoupled core plus a thin Config-aware adapter:

from attrs import frozen
import pathlib
from climate_ref_core.diagnostics import ExecutionResult


@frozen
class MaterializeOutcome:
    """What materialize did. No DB state, no status mutation — the caller decides."""

    log_copied: bool
    artifacts_copied: tuple[pathlib.Path, ...] = ()  # relative paths actually written

    @property
    def log_missing(self) -> bool:
        return not self.log_copied


def materialize_outputs(
    result: ExecutionResult,
    *,
    source_dir: pathlib.Path,   # already <root>/<fragment>
    dest_dir: pathlib.Path,     # already <root>/<fragment>
) -> MaterializeOutcome:
    """Promote an execution's artifacts from source_dir to dest_dir.

    Always attempts the execution log first (even on failure). On a successful
    result, also copies the metric bundle, the output bundle and every
    plot/data/html file it references, plus the series file. Idempotent.

    Takes two fragment-resolved dirs and an ExecutionResult — NO Config, NO
    Execution, NO Database. Decoupling is enforced by the signature.
    """


def materialize_execution(
    config: "Config",
    execution: "Execution",
    result: ExecutionResult,
) -> MaterializeOutcome:
    """Default form for the executor path: derive source/dest from config + fragment."""
    fragment = execution.output_fragment
    return materialize_outputs(
        result,
        source_dir=config.paths.scratch / fragment,
        dest_dir=config.paths.results / fragment,
    )

Usage

Common executor path (handle_execution_result) — replaces the log-copy block, the three success copies, and _copy_output_bundle_files (result_handling.py:374-431) with one call:

outcome = materialize_execution(config, execution, result)
if outcome.log_missing:
    logger.error("Missing log file in scratch; likely a killed process (will retry).")
    execution.mark_failed()      # status mutation (concern B) stays in the caller
    return                        # leave dirty=True for retry
# ... caller then handles failure branches (B/D), then ingest (C), then mark_successful

Reingest: needs zero changes under the current flow (its source is already config.paths.scratch / new_fragment, so it keeps calling through). The explicit source_dir/dest_dir override of materialize_outputs exists for any caller whose source is not under scratch (a future archive-restore, or staging dir).

What it hides

  • The copy-set / EMDS artifact taxonomy (callers never enumerate log/metric/output/series).
  • CMEC output-bundle parsing to enumerate referenced plots/data/html (CMECOutput.load_from_json + the _copy_output_bundle_files walk).
  • ensure_relative_path relative-vs-absolute resolution (diagnostics.py:27) — the word "relative" never appears in a caller.
  • mkdir(parents=True, exist_ok=True) + overwrite idempotency, and the results_dir != scratch_dir invariant.
  • The "log always, even on failure" ordering, and the translation of a missing log into a reportable outcome rather than buried try/except + mark_failed.

Design note — uniform reporting, not asymmetric raises. An alternative considered (the "minimal" candidate) reported a missing log via the outcome but raised FileNotFoundError on a missing success-payload ("a successful result that lied is a bug"). This RFC deliberately rejects that asymmetry: rfcs#3 wants ingest to be idempotent and replayable, and uniform outcome reporting is friendlier to replay than exception control-flow. Missing payload should be reported in MaterializeOutcome, letting the caller decide, not crash mid-promotion.

Dependency Strategy

Category 2 — local-substitutable. Pure filesystem + CMEC JSON parse.

  • Config is not a dependency of materialize_outputs; it is confined to the one-line materialize_execution adapter (imported under TYPE_CHECKING). The core's signature has no Config to reach into — decoupling enforced structurally, not by convention.
  • Filesystem: reuse the existing shutil.copy mechanics; fold _copy_file_to_results into a private _copy_one(source_dir, dest_dir, rel) in the new module. Delete the originals from result_handling.py (no dead aliases).
  • CMEC parse: reuse CMECOutput.load_from_json directly (pure JSON→model, itself category 2 — no mock).

Deferred (named, not built here) — the ArtifactStore port

A ports-&-adapters variant was evaluated: an ArtifactStore Protocol (bytes at relative keys: exists/read_bytes/write_bytes/iter_keys) + a transfer() free function, with LocalArtifactStore and InMemoryArtifactStore adapters. It would make the scratch/results filesystems two instances of one store and give rfcs#3's rsync/remote ingest a drop-in adapter (with iter_keys as the replay primitive).

It is deliberately deferred. The dependency is local-substitutable today (a tmp_path already substitutes it), and rfcs#3 marks cross-deployment portability as a separate, deferred RFC — so the port has no second adapter to justify it yet. When that work is scheduled, materialize_outputs's body swaps Path operations for store calls with near-zero churn. Build the port then, not now.

Testing Strategy

New boundary tests (against materialize_outputs, the deep core):

  • Failed result → only the log is copied; log_copied=True, no payload artifacts.
  • Missing log → log_missing=True, nothing copied, no raise.
  • Successful result whose output bundle references plots → all artifacts incl. referenced plots present under dest_dir; enumeration proven.
  • Idempotency → calling twice yields the same files and the same outcome, no error.

These need only two tmp_path subdirs + a hand-built ExecutionResult + a fake output.jsonno Config.default(), no DB session, no ORM Execution row. That is the core testability win: the expensive-to-construct deps are no longer entangled with the copy logic. The materialize_execution adapter gets one thin test with a real Config to confirm the path join.

Old tests to delete / replace: the per-file copy assertions currently embedded in packages/climate-ref/tests/unit/executor/test_result_handling.py (esp. test_handle_execution_result_with_files ~162-220 and test_handle_execution_result_missing_log_file_leaves_dirty ~250) — the copy behavior moves to the new boundary tests; the surviving handle_execution_result tests assert only the status/ingest orchestration (B/C/D).

Environment: tmp_path only. No stand-ins, no adapters.

Implementation Recommendations

  • Owns: the EMDS artifact taxonomy (which files comprise a result), the log-always ordering, CMEC bundle enumeration, relative-path resolution, mkdir/overwrite idempotency, and the source→dest transport.
  • Hides: ensure_relative_path, CMECOutput parsing, shutil/pathlib mechanics, the results != scratch invariant.
  • Exposes: materialize_outputs(result, *, source_dir, dest_dir) -> MaterializeOutcome (decoupled core) and materialize_execution(config, execution, result) -> MaterializeOutcome (ergonomic default). Outcome reports log_copied/log_missing + artifacts_copied; it never mutates DB or Execution.
  • Migration: (1) extract the new module and fold in _copy_file_to_results + _copy_output_bundle_files; (2) rewrite handle_execution_result to call materialize_execution, then keep only the status (B/D) and ingest (C) logic; (3) leave reingest calling through handle_execution_result unchanged. Follow-up (separate PR, toward rfcs#3): move ingest to read from results rather than scratch (result_handling.py:441 reads scratch today) — that is the actual run/ingest decoupling, and it is what makes an rsync'd results tree reingestable without a scratch dir.

Concern map (what each future verb owns)

  • materialize (this RFC) = concern A.
  • status resolver (natural follow-on extraction) = concerns B + D — a pure (result, outcome) → status/dirty function, table-testable with no DB/files.
  • ingest = concern C, reading from results, idempotent + replayable (rfcs#3).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions