Refactor RFC: extract a deep `materialize` module from `handle_execution_result` (toward rfcs#3 run/ingest split)

> Refactor RFC generated via `/improve-codebase-architecture`. Contributes toward the run/ingest split in **Climate-REF/rfcs#3** (execution lifecycle consolidation).

## Problem

`handle_execution_result` (`packages/climate-ref/src/climate_ref/executor/result_handling.py:345`) is a very commonly used function that fuses **four distinct concerns** into one body:

| Concern | Lines | Touches |
|---|---|---|
| A. Promote artifacts scratch→results (log, metric bundle, output bundle + its referenced plots/data/html, series) | 373-430 + `_copy_file_to_results` (90), `_copy_output_bundle_files` (456) | filesystem, `config.paths.scratch/results`, `output_fragment` |
| B. Status state machine (`mark_failed` on missing log → keep dirty/retry; unsuccessful → retryable vs diagnostic-error; `mark_successful`) | 386-400, 453 | `Execution` model |
| C. DB ingest (load CV, register outputs, scalar+series values, nested txn, swallow errors) | 432-444 → `ingest_execution_result` (234) | DB, CV file |
| D. Dirty / pending-work flag | 398-399, 449-450 | `ExecutionGroup` |

The ordering is load-bearing and implicit (copy must precede ingest), and the failure handling is interleaved with copying, so **no concern is reusable in isolation**.

### Concrete friction: "copying the results is not obvious"

When an execution is *copied* (reingest, `reingest.py:211`), the promotion happens in **two hops through two unrelated code paths**:

1. `shutil.copytree(scratch_dir, new_scratch_dir)` (`reingest.py:284`) — copies scratch → new scratch
2. `handle_execution_result(...)` (`reingest.py:312`) — which *internally* copies new-scratch → results, buried among DB writes and status mutation

Concern A — "materialize the durable results tree from a scratch tree" — has no name and no callable entry point. It is hidden inside a function called "handle result," entangled with DB and status. That is the non-obviousness this RFC removes.

This is exactly the **run/ingest split** of Climate-REF/rfcs#3: every backend completes by writing content to disk (`run`/materialize), then a transport-agnostic, idempotent, replayable step loads it into the DB (`ingest`). Today A+B+D and C are fused.

## Proposed Interface

A new module `packages/climate-ref/src/climate_ref/executor/materialize.py` owning **only concern A** (scratch→results promotion). It never touches the DB or mutates `Execution`; it reports an outcome and lets the caller drive status (B/D) and ingest (C).

Two-function split — a decoupled core plus a thin Config-aware adapter:

```python
from attrs import frozen
import pathlib
from climate_ref_core.diagnostics import ExecutionResult


@frozen
class MaterializeOutcome:
    """What materialize did. No DB state, no status mutation — the caller decides."""

    log_copied: bool
    artifacts_copied: tuple[pathlib.Path, ...] = ()  # relative paths actually written

    @property
    def log_missing(self) -> bool:
        return not self.log_copied


def materialize_outputs(
    result: ExecutionResult,
    *,
    source_dir: pathlib.Path,   # already <root>/<fragment>
    dest_dir: pathlib.Path,     # already <root>/<fragment>
) -> MaterializeOutcome:
    """Promote an execution's artifacts from source_dir to dest_dir.

    Always attempts the execution log first (even on failure). On a successful
    result, also copies the metric bundle, the output bundle and every
    plot/data/html file it references, plus the series file. Idempotent.

    Takes two fragment-resolved dirs and an ExecutionResult — NO Config, NO
    Execution, NO Database. Decoupling is enforced by the signature.
    """


def materialize_execution(
    config: "Config",
    execution: "Execution",
    result: ExecutionResult,
) -> MaterializeOutcome:
    """Default form for the executor path: derive source/dest from config + fragment."""
    fragment = execution.output_fragment
    return materialize_outputs(
        result,
        source_dir=config.paths.scratch / fragment,
        dest_dir=config.paths.results / fragment,
    )
```

### Usage

Common executor path (`handle_execution_result`) — replaces the log-copy block, the three success copies, and `_copy_output_bundle_files` (`result_handling.py:374-431`) with one call:

```python
outcome = materialize_execution(config, execution, result)
if outcome.log_missing:
    logger.error("Missing log file in scratch; likely a killed process (will retry).")
    execution.mark_failed()      # status mutation (concern B) stays in the caller
    return                        # leave dirty=True for retry
# ... caller then handles failure branches (B/D), then ingest (C), then mark_successful
```

Reingest: needs **zero changes** under the current flow (its source is already `config.paths.scratch / new_fragment`, so it keeps calling through). The explicit `source_dir`/`dest_dir` override of `materialize_outputs` exists for any caller whose source is not under scratch (a future archive-restore, or staging dir).

### What it hides

- The copy-set / EMDS artifact taxonomy (callers never enumerate log/metric/output/series).
- CMEC output-bundle parsing to enumerate referenced plots/data/html (`CMECOutput.load_from_json` + the `_copy_output_bundle_files` walk).
- `ensure_relative_path` relative-vs-absolute resolution (`diagnostics.py:27`) — the word "relative" never appears in a caller.
- `mkdir(parents=True, exist_ok=True)` + overwrite idempotency, and the `results_dir != scratch_dir` invariant.
- The "log always, even on failure" ordering, and the translation of a missing log into a reportable outcome rather than buried `try/except` + `mark_failed`.

> **Design note — uniform reporting, not asymmetric raises.** An alternative considered (the "minimal" candidate) reported a missing *log* via the outcome but *raised* `FileNotFoundError` on a missing success-*payload* ("a successful result that lied is a bug"). This RFC deliberately rejects that asymmetry: rfcs#3 wants ingest to be idempotent and replayable, and uniform outcome reporting is friendlier to replay than exception control-flow. Missing payload should be reported in `MaterializeOutcome`, letting the caller decide, not crash mid-promotion.

## Dependency Strategy

**Category 2 — local-substitutable.** Pure filesystem + CMEC JSON parse.

- `Config` is **not** a dependency of `materialize_outputs`; it is confined to the one-line `materialize_execution` adapter (imported under `TYPE_CHECKING`). The core's signature has no `Config` to reach into — decoupling enforced structurally, not by convention.
- Filesystem: reuse the existing `shutil.copy` mechanics; fold `_copy_file_to_results` into a private `_copy_one(source_dir, dest_dir, rel)` in the new module. Delete the originals from `result_handling.py` (no dead aliases).
- CMEC parse: reuse `CMECOutput.load_from_json` directly (pure JSON→model, itself category 2 — no mock).

### Deferred (named, not built here) — the `ArtifactStore` port

A ports-&-adapters variant was evaluated: an `ArtifactStore` Protocol (bytes at relative keys: `exists`/`read_bytes`/`write_bytes`/`iter_keys`) + a `transfer()` free function, with `LocalArtifactStore` and `InMemoryArtifactStore` adapters. It would make the scratch/results filesystems two instances of one store and give rfcs#3's rsync/remote ingest a drop-in adapter (with `iter_keys` as the replay primitive).

**It is deliberately deferred.** The dependency is local-substitutable *today* (a `tmp_path` already substitutes it), and rfcs#3 marks cross-deployment portability as a *separate, deferred* RFC — so the port has no second adapter to justify it yet. When that work is scheduled, `materialize_outputs`'s body swaps `Path` operations for store calls with near-zero churn. Build the port then, not now.

## Testing Strategy

**New boundary tests** (against `materialize_outputs`, the deep core):
- Failed result → only the log is copied; `log_copied=True`, no payload artifacts.
- Missing log → `log_missing=True`, nothing copied, **no raise**.
- Successful result whose output bundle references plots → all artifacts incl. referenced plots present under `dest_dir`; enumeration proven.
- Idempotency → calling twice yields the same files and the same outcome, no error.

These need only two `tmp_path` subdirs + a hand-built `ExecutionResult` + a fake `output.json` — **no `Config.default()`, no DB session, no ORM `Execution` row.** That is the core testability win: the expensive-to-construct deps are no longer entangled with the copy logic. The `materialize_execution` adapter gets one thin test with a real `Config` to confirm the path join.

**Old tests to delete / replace:** the per-file copy assertions currently embedded in `packages/climate-ref/tests/unit/executor/test_result_handling.py` (esp. `test_handle_execution_result_with_files` ~162-220 and `test_handle_execution_result_missing_log_file_leaves_dirty` ~250) — the *copy* behavior moves to the new boundary tests; the surviving `handle_execution_result` tests assert only the status/ingest orchestration (B/C/D).

**Environment:** `tmp_path` only. No stand-ins, no adapters.

## Implementation Recommendations

- **Owns:** the EMDS artifact taxonomy (which files comprise a result), the log-always ordering, CMEC bundle enumeration, relative-path resolution, mkdir/overwrite idempotency, and the source→dest transport.
- **Hides:** `ensure_relative_path`, `CMECOutput` parsing, `shutil`/`pathlib` mechanics, the `results != scratch` invariant.
- **Exposes:** `materialize_outputs(result, *, source_dir, dest_dir) -> MaterializeOutcome` (decoupled core) and `materialize_execution(config, execution, result) -> MaterializeOutcome` (ergonomic default). Outcome reports `log_copied`/`log_missing` + `artifacts_copied`; it never mutates DB or `Execution`.
- **Migration:** (1) extract the new module and fold in `_copy_file_to_results` + `_copy_output_bundle_files`; (2) rewrite `handle_execution_result` to call `materialize_execution`, then keep only the status (B/D) and ingest (C) logic; (3) leave reingest calling through `handle_execution_result` unchanged. **Follow-up (separate PR, toward rfcs#3): move ingest to read from `results` rather than `scratch`** (`result_handling.py:441` reads scratch today) — that is the actual run/ingest decoupling, and it is what makes an rsync'd results tree reingestable without a scratch dir.

### Concern map (what each future verb owns)

- **materialize (this RFC)** = concern A.
- **status resolver (natural follow-on extraction)** = concerns B + D — a pure `(result, outcome) → status/dirty` function, table-testable with no DB/files.
- **ingest** = concern C, reading from `results`, idempotent + replayable (rfcs#3).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor RFC: extract a deep `materialize` module from `handle_execution_result` (toward rfcs#3 run/ingest split) #717

Problem

Concrete friction: "copying the results is not obvious"

Proposed Interface

Usage

What it hides

Dependency Strategy

Deferred (named, not built here) — the `ArtifactStore` port

Testing Strategy

Implementation Recommendations

Concern map (what each future verb owns)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Concern	Lines	Touches
A. Promote artifacts scratch→results (log, metric bundle, output bundle + its referenced plots/data/html, series)	373-430 + `_copy_file_to_results` (90), `_copy_output_bundle_files` (456)	filesystem, `config.paths.scratch/results`, `output_fragment`
B. Status state machine (`mark_failed` on missing log → keep dirty/retry; unsuccessful → retryable vs diagnostic-error; `mark_successful`)	386-400, 453	`Execution` model
C. DB ingest (load CV, register outputs, scalar+series values, nested txn, swallow errors)	432-444 → `ingest_execution_result` (234)	DB, CV file
D. Dirty / pending-work flag	398-399, 449-450	`ExecutionGroup`

Refactor RFC: extract a deep materialize module from handle_execution_result (toward rfcs#3 run/ingest split) #717

Description

Problem

Concrete friction: "copying the results is not obvious"

Proposed Interface

Usage

What it hides

Dependency Strategy

Deferred (named, not built here) — the ArtifactStore port

Testing Strategy

Implementation Recommendations

Concern map (what each future verb owns)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Refactor RFC: extract a deep `materialize` module from `handle_execution_result` (toward rfcs#3 run/ingest split) #717

Deferred (named, not built here) — the `ArtifactStore` port