diff --git a/.claude/rules/common-pitfalls.md b/.claude/rules/common-pitfalls.md index 9f4fe3a480..a9b69accf4 100644 --- a/.claude/rules/common-pitfalls.md +++ b/.claude/rules/common-pitfalls.md @@ -1,63 +1,68 @@ # Common Pitfalls -## Array Bounds & Ghost Cells -- Grid dimensions: `m`, `n`, `p` (cells in x, y, z). 1D: n=p=0. 2D: p=0. -- Interior domain: `0:m`, `0:n`, `0:p` -- Buffer/ghost region: `-buff_size:m+buff_size` (similar for n, p in multi-D) -- `buff_size` is **not** a single formula: it's set per reconstruction scheme (WENO/MUSCL/IGR) in - `s_configure_coordinate_bounds` (`m_helper_basic.fpp`) and floored higher for Lagrange bubbles and IB. - Read that routine for the current value rather than assuming one. -- Domain bounds: `idwint(1:3)` (interior `0:m`), `idwbuff(1:3)` (with ghost cells) -- Cell-center coords: `x_cc(-buff_size:m+buff_size)`, `y_cc(...)`, `z_cc(...)` -- Cell-boundary coords: `x_cb(-1-buff_size:m+buff_size)` -- Riemann solver indexing: left state at `j`, right state at `j+1` -- Off-by-one errors in ghost cell regions are a common source of bugs - -## Field Variable Indexing -- Conserved variables: `q_cons_vf(1:sys_size)`. Primitive: `q_prim_vf(1:sys_size)`. -- All equation indices live in the unified `eqn_idx` struct (`eqn_idx_info` type in `m_derived_types.fpp`). - Index ranges depend on `model_eqns` and enabled features (set in `m_global_parameters.fpp`): - - `eqn_idx%cont` — continuity range (partial densities, one per fluid) - - `eqn_idx%mom` — momentum range - - `eqn_idx%E` — total energy (scalar) - - `eqn_idx%adv` — volume fractions (advection equations) - - `eqn_idx%bub`, `eqn_idx%stress`, `eqn_idx%xi`, `eqn_idx%species`, `eqn_idx%B` — optional - - `eqn_idx%gamma`, `eqn_idx%pi_inf`, `eqn_idx%alf`, `eqn_idx%int_en` — additional scalars/ranges -- Use `eqn_idx%cont%beg`/`eqn_idx%cont%end`, `eqn_idx%mom%beg`/`eqn_idx%mom%end`, etc. (old `contxb`/`contxe`, `momxb`/`momxe` shorthands are gone) -- `sys_size` = total number of conserved variables (computed at startup) -- Changing `model_eqns` or enabling features changes ALL index positions - -## Blast Radius -- `src/common/` is shared by ALL three executables (pre_process, simulation, post_process) -- Any change to common/ requires testing all three targets -- Public subroutine signature changes affect all callers across all targets -- Parameter default changes affect all existing case files - -## Physics Consistency -- Pressure formula MUST match `model_eqns` setting -- Model-specific conservative ↔ primitive conversion paths exist -- Volume fractions must sum to 1.0 -- Boundary condition symmetry requirements must be maintained - -## Compiler-Specific Issues -- See the compiler-backend matrix in `.claude/rules/gpu-and-mpi.md` for which compilers - are CI-gated and which backends each supports. -- Each compiler has different strictness levels and warning behavior -- Fypp macros must expand correctly for both GPU and CPU builds - -## Test System -- Tests are generated **programmatically** in `toolchain/mfc/test/cases.py`, not standalone files -- Each test is a parameter modification on top of `BASE_CFG` defaults -- Test UUID = CRC32 hash of the test's trace string; `./mfc.sh test -l` lists all -- To add a test: modify `cases.py` using `CaseGeneratorStack` push/pop pattern -- Golden files: `tests//golden.txt` — tolerance-based comparison, not exact match -- If your change intentionally modifies output, regenerate golden files: - `./mfc.sh test --generate --only -j 8` -- Do not regenerate ALL golden files unless you understand every output change - -## PR Checklist -The base loop (format → precheck → build → test → one logical commit) is the -Development Workflow Contract in `CLAUDE.md`. Beyond it, watch for: -- [ ] If adding parameters: definitions.py (_r + _nv) updated; cmake reconfigured; case_validator.py if constraints -- [ ] If modifying `src/common/`: all three targets tested -- [ ] If changing output: golden files regenerated for affected tests (`./mfc.sh test --generate --only `) +Traps that live in no other doc, collected because their failure modes are silent — +wrong answers and wrong indices, not error messages. General development pitfalls are +covered in `docs/documentation/contributing.md`. + +## Indexing and Ghost Cells + +- Grid dimensions `m`, `n`, `p` (cells in x, y, z); 1D: n=p=0, 2D: p=0. Interior `0:m`; + ghost region `-buff_size:m+buff_size`; bounds structs `idwint(1:3)` (interior) and + `idwbuff(1:3)` (with ghosts); cell boundaries `x_cb(-1-buff_size:m+buff_size)`. +- `buff_size` is **not** a single formula: it's set per reconstruction scheme in + `s_configure_coordinate_bounds` (`m_helper_basic.fpp`) and floored higher for Lagrange + bubbles and IB. Read that routine for the current value rather than assuming one. +- Riemann solvers: left state at `j`, right state at `j+1`. +- All equation indices live in the `eqn_idx` struct (`eqn_idx_info` in + `m_derived_types.fpp`, populated in `m_global_parameters.fpp`): `%cont`, `%mom`, `%E`, + `%adv`, plus optional ranges (`%bub`, `%stress`, `%species`, `%B`, ...). The old + `contxb`/`momxb` shorthands are gone. Index positions depend on `model_eqns` and + enabled features — changing either moves ALL indices; never hard-code one. + +## GPU + +- WARNING: do NOT wrap `GPU_LOOP` in `GPU_PARALLEL` for spatial loops — `GPU_LOOP` emits + empty directives on Cray and AMD, causing silent serial execution. Spatial loops always + use `GPU_PARALLEL_LOOP`/`END_GPU_PARALLEL_LOOP`. Macro API: + `docs/documentation/gpuParallelization.md`; signatures: + `src/common/include/parallel_macros.fpp`. Never call the `ACC_*`/`OMP_*` + implementation layers directly. +- Only `src/simulation/` is GPU-accelerated. Backends: OpenACC (nvfortran primary, Cray) + and OpenMP offload (Cray primary, AMD flang, nvfortran). The CPU-only build must always + work — every `#ifdef` needs a path for all configurations (CPU, ACC, OMP, with/without + MPI). Gates: `MFC_GPU`, `MFC_OpenACC`, `MFC_OpenMP`, `MFC_MPI`, `MFC_DEBUG`, + `MFC_SINGLE_PRECISION`/`MFC_MIXED_PRECISION`, `MFC_PRE_PROCESS`/`MFC_SIMULATION`/ + `MFC_POST_PROCESS`, and compiler macros (`_CRAYFTN`, `__PGI`, ...). +- `@:ACC_SETUP_VFs(...)`/`@:ACC_SETUP_SFs(...)` GPU pointer setup compiles only under + Cray. Around MPI: `GPU_UPDATE(host=...)` before send, `GPU_UPDATE(device=...)` after + receive. + +## Parameters + +- Adding one: `_r()` definition + `_nv()` `NAMELIST_VARS` registration in + `toolchain/mfc/params/definitions.py`; `case_validator.py` only if physics-constrained + (with a `PHYSICS_DOCS` entry). Fortran declarations and namelist bindings are + auto-generated at CMake configure time — re-run cmake (or `./mfc.sh build`) after editing. +- Still manual: array variables and derived-type members (declare in + `src/*/m_global_parameters.fpp` / the relevant type), and case-optimization parameters + (`CASE_OPT_PARAMS` + the `#:else` block in `src/simulation/m_global_parameters.fpp`). + Gotcha: under `--case-optimization` those are baked into the binary and dropped from + the namelist, so changing one needs a *rebuild*, not a case edit. +- Runtime checks (`@:PROHIBIT`) go where they run: shared → + `src/common/m_checker_common.fpp`; simulation-only → `src/simulation/m_checker.fpp`; + pre/post-only → `src/{pre,post}_process/m_checker.fpp` (their `s_check_inputs` are + currently empty — that IS the right place, not m_checker_common). +- Analytic ICs are compiled into the binary (syntax error = build failure, not runtime). + Each IC variable maps to an `eqn_idx%…` expression in `QPVF_IDX_VARS` + (`toolchain/mfc/case.py`); a new patch-settable conserved variable means updating that + map AND the Fortran `eqn_idx` builder to agree — a mismatch is a silent wrong index. + Variables available in expressions: `docs/documentation/case.md`. + +## Tests + +- Tests are generated programmatically in `toolchain/mfc/test/cases.py` (parameter + modifications on `BASE_CFG` via the `CaseGeneratorStack` push/pop pattern); test UUID = + CRC32 of the trace string; `./mfc.sh test -l` lists all. +- Golden files are tolerance-compared. Regenerate only the affected tests + (`./mfc.sh test --generate --only `) — an unexplained golden-file diff is a bug + report, not noise to be regenerated away. diff --git a/.claude/rules/fortran-conventions.md b/.claude/rules/fortran-conventions.md deleted file mode 100644 index 7202f66ce2..0000000000 --- a/.claude/rules/fortran-conventions.md +++ /dev/null @@ -1,72 +0,0 @@ -# Fortran Conventions - -## File Format -- Source files use `.fpp` extension (Fortran + Fypp preprocessor macros) -- Fypp preprocesses `.fpp` → `.f90` at build time via CMake -- Fypp supports conditional compilation, code generation, and regex macros - -## Module Structure -Every Fortran module follows this pattern: -- File: `m_.fpp` -- Module: `module m_` -- `implicit none` required -- Explicit `intent(in)`, `intent(out)`, or `intent(inout)` on ALL subroutine/function arguments -- Initialization subroutine: `s_initialize__module` -- Finalization subroutine: `s_finalize__module` - -## Naming -See "Naming Conventions" in `CLAUDE.md`. - -## Forbidden Patterns - -All checks below are enforced by `python3 toolchain/mfc/lint_source.py` -(runs via `./mfc.sh precheck` and CI). See that file for the full list. - -Fortran/Fypp source (`src/`): -- `dsqrt`, `dexp`, `dlog`, `dble`, `dabs`, `dcos`, `dsin`, `dtan`, etc. → use generic intrinsics -- `1.0d0`, `2.5d-3` (Fortran `d` exponent literals) → use `1.0_wp`, `2.5e-3_wp` -- `double precision` → use `real(wp)` or `real(stp)` -- `real(8)`, `real(4)` → use `wp` or `stp` kind parameters -- Raw `!$acc` or `!$omp` directives → use Fypp GPU_* macros from `parallel_macros.fpp` -- `int(8._wp, ...)` hardcoded byte size → use `storage_size(0._stp)/8` -- Bare integer kind like `2_wp` → use `2.0_wp` -- Junk patterns (`...`, `---`, `===`) in code or comments (no separator comments) -- Duplicate entries in Fypp `#:for ... in [...]` lists -- Identical adjacent non-trivial lines (copy-paste bugs) - -Python (`examples/`, `benchmarks/`, `toolchain/`): -- `===` separator comments → remove -- `----` or longer separator comments → remove (3 dashes `---` is allowed for markdown) - -Shell (`.github/`, `toolchain/`): -- `===` or `----` separator comments → remove -- Echo separators longer than 20 characters → shorten - -Enforced by convention/code review (not automated): -- `goto`, `COMMON` blocks, global `save` variables -- `stop`, `error stop` → use `call s_mpi_abort()` or `@:PROHIBIT()`/`@:ASSERT()` - -## Error Checking Macros (from macros.fpp) -- `@:PROHIBIT(condition, message)` — Runtime constraint check; aborts with file/line info -- `@:ASSERT(predicate, message)` — Invariant assertion; aborts if predicate is false -- `@:LOG(expr)` — Debug logging, active only in `MFC_DEBUG` builds -- Fortran-side runtime validation also exists in `m_checker*.fpp` files using `@:PROHIBIT` - -## Precision Types -`wp`/`stp` are defined in `CLAUDE.md` (`wp` = computation, `stp` = field-data storage + I/O). Detail: -- Modes: default both double; `--single` → both single; `--mixed` → wp=double, stp=half. -- MPI type matching: `mpi_p` must match `wp`, `mpi_io_p` must match `stp`. -- Always use generic intrinsics: `sqrt` not `dsqrt`, `abs` not `dabs`. -- Cast with `real(..., wp)` or `real(..., stp)`, never `dble(...)`. - -Key derived types (`m_derived_types.fpp`): -- `scalar_field` — `real(stp), pointer :: sf(:,:,:)`. Uses `stp`, NOT `wp`. -- `vector_field` — allocatable array of `scalar_field` components. -- New field arrays MUST use `stp` for storage precision consistency. - -## Size Guidelines (soft) -- Subroutine: ≤500 lines -- Helper routine: ≤150 lines -- Function: ≤100 lines -- File: ≤1000 lines -- Arguments: ≤6 preferred diff --git a/.claude/rules/gpu-and-mpi.md b/.claude/rules/gpu-and-mpi.md deleted file mode 100644 index 3a7c400e3e..0000000000 --- a/.claude/rules/gpu-and-mpi.md +++ /dev/null @@ -1,120 +0,0 @@ -# GPU and MPI Patterns - -## GPU Offloading Architecture - -Only `src/simulation/` is GPU-accelerated. Pre/post_process run on CPU only. - -MFC uses a **backend-agnostic GPU abstraction** via Fypp macros. The same source code -compiles to either OpenACC or OpenMP target offload depending on the build flag: - -- `./mfc.sh build --gpu acc` → OpenACC backend (NVIDIA nvfortran, Cray ftn) -- `./mfc.sh build --gpu mp` → OpenMP target offload backend (Cray ftn, AMD flang) -- `./mfc.sh build` (no --gpu) → CPU-only, GPU macros expand to plain Fortran - -### Macro Layers (in src/common/include/) -- `parallel_macros.fpp` — **Use these.** Generic `GPU_*` macros that dispatch to the - correct backend based on `MFC_OpenACC` / `MFC_OpenMP` compile definitions. -- `acc_macros.fpp` — OpenACC-specific `ACC_*` implementations (do not call directly) -- `omp_macros.fpp` — OpenMP target offload `OMP_*` implementations (do not call directly) - - OMP macros generate **compiler-specific** directives: NVIDIA uses `target teams loop`, - Cray uses `target teams distribute parallel do simd`, AMD uses - `target teams distribute parallel do` -- `shared_parallel_macros.fpp` — Shared helpers (collapse, private, reduction generators) - -### Key GPU Macros (always use the `GPU_*` prefix) - -Full set with signatures in `parallel_macros.fpp`. The ones you reach for most: -- `$:GPU_PARALLEL_LOOP(collapse=N, private=[...], reduction=[...], reductionOp='+')` - + `$:END_GPU_PARALLEL_LOOP()` — parallel spatial loop; by far the most common (see pattern below). -- `$:GPU_LOOP(collapse=N, ...)` — inner loop *within* a parallel region. -- `$:GPU_UPDATE(host=[...])` / `$:GPU_UPDATE(device=[...])` — device↔host copies (around MPI; see below). -- `#:call GPU_PARALLEL(...)` — block region for scalar reductions (`maxval`/`minval`). - -Others in `parallel_macros.fpp`: `GPU_ENTER_DATA`/`GPU_EXIT_DATA`, `GPU_DECLARE`, `GPU_ROUTINE`, -`GPU_ATOMIC`, `GPU_WAIT`, and the block macros `GPU_DATA`, `GPU_HOST_DATA`. - -Typical GPU loop pattern (the dominant spatial-loop idiom): -``` -$:GPU_PARALLEL_LOOP(private='[i,j,k,l]', collapse=3) -do l = idwbuff(3)%beg, idwbuff(3)%end - do k = idwbuff(2)%beg, idwbuff(2)%end - do j = idwbuff(1)%beg, idwbuff(1)%end - ! loop body - end do - end do -end do -$:END_GPU_PARALLEL_LOOP() -``` - -WARNING: Do NOT use `GPU_PARALLEL` wrapping `GPU_LOOP` for spatial loops. `GPU_LOOP` -emits empty directives on Cray and AMD compilers, causing silent serial execution. -Use `GPU_PARALLEL_LOOP` / `END_GPU_PARALLEL_LOOP` for all parallel spatial loops. - -NEVER write raw `!$acc` or `!$omp` directives. Always use `GPU_*` Fypp macros. -The precheck source lint will catch raw directives and fail. - -### Memory Management Macros (from macros.fpp) -- `@:ALLOCATE(var1, var2, ...)` — Fortran allocate + `GPU_ENTER_DATA(create=...)` -- `@:DEALLOCATE(var1, var2, ...)` — `GPU_EXIT_DATA(delete=...)` + Fortran deallocate -- `@:PREFER_GPU(var1, var2, ...)` — NVIDIA unified memory page placement hint -- Every `@:ALLOCATE` MUST have a matching `@:DEALLOCATE` in finalization -- Conditional allocation MUST have conditional deallocation - -### GPU Field Setup (Cray-specific, from macros.fpp) -- `@:ACC_SETUP_VFs(...)` / `@:ACC_SETUP_SFs(...)` — GPU pointer setup for vector/scalar fields -- These compile only for Cray (`_CRAYFTN`); other compilers skip them - -### Compiler-Backend Matrix - -CI-gated compilers (must always pass): gfortran, nvfortran, Cray ftn, Intel ifx. -AMD flang is additionally supported for GPU builds but not in the CI matrix. - -| Compiler | `--gpu acc` (OpenACC) | `--gpu mp` (OpenMP) | CPU-only | -|-----------------|----------------------|------------------------|----------| -| GNU gfortran | No | Experimental (AMD GCN) | Yes | -| NVIDIA nvfortran| Yes (primary) | Yes | Yes | -| Cray ftn (CCE) | Yes | Yes (primary) | Yes | -| Intel ifx | No | Experimental (SPIR64) | Yes | -| AMD flang | No | Yes | Yes | - -## Preprocessor Defines (`#ifdef` / `#ifndef`) - -Raw `#ifdef` / `#ifndef` preprocessor guards are **normal and expected** in MFC. -They are NOT the same as raw `!$acc`/`!$omp` pragmas (which are forbidden). - -Use `#ifdef` for feature, target, compiler, and library gating: - -### Feature gating -- `MFC_MPI` — MPI-enabled build (`--mpi` flag, default ON) -- `MFC_OpenACC` — OpenACC GPU backend (`--gpu acc`) -- `MFC_OpenMP` — OpenMP target offload backend (`--gpu mp`) -- `MFC_GPU` — Any GPU build (either OpenACC or OpenMP) -- `MFC_DEBUG` — Debug build (`--debug`) -- `MFC_SINGLE_PRECISION` — Single-precision mode (`--single`) -- `MFC_MIXED_PRECISION` — Mixed-precision mode (`--mixed`) - -### Target gating (for code in `src/common/` shared across executables) -- `MFC_PRE_PROCESS` — Only in pre_process builds -- `MFC_SIMULATION` — Only in simulation builds -- `MFC_POST_PROCESS` — Only in post_process builds - -### Compiler gating (for compiler-specific workarounds) -Compiler/feature macros: `_CRAYFTN`, `__NVCOMPILER_GPU_UNIFIED_MEM` (NVIDIA unified mem, GH-200 / -`--unified`), `__PGI` (legacy PGI/NVIDIA), `__INTEL_COMPILER`, `FRONTIER_UNIFIED`. Library code is -similarly gated (FFTW in `m_fftw.fpp` on `MFC_GPU`/`__PGI`; CUDA Fortran `cudafor` on -`__NVCOMPILER_GPU_UNIFIED_MEM`; SILO/HDF5 paths). Grep the relevant file for exact usage. - -When adding new `#ifdef` blocks, always provide an `#else` or `#endif` path so -the code compiles in all configurations (CPU-only, GPU-ACC, GPU-OMP, with/without MPI). - -## MPI - -### Halo Exchange -- Pack/unpack offset calculations are error-prone — verify carefully -- Buffer sizing depends on dimensionality and QBMM state -- GPU coherence: always `GPU_UPDATE(host=...)` before MPI send, - `GPU_UPDATE(device=...)` after MPI receive - -### Error Handling -- Use `call s_mpi_abort()` for fatal errors, never `stop` or `error stop` -- MPI must be finalized before program exit diff --git a/.claude/rules/parameter-system.md b/.claude/rules/parameter-system.md deleted file mode 100644 index ac1a7839d7..0000000000 --- a/.claude/rules/parameter-system.md +++ /dev/null @@ -1,80 +0,0 @@ -# Parameter System - -## Overview -MFC's simulation parameters are defined in Python and read by Fortran via namelist files. - -## Parameter Flow: Python → Fortran - -1. **Definition**: `toolchain/mfc/params/definitions.py` — source of truth - - Parameters are indexed families: `patch_icpp(i)%attr`, `fluid_pp(i)%attr`, etc. - - Each has type, default, constraints, and tags - -2. **Validation** (two layers): - - `toolchain/mfc/case.py` / `toolchain/mfc/params/registry.py` — JSON schema validation - via fastjsonschema (type checking, defaults) - - `toolchain/mfc/case_validator.py` — Physics constraint checking - (e.g., volume fractions sum to 1, dependency validation) - -3. **Input Generation**: `toolchain/mfc/run/input.py` - - Python case dict → Fortran namelist `.inp` file - - Format: `&user_inputs` ... `&end/` - -4. **Fortran Reading**: `src/*/m_start_up.fpp` - - Reads `&user_inputs` namelist - - Each parameter must be declared in the namelist statement - -## Adding a New Parameter (2-location checklist) - -Fortran declarations and namelist bindings are now auto-generated from definitions.py -at CMake configure time — no manual Fortran edits needed for simple scalar parameters. - -1. **`toolchain/mfc/params/definitions.py`**: Add parameter with `_r()` (type, default, - constraints) AND add it to `NAMELIST_VARS` via `_nv()` for the relevant target(s). - After editing, re-run cmake (or `./mfc.sh build`) to regenerate the Fortran includes. -2. **`toolchain/mfc/case_validator.py`**: Add validation rules if the parameter has - physics constraints. Include `PHYSICS_DOCS` entry with title, category, explanation. - -**Exceptions — still require manual Fortran edits:** -- Array variables (e.g. `logical, dimension(num_fluids_max)`) → declare in `src/*/m_global_parameters.fpp` -- Derived-type members (`fluid_pp%attr`, `patch_icpp(i)%attr`) → declare in the relevant derived type -- Case-optimization parameters → add to `CASE_OPT_PARAMS` and the `#:else` block in `src/simulation/m_global_parameters.fpp`. - Gotcha: under `--case-optimization` these are baked into the binary and dropped from the simulation namelist - (`case_dicts.py` filters them), so changing one needs a *rebuild*, not just a case edit — and building without - the flag makes them read from `.inp` again. - -## Case Files -- Case files are Python scripts (`.py`) that define a dict of parameters -- Validated with `./mfc.sh validate case.py` -- Examples in `examples/` directory -- Create new cases with `./mfc.sh new ` -- Search parameters with `./mfc.sh params ` - -## Fortran-Side Runtime Validation -Runtime parameter validation uses `@:PROHIBIT(condition, message)`. Put a check where it runs: -- **Shared across all three targets** → `src/common/m_checker_common.fpp` (`s_check_inputs_common`, - with `#ifndef MFC_*` gates for target-specific exclusions). This holds most checks. -- **Simulation-only** → `src/simulation/m_checker.fpp` (WENO/MUSCL/IGR/time-stepping/compiler checks). -- **Pre/post-only** → `src/{pre,post}_process/m_checker.fpp`. Note: their `s_check_inputs` are - currently empty — that's the right place for a pre/post-only constraint, not `m_checker_common.fpp`. - -Add Fortran-side checks in addition to `case_validator.py`. - -## Analytical Initial Conditions -String expressions in parameters become Fortran code via `case.py.__get_analytic_ic_fpp()`. -These are compiled into the binary, so syntax errors cause build failures, not runtime errors. - -Gotcha: each IC variable (`alpha_rho`, `vel`, `pres`, `alpha`, `Y`, `Bx`...) maps to an `eqn_idx%…` -expression in `QPVF_IDX_VARS` (`case.py`). Adding a conserved variable that patches can set means -updating that map *and* the Fortran `eqn_idx` builder to agree — a mismatch is a silent wrong-index, not -an error. (This is also why `Bx`/`By`/`Bz` use `eqn_idx%B%end-1/%end`, to stay valid in 1D/2D.) - -Available variables in analytical IC expressions: -- `x`, `y`, `z` — cell-center coordinates (mapped to `x_cc(i)`, `y_cc(j)`, `z_cc(k)`) -- `xc`, `yc`, `zc` — patch centroid coordinates -- `lx`, `ly`, `lz` — patch lengths -- `r` — patch radius; `eps`, `beta` — vortex parameters -- `e` — Euler's number (2.71828...) -- Standard Fortran math intrinsics available: `sin`, `cos`, `exp`, `sqrt`, `abs`, etc. -- For moving immersed boundaries: `t` (simulation time) is also available - -Example: `'patch_icpp(1)%vel(2)': '(x - xc) * exp(-((x-xc)**2 + (y-yc)**2))'` diff --git a/CLAUDE.md b/CLAUDE.md index 75b6abbe4a..b0c165ddde 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,129 +6,134 @@ toolchain for building/running/testing, and supports GPU acceleration via OpenAC OpenMP target offload. It must compile with gfortran, nvfortran, Cray ftn, and Intel ifx (CI-gated). AMD flang is additionally supported for OpenMP target offload GPU builds. -## Working Style - -Make surgical changes: every changed line should trace to the request. Don't refactor, -reformat, or "improve" adjacent code — in a four-compiler, golden-file-gated codebase, -incidental edits are how regressions slip in. For general behavioral guidance (simplicity, -surfacing assumptions, verifiable success criteria), invoke the `karpathy-guidelines` skill. +## How to Work Here + +You are editing this repository as a conservative maintainer. Three facts shape everything: + +1. **Four compilers, one truth.** Every line must compile and behave identically under four + CI-gated compilers and three GPU configurations. Code that is merely clever on one is + broken on another. Prefer the established idiom over the elegant one. +2. **Failures here are silent.** The worst bugs in a CFD code are not crashes — they are + answers that are subtly wrong: a wrong index, a serial loop that should be parallel, a + precision mix. Golden-file regression tests are the safety net, and incidental edits are + how regressions slip past them. +3. **Trust the toolchain.** `./mfc.sh` already solves environments, module loading, + dependencies, and build configuration; the lint (`./mfc.sh precheck`) encodes the + project's forbidden patterns. When unsure of a flag or rule, ask the tooling + (`--help`, precheck) rather than guessing. + +**Primary rule: make the smallest correct change. Prefer deleting code to adding code.** + +Diff discipline: +- Every changed line should trace to the request. Keep changes localized. +- Do not rewrite, reformat, or rename unrelated code. +- Do not add new files, dependencies, or net-positive LOC without clear justification. + +Do not add bloat: +- abstractions for one call site, or defensive code for hypothetical future callers +- validation layers around trusted internal data — constraint checks belong in + `case_validator.py` and the `m_checker*.fpp` files, not scattered through the solver +- optional parameters or config knobs for behavior with one correct value +- broad try/except blocks in toolchain Python +- comments explaining obvious code; compatibility shims unless explicitly requested + +Tests: add one only when it protects real behavior — it would fail before your change and +covers behavior a real case depends on. Prefer one targeted case over many broad ones. + +Before editing, state: the smallest viable fix, what changes, what deliberately does not +change, and whether a test is needed. After editing, report: files changed, net LOC, and +anything that could still be deleted or simplified. If a patch exceeds roughly 100 net new +lines, stop and justify before continuing. + +For general behavioral guidance (simplicity, surfacing assumptions, verifiable success +criteria), invoke the `karpathy-guidelines` skill. ## Commands -Prefer using `./mfc.sh` as the entry point for building, running, testing, formatting, -and linting. It handles virtual environments, module loading, dependency bootstrapping, -and build configuration. Avoid invoking CMake, Python toolchain scripts, or Fortran -compilers directly unless you have a specific reason. - -All commands run from the repo root via `./mfc.sh`. - -Run `./mfc.sh --help` for the full flag set; the most-used invocations: +All commands run from the repo root via `./mfc.sh`; avoid invoking CMake, Python toolchain +scripts, or Fortran compilers directly. Run `./mfc.sh --help` for flags +(`-j N` = parallel jobs). ```bash -# Build / run / test (-j N = parallel jobs) -./mfc.sh build -j 8 # all 3 targets; flags: -t , --gpu acc|mp, --debug, - # -i case.py --case-optimization (10x speedup) -./mfc.sh run case.py -n 4 # run with 4 MPI ranks; --no-build; -e batch (toolchain/templates/) -./mfc.sh test -j 8 # full suite; --only <1D|Bubbles|UUID>, -l, -% N (sample), - # --generate (regenerate golden files after an intended output change) - -# Verify before committing -./mfc.sh precheck -j 8 # all CI lint checks -./mfc.sh format -j 8 # auto-format Fortran (.fpp/.f90) + Python -./mfc.sh lint # ruff lint + Python unit tests (spelling: ./mfc.sh spelling) - -# Case files -./mfc.sh validate case.py # validate without running +./mfc.sh build -j 8 # all 3 targets; -t , --gpu acc|mp, --debug +./mfc.sh run case.py -n 4 # run with 4 MPI ranks; -e batch for clusters +./mfc.sh test -j 8 # full suite; --only , -l to list, + # --generate to refresh golden files after an intended output change +./mfc.sh format -j 8 # auto-format Fortran + Python +./mfc.sh precheck -j 8 # all CI lint checks — run before every commit +./mfc.sh validate case.py # validate a case without running ./mfc.sh params # search case parameters -./mfc.sh new # new case from template (clean: ./mfc.sh clean) ``` -Module loading (`source ./mfc.sh load -c -m `) is covered under System Identification below. - -## System Identification and Module Loading - -On an HPC cluster, load modules before building: `source ./mfc.sh load -c -m ` -(`-m g`/`gpu` or `c`/`cpu`). The `source` is required — plain `./mfc.sh load` errors, since -the command sets environment variables in the current shell. - -Slugs live in `toolchain/modules` (e.g. `p` Phoenix, `f` Frontier, `tuo` Tuolumne, `d` Delta, -`b` Bridges2, `c`/`cc` Carpenter GNU/Cray, `o` Oscar, `h` HiPerGator; GPU backend per system -is defined there). To identify the current system, check `$LMOD_SYSHOST` (most reliable), -then a non-empty `$CRAY_LD_LIBRARY_PATH` (→ Cray: Frontier / Carpenter-Cray), then `hostname` -— login and compute nodes may differ. Batch templates for `./mfc.sh run -e batch -c ` -are in `toolchain/templates/`. +On HPC clusters, load modules before building: `source ./mfc.sh load -c -m ` +(the `source` is required). Slugs and per-system GPU backends: `toolchain/modules`; batch +templates: `toolchain/templates/`. Identify the system via `$LMOD_SYSHOST`, then a +non-empty `$CRAY_LD_LIBRARY_PATH` (→ Cray), then `hostname`. ## Development Workflow Contract -IMPORTANT: Follow this loop for ALL code changes. Do not skip steps. - -1. **Read first** — Read and understand relevant code before modifying it. -2. **Plan** — For multi-file changes, outline your approach before implementing. -3. **Implement** — Make small, focused changes. One logical change per commit. -4. **Format** — Run `./mfc.sh format -j 8` to auto-format. -5. **Verify** — Run `./mfc.sh precheck -j 8` (same checks as the CI lint gate). -6. **Build** — Run `./mfc.sh build -j 8` to verify compilation. -7. **Test** — Run relevant tests: `./mfc.sh test --only -j 8`. - For changes to `src/common/`, test ALL three targets: `./mfc.sh test -j 8`. -8. **Commit** — Only after steps 4-7 pass. Do not commit untested code. +For ALL code changes: read the relevant code first, plan multi-file changes before +implementing, then **format → precheck → build → test → commit**, in that order, with one +logical change per commit. -YOU MUST run `./mfc.sh precheck` before any commit. This is enforced by pre-commit hooks. -YOU MUST run tests relevant to your changes before claiming work is done. -NEVER commit code that does not compile or fails tests. -NEVER use heredocs for git commit messages. Use simple `git commit -m "message"` instead. +- YOU MUST run `./mfc.sh precheck` before any commit (pre-commit hooks enforce this). +- YOU MUST run the tests relevant to your change before claiming work is done. + Changes to `src/common/` are shared by all three executables — test all three. +- NEVER commit code that does not compile or fails tests. +- NEVER use heredocs for git commit messages. Use simple `git commit -m "message"`. ## Architecture ``` -src/ - common/ # Shared code (used by ALL three executables — wide blast radius) - pre_process/ # Grid generation and initial conditions - simulation/ # CFD solver (GPU-accelerated via OpenACC / OpenMP target offload) - post_process/ # Data output and visualization -toolchain/ # Python CLI, build system, testing, parameter management - mfc/params/definitions.py # parameter definitions (source of truth) - mfc/case_validator.py # Physics constraint validation - mfc/test/ # Test runner and case generation -examples/ # Example simulation cases (case.py files) -tests/ # regression test golden files +src/common/ # shared by ALL three executables — wide blast radius +src/pre_process/ # grid generation and initial conditions +src/simulation/ # CFD solver (the only GPU-accelerated target) +src/post_process/ # data output and visualization +toolchain/ # Python CLI; params/definitions.py is the parameter source of truth +examples/ # example cases (case.py); tests/ holds regression golden files ``` Source files are `.fpp` (Fortran + Fypp macros), preprocessed to `.f90` by CMake. ## Critical Rules -NEVER use raw OpenACC/OpenMP pragmas (`!$acc`, `!$omp`). Use `GPU_*` Fypp macros instead. - Raw `#ifdef`/`#ifndef` preprocessor guards for feature/compiler/library gating ARE normal. -NEVER use double-precision intrinsics: `dsqrt`, `dexp`, `dlog`, `dble`, `dabs`, `real(8)`, `real(4)`. - Use generic intrinsics (`sqrt`, `exp`, `log`) and precision types (`wp`, `stp`). -NEVER use `d` exponent literals (`1.0d0`). Use `1.0_wp` instead. -NEVER use `stop` or `error stop`. Use `call s_mpi_abort()` or `@:PROHIBIT()`/`@:ASSERT()`. -NEVER use `goto`, `COMMON` blocks, or global `save` variables. - (Headline subset; full lint-enforced list — incl. Python/shell rules — in `.claude/rules/fortran-conventions.md`.) - -Every `@:ALLOCATE(...)` MUST have a matching `@:DEALLOCATE(...)`. -Every new parameter MUST be added in at least 2 places (3 if it has constraints): - 1. `toolchain/mfc/params/definitions.py` (parameter definition + NAMELIST_VARS target set) - 2. `toolchain/mfc/case_validator.py` (only if parameter has physics constraints) - Note: Fortran declarations and namelist bindings are auto-generated from definitions.py - at CMake configure time. Simple scalars need no manual Fortran edits. Array/derived-type - variables still require a manual declaration in `src/*/m_global_parameters.fpp`. - -Changes to `src/common/` affect ALL three executables. Test comprehensively. - -## Naming Conventions - -- Modules: `m_` (e.g., `m_bubbles`) -- Public subroutines: `s__` (e.g., `s_compute_pressure`) -- Public functions: `f__` -- Private/local variables: no prefix required. Constants: descriptive names, not ALL_CAPS. -- 2-space indentation, lowercase keywords, explicit `intent` on all arguments - -## Precision System - -- `wp` = working precision (computation). `stp` = storage precision (field data arrays and I/O). -- Both double by default. See `.claude/rules/fortran-conventions.md` for single/mixed - modes, casting rules, and MPI type matching (`mpi_p` ↔ `wp`, `mpi_io_p` ↔ `stp`). +The exhaustive forbidden-pattern list is `toolchain/mfc/lint_source.py` (enforced by +precheck and CI). The rules to internalize — they exist because MFC must behave +identically across compilers, precisions, and GPU backends: + +- GPU directives only via `GPU_*` Fypp macros — NEVER raw `!$acc`/`!$omp` pragmas. + (Raw `#ifdef`/`#ifndef` guards for feature/compiler/library gating ARE normal.) +- Precision only via `wp`/`stp` kinds and generic intrinsics — NEVER `dsqrt`/`dble`/ + `real(8)` or `d` exponent literals. Write `sqrt`, `1.0_wp`, `real(..., wp)`. +- Abort only via `call s_mpi_abort()` or `@:PROHIBIT()`/`@:ASSERT()` — NEVER `stop`/`error stop`. +- NEVER `goto`, `COMMON` blocks, or global `save` variables. +- Every `@:ALLOCATE(...)` MUST have a matching `@:DEALLOCATE(...)`. +- New parameters are defined in `toolchain/mfc/params/definitions.py` (plus + `case_validator.py` if physics-constrained); Fortran declarations and namelist bindings + are auto-generated. Exceptions: `.claude/rules/common-pitfalls.md`. + +## Naming and Style + +Modules `m_` with `s_initialize_/s_finalize__module` pairs; public +subroutines `s__`, functions `f__`; 2-space indent, lowercase +keywords, explicit `intent` on all arguments; constants get descriptive names, not +ALL_CAPS. Full hard/soft rule tables: `docs/documentation/contributing.md`. + +## Precision + +`wp` = working precision (computation); `stp` = storage precision (field arrays and I/O). +Both double by default; `--single` → both single; `--mixed` → wp=double, stp=half — so +wp/stp mixing is a silent precision bug, not a style issue. `scalar_field%sf` and all new +field arrays use `stp`. MPI types must match: `mpi_p` ↔ `wp`, `mpi_io_p` ↔ `stp`. + +## Where Things Are Documented + +`docs/documentation/` is freshness-checked by precheck (`lint_docs.py`) — prefer pointing +there over restating it. Most relevant: `contributing.md` (standards, architecture, +general pitfalls), `gpuParallelization.md` (GPU macro API), `testing.md` (test system), +`case.md` (case parameters, analytic ICs). MFC-specific traps with silent failure modes +live in `.claude/rules/common-pitfalls.md` — read it before touching indexing, GPU loops, +parameters, or tests. ## Code Review Priorities