feat(tui): auto-discover .codewhale/rules/ and .claude/rules/ directories as project context by yekern · Pull Request #3892 · Hmbown/CodeWhale

yekern · 2026-07-02T09:45:36Z

PR: feat(tui) — auto-discover `.codewhale/rules/` and `.claude/rules/` directories as project context

Summary

Add rules-directory auto-discovery to load_project_context(): on every session start,
CodeWhale automatically scans .codewhale/rules/ (native) and .claude/rules/ (Claude compat)
for .md files, loads them in filename order, and appends them to the project-context block
injected into the system prompt. Each rule is wrapped in a <project_rule source="…"> element.

This completes solution D from the design anchor issue #3867 — the same trust model as
AGENTS.md (workspace-contained content only, no absolute-path escape), with no #417
project-config relaxation required.

Motivation

Before this PR, CodeWhale's instruction system was nearly unusable in multi-project workflows:

instructions config key blocked at project scope since v0.8.8 (PRIOR: Ignore dangerous project-level config keys #417) — users could
only list rule files in ~/.codewhale/config.toml, making it painful to maintain
per-project rules across many repositories.
No rules-directory auto-discovery — Claude Code's .claude/rules/ auto-loads all
.md files; CodeWhale had no equivalent and no mechanism to load multiple rule files
without manual config.
No glob support in instructions_paths(), so even instructions = [".claude/rules/*.md"]
was impossible.

The recommended path from the #3867 design discussion was D first — rules-directory
auto-discovery sits in the same trust class as AGENTS.md, needs no #417 relaxation, and
delivers the majority of multi-project pain relief on its own. This PR implements that slice.

Design decisions

`rules_block` vs mixing into `instructions`

Rules are stored in a separate rules_block: Option<String> field on ProjectContext,
not mixed into instructions. This is essential for mono-repo support:

has_instructions() controls whether the parent-directory traversal searches for a root
AGENTS.md. If rules alone set instructions, they would block parent discovery.
By keeping rules in rules_block, has_instructions() stays unchanged (only reflects
main instructions), and parent traversal works correctly.
as_system_block() appends rules_block after instructions at render time, so both
are present in the final system prompt.

Security model

Same trust class as AGENTS.md:

Workspace-subtree only — rules live in .codewhale/rules/ or .claude/rules/ within
the project. No absolute-path escape.
Symlink refusal — load_context_file() (shared with AGENTS.md) rejects symlinked files,
matching the existing precedent in read_project_config_file.
Capped at 50 files per directory (MAX_RULES_FILES) to prevent abuse.
100 KB per file (MAX_CONTEXT_SIZE) inherited from the context loader.

No #417 relaxation

merge_project_config's rejection of project-scope instructions is left unchanged.
Scheme D is orthogonal to #417 — it doesn't touch the config key at all.

Changes

`crates/tui/src/project_context.rs` (+~190 lines)

New constants:

RULES_DIRS = [".codewhale/rules", ".claude/rules"] — directories scanned in order
MAX_RULES_FILES = 50 — per-directory file cap

New field on ProjectContext:

rules_block: Option<String> — holds the assembled rules XML, separate from instructions

New function load_rules_from_dir():

Scans a rules directory for *.md files
Sorts by filename for deterministic order
Reuses load_context_file() for size checking + symlink safety + empty-file rejection
Returns Vec<(PathBuf, String)> — silently returns empty on missing/unreadable directories

Modified load_project_context():

After loading PROJECT_CONTEXT_FILES (AGENTS.md etc.), iterates RULES_DIRS and calls
load_rules_from_dir()
Wraps each rule file in <project_rule source="…">…</project_rule>
Stores assembled rules in ctx.rules_block (not ctx.instructions, preserving parent traversal)

Modified as_system_block():

Appends rules_block inside the project-context block when instructions exist
Emits rules_block standalone when no main instructions are present
Emits rules_block after constitution when constitution exists but instructions don't

Modified project_context_cache_candidate_paths():

Scans RULES_DIRS for *.md files and adds them to the cache-key candidate list
Ensures rules changes invalidate the project-context cache (editing a rule file,
adding/removing rule files all produce a different cache key)

9 new tests:

Test	What it covers
`rules_from_codewhale_dir_are_loaded_as_project_context`	Basic discovery + `<project_rule>` wrapper
`rules_are_loaded_in_filename_order`	Deterministic filename sort (aaa < mmm < zzz)
`rules_from_claude_dir_are_compat_loaded`	`.claude/rules/` compatibility
`rules_directory_missing_does_not_crash`	Graceful handling of missing directories
`rules_coexist_with_agents_md`	AGENTS.md + rules coexist, AGENTS.md precedes rules
`non_md_files_in_rules_dir_are_ignored`	Only `*.md` files are loaded
`rules_cap_truncates_excess_files`	MAX_RULES_FILES=50 enforced
`rules_rejects_symlinked_files`	Symlinked rule files are refused (unix only)
`rules_from_both_dirs_are_loaded_together`	Dual directory support + correct priority order

`crates/tui/src/context_report.rs` (+18 lines)

/context report now includes rules_block content when rules are present
When only rules exist (no main instructions), they appear as a separate "Project rules" entry

`crates/tui/src/project_context_cache.rs` (+28 lines, 2 tests)

signature_changes_when_rules_file_changes — verifies content change triggers cache invalidation
signature_changes_when_rules_file_is_added_or_removed — verifies file addition/removal triggers invalidation

Verification

Check	Result
`cargo fmt --all -- --check`	clean
`cargo clippy -p codewhale-tui` (our files only)	clean
`cargo test -p codewhale-tui --bin codewhale-tui -- project_context`	56 passed, 0 failed
`cargo test -p codewhale-tui --bin codewhale-tui -- project_context_cache`	7 passed, 0 failed
`cargo test -p codewhale-tui --bin codewhale-tui -- context_report`	9 passed, 0 failed

System prompt structure (with rules)

┌─ System Prompt ──────────────────────────────────────────────┐
│ [mode prompt + constitution]                                  │
│                                                                │
│ <project_instructions source="AGENTS.md">                      │
│   ...AGENTS.md content...                                     │
│ </project_instructions>                                       │
│                                                                │
│ <project_rule source=".codewhale/rules/coding-style.md">      │
│   ...rule content...                                          │
│ </project_rule>                                               │
│ <project_rule source=".codewhale/rules/testing.md">           │
│   ...rule content...                                          │
│ </project_rule>                                               │
│                                                                │
│ ── volatile boundary ──                                       │
│ ## Environment …                                              │
│ <instructions source="~/global.md">…</instructions>           │
└────────────────────────────────────────────────────────────────┘

Audit summary

A comprehensive cross-system audit (2 rounds, 5 dimensions) was performed to ensure no
regressions or unexpected interactions:

Audit scope	Verdict	Details
Prompt byte-stability	✅ Safe	Rules in static layer (same as AGENTS.md). KV cache busts on rule changes — by design.
All prompt construction paths	✅ Covered	TUI, engine init, refresh_system_prompt, `build_system_prompt` all go through `as_system_block()`.
Sub-agent / Fleet	🟡 Pre-existing	Model-visible `agent` tool ✔️ inherits rules via fork_context. Background `/agent` path ❌ uses static prompt — same pre-existing limitation as AGENTS.md.
WhaleFlow	✅ No interaction	Independent crate, no project-context references.
Project-context cache	✅ Fixed	Cache key now includes rules directory files. Tested for content change + file addition/removal.
Parent-directory AGENTS.md	✅ Preserved	`rules_block` separated from `instructions` — `has_instructions()` unchanged.
#417 project-config	✅ Unchanged	`merge_project_config`'s `instructions` rejection untouched.

What this PR does NOT do (deferred to future milestones)

Glob support in instructions_paths() (scheme C)
Path restriction for project-scope instructions relaxation (scheme B)
Conditional rule loading with YAML frontmatter / paths matching (scheme E)
Trust gating for project-scope instructions (scheme A)

These are tracked in #3867 as separate workstreams.

Migration path

New projects: create .codewhale/rules/ (or .claude/rules/) and drop .md files.
No config changes needed — rules are auto-discovered on next session start.
Existing .claude/rules/ users: rules are picked up automatically — zero migration cost.
Existing global instructions users: both channels are additive (project rules + global
instructions coexist in the system prompt), so no conflict.

PR：feat(tui) — 自动发现 `.codewhale/rules/` 和 `.claude/rules/` 目录作为项目上下文

Closes #3867

概述

为 load_project_context() 新增 rules 目录自动发现：每次会话启动时，CodeWhale
自动扫描 .codewhale/rules/（原生）和 .claude/rules/（Claude 兼容）目录下的 .md
文件，按文件名排序加载，追加到注入 system prompt 的项目上下文块中。每条规则包裹在
<project_rule source="…"> 元素中。

这是设计锚点 issue #3867 中方案 D 的实现——与 AGENTS.md 相同的安全模型（仅限
工作区内容，无绝对路径逃逸），不需要 relax #417 项目级配置限制。

动机

此 PR 之前，CodeWhale 在多项目场景下的规则系统几乎不可用：

instructions 配置项被项目级禁止（自 v0.8.8 PRIOR: Ignore dangerous project-level config keys #417）——用户只能在
~/.codewhale/config.toml 中列举规则文件，跨多个仓库维护极其痛苦。
无 rules 目录自动发现——Claude Code 的 .claude/rules/ 自动加载所有 .md
文件；CodeWhale 没有对应机制，且无法批量加载多文件规则。
instructions_paths() 不支持 glob，即使写 instructions = [".claude/rules/*.md"]
也是无效的。

#3867 设计讨论的推荐路径是 D 优先——rules 目录自动发现与 AGENTS.md 同安全等级，
无需改动 #417，且能独立解决多项目痛点的大部分。本 PR 实现该方案。

设计决策

`rules_block` 分离 vs 混入 `instructions`

Rules 存储在 ProjectContext 的独立字段 rules_block: Option<String> 中，不混入
instructions。这对 mono-repo 场景至关重要：

has_instructions() 控制是否向上搜索父目录的 AGENTS.md。若 rules 单独设置了
instructions，会阻止父目录发现。
将 rules 保持在 rules_block 中，has_instructions() 保持不变（仅反映主指令），
父目录遍历正常工作。
as_system_block() 在渲染时将 rules_block 追在 instructions 之后，两者都出现在
最终 system prompt 中。

安全模型

与 AGENTS.md 同等级：

仅限工作区子树——rules 位于项目内的 .codewhale/rules/ 或 .claude/rules/。
无绝对路径逃逸。
拒绝软链接——load_context_file()（与 AGENTS.md 共享）拒绝软链接文件，与
read_project_config_file 中的现有先例一致。
每目录上限 50 文件（MAX_RULES_FILES）防止滥用。
每文件 100 KB（MAX_CONTEXT_SIZE）继承自上下文加载器。

不触碰 #417

merge_project_config 对项目级 instructions 的拒绝保持原样。方案 D 与 #417
完全正交——不涉及配置项。

改动

`crates/tui/src/project_context.rs`（+~190 行）

新增常量：

RULES_DIRS = [".codewhale/rules", ".claude/rules"] — 按顺序扫描的目录
MAX_RULES_FILES = 50 — 每目录文件上限

ProjectContext 新增字段：

rules_block: Option<String> — 存放组装好的 rules XML，与 instructions 分离

新增函数 load_rules_from_dir()：

扫描 rules 目录中的 *.md 文件
按文件名排序，保证确定性顺序
复用 load_context_file() 做大小检查 + 软链接安全 + 空文件拒绝
返回 Vec<(PathBuf, String)> — 目录缺失或不可读时静默返回空 vector

修改 load_project_context()：

加载 PROJECT_CONTEXT_FILES（AGENTS.md 等）后，遍历 RULES_DIRS 调用
load_rules_from_dir()
将每条规则包裹在 <project_rule source="…">…</project_rule> 中
组装结果存入 ctx.rules_block（而非 ctx.instructions，保留父目录遍历）

修改 as_system_block()：

instructions 存在时，将 rules_block 追在项目上下文块中
无主指令时，独立输出 rules_block
constitution 存在但 instructions 不存在时，constitution 后输出 rules_block

修改 project_context_cache_candidate_paths()：

扫描 RULES_DIRS 中的 *.md 文件，加入缓存 key 候选列表
确保 rules 变更触发项目上下文缓存失效（编辑规则文件、新增/删除规则文件均产生不同缓存 key）

9 个新测试：

测试	覆盖
`rules_from_codewhale_dir_are_loaded_as_project_context`	基础发现 + `<project_rule>` 包裹
`rules_are_loaded_in_filename_order`	确定性文件名排序（aaa < mmm < zzz）
`rules_from_claude_dir_are_compat_loaded`	`.claude/rules/` 兼容
`rules_directory_missing_does_not_crash`	目录缺失不崩溃
`rules_coexist_with_agents_md`	AGENTS.md + rules 共存，AGENTS.md 在前
`non_md_files_in_rules_dir_are_ignored`	仅加载 `*.md`
`rules_cap_truncates_excess_files`	MAX_RULES_FILES=50 强制
`rules_rejects_symlinked_files`	拒绝软链接规则文件（仅 unix）
`rules_from_both_dirs_are_loaded_together`	双目录共存 + 正确优先级

`crates/tui/src/context_report.rs`（+18 行）

/context report 现在在 rules 存在时包含 rules_block 内容
仅 rules 存在（无主指令）时显示为独立的"Project rules"条目

`crates/tui/src/project_context_cache.rs`（+28 行，2 个测试）

signature_changes_when_rules_file_changes — 验证内容变更触发缓存失效
signature_changes_when_rules_file_is_added_or_removed — 验证文件增删触发失效

验证

检查项	结果
`cargo fmt --all -- --check`	clean
`cargo clippy -p codewhale-tui`（仅本次改动文件）	clean
`cargo test -p codewhale-tui --bin codewhale-tui -- project_context`	56 passed, 0 failed
`cargo test -p codewhale-tui --bin codewhale-tui -- project_context_cache`	7 passed, 0 failed
`cargo test -p codewhale-tui --bin codewhale-tui -- context_report`	9 passed, 0 failed

System prompt 结构（含 rules）

┌─ System Prompt ──────────────────────────────────────────────────┐
│ [mode prompt + constitution]                                      │
│                                                                    │
│ <project_instructions source="AGENTS.md">                          │
│   ...AGENTS.md 内容...                                             │
│ </project_instructions>                                           │
│                                                                    │
│ <project_rule source=".codewhale/rules/coding-style.md">          │
│   ...规则内容...                                                   │
│ </project_rule>                                                   │
│ <project_rule source=".codewhale/rules/testing.md">               │
│   ...规则内容...                                                   │
│ </project_rule>                                                   │
│                                                                    │
│ ── volatile boundary ──                                           │
│ ## Environment …                                                  │
│ <instructions source="~/global.md">…</instructions>               │
└────────────────────────────────────────────────────────────────────┘

审计摘要

进行了全面的跨系统审计（2 轮、5 个维度），确保无回归或意外交互：

审计范围	结论	详情
Prompt 字节稳定性	✅ 安全	Rules 在静态层（与 AGENTS.md 一致）。KV cache 随规则变更刷新——设计如此。
所有 prompt 构造路径	✅ 全覆盖	TUI、engine init、refresh_system_prompt、`build_system_prompt` 均经过 `as_system_block()`。
子任务 / Fleet	🟡 预存限制	模型可见的 `agent` 工具 ✔️ 通过 fork_context 继承 rules。后台 `/agent` 路径 ❌ 使用静态 prompt——与 AGENTS.md 相同的预存限制。
WhaleFlow	✅ 无交互	独立 crate，无项目上下文引用。
项目上下文缓存	✅ 已修复	缓存 key 现在包含 rules 目录文件。已验证内容变更 + 文件增删。
父目录 AGENTS.md	✅ 保持	`rules_block` 与 `instructions` 分离——`has_instructions()` 不变。
#417 项目配置	✅ 未触碰	`merge_project_config` 的 `instructions` 拒绝保持不变。

本 PR 不包含的内容（推迟到后续 milestone）

Glob 支持 instructions_paths()（方案 C）
路径限制 放宽项目级 instructions（方案 B）
按需加载 YAML frontmatter / paths 匹配（方案 E）
Trust gating 项目级 instructions（方案 A）

以上在 #3867 中作为独立工作流跟踪。

迁移路径

新项目：创建 .codewhale/rules/（或 .claude/rules/）并放入 .md 文件。
无需配置变更——下次会话启动时自动发现 rules。
现有 .claude/rules/ 用户：rules 直接生效——零迁移成本。
现有全局 instructions 用户：两个通道是叠加关系（项目 rules + 全局 instructions
共存于 system prompt），无冲突。

github-actions · 2026-07-02T09:45:48Z

Thanks @yekern for taking the time to contribute.

This repository is observing a maintainer-managed PR intake gate in dry-run mode, so this pull request is staying open. This note helps maintainers prepare the allowlist before any enforcement is considered.

Please read CONTRIBUTING.md for the expected contribution shape. A maintainer can grant recurring PR access by commenting /lgtm on a pull request.

LeoLin990405

Thanks for the ping — really nice work, and cleanly scoped exactly as discussed: no #417 relaxation, merge_project_config untouched, and keeping rules_block separate from instructions so has_instructions() isn't poisoned is the right call for parent-directory AGENTS.md traversal in mono-repos. The cache-invalidation catch (adding the rules *.md to project_context_cache_candidate_paths) is a good find, the 50-file cap + deterministic filename order are sensible, and reusing load_context_file gets you the size check + per-file symlink safety for free.

One security gap worth closing before this lands, since it's exactly the "escape the workspace subtree" class Hunter flagged:

A symlinked rules directory escapes the workspace. rules_rejects_symlinked_files covers a symlinked .md file, but nothing checks whether .codewhale/rules / .claude/rules is itself a symlink. load_rules_from_dir (and the cache-path enumerator) call fs::read_dir(workspace.join(dir)) directly, which follows a directory symlink. The files behind it are real, so the per-file is_symlink() check in load_context_file passes them through:

$ ln -s /some/outside/dir .codewhale/rules     # real .md files live in /some/outside/dir
# symlink_metadata(".codewhale/rules/secret.md") → is_symlink=false, is_file=true
# → load_context_file reads /some/outside/dir/secret.md and injects it into project context

Confirmed locally: a repo shipping .codewhale/rules -> /some/outside/dir gets that directory's *.md read into the prompt at load_project_context time — before any command approval, including in read-only/plan mode. It's .md-only, so it's information disclosure rather than arbitrary read, but it still reads files the repo doesn't own, which is the #417 concern.

Suggested guard — refuse a symlinked rules dir (mirrors the existing file-level Refusing symlinked context file precedent):

// A repo could point .codewhale/rules at a path outside the workspace;
// refuse a symlinked rules directory so real .md files behind it aren't read.
if fs::symlink_metadata(&rules_dir)
    .map(|m| m.file_type().is_symlink())
    .unwrap_or(false)
{
    tracing::warn!(target: "project_context", dir = %rules_dir.display(), "Refusing symlinked rules directory");
    return entries;
}

Two follow-ups: the same guard needs to go in project_context_cache_candidate_paths (it re-scans the directory independently), and a rules_rejects_symlinked_directory test would lock it in. Since the directory scan is now duplicated in both places, it might be worth a small shared rules_md_files(workspace, dir) helper so the symlink guard can't drift between the load path and the cache path.

Everything else looks solid. 🐳

aidaiprivate-source · 2026-07-02T11:00:39Z

PR: feat(tui) — auto-discover .codewhale/rules/ and .claude/rules/ directories as project context

Closes #3867

Summary

Add rules-directory auto-discovery to load_project_context(): on every session start,
CodeWhale automatically scans .codewhale/rules/ (native) and .claude/rules/ (Claude compat)
for .md files, loads them in filename order, and appends them to the project-context block
injected into the system prompt. Each rule is wrapped in a <project_rule source="…"> element.

This completes solution D from the design anchor issue #3867 — the same trust model as
AGENTS.md (workspace-contained content only, no absolute-path escape), with no #417
project-config relaxation required.

Motivation

Before this PR, CodeWhale's instruction system was nearly unusable in multi-project workflows:

instructions config key blocked at project scope since v0.8.8 (PRIOR: Ignore dangerous project-level config keys #417) — users could
only list rule files in ~/.codewhale/config.toml, making it painful to maintain
per-project rules across many repositories.

No rules-directory auto-discovery — Claude Code's .claude/rules/ auto-loads all
.md files; CodeWhale had no equivalent and no mechanism to load multiple rule files
without manual config.

No glob support in instructions_paths(), so even instructions = [".claude/rules/*.md"]
was impossible.

The recommended path from the #3867 design discussion was D first — rules-directory
auto-discovery sits in the same trust class as AGENTS.md, needs no #417 relaxation, and
delivers the majority of multi-project pain relief on its own. This PR implements that slice.

Design decisions

rules_block vs mixing into instructions

Rules are stored in a separate rules_block: Option<String> field on ProjectContext,
not mixed into instructions. This is essential for mono-repo support:

has_instructions() controls whether the parent-directory traversal searches for a root
AGENTS.md. If rules alone set instructions, they would block parent discovery.

By keeping rules in rules_block, has_instructions() stays unchanged (only reflects
main instructions), and parent traversal works correctly.

as_system_block() appends rules_block after instructions at render time, so both
are present in the final system prompt.

Security model

Same trust class as AGENTS.md:

Workspace-subtree only — rules live in .codewhale/rules/ or .claude/rules/ within
the project. No absolute-path escape.

Symlink refusal — load_context_file() (shared with AGENTS.md) rejects symlinked files,
matching the existing precedent in read_project_config_file.

Capped at 50 files per directory (MAX_RULES_FILES) to prevent abuse.

100 KB per file (MAX_CONTEXT_SIZE) inherited from the context loader.

No #417 relaxation

merge_project_config's rejection of project-scope instructions is left unchanged.
Scheme D is orthogonal to #417 — it doesn't touch the config key at all.

Changes

crates/tui/src/project_context.rs (+~190 lines)

New constants:

RULES_DIRS = [".codewhale/rules", ".claude/rules"] — directories scanned in order

MAX_RULES_FILES = 50 — per-directory file cap

New field on ProjectContext:

rules_block: Option<String> — holds the assembled rules XML, separate from instructions

New function load_rules_from_dir():

Scans a rules directory for *.md files

Sorts by filename for deterministic order

Reuses load_context_file() for size checking + symlink safety + empty-file rejection

Returns Vec<(PathBuf, String)> — silently returns empty on missing/unreadable directories

Modified load_project_context():

After loading PROJECT_CONTEXT_FILES (AGENTS.md etc.), iterates RULES_DIRS and calls
load_rules_from_dir()

Wraps each rule file in <project_rule source="…">…</project_rule>

Stores assembled rules in ctx.rules_block (not ctx.instructions, preserving parent traversal)

Modified as_system_block():

Appends rules_block inside the project-context block when instructions exist

Emits rules_block standalone when no main instructions are present

Emits rules_block after constitution when constitution exists but instructions don't

Modified project_context_cache_candidate_paths():

Scans RULES_DIRS for *.md files and adds them to the cache-key candidate list

Ensures rules changes invalidate the project-context cache (editing a rule file,
adding/removing rule files all produce a different cache key)

9 new tests:

Test What it covers

rules_from_codewhale_dir_are_loaded_as_project_context Basic discovery + <project_rule> wrapper

rules_are_loaded_in_filename_order Deterministic filename sort (aaa < mmm < zzz)

rules_from_claude_dir_are_compat_loaded .claude/rules/ compatibility

rules_directory_missing_does_not_crash Graceful handling of missing directories

rules_coexist_with_agents_md AGENTS.md + rules coexist, AGENTS.md precedes rules

non_md_files_in_rules_dir_are_ignored Only *.md files are loaded

rules_cap_truncates_excess_files MAX_RULES_FILES=50 enforced

rules_rejects_symlinked_files Symlinked rule files are refused (unix only)

rules_from_both_dirs_are_loaded_together Dual directory support + correct priority order

crates/tui/src/context_report.rs (+18 lines)

/context report now includes rules_block content when rules are present

When only rules exist (no main instructions), they appear as a separate "Project rules" entry

crates/tui/src/project_context_cache.rs (+28 lines, 2 tests)

signature_changes_when_rules_file_changes — verifies content change triggers cache invalidation

signature_changes_when_rules_file_is_added_or_removed — verifies file addition/removal triggers invalidation

Verification

Check Result

cargo fmt --all -- --check clean

cargo clippy -p codewhale-tui (our files only) clean

cargo test -p codewhale-tui --bin codewhale-tui -- project_context 56 passed, 0 failed

cargo test -p codewhale-tui --bin codewhale-tui -- project_context_cache 7 passed, 0 failed

cargo test -p codewhale-tui --bin codewhale-tui -- context_report 9 passed, 0 failed

System prompt structure (with rules)
┌─ System Prompt ──────────────────────────────────────────────┐
│ [mode prompt + constitution]                                  │
│                                                                │
│ <project_instructions source="AGENTS.md">                      │
│   ...AGENTS.md content...                                     │
│ </project_instructions>                                       │
│                                                                │
│ <project_rule source=".codewhale/rules/coding-style.md">      │
│   ...rule content...                                          │
│ </project_rule>                                               │
│ <project_rule source=".codewhale/rules/testing.md">           │
│   ...rule content...                                          │
│ </project_rule>                                               │
│                                                                │
│ ── volatile boundary ──                                       │
│ ## Environment …                                              │
│ <instructions source="~/global.md">…</instructions>           │
└────────────────────────────────────────────────────────────────┘
Audit summary

A comprehensive cross-system audit (2 rounds, 5 dimensions) was performed to ensure no
regressions or unexpected interactions:

Audit scope Verdict Details

Prompt byte-stability ✅ Safe Rules in static layer (same as AGENTS.md). KV cache busts on rule changes — by design.

All prompt construction paths ✅ Covered TUI, engine init, refresh_system_prompt, build_system_prompt all go through as_system_block().

Sub-agent / Fleet 🟡 Pre-existing Model-visible agent tool ✔️ inherits rules via fork_context. Background /agent path ❌ uses static prompt — same pre-existing limitation as AGENTS.md.

WhaleFlow ✅ No interaction Independent crate, no project-context references.

Project-context cache ✅ Fixed Cache key now includes rules directory files. Tested for content change + file addition/removal.

Parent-directory AGENTS.md ✅ Preserved rules_block separated from instructions — has_instructions() unchanged.

#417 project-config ✅ Unchanged merge_project_config's instructions rejection untouched.

What this PR does NOT do (deferred to future milestones)

Glob support in instructions_paths() (scheme C)

Path restriction for project-scope instructions relaxation (scheme B)

Conditional rule loading with YAML frontmatter / paths matching (scheme E)

Trust gating for project-scope instructions (scheme A)

These are tracked in #3867 as separate workstreams.

Migration path

New projects: create .codewhale/rules/ (or .claude/rules/) and drop .md files.
No config changes needed — rules are auto-discovered on next session start.

Existing .claude/rules/ users: rules are picked up automatically — zero migration cost.

Existing global instructions users: both channels are additive (project rules + global
instructions coexist in the system prompt), so no conflict.

PR：feat(tui) — 自动发现 .codewhale/rules/ 和 .claude/rules/ 目录作为项目上下文

Closes #3867

概述

为 load_project_context() 新增 rules 目录自动发现：每次会话启动时，CodeWhale
自动扫描 .codewhale/rules/（原生）和 .claude/rules/（Claude 兼容）目录下的 .md
文件，按文件名排序加载，追加到注入 system prompt 的项目上下文块中。每条规则包裹在
<project_rule source="…"> 元素中。

这是设计锚点 issue #3867 中方案 D 的实现——与 AGENTS.md 相同的安全模型（仅限
工作区内容，无绝对路径逃逸），不需要 relax #417 项目级配置限制。

动机

此 PR 之前，CodeWhale 在多项目场景下的规则系统几乎不可用：

instructions 配置项被项目级禁止（自 v0.8.8 PRIOR: Ignore dangerous project-level config keys #417）——用户只能在
~/.codewhale/config.toml 中列举规则文件，跨多个仓库维护极其痛苦。

无 rules 目录自动发现——Claude Code 的 .claude/rules/ 自动加载所有 .md
文件；CodeWhale 没有对应机制，且无法批量加载多文件规则。

instructions_paths() 不支持 glob，即使写 instructions = [".claude/rules/*.md"]
也是无效的。

#3867 设计讨论的推荐路径是 D 优先——rules 目录自动发现与 AGENTS.md 同安全等级，
无需改动 #417，且能独立解决多项目痛点的大部分。本 PR 实现该方案。

设计决策

rules_block 分离 vs 混入 instructions

Rules 存储在 ProjectContext 的独立字段 rules_block: Option<String> 中，不混入
instructions。这对 mono-repo 场景至关重要：

has_instructions() 控制是否向上搜索父目录的 AGENTS.md。若 rules 单独设置了
instructions，会阻止父目录发现。

将 rules 保持在 rules_block 中，has_instructions() 保持不变（仅反映主指令），
父目录遍历正常工作。

as_system_block() 在渲染时将 rules_block 追在 instructions 之后，两者都出现在
最终 system prompt 中。

安全模型

与 AGENTS.md 同等级：

仅限工作区子树——rules 位于项目内的 .codewhale/rules/ 或 .claude/rules/。
无绝对路径逃逸。

拒绝软链接——load_context_file()（与 AGENTS.md 共享）拒绝软链接文件，与
read_project_config_file 中的现有先例一致。

每目录上限 50 文件（MAX_RULES_FILES）防止滥用。

每文件 100 KB（MAX_CONTEXT_SIZE）继承自上下文加载器。

不触碰 #417

merge_project_config 对项目级 instructions 的拒绝保持原样。方案 D 与 #417
完全正交——不涉及配置项。

改动

crates/tui/src/project_context.rs（+~190 行）

新增常量：

RULES_DIRS = [".codewhale/rules", ".claude/rules"] — 按顺序扫描的目录

MAX_RULES_FILES = 50 — 每目录文件上限

ProjectContext 新增字段：

rules_block: Option<String> — 存放组装好的 rules XML，与 instructions 分离

新增函数 load_rules_from_dir()：

扫描 rules 目录中的 *.md 文件

按文件名排序，保证确定性顺序

复用 load_context_file() 做大小检查 + 软链接安全 + 空文件拒绝

返回 Vec<(PathBuf, String)> — 目录缺失或不可读时静默返回空 vector

修改 load_project_context()：

加载 PROJECT_CONTEXT_FILES（AGENTS.md 等）后，遍历 RULES_DIRS 调用
load_rules_from_dir()

将每条规则包裹在 <project_rule source="…">…</project_rule> 中

组装结果存入 ctx.rules_block（而非 ctx.instructions，保留父目录遍历）

修改 as_system_block()：

instructions 存在时，将 rules_block 追在项目上下文块中

无主指令时，独立输出 rules_block

constitution 存在但 instructions 不存在时，constitution 后输出 rules_block

修改 project_context_cache_candidate_paths()：

扫描 RULES_DIRS 中的 *.md 文件，加入缓存 key 候选列表

确保 rules 变更触发项目上下文缓存失效（编辑规则文件、新增/删除规则文件均产生不同缓存 key）

9 个新测试：

测试覆盖

rules_from_codewhale_dir_are_loaded_as_project_context 基础发现 + <project_rule> 包裹

rules_are_loaded_in_filename_order 确定性文件名排序（aaa < mmm < zzz）

rules_from_claude_dir_are_compat_loaded .claude/rules/ 兼容

rules_directory_missing_does_not_crash 目录缺失不崩溃

rules_coexist_with_agents_md AGENTS.md + rules 共存，AGENTS.md 在前

non_md_files_in_rules_dir_are_ignored 仅加载 *.md

rules_cap_truncates_excess_files MAX_RULES_FILES=50 强制

rules_rejects_symlinked_files 拒绝软链接规则文件（仅 unix）

rules_from_both_dirs_are_loaded_together 双目录共存 + 正确优先级

crates/tui/src/context_report.rs（+18 行）

/context report 现在在 rules 存在时包含 rules_block 内容

仅 rules 存在（无主指令）时显示为独立的"Project rules"条目

crates/tui/src/project_context_cache.rs（+28 行，2 个测试）

signature_changes_when_rules_file_changes — 验证内容变更触发缓存失效

signature_changes_when_rules_file_is_added_or_removed — 验证文件增删触发失效

验证

检查项结果

cargo fmt --all -- --check clean

cargo clippy -p codewhale-tui（仅本次改动文件） clean

cargo test -p codewhale-tui --bin codewhale-tui -- project_context 56 passed, 0 failed

cargo test -p codewhale-tui --bin codewhale-tui -- project_context_cache 7 passed, 0 failed

cargo test -p codewhale-tui --bin codewhale-tui -- context_report 9 passed, 0 failed

System prompt 结构（含 rules）
┌─ System Prompt ──────────────────────────────────────────────────┐
│ [mode prompt + constitution]                                      │
│                                                                    │
│ <project_instructions source="AGENTS.md">                          │
│   ...AGENTS.md 内容...                                             │
│ </project_instructions>                                           │
│                                                                    │
│ <project_rule source=".codewhale/rules/coding-style.md">          │
│   ...规则内容...                                                   │
│ </project_rule>                                                   │
│ <project_rule source=".codewhale/rules/testing.md">               │
│   ...规则内容...                                                   │
│ </project_rule>                                                   │
│                                                                    │
│ ── volatile boundary ──                                           │
│ ## Environment …                                                  │
│ <instructions source="~/global.md">…</instructions>               │
└────────────────────────────────────────────────────────────────────┘
审计摘要

进行了全面的跨系统审计（2 轮、5 个维度），确保无回归或意外交互：

审计范围结论详情

Prompt 字节稳定性 ✅ 安全 Rules 在静态层（与 AGENTS.md 一致）。KV cache 随规则变更刷新——设计如此。

所有 prompt 构造路径 ✅ 全覆盖 TUI、engine init、refresh_system_prompt、build_system_prompt 均经过 as_system_block()。

子任务 / Fleet 🟡 预存限制模型可见的 agent 工具 ✔️ 通过 fork_context 继承 rules。后台 /agent 路径 ❌ 使用静态 prompt——与 AGENTS.md 相同的预存限制。

WhaleFlow ✅ 无交互独立 crate，无项目上下文引用。

项目上下文缓存 ✅ 已修复缓存 key 现在包含 rules 目录文件。已验证内容变更 + 文件增删。

父目录 AGENTS.md ✅ 保持 rules_block 与 instructions 分离——has_instructions() 不变。

#417 项目配置 ✅ 未触碰 merge_project_config 的 instructions 拒绝保持不变。

本 PR 不包含的内容（推迟到后续 milestone）

Glob 支持 instructions_paths()（方案 C）

路径限制 放宽项目级 instructions（方案 B）

按需加载 YAML frontmatter / paths 匹配（方案 E）

Trust gating 项目级 instructions（方案 A）

以上在 #3867 中作为独立工作流跟踪。

迁移路径

新项目：创建 .codewhale/rules/（或 .claude/rules/）并放入 .md 文件。
无需配置变更——下次会话启动时自动发现 rules。

现有 .claude/rules/ 用户：rules 直接生效——零迁移成本。

现有全局 instructions 用户：两个通道是叠加关系（项目 rules + 全局 instructions
共存于 system prompt），无冲突。

yekern · 2026-07-02T11:07:09Z

@LeoLin990405 Good catch — just pushed a commit adding the symlink-directory guard.

Two places patched: load_rules_from_dir() (the load path) and project_context_cache_candidate_paths() (the cache-key path), both with the same fs::symlink_metadata(…).is_symlink() check. Added rules_rejects_symlinked_directory test to lock it in. All green (59 tests, 0 failures).

Appreciate the thorough review. 🐳

Hmbown · 2026-07-02T17:19:49Z

Review — solid feature; needs a rebase onto the merged v0.8.67 context work

This is a nice addition and it directly addresses #3867 (project-scope instructions being overly denied). The design is thoughtful and the security model is handled well:

Symlink containment is correct. Refusing a symlinked rules directory (fs::symlink_metadata(...).is_symlink()) is the right call — the comment nails why a per-file is_symlink check alone wouldn't suffice (the real .md files behind a symlinked dir would pass the per-file check and be read from outside the workspace). Each file also goes through load_context_file (size + symlink safety). 👍
Bounded: MAX_RULES_FILES = 50 per dir with a truncation warning; deterministic filename ordering.
rules_block kept separate from instructions so rules alone don't suppress parent-directory AGENTS.md discovery via has_instructions() — good, subtle call.
.claude/rules/ compat alongside .codewhale/rules/ is a sensible bridge.

Blocker: rebase required

This now conflicts with main (2 markers in project_context.rs) — the merged #3861 v0.8.67 work reworked that file (constitution loading + repo-law protected_invariants). A rebase onto current main is needed before it can land, and CI here only ran partially (SKIPPED,SUCCESS), so a full run post-rebase is worth confirming.

The interaction is only textual, not behavioral: this feature reads rules as project context, which is orthogonal to the repo-law write enforcement that also lives in project_context.rs now — they don't fight, they just touch the same file.

One minor suggestion (non-blocking)

Per-file content is capped (MAX_CONTEXT_SIZE, 100 KB), but 50 files × 100 KB ≈ 5 MB of rules could be injected in the pathological case. Consider a total byte budget across the assembled rules_block (truncate with a note once the cumulative size crosses a threshold) so a large rules dir can't dominate the context window. Compaction would eventually handle it, but bounding at assembly time is cheaper.

Net: sound and safe design; rebase + full CI, optional total-size cap. Happy to help with the rebase against the new project_context.rs if useful.

(Reviewed against the current diff; not a merge/approve — for @Hmbown's decision.)

…ries as project context Add rules-directory auto-discovery as solution D from Hmbown#3867. - Scans .codewhale/rules/ (native) and .claude/rules/ (Claude compat) for *.md files - Loads in filename order, wraps each in <project_rule source="…"> elements - Separates rules into rules_block field to avoid blocking parent AGENTS.md traversal - Reuses load_context_file() for size checking + symlink safety (MAX_CONTEXT_SIZE 100KB) - Caps at MAX_RULES_FILES=50 per directory to prevent abuse - Adds rules files to project_context_cache_candidate_paths for proper cache invalidation - Updates /context report to surface rules Files: project_context.rs (+~190), context_report.rs (+18), project_context_cache.rs (+28) Tests: 11 new (9 rules + 2 cache), 67 passed, clippy clean

@LeoLin990405

A symlinked rules directory (e.g. .codewhale/rules -> /outside) would allow real .md files behind it to pass per-file symlink checks and be read from outside the workspace subtree — same escape class as Hmbown#417. Adds fs::symlink_metadata guard in load_rules_from_dir() and project_context_cache_candidate_paths() to skip symlinked directories. New test rules_rejects_symlinked_directory locks in the guard. Reported-by: @LeoLin990405

50 files × 100KB could reach ~5MB. Caps cumulative rules_block at MAX_RULES_BLOCK_BYTES=500KB with truncation marker to prevent a large rules directory from dominating the context window. Suggested-by: review on Hmbown#3892

yekern · 2026-07-03T01:05:27Z

@Hmbown Rebased onto latest main (v0.8.67 constitution work from #3861). Two review-driven additions:

Symlinked rules directory guard (caught by @LeoLin990405): load_rules_from_dir() and project_context_cache_candidate_paths() now refuse symlinked directories with the same fs::symlink_metadata(…).is_symlink() check used for files. Added rules_rejects_symlinked_directory test.
Total byte budget: MAX_RULES_BLOCK_BYTES = 500 KB caps the assembled rules_block to prevent a pathological 50 × 100 KB scenario from dominating the context window. Truncation includes an explicit marker.

All green: 62 tests pass (including 10 new rules tests + 2 cache tests), fmt clean, clippy clean. Conflict was textual only — the ignored_project_whale_warnings line from #3861 sits right above our rules loading block.

yekern requested a review from Hmbown as a code owner July 2, 2026 09:45

yekern mentioned this pull request Jul 2, 2026

Project-scope instructions are overly denied — need glob + rules directory auto-discovery #3867

Closed

LeoLin990405 reviewed Jul 2, 2026

View reviewed changes

yekern added 3 commits July 3, 2026 08:58

yekern force-pushed the codex/rules-dir-auto-discovery branch from fbd881a to 20926cf Compare July 3, 2026 01:03

Hmbown merged commit 296f050 into Hmbown:main Jul 3, 2026
13 checks passed

Conversation

yekern commented Jul 2, 2026

PR: feat(tui) — auto-discover .codewhale/rules/ and .claude/rules/ directories as project context

Summary

Motivation

Design decisions

rules_block vs mixing into instructions

Security model

No #417 relaxation

Changes

crates/tui/src/project_context.rs (+~190 lines)

crates/tui/src/context_report.rs (+18 lines)

crates/tui/src/project_context_cache.rs (+28 lines, 2 tests)

Verification

System prompt structure (with rules)

Audit summary

What this PR does NOT do (deferred to future milestones)

Migration path

PR：feat(tui) — 自动发现 .codewhale/rules/ 和 .claude/rules/ 目录作为项目上下文

概述

动机

设计决策

rules_block 分离 vs 混入 instructions

安全模型

不触碰 #417

改动

crates/tui/src/project_context.rs（+~190 行）

crates/tui/src/context_report.rs（+18 行）

crates/tui/src/project_context_cache.rs（+28 行，2 个测试）

验证

System prompt 结构（含 rules）

审计摘要

本 PR 不包含的内容（推迟到后续 milestone）

迁移路径

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

LeoLin990405 left a comment

Choose a reason for hiding this comment

Uh oh!

aidaiprivate-source commented Jul 2, 2026

PR: feat(tui) — auto-discover .codewhale/rules/ and .claude/rules/ directories as project context

Summary

Motivation

Design decisions

rules_block vs mixing into instructions

Security model

No #417 relaxation

Changes

crates/tui/src/project_context.rs (+~190 lines)

crates/tui/src/context_report.rs (+18 lines)

crates/tui/src/project_context_cache.rs (+28 lines, 2 tests)

Verification

System prompt structure (with rules)

Audit summary

What this PR does NOT do (deferred to future milestones)

Migration path

PR：feat(tui) — 自动发现 .codewhale/rules/ 和 .claude/rules/ 目录作为项目上下文

概述

动机

设计决策

rules_block 分离 vs 混入 instructions

安全模型

不触碰 #417

改动

crates/tui/src/project_context.rs（+~190 行）

crates/tui/src/context_report.rs（+18 行）

crates/tui/src/project_context_cache.rs（+28 行，2 个测试）

验证

System prompt 结构（含 rules）

审计摘要

本 PR 不包含的内容（推迟到后续 milestone）

迁移路径

Uh oh!

yekern commented Jul 2, 2026

Uh oh!

Hmbown commented Jul 2, 2026

Review — solid feature; needs a rebase onto the merged v0.8.67 context work

Blocker: rebase required

One minor suggestion (non-blocking)

PR: feat(tui) — auto-discover `.codewhale/rules/` and `.claude/rules/` directories as project context

`rules_block` vs mixing into `instructions`

`crates/tui/src/project_context.rs` (+~190 lines)

`crates/tui/src/context_report.rs` (+18 lines)

`crates/tui/src/project_context_cache.rs` (+28 lines, 2 tests)

PR：feat(tui) — 自动发现 `.codewhale/rules/` 和 `.claude/rules/` 目录作为项目上下文

`rules_block` 分离 vs 混入 `instructions`

`crates/tui/src/project_context.rs`（+~190 行）

`crates/tui/src/context_report.rs`（+18 行）

`crates/tui/src/project_context_cache.rs`（+28 行，2 个测试）

PR: feat(tui) — auto-discover `.codewhale/rules/` and `.claude/rules/` directories as project context

`rules_block` vs mixing into `instructions`

`crates/tui/src/project_context.rs` (+~190 lines)

`crates/tui/src/context_report.rs` (+18 lines)

`crates/tui/src/project_context_cache.rs` (+28 lines, 2 tests)

PR：feat(tui) — 自动发现 `.codewhale/rules/` 和 `.claude/rules/` 目录作为项目上下文

`rules_block` 分离 vs 混入 `instructions`

`crates/tui/src/project_context.rs`（+~190 行）

`crates/tui/src/context_report.rs`（+18 行）

`crates/tui/src/project_context_cache.rs`（+28 行，2 个测试）