Skip to content

ci(test): land the #185 residuals — flake-repro workflow + stuck-state timeout diagnostic#197

Merged
rbuergi merged 2 commits into
mainfrom
ci/flake-repro-and-timeout-diag
Jul 2, 2026
Merged

ci(test): land the #185 residuals — flake-repro workflow + stuck-state timeout diagnostic#197
rbuergi merged 2 commits into
mainfrom
ci/flake-repro-and-timeout-diag

Conversation

@rbuergi

@rbuergi rbuergi commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Closes #185 (residual 1 already landed via #194).

Residual 2 — reusable 2-vCPU flake-repro workflow

.github/workflows/flake-repro.yml: manually dispatched (workflow_dispatch only — the branch's TEMP self-serve push: trigger is stripped, so merging this cannot start any run). Loops one suspect test on the real ubuntu 2-vCPU runner until it fails (inputs: project / filter / iterations / log level via ENV — never by editing committed log levels), uploading the failing iteration's trx + MonolithMeshTestBase traces + blame hang-dump. It reliably manufactured the CodeEditRecompile flake on iterations 4/19/80 and targets any project/test. Zero src/ risk.

Residual 3 — stuck-state timeout diagnostic

CodeEditRecompileTest.WaitForLatestRelease dumps a discriminating diagnostic on its 50 s timeout: MIRROR (the cross-hub cache handle the test reads) vs INDEX (persisted+indexed state + Release children) views of the same node — separating owner-side "never produced v2" from delivery/mirror staleness — plus a fresh re-trigger that splits a one-time missed emission (recovers → sub-case a) from a persistent clobber / dead subscription (stays stuck → sub-case b). Runs only in the already-failing timeout path; the happy path is untouched.

Provenance & verification

🤖 Generated with Claude Code

…e timeout diagnostic

Residual 2: .github/workflows/flake-repro.yml — a manually-dispatched
(workflow_dispatch-only; the branch's TEMP push trigger stripped) job that
loops one suspect test on the real 2-vCPU ubuntu runner until it fails and
uploads the failing trx + traces + blame hang-dump, with logging cranked via
ENV (never by editing committed log levels). It reliably manufactured the
CodeEditRecompile flake (iterations 4/19/80) and targets any project/filter.

Residual 3: CodeEditRecompileTest.WaitForLatestRelease now dumps a full
discriminating diagnostic on its 50s timeout — MIRROR (cross-hub cache handle)
vs INDEX (persisted+indexed state + Release children) views of the node to
separate owner-side "never produced v2" from delivery/mirror staleness, plus a
fresh re-trigger to split one-time missed emission (recovers) from persistent
clobber/dead subscription (stays stuck). Timing-neutral: only runs in the
already-failing timeout path.

Extracted from the never-PR'd ci/flake-repro-workflow branch WITHOUT the
abandoned #124 watcher refinement it also carried (superseded by #194's
commit-path high-water fix). Completes #185 (residual 1 landed in #194).

Fixes #185.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a reusable manual GitHub Actions workflow to reproduce CI-only flakes on the real 2‑vCPU runner, and enhances CodeEditRecompileTest.WaitForLatestRelease to emit richer diagnostics when the existing 50s wait times out (mirror vs index views + a bounded re-trigger attempt).

Changes:

  • Add .github/workflows/flake-repro.yml (workflow_dispatch only) to loop a specified test until failure and upload TRX + dumps + MeshWeaver traces.
  • Extend WaitForLatestRelease timeout path to dump discriminating state (MIRROR vs INDEX + Release children) and attempt a re-trigger to classify “one-time missed emission” vs “persistently stuck”.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
test/MeshWeaver.Hosting.Monolith.Test/CodeEditRecompileTest.cs Adds timeout-path diagnostics (mirror vs index, release listing, re-trigger classification).
.github/workflows/flake-repro.yml New manual workflow to reproduce flakes on ubuntu-latest and upload diagnostics artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/MeshWeaver.Hosting.Monolith.Test/CodeEditRecompileTest.cs
Comment thread test/MeshWeaver.Hosting.Monolith.Test/CodeEditRecompileTest.cs
Comment thread test/MeshWeaver.Hosting.Monolith.Test/CodeEditRecompileTest.cs
Comment thread .github/workflows/flake-repro.yml Outdated
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Test Results (shard 3)

792 tests  ±0   685 ✅ ±0   2m 22s ⏱️ -16s
 13 suites ±0   107 💤 ±0 
 13 files   ±0     0 ❌ ±0 

Results for commit cc980cb. ± Comparison against base commit 77df906.

♻️ This comment has been updated with latest results.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Test Results (shard 1)

1 396 tests  ±0   1 395 ✅ ±0   6m 31s ⏱️ +9s
   14 suites ±0       1 💤 ±0 
   14 files   ±0       0 ❌ ±0 

Results for commit cc980cb. ± Comparison against base commit 77df906.

♻️ This comment has been updated with latest results.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Test Results (shard 2)

   15 files  ± 0     15 suites  ±0   6m 15s ⏱️ -44s
1 164 tests  - 43  1 161 ✅  - 43  3 💤 ±0  0 ❌ ±0 
1 165 runs   - 45  1 162 ✅  - 45  3 💤 ±0  0 ❌ ±0 

Results for commit cc980cb. ± Comparison against base commit 77df906.

This pull request removes 43 tests.
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.AI.ChatCommands.md")
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.AI.TeamsBot.md")
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.AccessC"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.DataSyn"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.EmailIn"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.HubInit"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.ImageCl"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.MemexCl"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.MeshGra"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.Notific"···)
…

♻️ This comment has been updated with latest results.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Test Results (shard 0)

2 151 tests  +673   2 146 ✅ +668   8m 41s ⏱️ + 4m 58s
   13 suites +  1       5 💤 +  5 
   13 files   +  1       0 ❌ ±  0 

Results for commit cc980cb. ± Comparison against base commit 77df906.

♻️ This comment has been updated with latest results.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Test Results

   55 files  +  1     55 suites  +1   23m 50s ⏱️ + 4m 6s
5 503 tests +630  5 387 ✅ +625  116 💤 +5  0 ❌ ±0 
5 504 runs  +628  5 388 ✅ +623  116 💤 +5  0 ❌ ±0 

Results for commit cc980cb. ± Comparison against base commit 77df906.

This pull request removes 43 and adds 673 tests. Note that renamed tests count towards both.
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.AI.ChatCommands.md")
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.AI.TeamsBot.md")
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.AccessC"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.DataSyn"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.EmailIn"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.HubInit"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.ImageCl"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.MemexCl"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.MeshGra"···)
MeshWeaver.Persistence.Test.DocumentationCodeBlockCompilationTest ‑ ExecutedCsharpBlocks_MustCompile(embeddedResourceName: "MeshWeaver.Documentation.Data.Architecture.Notific"···)
…
MeshWeaver.AI.Test.ActivityLogStreamTest ‑ Progress_Messages_Stream_Gradually_Not_Just_At_The_End
MeshWeaver.AI.Test.ActivityLogStreamTest ‑ Script_Failure_Flips_ActivityLog_Status_To_Failed
MeshWeaver.AI.Test.ActivityLogStreamTest ‑ Script_Log_Messages_Land_On_ActivityLog_Node
MeshWeaver.AI.Test.AgentChatClientDeadlockTest ‑ GetOrderedAgentsAsync_WithContextPath_ConcurrentCallers_DoNotDeadlock
MeshWeaver.AI.Test.AgentChatClientDeadlockTest ‑ GetOrderedAgentsAsync_WithContextPath_SingleCaller_ResolvesQuickly
MeshWeaver.AI.Test.AgentChatClientDeadlockTest ‑ GetOrderedAgentsAsync_WithMarkdownContext_DoesNotDeadlock
MeshWeaver.AI.Test.AgentChatClientTest ‑ AgentChatClient_Initialize_SurfacesPartitionAgents_NotNodeTypeNamespaceAgents
MeshWeaver.AI.Test.AgentChatClientTest ‑ AgentChatClient_Initialize_SurfacesPlatformAgentsForAnyContext
MeshWeaver.AI.Test.AgentChatClientUnitTest ‑ FindCyclicDelegations_Chain_ReturnsEmpty
MeshWeaver.AI.Test.AgentChatClientUnitTest ‑ FindCyclicDelegations_DelegationsWithNoMatchingTarget_ReturnsEmpty
…

♻️ This comment has been updated with latest results.

- The three facts using WaitForLatestRelease get [Fact(Timeout = 120000)] with
  rationale: the happy path completes in seconds; the budget is for the FAILURE
  path — the 50s primary wait plus the discriminating diagnostic (probes +
  decisive re-trigger, worst ~50s) must fit inside the xUnit method timeout or
  the diagnostic is cancelled before it can be emitted.
- Index probes are best-effort and tightly bounded (first emission, 5s): waiting
  for a non-empty snapshot could stall the whole bound when the node is
  genuinely absent from the index — itself a diagnostic result.
- flake-repro.yml exits non-zero when the flake reproduces so the run is
  clearly red in the Actions list; diagnostics-collection/upload steps still
  run via if: always().

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@rbuergi rbuergi merged commit c39b7f5 into main Jul 2, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Residuals from abandoned #124: watcher eager high-water advance + standalone flake-repro harness

2 participants