Skip to content

HYPERFLEET-1199 - feat: Add /e2e-debug skill#63

Open
tirthct wants to merge 3 commits into
openshift-hyperfleet:mainfrom
tirthct:e2e-debug
Open

HYPERFLEET-1199 - feat: Add /e2e-debug skill#63
tirthct wants to merge 3 commits into
openshift-hyperfleet:mainfrom
tirthct:e2e-debug

Conversation

@tirthct

@tirthct tirthct commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add /e2e-debug skill to the hyperfleet-devtools plugin — an AI-powered forensic debugger that automates root cause analysis of failed E2E CI pipeline runs
  • Accepts Prow URLs, GCS artifact links, GitHub Actions URLs, job names, or JIRA tickets as input
  • Validated against 3 real Prow failures — all 3 diagnoses matched actual fix commits (PR #99, PR #107, HYPERFLEET-1225)

What it does

The skill runs a 6-step forensic workflow:

  1. Artifact Walk — recursively reads every file in the GCS artifact tree (setup, test, cleanup logs, component logs, junit.xml, ci-operator metrics)
  2. Pattern Matching — matches errors against 30+ documented failure patterns including GKE node upgrades, Maestro DB pagination, Helm ownership conflicts, adapter conditions, and RFC 9457 error codes
  3. Change Verification — time-bounded commit/PR queries across 6+ repos, JIRA bug search, prior run pass/fail history via GCS finished.json, cross-run comparison (commit, chart versions, GKE cluster version, node names)
  4. Synthesis — timeline reconstruction with timestamps, node assignments, and infrastructure context; preliminary confidence scoring
  5. Live Cluster Inspection (when kubectl/gcloud available) — Maestro DB accumulation, orphaned resources, GKE node operations (gcloud container operations list), maintenance policy, pod health, Sentinel metrics
  6. Forensic Certification Gate — contradiction check, symptom-vs-cause check, two-source corroboration requirement; forces LOW confidence if evidence is insufficient

Key design decisions

  • No shortcuts on artifacts: the skill walks the entire GCS tree including cleanup logs, which contain post-failure cluster state (e.g., whether Helm uninstall succeeded, namespace deletion status)
  • No incorrect inferences: Helm uninstall succeeding does NOT prove a pod was alive (Helm metadata is in etcd). The skill explicitly warns against this in 3 locations
  • Time-bounded everything: commit queries use since=/until=, PR queries use merged:>, gcloud operations use date filters. No --limit flags anywhere
  • GKE node upgrade detection: cross-run comparison catches GKE version changes; gcloud container operations list finds UPGRADE_NODES overlapping the test window (see HYPERFLEET-1225)
  • Graceful degradation: core diagnosis (Steps 1-4) works with gh CLI alone. kubectl/gcloud add live cluster validation but are optional. Confidence capped at MEDIUM if kubectl is available but not used

Negative scenarios handled

  • Passing run URL → stops immediately
  • ABORTED run → different investigation path (checks prowjob.json for abort reason)
  • In-progress run → detects missing finished.json, reports status
  • Setup-only failure (no test directory) → falls back to setup/ci-operator logs
  • Wrong kubectl context → verifies against cluster name from setup logs
  • Wrong gcloud project → verifies against project ID from setup logs
  • Missing all-resources.txt → notes "node assignments unknown" and proceeds
  • Expired cross-run artifacts → skips comparison, does not guess

Files changed

File Change
hyperfleet-devtools/skills/e2e-debug/SKILL.md New — 546-line skill definition
hyperfleet-devtools/skills/e2e-debug/references/known-failure-patterns.md New — 30+ error patterns with investigation steps
hyperfleet-devtools/skills/e2e-debug/references/ci-quick-reference.md New — Prow jobs, GCS structure, environment details
hyperfleet-devtools/.claude-plugin/plugin.json Version 0.5.1 → 0.6.0, description updated
hyperfleet-devtools/README.md Added skill section, external systems, version bump
.claude-plugin/marketplace.json Description updated
AGENTS.md Inventory: 3 → 4 skills, version 0.6.0
hyperfleet-devtools/docs/ Presentation deck + build script (supplementary)

Test plan

  • Install locally: claude --plugin-dir ./hyperfleet-devtools
  • Verify skill appears: /hyperfleet-devtools:e2e-debug in available skills
  • Run against a passing run URL — should stop with "This run passed"
  • Run against a failing Prow URL — should produce structured output with confidence score
  • Run against a job name (no URL) — should fetch latest-build.txt and resolve
  • Verify kubectl checks run when kubectl is available and connected
  • Verify graceful degradation when kubectl/gcloud are not available
  • Verify time-bounded commits (debug a week-old failure — should show that week's commits, not today's)

@openshift-ci openshift-ci Bot requested review from jsell-rh and rafabene June 16, 2026 00:32
@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign vkareh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced the E2E CI Failure Debugger (e2e-debug) to help perform end-to-end HyperFleet pipeline forensic diagnosis, including confidence scoring and structured root-cause output.
  • Documentation

    • Added a complete debugger runbook (including presentation outline), a CI quick reference, and a known failure pattern guide to speed up triage and classification.
    • Updated installation and usage examples for the new debugger tool.
  • Release / Versioning

    • Updated devtools plugin documentation and bumped the plugin to v0.6.0, including expanded skill inventory and updated availability/version metadata.

Walkthrough

The hyperfleet-devtools plugin is bumped from 0.5.x to 0.6.0 across plugin.json, marketplace.json, and AGENTS.md. A new e2e-debug skill is introduced defining a 6-step forensic procedure: input classification from Prow/GitHub Actions URLs, job names, or JIRA tickets; GCS artifact retrieval with recursive log parsing and UTC timeline reconstruction from heterogeneous timestamp formats; handbook and known-failure-pattern cross-reference with Prow infrastructure failure short-circuit; time-bounded git/GitHub/JIRA queries with cross-run comparison against prior Prow executions; confidence-scored hypothesis synthesis (HIGH/MEDIUM/LOW tiers); mandatory live kubectl inspection when available (context discovery, Maestro ResourceBundles, pod health, node state, orphaned manifests); optional gcloud-based GKE node operations, maintenance policy checks, and Pub/Sub leak detection; forensic certification gate enforcing contradiction-checks and two-source corroboration; and mandatory structured output (failure point, root cause, evidence, action) with guardrails against hallucination and verification-before-claiming. Two reference documents provide Prow job naming, GCS artifact structure, manual rerun via gangway API, and error-signature-to-category mappings covering API timeouts, validation failures, adapter health, Kubernetes etcd saturation, DNS/image pull failures, and persistent state leaks. README documents the new tool and its capabilities. A 12-slide presentation outlines current debugging pain (20+ minute root-cause time), pipeline health statistics, failure costs, the forensic workflow, worked example with validation, prerequisites with graceful degradation, and future automation roadmap (GitHub Action trigger, Slack integration, confidence-gated auto-triage).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
No Injection Vectors ⚠️ Warning JIRA ticket validation (lines 50-56 in SKILL.md) lacks exit control flow: regex validates format but if statement has no exit/return to prevent passing invalid input to jira CLI (CWE-78). Add 'exit 1' after error message on line 54 to stop execution when JIRA format validation fails.
✅ Passed checks (10 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the primary change: adding a new /e2e-debug skill to hyperfleet-devtools. It is specific, concise, and directly reflects the main contribution.
Description check ✅ Passed Description is detailed and directly related to the changeset, covering the skill's purpose, workflow steps, design decisions, and test plan. It clearly explains what is being added and why.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Sec-02: Secrets In Log Output ✅ Passed PR adds documentation and config only (4 Markdown files, 2 JSON configs). No Go code files or executable scripts. Zero log statements to evaluate; check inapplicable.
No Hardcoded Secrets ✅ Passed No hardcoded secrets found. All examples use placeholders, command substitution, or variables; URLs contain no embedded credentials; JIRA input validated with grep pattern ^[A-Z]+-[0-9]+$; base64 u...
No Weak Cryptography ✅ Passed PR contains only documentation and configuration (Markdown, JSON). No cryptographic primitives (MD5, DES, RC4, SHA1 for security, ECB, HMAC), custom crypto implementations, or non-constant-time com...
No Privileged Containers ✅ Passed PR contains no Kubernetes manifests, Helm templates, or Dockerfiles. No privileged container configurations or security context settings found in any added/modified files.
No Pii Or Sensitive Data In Logs ✅ Passed Skill outputs infrastructure diagnostics (node names, cluster identifiers, error messages) without exposing PII, credentials, session IDs, or customer data. No logging statements found; skill is Ma...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
hyperfleet-devtools/skills/e2e-debug/SKILL.md (1)

6-6: 💤 Low value

Remove unused Read from allowed-tools.

Line 6 declares allowed-tools: Bash, Read, WebFetch, AskUserQuestion, but the skill only uses Bash (for gh, gcloud, kubectl, jira commands), WebFetch (for GCS artifacts), and AskUserQuestion (line 26). The Read tool is not exercised. Per coding guidelines, do not request tools the skill does not use.

♻️ Proposed fix
-allowed-tools: Bash, Read, WebFetch, AskUserQuestion
+allowed-tools: Bash, WebFetch, AskUserQuestion
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md` at line 6, The allowed-tools
declaration includes the Read tool, but reviewing the skill implementation shows
it only uses Bash, WebFetch, and AskUserQuestion. Remove Read from the
allowed-tools list in the SKILL.md file to ensure only the tools actually used
by the skill are declared, per coding guidelines.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md`:
- Line 266: The jira issue list command on line 266 is vulnerable to JQL
injection because the keyword-from-error is interpolated directly into the query
string without validation or escaping. To fix this, validate the keyword before
interpolation by filtering it to only alphanumeric characters, underscores, and
hyphens (removing or replacing any special characters or JQL operators), then
use the sanitized keyword in the query string. Alternatively, if the jira CLI
supports structured parameter passing or environment variables for query
parameters, use those mechanisms instead of string interpolation to avoid
injection entirely.
- Line 366: The kubectl port-forward commands at
hyperfleet-devtools/skills/e2e-debug/SKILL.md lines 366, 419, and 424 lack
timeout protection, causing indefinite hangs if the service is unreachable or
kubectl context is misconfigured. For each of these three locations, replace the
current `kubectl port-forward ... & PF_PID=$!; sleep 2; curl ...` pattern with a
timeout wrapper around the port-forward command (e.g., `timeout 5 kubectl
port-forward ...`), followed by a check to verify the process started
successfully using `kill -0 $PF_PID`, and only proceed with the curl command if
the process is running. This ensures the skill fails safely if port-forward
cannot establish within the timeout period, satisfying the fail-safe requirement
for dynamic context.
- Line 47: The SKILL.md file accepts a JIRA ticket input in the format
HYPERFLEET-XXXX without validating its format before passing it to the jira CLI
query around lines 262-270. This creates a security vulnerability where
malformed or attacker-controlled input could inject JQL metacharacters. Add
format validation early in the step (before line 266 where the jira query is
executed) to ensure the JIRA_TICKET variable matches the expected pattern of
uppercase project key characters, followed by a hyphen, followed by one or more
digits using a regex pattern like ^[A-Z]+-[0-9]+$. If the format is invalid,
output an error message and exit the step. This validation should be documented
or referenced at line 47 where the argument-hint is defined, and the actual
validation logic should be placed before the jira CLI execution in the 262-270
line range.

---

Nitpick comments:
In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md`:
- Line 6: The allowed-tools declaration includes the Read tool, but reviewing
the skill implementation shows it only uses Bash, WebFetch, and AskUserQuestion.
Remove Read from the allowed-tools list in the SKILL.md file to ensure only the
tools actually used by the skill are declared, per coding guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 4e34d43d-e223-4891-a79e-9d2861e10f4e

📥 Commits

Reviewing files that changed from the base of the PR and between c1f42c3 and 518fa78.

📒 Files selected for processing (8)
  • .claude-plugin/marketplace.json
  • AGENTS.md
  • hyperfleet-devtools/.claude-plugin/plugin.json
  • hyperfleet-devtools/README.md
  • hyperfleet-devtools/docs/e2e-debug-presentation.md
  • hyperfleet-devtools/skills/e2e-debug/SKILL.md
  • hyperfleet-devtools/skills/e2e-debug/references/ci-quick-reference.md
  • hyperfleet-devtools/skills/e2e-debug/references/known-failure-patterns.md
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)

Comment thread hyperfleet-devtools/skills/e2e-debug/SKILL.md Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md`:
- Line 47: Add input validation to prevent JIRA and JQL injection
vulnerabilities by implementing two complementary fixes. At
hyperfleet-devtools/skills/e2e-debug/SKILL.md#L47-L47, add a format validation
check using the regex pattern `^[A-Z]+-[0-9]+$` to validate the JIRA ticket
input before it is used in Step 3c jira execution command. At
hyperfleet-devtools/skills/e2e-debug/SKILL.md#L262-L270, add keyword
sanitization that filters out non-alphanumeric characters from the extracted
keywords before line 266 where the keywords are interpolated into the JQL query.
Both fixes address the root cause of missing input validation on external input
passed directly to CLI commands.
- Around line 266-270: The jira issue list commands on lines 266 and subsequent
lines interpolate error keywords directly into JQL queries without sanitization,
creating a query injection vulnerability. Before interpolating the
keyword_from_error variable into the -q parameter, sanitize it to contain only
alphanumeric characters, hyphens, and underscores by using sed or similar
filtering (e.g., sed 's/[^a-zA-Z0-9_-]/ /g'). Apply this sanitization to all
locations where error text is extracted from logs and inserted into JQL query
strings to prevent injection of JQL operators or quotes that could modify the
query logic.
- Line 366: The kubectl port-forward commands at line 366 (Maestro DB check),
line 419 (Sentinel metrics), and line 424 (API status check) lack timeout
protection, which can cause indefinite hangs and block skill execution. For each
of these three instances, wrap the kubectl port-forward command with a timeout
wrapper (e.g., timeout 5), add a process-alive check using kill -0 on the PF_PID
variable to verify the port-forward started successfully, execute the curl
command only if the process is running, and ensure the process is cleaned up
with kill. This prevents hangs and ensures fail-safe behavior when services are
unreachable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9bd2e055-80de-4459-ae65-f504bf0ee2c0

📥 Commits

Reviewing files that changed from the base of the PR and between 518fa78 and c45f2e7.

📒 Files selected for processing (8)
  • .claude-plugin/marketplace.json
  • AGENTS.md
  • hyperfleet-devtools/.claude-plugin/plugin.json
  • hyperfleet-devtools/README.md
  • hyperfleet-devtools/docs/e2e-debug-presentation.md
  • hyperfleet-devtools/skills/e2e-debug/SKILL.md
  • hyperfleet-devtools/skills/e2e-debug/references/ci-quick-reference.md
  • hyperfleet-devtools/skills/e2e-debug/references/known-failure-patterns.md
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)
✅ Files skipped from review due to trivial changes (4)
  • hyperfleet-devtools/.claude-plugin/plugin.json
  • .claude-plugin/marketplace.json
  • AGENTS.md
  • hyperfleet-devtools/skills/e2e-debug/references/ci-quick-reference.md

Comment thread hyperfleet-devtools/skills/e2e-debug/SKILL.md Outdated
Comment thread hyperfleet-devtools/skills/e2e-debug/SKILL.md Outdated
Comment thread hyperfleet-devtools/skills/e2e-debug/SKILL.md Outdated
- Add JIRA ticket format validation (^[A-Z]+-[0-9]+$) before CLI use
- Sanitize JQL keywords from error logs (strip metacharacters)
- Add timeout wrapper (timeout 10) to all kubectl port-forward commands
- Add kill -0 process check before curl to fail safely if port-forward hangs
- Keep Read in allowed-tools (needed for loading references/ files)

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md`:
- Line 47: The JIRA ticket input handling at line 47 instructs to validate the
format against the regex pattern but contains no executable validation code,
creating a security risk for injection attacks. Add explicit validation code
immediately after the JIRA ticket input is identified and before any CLI command
is constructed: check if the input matches the specified regex pattern
(^[A-Z]+-[0-9]+$), and if validation fails, output an error message to stderr
and exit with a non-zero status code. Only proceed with the jira CLI call to
fetch the ticket and search for pipeline links once validation succeeds,
matching the defensive pattern already applied to keyword sanitization elsewhere
in the document.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 136bd9b0-79f3-4875-80d0-d7d35e861ce8

📥 Commits

Reviewing files that changed from the base of the PR and between c45f2e7 and 60a4c16.

📒 Files selected for processing (1)
  • hyperfleet-devtools/skills/e2e-debug/SKILL.md
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)

Comment thread hyperfleet-devtools/skills/e2e-debug/SKILL.md Outdated
CodeRabbit flagged that line 47 had a validation instruction but no
executable code, unlike the JQL keyword sanitization at line 268.
Added explicit grep -qE validation with error message before jira CLI use.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
hyperfleet-devtools/skills/e2e-debug/SKILL.md (1)

555-556: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep live cluster data in the evidence set.

This guardrail excludes kubectl/gcloud evidence even though Step 5 makes live cluster inspection mandatory when available and the opening instructions already require live cluster corroboration. That contradiction can let the model certify a diagnosis without the only data source that confirms node drains, restarts, or Maestro state.

♻️ Proposed fix
-- **NO GUESSWORK:** Base your root cause ONLY on the intersection of logs, the debugging handbook, and the repository state.
++ **NO GUESSWORK:** Base your root cause ONLY on the intersection of logs, the debugging handbook, the repository state, and live cluster data.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md` around lines 555 - 556, The
guardrails in the "NO HALLUCINATIONS" and "NO GUESSWORK" rules are excluding
kubectl and gcloud evidence from the evidence set, which contradicts the
requirement in Step 5 to mandatorily inspect the live cluster when available and
the opening instructions requiring live cluster corroboration. Modify these two
guardrail statements to explicitly include kubectl/gcloud evidence in the
evidence set when available, ensuring that live cluster data about node drains,
restarts, and Maestro state is always retained and used to confirm diagnoses
rather than allowing certifications without this critical data source.
♻️ Duplicate comments (1)
hyperfleet-devtools/skills/e2e-debug/SKILL.md (1)

49-56: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fail closed on invalid JIRA keys.

The regex check only logs an error; it still falls through to jira issue view, so malformed ticket IDs remain eligible for CLI use. That leaves the SEC-01 guard unenforced and reopens the injection path.

🔒 Proposed fix
 if ! echo "$JIRA_INPUT" | grep -qE '^[A-Z]+-[0-9]+$'; then
   echo "ERROR: Invalid JIRA ticket format. Expected: HYPERFLEET-1234. Received: $JIRA_INPUT" >&2
-  # Stop — do not pass unvalidated input to jira CLI
+  exit 1
 fi
 jira issue view "$JIRA_INPUT" --plain 2>/dev/null

As per coding guidelines, SEC-01: validate input at system boundaries before passing untrusted data to CLI commands.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md` around lines 49 - 56, The JIRA
input validation in this script checks the format using grep with the regex
pattern but fails to stop execution when validation fails. Currently, after
logging the error message for an invalid JIRA ticket format, the script
continues and passes the malformed JIRA_INPUT to the jira issue view command,
creating a security vulnerability. Add an exit statement or equivalent control
flow termination immediately after the error log within the validation block to
ensure that execution stops and the jira CLI command is never invoked with
invalid input, thus enforcing the SEC-01 validation requirement at the system
boundary.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md`:
- Around line 555-556: The guardrails in the "NO HALLUCINATIONS" and "NO
GUESSWORK" rules are excluding kubectl and gcloud evidence from the evidence
set, which contradicts the requirement in Step 5 to mandatorily inspect the live
cluster when available and the opening instructions requiring live cluster
corroboration. Modify these two guardrail statements to explicitly include
kubectl/gcloud evidence in the evidence set when available, ensuring that live
cluster data about node drains, restarts, and Maestro state is always retained
and used to confirm diagnoses rather than allowing certifications without this
critical data source.

---

Duplicate comments:
In `@hyperfleet-devtools/skills/e2e-debug/SKILL.md`:
- Around line 49-56: The JIRA input validation in this script checks the format
using grep with the regex pattern but fails to stop execution when validation
fails. Currently, after logging the error message for an invalid JIRA ticket
format, the script continues and passes the malformed JIRA_INPUT to the jira
issue view command, creating a security vulnerability. Add an exit statement or
equivalent control flow termination immediately after the error log within the
validation block to ensure that execution stops and the jira CLI command is
never invoked with invalid input, thus enforcing the SEC-01 validation
requirement at the system boundary.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 950c36e4-7691-4160-85b3-d0607c31f38a

📥 Commits

Reviewing files that changed from the base of the PR and between 60a4c16 and c6eeaf1.

📒 Files selected for processing (1)
  • hyperfleet-devtools/skills/e2e-debug/SKILL.md
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant