Skip to content

[kernel-1116] Add CDP Monitor#213

Open
archandatta wants to merge 55 commits intomainfrom
archand/kernel-1116/cdp-foundation
Open

[kernel-1116] Add CDP Monitor#213
archandatta wants to merge 55 commits intomainfrom
archand/kernel-1116/cdp-foundation

Conversation

@archandatta
Copy link
Copy Markdown
Contributor

@archandatta archandatta commented Apr 13, 2026

Introduces the foundational layer of the CDP monitor as a standalone reviewablechunk. No Monitor struct wiring, just the primitives that everything else builds on.

  • types.go: CDP wire format (cdpMessage), all event type constants, internal state structs (networkReqState, targetInfo, CDP param shapes).

  • util.go: Console arg extraction, MIME allow-list (isCapturedMIME), resource type filter (isTextualResource), per-MIME body size caps (bodyCapFor), UTF-8-safe body truncation (truncateBody).

  • computed.go: State machine for the three derived events: network_idle (500ms debounce after all requests finish), layout_settled (1s after page_load with no layout shifts), navigation_settled (fires once all three flags converge). Timer invalidation via navSeq prevents stale AfterFunc callbacks from publishing for a previous navigation.

  • domains.go: isPageLikeTarget predicate (pages and iframes get Page.* / PerformanceTimeline.*; workers don't), bindingName constant, interaction.js embed.

  • interaction.js: Injected script tracking clicks, keydowns, and scroll-settled events via the __kernelEvent CDP binding.


Note

High Risk
Introduces a large new event-capture surface (network headers/bodies, console, interactions) plus complex concurrency/reconnect logic; mistakes here can leak sensitive data or destabilize long-lived capture sessions.

Overview
Implements a new server/lib/cdpmonitor package that dials Chrome’s DevTools WebSocket, auto-attaches to targets, translates CDP notifications into events.Events (console/network/page/interaction), and emits computed events like network_idle, layout_settled, and navigation_settled.

Adds robustness features including capped-backoff reconnect on upstream URL changes, per-session state machines, redirect-aware request tracking with TTL sweeps, response-body capture with MIME allow-list + truncation, and rate-limited screenshot capture via ffmpeg.

Wires the API service to construct the monitor with slog and makes ApiService.cdpMonitor an interface for easier stubbing; expands tests with extensive fixtures and lifecycle/handler coverage for the new monitor.

Reviewed by Cursor Bugbot for commit 0139432. Bugbot is set up for automated code reviews on this repo. Configure here.

This binary is tracked on main and was incidentally deleted earlier on
this branch. Restoring it keeps the 13.4MB binary out of this PR's diff.
Removing the tracked binary from main should be done in a separate PR.
Comment thread server/lib/cdpmonitor/screenshot.go
Comment thread server/lib/cdpmonitor/interaction.js
_ = m.injectScript(ctx, p.SessionID)
}
})
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Injected script never runs on already-loaded pages

Medium Severity

handleAttachedToTarget calls injectScript, which only uses Page.addScriptToEvaluateOnNewDocument. This registers interaction.js for future navigations but never evaluates it on the current document. Pages already loaded when the monitor attaches (via attachExistingTargets or after reconnect) won't have click, keydown, or scroll-settled tracking until their next navigation. A Runtime.evaluate call with the same script source is needed alongside the addScriptToEvaluateOnNewDocument registration to cover the current page.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit bf4b04c. Configure here.

Comment thread server/lib/cdpmonitor/interaction.js
Comment thread server/lib/cdpmonitor/monitor.go
Comment thread server/lib/cdpmonitor/interaction.js
Base automatically changed from archand/kernel-1116/cdp-pipeline to main April 22, 2026 17:41
@archandatta archandatta dismissed Sayan-’s stale review April 22, 2026 17:41

The base branch was changed.

@archandatta archandatta requested a review from Sayan- April 22, 2026 17:43
_, err := m.send(ctx, "Page.addScriptToEvaluateOnNewDocument", map[string]any{
"source": injectedJS,
}, sessionID)
return err
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interaction script not injected into current document

Medium Severity

injectScript only calls Page.addScriptToEvaluateOnNewDocument, which registers the interaction-tracking JS for future navigations. The already-loaded document in an attached target never receives the script. When the monitor attaches to existing pages (via attachExistingTargets at startup or after reconnect), clicks, keydowns, and scroll events on those pages won't be captured until the user navigates away. A companion Runtime.evaluate call is needed to inject into the current document.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 8e94162. Configure here.

key: e.key,
selector: sel(t), tag: t.tagName || ''
}));
}, true);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sensitive input detection bypassed by shadow DOM retargeting

Medium Severity

The keydown handler uses e.target to check isSensitiveInput, but e.target is retargeted across shadow DOM boundaries. When a <input type="password"> lives inside a web component's shadow DOM (common with Material UI, Lit, Shoelace, etc.), e.target at the document level resolves to the shadow host custom element — not the inner password input. Since the shadow host typically isn't an INPUT/TEXTAREA, isEditable returns false and isSensitiveInput returns false, allowing the actual e.key character to be captured. Using e.composedPath()[0] instead of e.target would resolve this, as it returns the real originating element even across shadow boundaries.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5465e59. Configure here.

Comment thread server/lib/cdpmonitor/monitor.go Outdated
Comment thread server/lib/cdpmonitor/monitor.go Outdated
Comment on lines +28 to +34
//
// Lock ordering (outer → inner):
//
// restartMu → lifeMu → pendReqMu → computed.mu → pendMu → sessionsMu
//
// Never acquire a lock that appears later in this order while holding an
// earlier one, to prevent deadlock.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is missing a lot of introductory context about what these locks control / what the Monitor struct is responsible for. This file is the entrypoint to the package and reading this I am very lost

maybe a lib/cdpmonitor/README.md would be helpful

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lmk if this is clear enough, or if there other details that would be worth adding! fd4d4d3

Comment thread server/lib/cdpmonitor/monitor.go Outdated
Comment thread server/lib/cdpmonitor/types.go Outdated
@archandatta archandatta requested a review from rgarcia April 23, 2026 19:35
@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented Apr 27, 2026

I did a manual pass against this branch with the review harness, doing a single navigation to kernel.sh. Event log from that run: https://gist.github.com/rgarcia/cc2c9bb92b536d7f6659985868a3c56b

A few observations/suggestions from that run:

The event stream needs richer CDP identity metadata before this is reviewable as a browser logging foundation. The monitor already receives enough data to tie network events to a target/frame/navigation, but the published event drops most of it. Network.requestWillBeSent includes requestId, loaderId, documentURL, and frameId, and the monitor also has sessionID -> targetInfo with targetId/targetType. Today we only publish cdp_session_id, method/url/headers/etc. This makes redirects look like duplicate requests and makes it hard to group requests by tab/frame/navigation.

Could we include at least:

  • request_id
  • loader_id
  • frame_id
  • document_url
  • target_id
  • target_type
  • redirect metadata, or a clear marker when requestWillBeSent represents a redirect continuation

Relatedly, the computed events (network_idle, layout_settled, navigation_settled) currently publish no session/frame/target metadata, and the state machine appears global to the monitor. That makes these events impossible to attribute to a tab from the event stream, and multiple tabs/navigations could interfere with each other. I’d expect computed state to be scoped per page target/session, or at least to carry the current top-level navigation context initialized from Page.frameNavigated (session_id, target_id, frame_id, loader_id, URL, nav sequence) and publish that context on all synthetic events.

Without this, consumers can’t reliably answer “which tab/page/navigation produced this event?”, which seems core to the usefulness of the CDP monitor.

One note from the log: the first three document network_request events are a redirect chain (http://kernel.sh/ -> https://kernel.sh/ -> https://www.kernel.sh/), not exact duplicates. Including request_id and redirect metadata would make that much clearer in the event stream.

@archandatta
Copy link
Copy Markdown
Contributor Author

@rgarcia feedback addressed! the fields like layout_settled and navigation_settled are connected with loader id:

layout_settled.data.loader_id
      == network_request.data.loader_id (all requests from that navigation)
      == network_response.data.loader_id

Comment thread server/lib/cdpmonitor/handlers.go
@kernel kernel deleted a comment from cursor Bot Apr 29, 2026
@archandatta archandatta force-pushed the archand/kernel-1116/cdp-foundation branch from 16d37eb to 83a164b Compare April 29, 2026 14:20
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 5 total unresolved issues (including 4 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 83a164b. Configure here.

for _, ev := range evs {
s.publish(ev)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing dead check in onDOMContentLoaded

Low Severity

onDOMContentLoaded unconditionally sets s.navDOMLoaded = true without checking s.dead first. Every other state-mutating method (onRequest, onLoadingFinished, onPageLoad, onLayoutShift) guards with if s.dead { return } before modifying state. While pendingNavigationSettled independently checks dead preventing event emission, this inconsistency mutates a stopped state machine, which could mask bugs if future code reads navDOMLoaded without also checking dead.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 83a164b. Configure here.

@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented Apr 29, 2026

I ran another manual pass with the review harness, this time navigating to Hacker News, then opening a new tab and navigating to kernel.sh. Event data: https://gist.github.com/rgarcia/0a60c17ece22e9394bdd9118475a0f5c

Observations:

  1. there does not seem to be a consistent "raw CDP passthrough" envelope inside event.data.
    data.event shows up for layout_shift because PerformanceTimeline.timelineEventAdded CDP params are shaped as { event: ... }, not because Kernel has a convention that raw CDP lives under event. Other events project/rename fields directly into data, e.g. network_request.data.request_id, loader_id, frame_id, etc.
    That makes casing hard to reason about and looks sloppy / inconsistent. Right now consumers see a mix of:

    • raw CDP camelCase fields like frameId, nodeId, layoutShiftDetails
    • Kernel-projected snake_case fields like frame_id, resource_type, request_id
    • raw-ish CDP timestamps in some page events
    • synthetic Kernel fields in computed events

    I think we should probably choose one clear convention of always projecting event data into a Kernel-owned snake_case schema to keep the event data clean and consistent

  2. Relatedly, I think server/lib/cdpmonitor/README.md should grow a consumer-facing schema section. Right now it has a useful taxonomy/internals overview, but not enough for someone consuming the stream to understand fields like request_id, loader_id, frame_id, target_id, cdp_session_id, initiator_type, or the timestamp fields. A short CDP data model primer would help a lot

  3. On timestamps: top-level event.ts is Unix microseconds, but dom_content_loaded.data.timestamp and page_load.data.timestamp are CDP monotonic timestamps in seconds. That is useful CDP data, but it is very easy to confuse with the event timestamp. Maybe rename/project it as something like cdp_timestamp?

  4. dom_content_loaded and page_load still only have the CDP timestamp in data. They do get source metadata for the CDP session/target, but unlike navigation, network_idle, layout_settled, and navigation_settled, they do not carry loader_id, frame_id, URL, or nav_seq in the event payload. Since the monitor now tracks nav context, can we stamp that same context onto these page lifecycle events too?

  5. screenshot still emits as local_process with only the PNG payload. Since screenshots are triggered by Page.loadEventFired / exceptions, it would be very useful for them to include context like trigger_event, session_id, target_id, frame_id, loader_id, URL, and nav_seq. Otherwise screenshots are hard to correlate with the page/navigation that caused them.

  6. No event was fired when I opened a new tab. Not sure what CDP exposes here but I think we should produce something here. Subsequent events were fired once I navigated, which is good.

Overall the new loader/frame/request/target additions are a big improvement. I think the remaining work is mostly making the schema consistent and documented enough for downstream consumers.

@rgarcia
Copy link
Copy Markdown
Contributor

rgarcia commented Apr 29, 2026

Couple of other things:

  1. These should likely carry current navigation context (session_id, target_id, target_type, frame_id, loader_id, URL, nav_seq) in data, not just metadata:
    • interaction_click
    • interaction_key
    • scroll_settled
    • console_log
    • console_error
    • dom_content_loaded
    • page_load
    • layout_shift
    • screenshot
    Partially addressed:
    • network_request, network_response, network_loading_failed: now mostly good.
    • network_idle, layout_settled, navigation_settled: now good, they carry nav context.
    • navigation: has frame_id, loader_id, URL, but not target_id / target_type in data; those are only in metadata.

  2. Event naming is inconsistent. Current event types:
    • console_log, console_error
    • network_request, network_response, network_loading_failed
    • navigation, dom_content_loaded, page_load, layout_shift
    • network_idle, layout_settled, navigation_settled
    • interaction_click, interaction_key, scroll_settled
    • screenshot, monitor_*
    The category already says network, page, interaction, etc., so prefixes like network_ and interaction_ are somewhat redundant. But if event type is intended to be globally unique without category, then prefixes make sense. The issue is inconsistency:
    • interaction_click / interaction_key have interaction_, but scroll_settled does not.
    • network_idle has network_, but layout_settled and navigation_settled do not have page_.
    • page_load has page_, but navigation, layout_shift, layout_settled, navigation_settled do not.
    I think we should choose a convention here. Since these are logs/events that people will grep/query outside typed clients, I’d lean toward globally namespaced names, e.g. network_request, page_navigation, interaction_click, interaction_scroll_settled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants