Skip to content

OCPBUGS-81648: Fix orphaned shell processes in pod terminal on WebSocket disconnect#16299

Open
Leo6Leo wants to merge 1 commit intoopenshift:release-4.18from
Leo6Leo:fix-orphaned-terminal-processes
Open

OCPBUGS-81648: Fix orphaned shell processes in pod terminal on WebSocket disconnect#16299
Leo6Leo wants to merge 1 commit intoopenshift:release-4.18from
Leo6Leo:fix-orphaned-terminal-processes

Conversation

@Leo6Leo
Copy link
Copy Markdown
Contributor

@Leo6Leo Leo6Leo commented Apr 17, 2026

Summary

  • When a pod terminal WebSocket connection drops, the shell process inside the container was never terminated, leading to accumulation of orphaned sh processes
  • Root cause: gorilla/websocket's Close() only drops the TCP connection without sending a WebSocket close frame, and no exit command was sent to the shell on disconnect
  • Fix: send exit through the exec STDIN channel (base64.channel.k8s.io protocol) and a proper WebSocket close frame to the Kubernetes API server when the proxy connection closes

Details

The cleanup in componentWillUnmount (frontend) only runs when the user navigates away from the terminal page. When the WebSocket drops while the page stays loaded (e.g., load balancer idle timeout, network interruption), the shell process inside the container survives indefinitely. Each reconnection spawns a new shell, causing orphaned processes to accumulate. This is especially impactful for high-usage pods like DB2U.

The fix adds cleanup at the proxy layer (pkg/proxy/proxy.go), which handles all disconnect scenarios regardless of how the frontend behaves.

Test plan

  • Verified on a live OpenShift 4.18 cluster by running a local console bridge with the fix
  • Opened pod terminal, confirmed shell process created (ps aux via oc exec)
  • Forced WebSocket disconnect via browser DevTools
  • Confirmed old shell process was terminated (no longer in ps aux)
  • Confirmed reconnected terminal created a fresh shell process
  • All existing proxy unit tests pass (go test ./pkg/proxy/)
  • CLI access (oc rsh / oc exec) unaffected (fix only targets /exec WebSocket path in the proxy)

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced websocket exec session termination with improved cleanup procedures, including proper termination signaling and graceful connection closure handshakes for more reliable session management.

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 17, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@Leo6Leo: This pull request references Jira Issue OCPBUGS-81648, which is invalid:

  • expected the bug to target the "4.18.z" version, but no target version was set
  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-81648 to depend on a bug targeting a version in 4.19.0, 4.19.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • When a pod terminal WebSocket connection drops, the shell process inside the container was never terminated, leading to accumulation of orphaned sh processes
  • Root cause: gorilla/websocket's Close() only drops the TCP connection without sending a WebSocket close frame, and no exit command was sent to the shell on disconnect
  • Fix: send exit through the exec STDIN channel (base64.channel.k8s.io protocol) and a proper WebSocket close frame to the Kubernetes API server when the proxy connection closes

Details

The cleanup in componentWillUnmount (frontend) only runs when the user navigates away from the terminal page. When the WebSocket drops while the page stays loaded (e.g., load balancer idle timeout, network interruption), the shell process inside the container survives indefinitely. Each reconnection spawns a new shell, causing orphaned processes to accumulate. This is especially impactful for high-usage pods like DB2U.

The fix adds cleanup at the proxy layer (pkg/proxy/proxy.go), which handles all disconnect scenarios regardless of how the frontend behaves.

Test plan

  • Verified on a live OpenShift 4.18 cluster by running a local console bridge with the fix
  • Opened pod terminal, confirmed shell process created (ps aux via oc exec)
  • Forced WebSocket disconnect via browser DevTools
  • Confirmed old shell process was terminated (no longer in ps aux)
  • Confirmed reconnected terminal created a fresh shell process
  • All existing proxy unit tests pass (go test ./pkg/proxy/)
  • CLI access (oc rsh / oc exec) unaffected (fix only targets /exec WebSocket path in the proxy)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 17, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c604edc-e6b0-4b99-b5af-9724c22549ca

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

The Proxy.ServeHTTP method is enhanced to detect websocket requests with URL paths ending in /exec and replaces simple connection closure with a more sophisticated cleanup sequence. This includes sending an exec termination command, attempting a graceful websocket close handshake, and then closing the backend connection.

Changes

Cohort / File(s) Summary
Websocket Exec Session Cleanup
pkg/proxy/proxy.go
Modified ServeHTTP defer logic to conditionally send exec termination commands for websocket connections ending with /exec, perform graceful websocket close handshake, and properly close backend connections. Added new sendExecExitCommand helper to base64-encode and transmit shell exit signals to exec sessions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 8 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive The PR modifies pkg/proxy/proxy.go without adding or modifying any Ginkgo test files, making the Ginkgo test code quality check not applicable. Verify whether the PR includes Ginkgo test files; if not, this check is not applicable to the current PR.
✅ Passed checks (8 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: fixing orphaned shell processes in pod terminals on WebSocket disconnect, which matches the primary modification to handle cleanup in Proxy.ServeHTTP.
Stable And Deterministic Test Names ✅ Passed No Ginkgo test suites present—tests use Go's standard testing package with static TestXxx names.
Microshift Test Compatibility ✅ Passed This PR does not introduce new Ginkgo e2e tests and is not applicable to the MicroShift Test Compatibility check.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR does not add new Ginkgo e2e tests; changes are limited to backend proxy implementation in pkg/proxy/proxy.go.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only pkg/proxy/proxy.go with WebSocket cleanup changes; no deployment manifests, operator code, controllers, or scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed Changes are contained within request-level functions in the ServeHTTP handler and its defer block, not process-level code subject to OTE Binary Stdout Contract requirements.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This pull request does not add any new Ginkgo e2e tests. The changes are exclusively to the backend proxy implementation in pkg/proxy/proxy.go, specifically adding WebSocket cleanup logic for exec sessions to prevent orphaned shell processes. No test code with Ginkgo patterns is being introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from TheRealJon and jhadvig April 17, 2026 16:08
@openshift-ci openshift-ci bot added the component/backend Related to backend label Apr 17, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Leo6Leo
Once this PR has been reviewed and has the lgtm label, please assign therealjon for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Leo6Leo
Copy link
Copy Markdown
Contributor Author

Leo6Leo commented Apr 17, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 17, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/proxy/proxy.go`:
- Around line 234-241: The defer cleanup in the connection handler writes to the
shared websocket backend (calling sendExecExitCommand(backend) and
backend.WriteControl/Close) without synchronizing with other writers (e.g., the
copyMsgs goroutine that uses writeMutex), risking concurrent writes; introduce
and use a dedicated backendWriteMutex (or reuse writeMutex consistently) to
protect all backend write calls: acquire backendWriteMutex before calling
sendExecExitCommand(backend), before backend.WriteControl(...), and before
backend.Close() in the defer, and ensure copyMsgs and any other code paths also
lock the same backendWriteMutex when performing
backend.WriteMessage/WriteControl to guarantee the gorilla/websocket one-writer
rule is respected.
- Around line 299-307: sendExecExitCommand currently always base64-encodes the
"exit\r" frame which only works for the base64.channel.k8s.io subprotocol;
change it to inspect backend.Subprotocol() inside sendExecExitCommand and
branch: if the negotiated subprotocol is base64.channel.k8s.io (or starts with
"base64.channel") keep the existing base64 text framing and send as
websocket.TextMessage, otherwise for binary channel subprotocols like
v4.channel.k8s.io or v5.channel.k8s.io send a websocket.BinaryMessage with a
leading channel byte 0 followed by the raw bytes of "exit\r"; add a debug log if
the subprotocol is unrecognized before sending.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ecf7fe8e-4be6-42e6-85e7-14602cc72b4d

📥 Commits

Reviewing files that changed from the base of the PR and between dda5808 and 12ece3f.

📒 Files selected for processing (1)
  • pkg/proxy/proxy.go

Comment thread pkg/proxy/proxy.go
Comment on lines +234 to +241
defer func() {
if isExec {
sendExecExitCommand(backend)
}
closeMsg := websocket.FormatCloseMessage(websocket.CloseNormalClosure, "")
_ = backend.WriteControl(websocket.CloseMessage, closeMsg, time.Now().Add(websocketTimeout))
backend.Close()
}()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In github.com/gorilla/websocket, are concurrent calls to WriteMessage from multiple goroutines allowed on the same Conn, and is WriteControl the only write API that is safe concurrently with other methods?

💡 Result:

No, concurrent calls to WriteMessage from multiple goroutines on the same Conn are not allowed. Yes, WriteControl is safe concurrently with other methods (along with Close). The official documentation states: "Applications are responsible for ensuring that no more than one goroutine calls the write methods (NextWriter, SetWriteDeadline, WriteMessage, WriteJSON, EnableWriteCompression, SetCompressionLevel) concurrently... The Close and WriteControl methods can be called concurrently with all other methods." This is confirmed in the source code where WriteMessage acquires a write mutex (c.mu), but concurrent access triggers a panic "concurrent write to websocket connection". WriteControl does not acquire this mutex and can interleave with data messages per the WebSocket protocol.

Citations:


🏁 Script executed:

# First, examine the sendExecExitCommand function around line 302-305
sed -n '300,310p' pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 579


🏁 Script executed:

# Next, find and examine the copyMsgs function
rg "func copyMsgs" -A 10 pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 324


🏁 Script executed:

# Also check the context around line 233-241 to understand the goroutine setup
sed -n '225,250p' pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 904


🏁 Script executed:

# Check the full goroutine setup and error handling
sed -n '240,270p' pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 968


🏁 Script executed:

# Also check if there's a writeMutex declaration and how it's passed to the goroutine
sed -n '220,245p' pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 854


🏁 Script executed:

# Check the goroutine launches - where copyMsgs is called with which mutex
sed -n '265,290p' pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 881


🏁 Script executed:

# Check if there's a separate mutex for backend or how the goroutines sync
sed -n '285,320p' pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 1284


Prevent concurrent backend data writes during deferred exec cleanup.

The sendExecExitCommand(backend) call in the defer block writes to the backend connection without synchronization, while the second copyMsgs goroutine (line 280) holds writeMutex when writing to the same backend connection. When one goroutine errors and returns, the defer executes while the other goroutine may still be writing to backend, causing concurrent WriteMessage calls that violate gorilla/websocket's one-writer contract and can trigger panics or drop the exit frame.

The suggested fix using a dedicated backendWriteMutex is appropriate.

Suggested fix
+	var backendWriteMutex sync.Mutex
 	defer func() {
 		if isExec {
+			backendWriteMutex.Lock()
 			sendExecExitCommand(backend)
+			backendWriteMutex.Unlock()
 		}
 		closeMsg := websocket.FormatCloseMessage(websocket.CloseNormalClosure, "")
 		_ = backend.WriteControl(websocket.CloseMessage, closeMsg, time.Now().Add(websocketTimeout))
 		backend.Close()
 	}()
 	...
-	go func() { errc <- copyMsgs(&writeMutex, backend, frontend) }()
+	go func() { errc <- copyMsgs(&backendWriteMutex, backend, frontend) }()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/proxy/proxy.go` around lines 234 - 241, The defer cleanup in the
connection handler writes to the shared websocket backend (calling
sendExecExitCommand(backend) and backend.WriteControl/Close) without
synchronizing with other writers (e.g., the copyMsgs goroutine that uses
writeMutex), risking concurrent writes; introduce and use a dedicated
backendWriteMutex (or reuse writeMutex consistently) to protect all backend
write calls: acquire backendWriteMutex before calling
sendExecExitCommand(backend), before backend.WriteControl(...), and before
backend.Close() in the defer, and ensure copyMsgs and any other code paths also
lock the same backendWriteMutex when performing
backend.WriteMessage/WriteControl to guarantee the gorilla/websocket one-writer
rule is respected.

Comment thread pkg/proxy/proxy.go
Comment on lines +299 to +307
// sendExecExitCommand sends "exit\r" to the exec session's STDIN channel
// using the base64.channel.k8s.io subprotocol to terminate the shell process
// and prevent orphaned processes when the frontend WebSocket disconnects.
func sendExecExitCommand(backend *websocket.Conn) {
exitCmd := base64.StdEncoding.EncodeToString([]byte("exit\r"))
if err := backend.WriteMessage(websocket.TextMessage, []byte("0"+exitCmd)); err != nil {
klog.V(4).Infof("Failed to send exit command to exec session: %v", err)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

For Kubernetes exec over WebSocket, what are the stdin frame formats for base64.channel.k8s.io vs v4/v5.channel.k8s.io, and does writing "0"+base64(text) only apply to base64.channel.k8s.io?

💡 Result:

For Kubernetes exec over WebSocket, the stdin frame formats differ between base64.channel.k8s.io and the v4/v5.channel.k8s.io protocols as follows: base64.channel.k8s.io: Each message is prefixed with a single ASCII character channel ID ('0' for stdin) followed by the base64-encoded data. Server decodes the base64 data for stdin; client must base64-encode stdin input before sending. Example: to send "foo\n" on stdin, send '0' + base64("foo\n"). v4.channel.k8s.io and v5.channel.k8s.io: These are binary protocols. Each message is prefixed with a single byte channel ID (0x00 for stdin) followed by the raw binary data (no base64 encoding/decoding). Client sends raw bytes prefixed with 0x00 for stdin. Writing "0"+base64(text) (where "0" is the ASCII character) applies only to base64.channel.k8s.io. For v4/v5.channel.k8s.io, use binary byte 0x00 + raw text bytes. Note: v4 and v5 primarily differ in stdin close signaling (v5 supports proper CLOSE via channel 255), but stdin data framing is identical (binary). Base64 variants like v4.base64.channel.k8s.io may exist but are less common; they would follow base64.channel.k8s.io framing with '0' prefix.

Citations:


🏁 Script executed:

# Search for sendExecExitCommand usage and definition
rg -n "sendExecExitCommand" pkg/proxy/ -A 5 -B 5

Repository: openshift/console

Length of output: 1488


🏁 Script executed:

# Search for WebSocket subprotocol negotiation in the codebase
rg -n "Subprotocol\|base64\.channel\.k8s\.io\|channel\.k8s\.io" pkg/proxy/ -B 3 -A 3

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Look for WebSocket connection setup and negotiation
rg -n "websocket\.(Dialer|Upgrader|Conn)" pkg/proxy/ -B 2 -A 5

Repository: openshift/console

Length of output: 3112


🏁 Script executed:

# Check for any exec-related constants or protocol definitions
rg -n "exec.*protocol\|protocol.*exec" pkg/ -i

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Find the function signature and where subProtocol is passed
rg -n "func.*websocket\|ProxyHandler.*subProtocol" pkg/proxy/proxy.go -B 5 -A 10

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Check where sendExecExitCommand is used and trace back the subProtocol
rg -n "subProtocol" pkg/proxy/ -B 2 -A 2

Repository: openshift/console

Length of output: 1370


🏁 Script executed:

# Look for all references to "base64" in the proxy code
rg -n "base64" pkg/proxy/

Repository: openshift/console

Length of output: 461


🏁 Script executed:

# Check what websocket library is being used and if Conn has Subprotocol method
rg -n "import.*websocket" pkg/proxy/

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Look for any existing uses of .Subprotocol() in the codebase
rg -n "\.Subprotocol\(\)" pkg/

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Check proxy.go imports section
head -n 20 pkg/proxy/proxy.go

Repository: openshift/console

Length of output: 341


🏁 Script executed:

# Search for websocket import more broadly
rg "websocket" pkg/proxy/proxy.go | head -20

Repository: openshift/console

Length of output: 1144


🏁 Script executed:

# Check gorilla/websocket documentation or code - does Conn have Subprotocol method?
# Let's look at the test file to see if Subprotocol is used anywhere
cat -n pkg/proxy/proxy_test.go | head -50

Repository: openshift/console

Length of output: 1471


🏁 Script executed:

# Search for any existing pattern of reading negotiated subprotocol
rg -n "Subprotocol" pkg/proxy/proxy_test.go -B 3 -A 3

Repository: openshift/console

Length of output: 539


🏁 Script executed:

# Verify gorilla/websocket Conn.Subprotocol() method exists in documentation or go.mod
cat pkg/go.mod 2>/dev/null | grep gorilla || cat go.mod 2>/dev/null | grep gorilla || rg "gorilla/websocket" -g "*.mod"

Repository: openshift/console

Length of output: 216


🏁 Script executed:

# Check if there's any way to access subprotocol from the connection in gorilla/websocket
# Look at the upgrader usage to see what information is available after upgrade
rg -n "Upgrade\|SubprotocolError" pkg/proxy/proxy.go -B 3 -A 5

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Search gorilla/websocket v1.5.0 API - check if websocket.Conn has Subprotocol() method
python3 << 'EOF'
import subprocess
result = subprocess.run(["curl", "-s", "https://pkg.go.dev/github.com/gorilla/[email protected]?tab=doc"], 
                       capture_output=True, text=True, timeout=10)
if "Subprotocol" in result.stdout:
    print("Subprotocol method found in gorilla/websocket v1.5.0")
    # Extract relevant lines
    lines = result.stdout.split('\n')
    for i, line in enumerate(lines):
        if 'Subprotocol' in line:
            print('\n'.join(lines[max(0, i-2):min(len(lines), i+5)]))
else:
    print("Could not verify Subprotocol method - checking alternative approaches")
EOF

Repository: openshift/console

Length of output: 130


🏁 Script executed:

# Check the upgraded websocket connection - how is subprotocol stored/accessed?
# Look at the Upgrader.Upgrade return values
rg -n "Upgrade.*frontend\|conn.*Upgrade" pkg/proxy/proxy.go -A 3

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Check the Upgrader.Upgrade signature and return value
rg -n "frontend.*:=.*upgrader\.Upgrade\|websocket\.Upgrader.*Upgrade" pkg/proxy/proxy.go -A 10 -B 2

Repository: openshift/console

Length of output: 43


🏁 Script executed:

# Let's verify gorilla/websocket API by checking pkg.go.dev directly
# gorilla/websocket Conn should have SubprotocolError or access to negotiated subprotocol
python3 << 'EOF'
# Check gorilla/websocket v1.5.0 documentation
import subprocess
try:
    result = subprocess.run(
        ["curl", "-s", "https://raw.githubusercontent.com/gorilla/websocket/v1.5.0/conn.go"],
        capture_output=True, text=True, timeout=10
    )
    # Look for Subprotocol method
    if "func (c *Conn) Subprotocol" in result.stdout:
        print("✓ Subprotocol() method exists in gorilla/websocket v1.5.0")
        # Extract the method
        lines = result.stdout.split('\n')
        for i, line in enumerate(lines):
            if 'func (c *Conn) Subprotocol' in line:
                print('\n'.join(lines[i:min(len(lines), i+5)]))
                break
    else:
        print("Checking Upgrader return value instead...")
        if "SubprotocolError" in result.stdout or "subprotocol" in result.stdout.lower():
            print("Found subprotocol-related code")
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: openshift/console

Length of output: 258


Check which subprotocol is negotiated before hardcoding base64 framing in cleanup.

The sendExecExitCommand function assumes base64.channel.k8s.io, but the actual subprotocol is negotiated dynamically from client headers and can vary. For binary protocols like v4.channel.k8s.io or v5.channel.k8s.io, the frame format "0"+base64(text) is invalid—the cleanup will fail silently and leave orphaned processes. Use backend.Subprotocol() to gate the encoding logic.

Suggested refactor
 func sendExecExitCommand(backend *websocket.Conn) {
-	exitCmd := base64.StdEncoding.EncodeToString([]byte("exit\r"))
-	if err := backend.WriteMessage(websocket.TextMessage, []byte("0"+exitCmd)); err != nil {
-		klog.V(4).Infof("Failed to send exit command to exec session: %v", err)
+	switch backend.Subprotocol() {
+	case "base64.channel.k8s.io":
+		exitCmd := base64.StdEncoding.EncodeToString([]byte("exit\r"))
+		if err := backend.WriteMessage(websocket.TextMessage, []byte("0"+exitCmd)); err != nil {
+			klog.V(4).Infof("Failed to send exit command to exec session: %v", err)
+		}
+	default:
+		klog.V(4).Infof("Skipping exec exit command for unsupported websocket subprotocol: %q", backend.Subprotocol())
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/proxy/proxy.go` around lines 299 - 307, sendExecExitCommand currently
always base64-encodes the "exit\r" frame which only works for the
base64.channel.k8s.io subprotocol; change it to inspect backend.Subprotocol()
inside sendExecExitCommand and branch: if the negotiated subprotocol is
base64.channel.k8s.io (or starts with "base64.channel") keep the existing base64
text framing and send as websocket.TextMessage, otherwise for binary channel
subprotocols like v4.channel.k8s.io or v5.channel.k8s.io send a
websocket.BinaryMessage with a leading channel byte 0 followed by the raw bytes
of "exit\r"; add a debug log if the subprotocol is unrecognized before sending.

…nect

When a terminal WebSocket connection drops, the shell process inside the
container was never terminated, leading to accumulation of orphaned
processes. This occurred because gorilla/websocket's Close() only drops
the TCP connection without sending a WebSocket close frame, and no exit
command was sent to the shell.

Send "exit" to the exec STDIN channel and a proper WebSocket close frame
to the Kubernetes API server when the proxy connection closes. Use a
dedicated backendWriteMutex to prevent concurrent writes to the backend
connection, and check the negotiated subprotocol to use the correct
frame encoding.
@Leo6Leo Leo6Leo force-pushed the fix-orphaned-terminal-processes branch from 12ece3f to 03f907a Compare April 17, 2026 20:26
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 18, 2026

@Leo6Leo: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-console 03f907a link true /test e2e-gcp-console

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/backend Related to backend jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants