Skip to content

# v1.3.31+claude1.2773.0: cowork-vm-service daemon terminates mid-session and does not recover; persists across reboot #408

@RayCharlizard

Description

@RayCharlizard

Environment

  • Package: claude-desktop-appimage (AUR)
  • Version: 1.3.31+claude1.2773.0-1
  • Install date: 2026-04-15 22:49:47 CDT
  • Distro: CachyOS Linux (rolling), kernel 7.0.0-1-cachyos
  • Desktop: KDE Plasma on Wayland (XWayland for Claude Desktop, --ozone-platform=x11)
  • Shell: fish 4.1.2
  • Cowork backend: host-direct (via override)
  • Previous release on same system: worked correctly

Summary

On host-direct backend, cowork-vm-service starts and functions normally on a fresh launch of this version. After roughly 40 minutes of active use (two completed Cowork sessions), the daemon terminates without logging a cause — surfaced only by a [Keepalive] Ping failed on the Node-side vm-client. From that point forward the daemon is never re-spawned: not by the running app, not after a full quit + relaunch, and not after an OS reboot. Auto-reinstall triggers once on the next launch (preserving sessiondata.img and rootfs.img.zst) and fails identically on retry.

End state: no cowork-vm-service process, no Unix socket at /run/user/$UID/cowork-vm-service.sock, no daemon stderr captured anywhere, and vm-client retrying connect() in an ENOENT loop.

Reproduction

  1. Install v1.3.31+claude1.2773.0 on host-direct backend.

  2. Launch Claude Desktop.

  3. Open Cowork sessions and use them normally — this works.

  4. After some period of use (in my case ~40 minutes with two sessions), a keepalive ping failure is logged and the daemon terminates.

  5. Every subsequent attempt to open a Cowork session (in the same running app, after a full quit+relaunch, and after an OS reboot) fails with:

    Failed to start Claude's workspace
    VM service not running. The service failed to start.

Timeline from logs (single continuous session)

All times CDT, same day as install.

Time Event
22:49:47 Package installed
22:50:23–22:50:25 VM startup succeeds: download_and_sdk_prepare, load_swift_api, smol-bin.x64.vhdx copy, Keepalive: Starting, Startup complete, total time: 1896ms
22:51:01 First Cowork session amazing-upbeat-mendel spawned successfully
22:58:17 Second Cowork session tender-dazzling-mccarthy spawned successfully
23:09:48 First session killed cleanly (SIGTERM, code=0, duration=1127362ms)
23:32:11 [Keepalive] Ping failed: Keep-alive ping timed out
23:32:18 [Keepalive] Ping failed: Keep-alive ping timed out
(OS reboot)
23:45:27 App starts. appVersion: '1.2773.0'
23:45:35 [VM:start] Beginning startupdownload_and_sdk_prepare ✓ → load_swift_api ✓ → Copying smol-bin.x64.vhdx
23:45:40 [error] [VM:start] VM boot failed: VM service not running. The service failed to start.
23:45:40 [VM:start] Auto-reinstalling workspace after startup failure — reinstall files deleted, sessiondata.img and rootfs.img.zst preserved
23:45:44 Retry: identical failure
23:45:44 [VM:start] Skipping auto-reinstall (already attempted once)
23:51:12+ Subsequent launches: identical failure every time

Between the successful 22:58:17 session spawn and the 23:32:11 keepalive failure there are no daemon-side log entries at all — the daemon dies silently and is only detected when the Node-side client's periodic ping stops getting a response.

Diagnostic state while the failure is active

claude-desktop --doctor

Cowork Mode
----------------
[PASS] bubblewrap: found
[PASS] KVM: accessible
[PASS] vsock: module loaded
[PASS] QEMU: found
[PASS] socat: found
[PASS] virtiofsd: found
      VM image: not downloaded yet
      Cowork isolation: host-direct (no isolation, via override)

(One unrelated [FAIL] for chrome-sandbox perms inside the AppImage FUSE mount; suggested fix — chmod 4755 inside a read-only FUSE mount — cannot be applied. Electron's --no-sandbox workaround is in effect and launches fine. Minor separate issue.)

No daemon process, no bwrap, no claude CLI

App running, Cowork session failed, then:

$ ps auxf | grep -iE 'cowork|claude' | grep -v grep

Returns only the Electron main process, zygotes, and NetworkService / AudioService utility children. No cowork-vm-service Node process. No bwrap. No /usr/local/bin/claude CLI.

No socket, no orphans

$ lsof -U 2>/dev/null | grep -iE 'cowork|claude'
(empty)
$ ls /run/user/$UID/cowork-vm-service.sock
No such file or directory
$ ls /tmp/cowork-* 2>/dev/null
(none)

The daemon file ships correctly

$ find /tmp -path '*claude*cowork-vm-service*'
/tmp/.mount_clauder22nbF/usr/lib/node_modules/electron/dist/resources/app.asar.unpacked/cowork-vm-service.js

/usr/bin/claude-desktop is a symlink to the AppImage

$ file /usr/bin/claude-desktop
/usr/bin/claude-desktop: symbolic link to /opt/claude-desktop/claude-desktop.AppImage

No separate wrapper/launcher sits between the shell and the AppImage bootstrap, so the daemon spawn has to happen from inside app.asar.

Launcher log: endless client-side retry

~/.cache/claude-desktop-debian/launcher.log fills with:

[vm-client] Event subscription error: connect ENOENT /run/user/1000/cowork-vm-service.sock
[vm-client] Event subscription error: connect ENOENT /run/user/1000/cowork-vm-service.sock
[vm-client] Event subscription error: connect ENOENT /run/user/1000/cowork-vm-service.sock
...

ENOENT rather than ECONNREFUSED confirms the socket file was never created on this launch — no daemon came up, nothing crashed leaving a stale socket, nothing tried and failed. The spawn call simply isn't being reached on recovery.

Observations

  • The daemon does function correctly on fresh launches of this version, so the asar-level integration is not fundamentally broken.
  • What regressed is the recovery path: after the daemon dies mid-session, no code path re-spawns it, and this persists across full app restart and OS reboot.
  • Auto-reinstall deletes "reinstall files" but preserves sessiondata.img and rootfs.img.zst. Retrying with those preserved files produces the identical failure. Whether those files are implicated in the stuck state is unconfirmed — I haven't tested manual deletion.
  • The initial daemon termination at 23:32 has no captured cause. Nothing in cowork_vm_node.log between the successful second session spawn at 22:58 and the first keepalive ping failure at 23:32.

Workaround

Downgrade to the previous release. Manual deletion of sessiondata.img and rootfs.img.zst in ~/.config/Claude/vm_bundles/claudevm.bundle/ is untested as a recovery step — happy to test on request before filing if helpful.

Additional notes

Separately, sessions-bridge reports a 409 registration conflict because a Cowork agent is already registered on another device under the same account. This is expected given the current single-device lock and not part of this report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcoworkRelated to Cowork modeformat: appimageAffects AppImage buildspriority: highImportant, should be addressed soonregressionPreviously working, now brokentriage: investigatedIssue has been triaged and investigated

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions