feat: image-backed microVM boot#11
Merged
Merged
Conversation
Extract attach_args() as a pure function to assemble losetup arguments without filesystem access. Replace clap-based test (which invoked the ok_backing_file validator and required /srv/hyper/test.img to exist) with hermetic unit tests that assert on attach_args() directly: - attach_args(false, path) includes --read-only - attach_args(true, path) omits --read-only Validates the real --read-only-omission logic without requiring any backing files to exist on the system.
…link traversal CRITICAL: Replace mknod --major/--minor with --device <BlockDev>. The helper now opens the device with O_PATH|O_NOFOLLOW, fstats it, and uses the kernel's own st_rdev for mknodat — callers can no longer name arbitrary major:minor. HIGH (x2): Introduce open_parent_nofollow in safe_dev.rs: walks every parent component of a JailPath under JAIL_BASE with openat(O_NOFOLLOW), so a symlinked component causes ELOOP → SymlinkComponent before any write occurs. mknodat, linkat, and fchownat(AT_SYMLINK_NOFOLLOW) are all relative to the verified parent fd. Replaces plain chown/hard_link/mknod calls in mknod.rs and stage.rs. Add pure unit tests for check_owner (uid/gid 0 and <1000 rejected), jail_relative_parts (strips JAIL_BASE, splits parent+name), and existing JailPath lexical tests. 18 tests pass; cargo build --release clean.
… Writable.release/1 holder Add comments exposing the silent coupling between Rust setuid helper constants and Elixir config. JAIL_BASE and HYPER_BASE must match config :hyper, work_dir and its derived paths; changing one without rebuilding the helper breaks device staging with opaque errors. Also document Writable.release/1 holder semantics.
…est, drop libc dep
…ce work_dir/helper base match at node startup
…e-adoption A controller crash now discards the daemon (no orphaned VM) and the fresh controller cold-boots. terminate always stops the daemon; MuonTrap kills the firecracker OS process when its port closes, so none survive teardown/BEAM death. Removes Daemon.ensure adopt branch and the AwaitingApi Running/Paused shortcut.
Daemon is now a static :permanent child of a plain Core supervisor (:one_for_all); a firecracker crash exits it and Core restarts daemon+controller together for a clean cold boot. Daemon.start_link resets the stale jail (chroot + cgroup, via the new reset-jail helper) before each launch so relaunch succeeds. Removes the controller's daemon monitor, the :booting/:crashed states, re-adoption, and DynamicSupervisor; State.init goes straight to :awaiting_api.
Per-axis Check traits over six markers (absoluteness, components, existence, file type, owner, mode); Any turns an axis off. One trait per axis so each type-parameter slot only accepts its own markers. Validation runs every enforced axis sharing a single symlink_metadata call, and reports the first failure via one ValidationError (no per-combination error type - Rust can't synthesise one). Not yet wired into the tools.
Seventh axis. Unlike the six type-only markers, confinement carries a runtime base value (a &Path can't be a type parameter), so LivesUnder<'a> holds the base and is supplied to a dedicated under(path, base) constructor. TryFrom stays the entry for the unconfined (Any) case.
Drop the lifetime: SafePath<...,LivesUnder> no longer borrows the base, so a confined path is self-contained. under() takes an owned PathBuf (e.g. Config::get().jail_base() directly).
Remove the metadata axes (MustExist/IsRegularFile/RootOwner/OnlyRootWritable) and their by-name stat: checked by path they are TOCTOU-racy, so calling them "safe" was a footgun. SafePath now reasons about the name only (absoluteness, components, confinement-prefix); existence/type/owner/mode move to fd-based verification (future safe_file).
Lexical starts_with confinement is defeated by a symlinked component, so it was the same false-safety footgun as the metadata checks. SafePath is now purely absoluteness + components; real confinement is the O_NOFOLLOW walk (fd-side).
Wraps an open fd (backed by std OwnedFd) and closes it exactly once, on drop - never before. The fd half of the path-safety story: resolve a name to a descriptor once, then verify (fstat) and operate (*at) through the held fd, immune to the by-name TOCTOU races. Replaces the scattered manual close calls (which leak on early return / risk double-close).
SafeFile<T,R,O> proves in its type which fstat-checked properties the held fd has: file type / ownership / mode, the same axes pulled out of SafePath (by-name they were TOCTOU footguns; on the fd they are sound). Verification runs once in TryFrom<OwnedFd> sharing one fstat; Any turns an axis off. Existence needs no axis - holding an fd proves the file exists.
open(path, flags) takes a lexically-validated SafePath, opens it with O_NOFOLLOW|O_CLOEXEC always forced, and runs the fstat axes - so one call proves existence (the open succeeded) plus type/owner/mode, all in the returned type. O_PATH to verify-only, O_RDONLY to also read. Guards only the final component; confined trees still need the fd-by-fd parent walk.
Replace the bespoke O_NOFOLLOW open + metadata() owner/mode/type checks in Config::safe_load with the pipeline: lexical SafePath gate, then SafeFile::<IsRegularFile,RootOwner,OnlyRootWritable>::open(O_RDONLY) which proves existence + the fstat axes on the held fd, then read through that fd. First real consumer of the new utilities; security logic lives in one place.
Drop the from_validation mapping function. LoadingError now wraps the underlying errors via #[from] (Path/File variants), so safe_load uses ? and the precise messages surface directly. Both ValidationErrors are now Copy (payloads already are) to keep LoadingError Copy.
SafeDir owns a directory fd and is both the walk primitive (openat_dir descends one component O_NOFOLLOW|O_DIRECTORY, relative to the pinned fd) and the home for fd-relative removal: unlink/rmdir via unlinkat, and a recursive remove_dir_all that descends with fresh openat'd fds and never re-resolves a path by name (vs std::fs::remove_dir_all, a by-name TOCTOU footgun). Symlinked entries are unlinked, never followed; DT_UNKNOWN falls back to a confined open probe. Adds the nix "dir" feature.
…kDevice Primitives for the walk migration: - SafePath::relative_to(base) -> (parents, leaf), gated on StrictComponents. - SafeDir: descend (walk), openat_file, create_file, mknod_block, link_from, chown, try_clone. - SafeFile: IsBlockDevice tag + type-gated rdev() accessor.
…ete JailPath
Retire the bespoke path/fd security code in favour of the typed utilities:
- prepare: walk the chroot from JAIL_BASE via SafeDir (O_NOFOLLOW, confinement
proven), then stage kernel + mknod rootfs relative to that dir fd.
- stage: stage_into(parent, name, src, ...) - link_from / create_file+copy /
chown via SafeDir; canonicalize+confine the source. No manual close dance.
- mknod: device fd is SafeFile<IsBlockDevice>; rdev() is type-gated; node via
SafeDir::mknod_block.
- remove: fd-relative deletion via SafeDir.remove_dir_all / rmdir after an
O_NOFOLLOW walk, replacing std::fs::remove_dir_all (by-name TOCTOU).
- safe_dev: deleted JailPath, jail_relative_parts, open_parent_nofollow and the
Jail/SymlinkComponent/DeviceStat error variants; it is now device-name
newtypes only.
- trimmed dead code (safe_path::Any/as_path, safe_dir::openat_file).
Confinement is now proven by the walk, every fd is RAII, and the whole
O_NOFOLLOW/fstat/openat/close surface lives in util/{safe_path,safe_file,safe_dir}.
…calize canonicalize + confine-under-HYPER_BASE stays (path resolution losetup needs as a path), but the manual open + fstat + S_IFREG check becomes SafeFile::<IsRegularFile,..>::open(O_PATH) - the regular-file proof rides the held fd. SafeFile's fd is O_CLOEXEC, so dup an inheritable copy for the child losetup to reopen via /proc/self/fd; the SafeFile closes on drop. Drops the bespoke OpenBacking-open/NotRegularFile logic.
One type per file: snapshot.rs (SnapshotTable), thin_pool.rs (ThinPoolTable), thin.rs (ThinTable), table.rs (DmTable), message.rs (ThinMessage). mod.rs keeps the shared Error and the Dmsetup tool. No behaviour change.
util/chroot_jail.rs: a lazy, declarative builder for a VM chroot's contents. Kernel and rootfs slots start Unset and carry their value once set (with_kernel/with_rootfs); build() exists only on ChrootJail<Kernel,Rootfs>, so a jail missing either artifact is a compile error. build() does the confined O_NOFOLLOW walk then realizes each artifact relative to the chroot dir fd - the staging (hardlink/copy/chown) and mknod logic folded in here. prepare.rs collapses to a three-line declaration. Deleted tools/stage.rs and tools/mknod.rs (logic now lives once, in the builder).
git mv src/{safe_bin,safe_dev}.rs -> src/util/; declared in util/mod.rs;
repointed crate::safe_{bin,dev} -> crate::util::safe_{bin,dev}. No behaviour
change.
Across the helper, filesystem paths and path components are now Path/PathBuf: - SafeDir: name params &str -> &Path, descend &[String] -> &[PathBuf], errors hold PathBuf; entries read filenames as PathBuf via OsStr (no UTF-8 coupling, drops the BadName variant). - SafePath::relative_to -> (Vec<PathBuf>, PathBuf), dropping NonUtf8. - ChrootJail: chroot/kernel slots PathBuf; new/with_kernel take Into<PathBuf>. - prepare/remove args -> PathBuf; remove helpers take &Path. - losetup backing-file value parser returns PathBuf. - output device fields (losetup/dmsetup) and sys-test hyper_base -> PathBuf (serde still serialises them as JSON strings, so the wire is unchanged). Left as-is (not paths): FromStr(&str) parse interfaces, DmName/SafeBin validated-name newtypes, and the path-literal consts.
# Conflicts: # lib/hyper.ex # lib/hyper/node.ex # lib/hyper/node/fire_vmm/state.ex # native/suidhelper/src/tools/dmsetup/mod.rs # native/suidhelper/src/tools/losetup.rs
Pass the newly-added CI's Rust gates: rustfmt the tree, and match the EXDEV errno in the pattern instead of a guard (clippy::redundant_guard).
- Img.create_mutable / Daemon.start_link: narrow the supervisor/MuonTrap
start result via case so the return matches the {:ok,pid}|{:error,_} spec
(was returning the wider on_start_child type directly -> missing_range).
- Img.Mutable.State: add the @type t referenced by drop/2's spec
(was unknown_type).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.