Skip to content

feat(rpm): replace init-pki.sh with openshell-gateway generate-certs#1426

Merged
maxamillion merged 2 commits into
NVIDIA:mainfrom
maxamillion:rpm-certgen-cutover
May 18, 2026
Merged

feat(rpm): replace init-pki.sh with openshell-gateway generate-certs#1426
maxamillion merged 2 commits into
NVIDIA:mainfrom
maxamillion:rpm-certgen-cutover

Conversation

@maxamillion
Copy link
Copy Markdown
Collaborator

Summary

RPM cutover: the gateway systemd user unit's `ExecStartPre` now invokes `openshell-gateway generate-certs --output-dir %S/openshell/tls` instead of the 197-line `deploy/rpm/init-pki.sh` openssl wrapper. One PKI implementation, one file layout, real test coverage.

Builds on #1257, which landed the `generate-certs` subcommand and its `--output-dir` local mode.

Related Issue

Related to #1258 (supersedes `tmutch/rpm-certgen-cutover`).

Changes

  • Spec rewire (`openshell.spec`):
    • `ExecStartPre=/usr/bin/openshell-gateway generate-certs --output-dir %S/openshell/tls` (was `init-pki.sh %S/openshell/tls`).
    • Removed the `install -pm 0755 deploy/rpm/init-pki.sh ...` line and the matching `%files gateway` entry.
  • `deploy/rpm/init-pki.sh` deleted (-197 lines).
  • `pki.rs::DEFAULT_SERVER_SANS` gains `host.containers.internal` so Podman parity is built-in. Docker (`host.docker.internal`) and Kubernetes (cluster.local DNS) were already covered.
  • Docs: man page (`deploy/man/openshell-gateway.8.md`), RPM `CONFIGURATION.md`, and the comment in `init-gateway-env.sh` all point at the new entrypoint.
  • Stale comment cleanup (`certgen.rs`): removed references to `init-pki.sh` in module and function docs.

Testing

Validated end-to-end on Fedora 43 VM (rootless Podman, netavark) with the built RPMs:

  • Ph1: `init-pki.sh` absent from RPM; correct `ExecStartPre` lines in unit file

  • Ph2: First-start PKI generation via `openshell_server::certgen`; TLS+mTLS enabled

  • Ph3: 6 PEM files, correct layout, 600 key permissions, CA chain valid

  • Ph4: All 9 SANs confirmed including new `host.containers.internal`

  • Ph5: CLI auto-discovery certs populated at `~/.config/openshell/gateways/openshell/mtls/`

  • Ph6: Idempotency — restart skips regen with `PKI files already exist, skipping`

  • Ph7: Self-healing — deleted CLI mtls dir re-populated on restart without PKI regen

  • Ph8: mTLS enforced (plaintext → reset; no-client-cert → TLS alert; full mTLS → success)

  • Ph9: Podman sandbox created, exec'd, deleted; supervisor connected via `host.containers.internal`

  • Ph10: Cert rotation — delete TLS dir → new certs → CLI certs updated → sandbox works

  • Ph11: Partial state → `ExecStartPre` fails with explicit recovery hint

  • `mise run pre-commit` passes

  • Unit tests pass (14 certgen + 3 pki tests)

  • E2E tests added/updated (not applicable — RPM packaging path)

Checklist

TaylorMutch and others added 2 commits May 15, 2026 16:21
Cuts the RPM gateway over to the unified Rust certgen path. The systemd
user unit's first ExecStartPre now invokes:

  /usr/bin/openshell-gateway generate-certs --output-dir %S/openshell/tls

producing the same six-PEM layout init-pki.sh built (ca.{crt,key},
server/tls.{crt,key}, client/tls.{crt,key}) and the same CLI mTLS copy
under $XDG_CONFIG_HOME/openshell/gateways/openshell/mtls/. None of the
OPENSHELL_TLS_* / OPENSHELL_PODMAN_TLS_* paths in the unit change.

Adds host.containers.internal to the gateway's built-in SAN list so
podman containers reaching their host validate cleanly with no
per-deployment --server-san flag. Docker (host.docker.internal) and
Kubernetes (cluster.local DNS) were already covered.

Drops 197 lines of openssl shell, the install/file lines for the script
itself, and updates the docs (man page, RPM CONFIGURATION.md, env-file
generator comment) to point at the new entrypoint. The %S state dir,
unit security hardening, and consumer paths are untouched.
@maxamillion maxamillion requested review from a team, derekwaynecarr and mrunalp as code owners May 18, 2026 15:49
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 18, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@maxamillion maxamillion merged commit 71209e6 into NVIDIA:main May 18, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants