Skip to content

[enhancement] Reduce pulsar-admin / CLI startup time with Class Data Sharing (CDS) via -XX:+AutoCreateSharedArchive #26069

Description

@lhotari

Motivation

Every invocation of pulsar-admin (and the other CLI tools — pulsar-client, pulsar-perf, pulsar-shell) pays the full JVM startup + class‑loading cost. In #22318, pulsar-admin tenants list was measured at ~2.4s of wall‑clock, most of it JVM warmup and class loading. For scripting and interactive use this is a noticeable tax.

Building a GraalVM native image (#22318) cuts that ~9× (≈0.26s), but it requires per‑platform builds, ongoing reflection‑config maintenance, and extra CI cost, and the broader‑stack reflection (Jackson model serialization, auth plugins, Netty) makes a complete native pulsar-admin hard — as discussed in that issue.

Class Data Sharing (CDS) is the cheaper, lower‑risk win already suggested in #22318. It works with the existing jars, has no closed‑world/reflection constraints, and delivers a meaningful startup reduction "for relatively little work" (Brian Goetz, quoted in #22318). Since JDK 19, -XX:+AutoCreateSharedArchive (JDK‑8261455) makes it almost free to wire up: a single flag that auto‑creates the dynamic CDS archive on first run and auto‑regenerates it when the archive is missing/invalid or after a JDK upgrade — no two‑step dump and no manual archive lifecycle to manage.

This is a natural follow‑up to the CDS idea raised in #22318 (where @nodece offered "Let me try CDS"), and complements, rather than competes with, the native‑image effort.

Solution

Wire CDS into the CLI launcher scripts, gated on the detected JDK version. The common script bin/pulsar-admin-common.sh already detects JAVA_MAJOR_VERSION and assembles OPTS, so this is the natural hook (with the equivalent in the .cmd scripts):

  • JDK ≥ 19 (covers Pulsar's supported Java 21): single‑step auto‑create dynamic CDS
    -XX:+AutoCreateSharedArchive -XX:SharedArchiveFile=<archive>
    
    relying on the default -Xshare:auto.
  • JDK 17–18 (still supported for the CLI): either the older two‑step dynamic CDS
    (-XX:ArchiveClassesAtExit=<archive> to seed, then -XX:SharedArchiveFile=<archive> to load),
    or simply skip CDS. -XX:+AutoCreateSharedArchive does not exist before JDK 19.

Details:

  • Archive location: a writable, per‑user path keyed per tool, e.g. ${PULSAR_CDS_DIR:-$PULSAR_HOME/cds}/pulsar-admin.jsa (or an XDG/$TMPDIR cache dir). The JVM handles JDK‑version mismatch automatically with AutoCreateSharedArchive; using distinct files per tool avoids classpath‑mismatch invalidation between tools.
  • Opt‑out: an env toggle (e.g. PULSAR_CDS_ENABLED=false) so users on read‑only/locked‑down environments can disable it; users can always add flags via PULSAR_EXTRA_OPTS.
  • Safety: keep -Xshare:auto (never -Xshare:on) so a read‑only filesystem or a failed archive map falls back silently instead of crashing.
  • Default on for the CLI tools (short‑lived, frequently invoked — the prime CDS target). Extending CDS to the broker/bookie launchers (conf/pulsar_env.sh) can be a follow‑up; for long‑lived servers the relative startup benefit is smaller.

Alternatives

  • GraalVM native image ([Native-image] Build a native-image for pulsar-admin #22318) — much larger startup win (~9×) but high effort: per‑platform builds, reflection‑config maintenance, CI cost; gated on broader‑stack reflection.
  • Two‑step AppCDS only — drop the JDK‑19 path and always use -XX:ArchiveClassesAtExit + -XX:SharedArchiveFile. More portable (works on 17/18) but needs explicit archive‑lifecycle logic in the scripts and doesn't self‑heal on JDK upgrade.
  • Do nothing — keep paying full startup cost per invocation.

Anything else?

Things to validate during implementation:

  • JDK‑version gate is essential. -XX:+AutoCreateSharedArchive is JDK 19+ (JDK‑8261455); it is absent in JDK 17/18. Pulsar's CLI runs on Java 17 or 21, so the script must branch on JAVA_MAJOR_VERSION.
  • Classpath gotcha. Pulsar's CLI classpath includes the conf directory and lib/*. CDS archiving has known trouble with non‑empty directories on the classpath (JDK‑8329980) — verify archive creation succeeds with Pulsar's classpath and adjust if needed.
  • Classpath must match between create and load; Pulsar's is stable per install, and -Xshare:auto + AutoCreateSharedArchive self‑heal on mismatch/upgrade. The first invocation builds the archive on exit and sees no benefit; subsequent runs do.
  • Read‑only containers: the archive path must be writable, otherwise CDS silently no‑ops (acceptable with -Xshare:auto).
  • Expected gain is real but app‑dependent and must be measured, not assumed. App‑CDS typically yields roughly 20–35% off startup (Spring's CDS report shows ~30–45% in some scenarios, with caveats). It will not match native‑image's ~9×, but it's far cheaper and broadly applicable. Worth benchmarking pulsar-admin startup before/after.
  • Prior art / references: [Native-image] Build a native-image for pulsar-admin #22318 (native image + the original CDS suggestion), [feat][client] Embed GraalVM native image config #25883 (embedded GraalVM native‑image config in the client), Pulsar Shell: native image with GraalVM #17211 and Use graalVM native image to build client and client admin #4194 (earlier native‑image explorations).
  • Docs: Oracle JDK 21 — Class Data Sharing · AutoCreateSharedArchive in the java man page · JDK‑8261455 · Spring CDS blog.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions