Skip to content

feat: add cdf, domain_metadata, appTxn, checkpoint, and crc spec types#56

Open
rliao147 wants to merge 3 commits into
mainfrom
stack/add-missing-ops
Open

feat: add cdf, domain_metadata, appTxn, checkpoint, and crc spec types#56
rliao147 wants to merge 3 commits into
mainfrom
stack/add-missing-ops

Conversation

@rliao147

@rliao147 rliao147 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

🥞 Stacked PR

Use this link to review incremental changes.


Description

Builds on the read/snapshot/write spec types to add more read-side spec types:

  1. cdf: Change Data Feed reads over a version/timestamp range, capturing rows with _change_type / _commit_version / _commit_timestamp.
  2. domain_metadata: a domain entry's configuration / removed at a version.
  3. appTxn: SetTransaction (appId / txnVersion).
  4. checkpoint: protocol / metadata / txn / domainMetadata of a checkpoint file at a version (V1 single-file).
  5. crc: version-checksum aggregates (table size, file count, protocol). A crc spec is validated only where a .crc exists at that version (the version-checksum file is protocol-optional to write)

cdf reads through the DataFrame change-feed API; domain_metadata and appTxn read via two new neutral SnapshotView accessors (domainMetadataJson,setTransactionsJson); checkpoint and crc read log artifacts directly.

How was this patch tested?

Added multiple new test suites for the new specs, and existing suites pass. Also validated new against OSS spark/delta-kernel-rs.

rliao147 added 3 commits June 15, 2026 19:24
This PR adds a write spec, which allows DAT workloads to test Delta
writers. Any Delta writer can validate their own write APIs through the
existing read/snapshot specs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant