Skip to content

[X-2695] Support Reverse Order Scans#6

Merged
ch-sc merged 8 commits into
0.65.0-atlasfrom
X-2695/Support-reverse-order-scans
May 15, 2026
Merged

[X-2695] Support Reverse Order Scans#6
ch-sc merged 8 commits into
0.65.0-atlasfrom
X-2695/Support-reverse-order-scans

Conversation

@ch-sc
Copy link
Copy Markdown
Collaborator

@ch-sc ch-sc commented May 8, 2026

Summary

Reverse order scans are an optimization for queries like ORDER BY timestamp DESC LIMIT n where the data is ordered by timestamp ASC. Such read patterns appear constantly in time-series workloads where callers want the most recent rows. With the current implementation users would follow naive approaches: fully scan a Vortex file, buffer all rows and then reverse the output or sort all rows of the file. This is unnecessarily expensive.

If files are already written in sorted order, a scan in opposite direction can be answered by iterating chunks from last to first and reversing the rows within each chunk. Avoiding sorting and buffering. This PR implements this by reversing ranges in the scan layer and reversing the Vortex array representation.

Implementation

The work spans two layers: the scan orchestration layer (vortex-layout) and the array encoding layer (vortex-array).

Scan layer (vortex-layout)

ScanBuilder gains a with_reversed(bool) builder method. When set:

  • RepeatedScan::execute collects the chunk ranges and iterates them in reverse order (last chunk first). This is the global reversal — chunk order is flipped for free by reversing a Vec of ranges.
  • The map_fn closure wraps the user-supplied function to call array.reverse() on each chunk before passing it downstream. This is the per-chunk reversal — row order within each chunk is flipped.

Reversed scans are always ordered (they produce a strict global sequence), so ordered = true is implied.

Array layer (vortex-array) — ReversedArray

ReversedArray is a new lazy wrapper encoding. It is constructed by ArrayRef::reverse() and immediately runs through the optimizer. The optimizer fires structural reduce rules at construction time, before any data is read:

Reduce rules:

Pattern Result Cost
Reversed(Reversed(x)) x Zero — both wrappers cancelled
Reversed(Dict(codes, values)) Dict(Reversed(codes), values) Reverse only the codes array; values dictionary reused
Reversed(Chunked([c₀, c₁, …, cₙ])) Chunked([reverse(cₙ), …, reverse(c₁),reverse(c₀)]) Chunk order flipped; each chunk wrapped in Reversed and re-optimized recursively

The Dict rule is the most important one. Reversing a Dict means reversing only the codes, not the values.

Execute kernels:

Canonical type Path
Primitive Iterates the typed buffer backwards — O(n), sequential, auto-vectorizable
Bool Reads bits in reverse via BitBuffer::value_unchecked — O(n), no intermediate allocation
Struct Calls field.reverse() on each child — per-field optimizer rules still fire
All others Falls back to take(reversed_indices)

API Changes

New surface in vortex-array:

  • ArrayRef::reverse() -> VortexResult<ArrayRef> — reverse any array lazily
  • Reversed / ReversedArray — the new encoding type (public, can be pattern-matched)
  • ReverseReduce trait + ReverseReduceAdaptor struct — extension point for custom encodings

New surface in vortex-layout:

  • ScanBuilder::with_reversed(bool) -> Self
  • ScanBuilder::reversed() -> bool

No breaking changes. All changes are additive.

Testing

vortex-array/src/arrays/reversed/tests.rs covers 13 cases for PrimitiveArray, BoolArray, DictArray, StructArray, and ChunkedArray.

ch-sc added 3 commits May 8, 2026 13:35
Signed-off-by: Christoph Schulze <[email protected]>
Signed-off-by: Christoph Schulze <[email protected]>
…on.io>

I, Christoph Schulze <[email protected]>, hereby add my Signed-off-by to this commit: 0e64d5e
I, Christoph Schulze <[email protected]>, hereby add my Signed-off-by to this commit: 96a951e

Signed-off-by: Christoph Schulze <[email protected]>
@ch-sc ch-sc changed the title X 2695/support reverse order scans [X-2695] Support Reverse Order Scans May 8, 2026
@ch-sc ch-sc mentioned this pull request May 8, 2026
ch-sc added 3 commits May 15, 2026 11:01
Signed-off-by: Christoph Schulze <[email protected]>
…ort-reverse-order-scans

Signed-off-by: Christoph Schulze <[email protected]>
Signed-off-by: Christoph Schulze <[email protected]>
Comment thread vortex-array/src/arrays/reversed/array.rs
Comment thread vortex-array/src/arrays/reversed/execute.rs
Comment thread Cargo.toml Outdated
Copy link
Copy Markdown

@wyatt-herkamp wyatt-herkamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

ch-sc added 2 commits May 15, 2026 18:59
Signed-off-by: Christoph Schulze <[email protected]>
Signed-off-by: Christoph Schulze <[email protected]>
@ch-sc ch-sc merged commit d46ef72 into 0.65.0-atlas May 15, 2026
19 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants