Skip to content

Latest commit

 

History

History
51 lines (35 loc) · 1.15 KB

File metadata and controls

51 lines (35 loc) · 1.15 KB

VortexSplit

Auto-segment PreFLMR's query() into profiled, exportable components and run retrieval through the split model.

Requirements

  • uv for dependency management
  • NVIDIA GPU + CUDA 11.8
  • Graphviz (dot on your PATH)

Install

uv sync
uv run python main.py --help

Data

Retrieval needs the EVQA (M2KR) text, passages, and query images. Fetch them with:

uv run python fetch_datasets.py

A prebuilt ColBERT index is expected under /data/EVQA/index (see the paths in main.py: INDEX_ROOT, EXPERIMENT, INDEX_NAME).

Workflow

HF_HUB_OFFLINE=1 uv run python main.py generate --batch 16 --out /dev/shm/flmr_split.tspart --coarse
HF_HUB_OFFLINE=1 uv run python main.py demo --artifact /dev/shm/flmr_split.tspart --batch 16
HF_HUB_OFFLINE=1 uv run python main.py draw --artifact /dev/shm/flmr_split.tspart --out flow.svg

Tests

uv run pytest
uv run pytest -m slow

Example

Identical results between monolith and partitioned

baseline baseline

partitioned partitioned