PRX pixel pipeline by DavidBert · Pull Request #6 · Photoroom/diffusers

DavidBert · 2026-06-04T12:19:23Z

PRX-Pixel in 🧨 diffusers

Quick note for running a trained PRX-Pixel checkpoint (7B, pixel-space RGB, no VAE, Qwen3-VL
text tower) through PRXPixelPipeline. Three steps: convert → load → predict.

Checkpoints

The research checkpoints live on the other cluster (point --checkpoint_path at one of these):

Model	Path
Base model (SFT)	`/mnt/data/users/davidb/checkpoints/PRX7B-ckpt/SFT`
RLHF (FDFO)	`/mnt/data/users/davidb/checkpoints/PRX7B-ckpt/FDFO_forensic_omniaid`

1. Setup

PRX-Pixel needs the Qwen3-VL text tower → transformers >= 4.57 (pin < 5, 5.x breaks torchvision).

# from the diffusers repo root
uv venv --python 3.12 --system-site-packages .venv_prxpixel
uv pip install --python .venv_prxpixel/bin/python "transformers>=4.57,<5" accelerate

2. Convert the checkpoint

Reads the research checkpoint (a DCP dir *.distcp or a .pt file) and writes a diffusers folder.

CUDA_VISIBLE_DEVICES=5 uv run --no-project --python .venv_prxpixel/bin/python \
  scripts/convert_prx_to_diffusers.py \
  --checkpoint_path /path/to/ep0-ba400 \
  --output_path     checkpoints_prx/prxpixel-diffusers \
  --variant         pixel \
  --resolution      1024

Look for ✓ All parameters loaded successfully (0 missing, 0 unexpected)!. This also downloads the
Qwen3-VL text encoder + tokenizer into the output folder.

3. Load + predict

import torch, numpy as np
from PIL import Image
from diffusers import PRXPixelPipeline

pipe = PRXPixelPipeline.from_pretrained("checkpoints_prx/prxpixel-diffusers", torch_dtype=torch.bfloat16).to("cuda:0")

out = pipe(
    "A polished brass weathervane shaped like a rooster against a deep blue sky",
    height=1024, width=1024,
    num_inference_steps=50,
    guidance_scale=1.0,                          # CFG 1 = no guidance; works great here
    output_type="pt",                            # pixel-space, no VAE -> get the raw tensor
    generator=torch.Generator("cuda:0").manual_seed(0),
).images                                         # tensor in [-1, 1]

img = (out.float().clamp(-1, 1) + 1) / 2         # -> [0, 1]
arr = (img[0].permute(1, 2, 0).cpu().numpy() * 255).round().astype(np.uint8)
Image.fromarray(arr).save("prxpixel.png")

The pipeline already handles the PRX-Pixel specifics (x0-prediction, noise_scale=2, 256-token
budget, full-res RGB). Good defaults: 50 steps, CFG 1, scheduler shift ≈ 3.

Run scripts with .venv_prxpixel/bin/python <script> (or uv run --no-project --python .venv_prxpixel/bin/python <script> — --no-project is needed since this repo has no [project] table).

PRX pixel pipeline

35ba10e

github-actions Bot added pipelines models utils labels Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PRX pixel pipeline#6

PRX pixel pipeline#6
DavidBert wants to merge 1 commit into
mainfrom
prx-pixel-pipeline

DavidBert commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DavidBert commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PRX-Pixel in 🧨 diffusers

Checkpoints

1. Setup

2. Convert the checkpoint

3. Load + predict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DavidBert commented Jun 4, 2026 •

edited

Loading