Yixiong Yang1,*, Tao Wu2,*, Senmao Li3, Shiqi Yang3,†,
Yaxing Wang3, Joost van de Weijer2, Kai Wang4,5,2,✉
1Harbin Institute of Technology (Shenzhen), China
2Computer Vision Center, Universitat Autònoma de Barcelona, Spain
3VCIP, CS, Nankai University, China
4City University of Hong Kong (Dongguan), China
5City University of Hong Kong, HK SAR, China
*Equal contribution. †Visiting researcher in Nankai University. ✉Corresponding author.
Official implementation for OPAD.
Overview of OPAD. The student and teacher jointly learn the new concept with a shared text encoder. The teacher learns from real images (green), and the text encoder is updated accordingly. The student is optimized with two objectives (gold): an adversarial loss to match real data distribution and alignment losses to match the denoised outputs of the teacher. The discriminators are trained to distinguish between the student's outputs and real images.
conda create -n opad python=3.10 -y
conda activate opad
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
pip install diffusers==0.26.0 huggingface-hub==0.25.2 transformers==4.28.0
pip install peft==0.7.0 lpips==0.1.4 wandb==0.19.8 accelerate==1.5.2 safetensors==0.5.3 timm==1.0.11 einops==0.8.1
pip install matplotlib scipy scikit-learn pandas opencv-python==4.11.0.86 numpy==1.24.4
pip install git+https://github.com/openai/CLIP.git
pip install git+https://github.com/tencent-ailab/IP-Adapter.gitOPAD uses Weights & Biases for training logs. The default student model is stabilityai/sd-turbo. The teacher model is currently set to sd2-community/stable-diffusion-2-1; Stable Diffusion 2.1 model paths on Hugging Face may change, so please update --teacher_pretrained_model_name_or_path if needed.
Download the IP-Adapter weights and pass the image encoder and checkpoint paths to the training command. The examples below use /path/to/IP-Adapter; replace it with your local IP-Adapter folder.
Model downloads are cached by Hugging Face. To keep cache files outside the default home directory, set HF_HUB_CACHE, TRANSFORMERS_CACHE, and TORCH_HOME before running training or inference.
Download the DreamBooth dataset and keep its original folder layout:
git clone https://huggingface.co/datasets/google/dreambooth ../dreamboothFor the dog example, images should be available at:
../dreambooth/dataset/dog/*.jpg
Edit the IP-Adapter paths in train.sh, then run the dog example with:
bash train.shEquivalent command:
python train_opad.py \
--instance_data_dir=../dreambooth/dataset/dog \
--output_dir=outputs/opad_dog/dog \
--instance_prompt="<new1> dog" \
--modifier_token="<new1>" \
--initializer_token=corgi \
--validation_prompt="a <new1> dog in the jungle" \
--ip_adapter_image_encoder_path=/path/to/IP-Adapter/models/image_encoder \
--ip_adapter_ckpt=/path/to/IP-Adapter/models/ip-adapter_sd15.binGenerate personalized images from the trained checkpoint:
python inference_opad.py \
--model_path outputs/opad_dog/dog \
--output_path outputs/opad_dog/dog/inference/grid.pngYou can also provide prompts directly:
python inference_opad.py \
--model_path outputs/opad_dog/dog \
--output_path outputs/opad_dog/dog/inference/custom.png \
--prompt "a <new1> dog in the jungle" "a <new1> dog wearing a red hat"Run all DreamBooth instances listed in run_dreambooth.py:
python run_dreambooth.py \
--output_folder outputs/opad_dreambooth \
--train_file train_opad.py \
--data_dir ../dreambooth/dataset \
--ip_adapter_image_encoder_path=/path/to/IP-Adapter/models/image_encoder \
--ip_adapter_ckpt=/path/to/IP-Adapter/models/ip-adapter_sd15.bineval_dreambooth.py is kept as an optional utility for DreamBooth prompt generation and quantitative metrics:
python eval_dreambooth.py \
--path outputs/opad_dog \
--instances dog \
--dreambooth_path ../dreambooth/dataset \
--outdir benchmarksThe default DreamBooth metrics are clip-t, clip-i, and dino. The metric implementation follows the common evaluation protocol used by prior work and is partly inspired by t2v-metrics, but the current default evaluation does not require installing t2v-metrics.
@inproceedings{yang2026adversarial,
title = {Adversarial Concept Distillation for One-Step Diffusion Personalization},
author = {Yixiong Yang and Tao Wu and Senmao Li and Shiqi Yang and Yaxing Wang and Joost van de Weijer and Kai Wang},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
year = {2026}
}Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Any commercial use should get formal permission first.
This codebase builds on and refers to Custom Diffusion and TextBoost. We thank the authors for their kind sharing.

