GitHub - HiThink-Research/SCMAPR: Prompt optimization for T2V

SCMAPR: Self-Correcting Multi-Agent Prompt Refinement for Complex-Scenario Text-to-Video Generation

Chengyi Yang^1,2， Pengzhen Li¹， Jiayin Qi³， Aimin Zhou²， Ji Wu⁴， Ji Liu^1†

¹ HiThink Research ² East China Normal University ³ Guangzhou University ⁴ Tsinghua University

^†Corresponding Author: jiliuwork@gmail.com

### Abstract

Text-to-Video (T2V) generation has benefited from recent advances in diffusion models, yet current systems still struggle under complex scenarios, which are generally exacerbated by the ambiguity and underspecification of text prompts. In this work, we formulate complex-scenario prompt refinement as a stage-wise multi-agent refinement process and propose SCMAPR, i.e., a scenario-aware and Self-Correcting Multi-Agent Prompt Refinement framework for T2V prompting. SCMAPR coordinates specialized agents to (i) route each prompt to a taxonomy-grounded scenario for strategy selection, (ii) synthesize scenario-aware rewriting policies and perform policy-conditioned refinement, and (iii) conduct structured semantic verification that triggers conditional revision when violations are detected. To clarify what constitutes complex scenarios in T2V prompting, provide representative examples, and enable rigorous evaluation under such challenging conditions, we further introduce T2V-Complexity, which is a complex-scenario T2V benchmark consisting exclusively of complex-scenario prompts. Extensive experiments on 3 existing benchmarks and our T2V-Complexity benchmark demonstrate that SCMAPR consistently improves text-video alignment and overall generation quality under complex scenarios, achieving up to 2.67% and 3.28 gains in average score on VBench and EvalCrafter, and up to 0.028 improvement on T2V-CompBench over 3 State-Of-The-Art baselines.

Framework

Self-Correcting Multi-Agent Prompt Refinement Framework (SCMAPR)

SCMAPR organizes prompt refinement as a stage-wise multi-agent collaboration involving six specialized agents. The framework proceeds through five functional stages: (I) Scenario Routing, where Scenario Router assigns a scenario tag to the input prompt. (II) Policy Synthesis, where a Policy Generator generates a scenario-conditioned rewriting policy. (III) Policy-Conditioned Refinement, where a Prompt Refiner rewrites the prompt. (IV) Semantic Verification, where Atomizer and Validator collaboratively verify semantic fidelity through atomic extraction and entailment judgment. (V) Conditional Revision, where verification feedback conditionally triggers targeted revision, enabling self-correcting refinement.

Illustration of the Semantic Verification Stage in SCMAPR

Given a user input and the corresponding refined prompt, semantic verification is performed in four steps. (1) \emph{Atomic Extraction} decomposes the user input into atom elements. (2) \emph{Chunking} segments the refined prompt into semantically coherent evidence units. (3) \emph{Atom-Chunk Matching} retrieves the most relevant evidence chunk for each atom. (4) \emph{Entailment Validation} assesses atom-level semantic relations between atoms and evidence chunks. Through this design, semantic missing and contradictions in the refined prompt can be detected and subsequently used to trigger downstream revision.

Results

Comparison of Complex-Scenario Text-to-Video Generation Before and After Prompt Refinement

End-to-End case study of SCMAPR with Self-Correction

Given a user input, the framework performs scenario routing, policy generation, policy-conditioned prompt refinement, atom-level verification, and targeted revision. Entailment Validator labels each atom-evidence pair and conditionally triggers targeted revision, producing a verified refined prompt for downstream video generation.

Installation

conda create -n SCMPR python=3.10.18
conda activate SCMPR
pip install -r requirements.txt

Run SCMAPR

Remember to write your API Key in utils/config.json

Our code supports running the entire pipeline end to end, as well as executing each stage step by step.

Scenario Routing

VBench

python -m refinement.classifier \
  --output_dir results \
  --input_txt data/vbench_full_info.txt \
  --output_name category_vbench946.jsonl \
  --include_non_difficult

EvalCrafer

python -m refinement.classifier \
  --output_dir results \
  --input_txt data/evalcrafter700.txt \
  --output_name category_evalcrafter700.jsonl \
  --include_non_difficult

CompBench

python -m refinement.classifier \
  --output_dir results \
  --input_txt data/compbench1400.txt \
  --output_name category_compbench1400.jsonl \
  --include_non_difficult

Policy Generation

VBench

python -m refinement.policy \
  --input_jsonl results/category_vbench946.jsonl \
  --output_jsonl results/policy_vbench946.jsonl \
  --log_every 20

EvalCrafter

python -m refinement.policy \
  --input_jsonl results/category_evalcrafter700.jsonl \
  --output_jsonl results/policy_evalcrafter700.jsonl \
  --log_every 20

CompBench

python -m refinement.policy \
  --input_jsonl results/category_compbench1400.jsonl \
  --output_jsonl results/policy_compbench1400.jsonl \
  --log_every 20

T2V-Complexity

python -m refinement.policy \
  --input_jsonl benchmark/prompts.jsonl \
  --output_jsonl results/policy_t2vcomplexity1000.jsonl \
  --log_every 20

Prompt Refinement

Vbench

python -m refinement.refiner \
  --input_jsonl results/policy_vbench946.jsonl \
  --output_jsonl results/refined_vbench946.jsonl \
  --log_every 20

EvalCrafter

python -m refinement.refiner \
  --input_jsonl results/policy_evalcrafter700.jsonl \
  --output_jsonl results/refined_evalcrafter700.jsonl \
  --log_every 20

CompBench

python -m refinement.refiner \
  --input_jsonl results/policy_compbench1400.jsonl \
  --output_jsonl results/refined_compbench1400.jsonl \
  --log_every 20

T2V-Compleixty

python -m refinement.refiner \
  --input_jsonl results/policy_t2vcomplexity1000.jsonl \
  --output_jsonl results/refined_t2vcomplexity1000.jsonl \
  --log_every 20

Verification and Revision

VBench

python3 run_batch_flow.py \
    --input data/vbench_full_info.txt \
    --output_txt results/verified_vbench946.txt \
    --output_jsonl results/verified_vbench946.jsonl \
    --category_jsonl results/category_vbench946.jsonl \
    --policy_jsonl results/policy_vbench946.jsonl \
    --refined_jsonl results/refined_vbench946.jsonl \
    --resume_from verify

EvalCrafter

python3 run_batch_flow.py \
    --input data/evalcrafter700.txt \
    --output_txt results/verified_evalcrafter700.txt \
    --output_jsonl results/verified_evalcrafter700.jsonl \
    --category_jsonl results/category_evalcrafter700.jsonl \
    --policy_jsonl results/policy_evalcrafter700.jsonl \
    --refined_jsonl results/refined_evalcrafter700.jsonl \
    --resume_from verify

CompBench

python3 run_batch_flow.py \
    --input data/compbench1000.txt \
    --output_txt results/verified_compbench1000.txt \
    --output_jsonl results/verified_compbench1000.jsonl \
    --category_jsonl results/category_compbench1000.jsonl \
    --policy_jsonl results/policy_compbench1000.jsonl \
    --refined_jsonl results/refined_compbench1000.jsonl \
    --resume_from verify

T2v-Compleixty

python3 run_batch_flow.py \
    --input data/t2v_complexity1000.txt \
    --output_txt results/verified_t2vcomplexity1000.txt \
    --output_jsonl results/verified_t2vcomplexity1000.jsonl \
    --category_jsonl benchmark/prompts.jsonl \
    --policy_jsonl results/policy_t2vcomplexity1000.jsonl \
    --refined_jsonl results/refined_t2vcomplexity1000.jsonl \
    --resume_from [classifier or policy refiner or verify or verify]

Run the whole framework

python3 run_batch_flow.py \
    --input data/vbench_full_info.txt \
    --output_txt results/verified_vbench946.txt \
    --output_jsonl results/verified_vbench946.jsonl \
    --category_jsonl results/category_vbench946.jsonl \
    --policy_jsonl results/policy_vbench946.jsonl \
    --refined_jsonl results/refined_vbench946.jsonl \
    --resume_from None

python3 run_batch_flow.py \
    --input data/t2v_complexity1000.txt \
    --output_txt results/verified_t2vcomplexity1000.txt \
    --output_jsonl results/verified_t2vcomplexity1000.jsonl \
    --category_jsonl benchmark/prompts.jsonl \
    --policy_jsonl results/policy_t2vcomplexity1000.jsonl \
    --refined_jsonl results/refined_t2vcomplexity1000.jsonl \
    --resume_from None

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
T2V-Complexity		T2V-Complexity
__pycache__		__pycache__
assets		assets
data		data
refinement		refinement
results		results
revision		revision
scripts		scripts
utils		utils
verification		verification
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
run_batch_flow.py		run_batch_flow.py

Folders and files

Latest commit

History

Repository files navigation

SCMAPR: Self-Correcting Multi-Agent Prompt Refinement for Complex-Scenario Text-to-Video Generation

Framework

Self-Correcting Multi-Agent Prompt Refinement Framework (SCMAPR)

Illustration of the Semantic Verification Stage in SCMAPR

Results

Comparison of Complex-Scenario Text-to-Video Generation Before and After Prompt Refinement

End-to-End case study of SCMAPR with Self-Correction

Installation

Run SCMAPR

Scenario Routing

VBench

EvalCrafer

CompBench

Policy Generation

VBench

EvalCrafter

CompBench

T2V-Complexity

Prompt Refinement

Vbench

EvalCrafter

CompBench

T2V-Compleixty

Verification and Revision

VBench

EvalCrafter

CompBench

T2v-Compleixty

Run the whole framework

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages