Improve model card: Add text-generation pipeline tag, transformers library, links, and sample usage for RAPO++ Prompt Rewriter

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +90 -3
README.md CHANGED
@@ -1,3 +1,90 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ ---
6
+
7
+ # RAPO++ Prompt Rewriter: Llama-3.1-8B-Instruct
8
+
9
+ This repository hosts the **RAPO++** prompt rewriter model, an LLM fine-tuned for prompt optimization as described in the paper [RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling](https://huggingface.co/papers/2510.20206).
10
+
11
+ **RAPO++** is a three-stage framework that enhances text-to-video generation without modifying model architectures. This specific model is the LLM component (based on Llama-3.1-8B-Instruct) responsible for prompt rewriting, aiming to refine short, unstructured user prompts to be more descriptive and aligned with training distributions, thereby enhancing compositionality and multi-object fidelity in T2V generation.
12
+
13
+ - **Paper**: [RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling](https://huggingface.co/papers/2510.20206)
14
+ - **Project Page**: [https://whynothaha.github.io/RAPO_plus_github/](https://whynothaha.github.io/RAPO_plus_github/)
15
+ - **Code**: [https://github.com/Vchitect/RAPO](https://github.com/Vchitect/RAPO)
16
+
17
+ <p align="center">
18
+ <img src="https://github.com/Vchitect/RAPO/raw/main/assets/overview.png" alt="RAPO++ Overview" width="700">
19
+ </p>
20
+
21
+ ## Quick Start (Prompt Rewriting)
22
+
23
+ You can use this prompt rewriter model for text generation with the Hugging Face `transformers` library. The `config.json` confirms it is a Llama-based model, compatible with `transformers`.
24
+
25
+ ```python
26
+ from transformers import pipeline, AutoTokenizer
27
+ import torch
28
+
29
+ model_id = "bingjie/llama3_1_instruct_lora_rewrite" # Replace with actual model ID if different
30
+
31
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
32
+
33
+ pipe = pipeline(
34
+ "text-generation",
35
+ model=model_id,
36
+ tokenizer=tokenizer,
37
+ torch_dtype=torch.float16, # Use bfloat16 if your GPU supports it
38
+ device_map="auto",
39
+ # The original project does not indicate a trust_remote_code=True,
40
+ # but Llama models typically don't require it unless custom layers are involved.
41
+ # For general safety and broader compatibility, if unsure, you can omit or add based on model specifics.
42
+ )
43
+
44
+ # Example: Rewriting a user prompt
45
+ user_prompt_to_rewrite = "A cat playing with a ball."
46
+ chat_template_messages = [
47
+ {"role": "system", "content": "You are a prompt rewriter. Rewrite the given user prompt to be more descriptive and suitable for text-to-video generation."},
48
+ {"role": "user", "content": f"User prompt: {user_prompt_to_rewrite}"}
49
+ ]
50
+
51
+ # Apply the chat template for instruction-tuned models
52
+ input_text = tokenizer.apply_chat_template(
53
+ chat_template_messages,
54
+ add_generation_prompt=True,
55
+ tokenize=False, # Ensure the output is a string for the pipeline
56
+ )
57
+
58
+ print(f"Original input to pipeline: {input_text}")
59
+
60
+ # Generate the rewritten prompt
61
+ outputs = pipe(
62
+ input_text,
63
+ max_new_tokens=100,
64
+ do_sample=True,
65
+ temperature=0.7,
66
+ top_p=0.9,
67
+ eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>")], # Ensure correct EOS tokens are used
68
+ )
69
+
70
+ generated_text = outputs[0]["generated_text"]
71
+ # Extract the assistant's response part
72
+ rewritten_prompt = generated_text.split("<|start_header_id|>assistant<|end_header_id|>")[-1].strip()
73
+
74
+ print(f"
75
+ Original Prompt: {user_prompt_to_rewrite}")
76
+ print(f"Rewritten Prompt: {rewritten_prompt}")
77
+ ```
78
+
79
+ ## Citation
80
+
81
+ If you find our work helpful for your research, please consider citing it:
82
+
83
+ ```bibtex
84
+ @article{gao2025rapopp,
85
+ title = {RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling},
86
+ author = {Gao, Bingjie and Ma, Qianli and Wu, Xiaoxue and Yang, Shuai and Lan, Guanzhou and Zhao, Haonan and Chen, Jiaxuan and Liu, Qingyang and Qiao, Yu and Chen, Xinyuan and Wang, Yaohui and Niu, Li},
87
+ journal = {arXiv preprint arXiv:2510.20206},
88
+ year = {2025}
89
+ }
90
+ ```