KOKKKOKK commited on
Commit
400a949
·
verified ·
1 Parent(s): 9b5e7e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -1
README.md CHANGED
@@ -5,4 +5,61 @@ language:
5
  base_model:
6
  - Qwen/Qwen2.5-VL-7B-Instruct
7
  pipeline_tag: reinforcement-learning
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  base_model:
6
  - Qwen/Qwen2.5-VL-7B-Instruct
7
  pipeline_tag: reinforcement-learning
8
+ ---
9
+
10
+ # 🧠 Ariadne
11
+
12
+ This is the official model checkpoint for the paper:
13
+ **[Ariadne: A Controllable Framework for Probing and Extending VLM Reasoning Boundaries](https://arxiv.org/abs/2511.00710)**
14
+
15
+ ### 🔬 Example
16
+
17
+ ```python
18
+
19
+ from transformers import AutoModelForImageTextToText, AutoProcessor
20
+
21
+ MODEL_ID = "..." # path
22
+ # Load model and tokenizer
23
+ model = AutoModelForImageTextToText.from_pretrained(
24
+ MODEL_ID,
25
+ torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
26
+ device_map="auto",
27
+ low_cpu_mem_usage=True,
28
+ )
29
+ processor = AutoProcessor.from_pretrained(MODEL_ID)
30
+
31
+ # Format question example
32
+ SYSTEM_PROMPT = "..."
33
+ img = None
34
+
35
+ conversation = [
36
+ {"role": "system", "content": [{"type": "text", "text": SYSTEM_PROMPT}]},
37
+ {
38
+ "role": "user",
39
+ "content": [
40
+ {"type": "image", "image": img},
41
+ {"type": "text", "text": "..."},
42
+ ],
43
+ },
44
+ ]
45
+
46
+ # Generate output
47
+ prompt_text = processor.apply_chat_template(
48
+ conversation, add_generation_prompt=True, tokenize=False
49
+ )
50
+ inputs = processor(text=prompt_text, images=img, return_tensors="pt")
51
+ with torch.inference_mode():
52
+ gen_out = model.generate(
53
+ **inputs,
54
+ max_new_tokens=256,
55
+ do_sample=False,
56
+ return_dict_in_generate=True,
57
+ output_scores=False,
58
+ )
59
+ sequences = gen_out.sequences
60
+
61
+ input_len = inputs["input_ids"].shape[1]
62
+ gen_ids = sequences[0, input_len:]
63
+ resp_text = processor.tokenizer.decode(
64
+ gen_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
65
+ ).strip()