jonasneves commited on
Commit
86ad25c
·
verified ·
1 Parent(s): 23ba854

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +154 -0
README.md ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: LiquidAI/LFM2-350M
4
+ tags:
5
+ - gguf
6
+ - spatial-reasoning
7
+ - lora
8
+ - fine-tuned
9
+ - lfm2
10
+ - stepgame
11
+ - llama-cpp
12
+ datasets:
13
+ - ZhengyanShi/StepGame
14
+ language:
15
+ - en
16
+ pipeline_tag: text-generation
17
+ library_name: llama-cpp
18
+ model-index:
19
+ - name: LFM2-350M-StepGame
20
+ results:
21
+ - task:
22
+ type: text-generation
23
+ name: Spatial Reasoning (StepGame)
24
+ dataset:
25
+ name: StepGame (validation split)
26
+ type: ZhengyanShi/StepGame
27
+ split: validation
28
+ metrics:
29
+ - type: accuracy
30
+ value: 74.4
31
+ name: Overall Accuracy
32
+ - type: accuracy
33
+ value: 94.0
34
+ name: 1-hop Accuracy
35
+ - type: accuracy
36
+ value: 90.0
37
+ name: 2-hop Accuracy
38
+ - type: accuracy
39
+ value: 76.0
40
+ name: 3-hop Accuracy
41
+ - type: accuracy
42
+ value: 54.0
43
+ name: 4-hop Accuracy
44
+ - type: accuracy
45
+ value: 58.0
46
+ name: 5-hop Accuracy
47
+ ---
48
+
49
+ # LFM2-350M-StepGame (GGUF)
50
+
51
+ Fine-tuned [LiquidAI/LFM2-350M](https://huggingface.co/LiquidAI/LFM2-350M) on the [StepGame](https://huggingface.co/datasets/ZhengyanShi/StepGame) spatial reasoning benchmark. The model answers directional relationship questions (left, right, above, below, upper-left, upper-right, lower-left, lower-right) given a sequence of positional statements.
52
+
53
+ ## Results
54
+
55
+ | Metric | Baseline | Fine-tuned | Delta |
56
+ |--------|----------|------------|-------|
57
+ | Overall | 16.0% | **74.4%** | +58.4 |
58
+ | 1-hop | 24.0% | **94.0%** | +70.0 |
59
+ | 2-hop | 14.0% | **90.0%** | +76.0 |
60
+ | 3-hop | 14.0% | **76.0%** | +62.0 |
61
+ | 4-hop | 18.0% | **54.0%** | +36.0 |
62
+ | 5-hop | 10.0% | **58.0%** | +48.0 |
63
+
64
+ Evaluated on 250 held-out examples (50 per hop level) from the StepGame validation split.
65
+
66
+ ## How to use
67
+
68
+ ### llama.cpp / llama-server
69
+
70
+ ```bash
71
+ llama-server \
72
+ --model LFM2-350M-StepGame-f16.gguf \
73
+ --ctx-size 8192 \
74
+ --host 0.0.0.0 --port 8080
75
+ ```
76
+
77
+ ### llama-cpp-python
78
+
79
+ ```python
80
+ from llama_cpp import Llama
81
+
82
+ llm = Llama(model_path="LFM2-350M-StepGame-f16.gguf", n_ctx=8192)
83
+ output = llm.create_chat_completion(messages=[
84
+ {"role": "system", "content": (
85
+ "You are a spatial reasoning assistant. "
86
+ "Given a sequence of positional relationships between objects, "
87
+ "determine the spatial relationship between two specified objects. "
88
+ "Answer with a single direction from: "
89
+ "left, right, above, below, upper-left, upper-right, lower-left, lower-right."
90
+ )},
91
+ {"role": "user", "content": (
92
+ "J and A are in a vertical line with A below J.\n\n"
93
+ "What is the relation of the agent A to the agent J?"
94
+ )},
95
+ ])
96
+ print(output["choices"][0]["message"]["content"])
97
+ # => "below"
98
+ ```
99
+
100
+ ## Training details
101
+
102
+ | Parameter | Value |
103
+ |-----------|-------|
104
+ | Base model | [LiquidAI/LFM2-350M](https://huggingface.co/LiquidAI/LFM2-350M) |
105
+ | Method | LoRA (PEFT) |
106
+ | Rank (r) | 16 |
107
+ | Alpha | 32 |
108
+ | Dropout | 0.05 |
109
+ | Target modules | q_proj, k_proj, v_proj, w1, w2, w3, in_proj, out_proj |
110
+ | Training examples | 10,000 (2,000 per hop level, stratified) |
111
+ | Epochs | 3 |
112
+ | Learning rate | 2e-4 |
113
+ | Batch size | 2 (x8 gradient accumulation) |
114
+ | Optimizer | paged_adamw_8bit |
115
+ | Quantization | QLoRA (NF4, double quant) |
116
+ | Final loss | 0.2033 |
117
+ | Training time | ~77 min (Colab T4) |
118
+
119
+ ## GGUF details
120
+
121
+ | File | Quant | Size |
122
+ |------|-------|------|
123
+ | `LFM2-350M-StepGame-f16.gguf` | F16 | 679 MB |
124
+
125
+ Produced by merging the LoRA adapter into the base model, then converting with llama.cpp `convert_hf_to_gguf.py`.
126
+
127
+ ## Dataset
128
+
129
+ Training and evaluation data come from different splits of [ZhengyanShi/StepGame](https://huggingface.co/datasets/ZhengyanShi/StepGame):
130
+
131
+ - **Training**: 10,000 examples from the `train` split (stratified, 2,000 per k-hop level)
132
+ - **Evaluation**: 250 examples from the `validation` split (stratified, 50 per k-hop level)
133
+
134
+ Examples with the "overlap" label were filtered out. Only the 8 cardinal/intercardinal directions are used.
135
+
136
+ ## Prompt format
137
+
138
+ The model uses ChatML-style prompts (`<|im_start|>`/`<|im_end|>` tokens):
139
+
140
+ ```
141
+ <|im_start|>system
142
+ You are a spatial reasoning assistant. Given a sequence of positional relationships between objects, determine the spatial relationship between two specified objects. Answer with a single direction from: left, right, above, below, upper-left, upper-right, lower-left, lower-right.<|im_end|>
143
+ <|im_start|>user
144
+ {story}
145
+
146
+ {question}<|im_end|>
147
+ <|im_start|>assistant
148
+ ```
149
+
150
+ ## Source
151
+
152
+ Project repository: [spatialft/spatialft.github.io](https://github.com/spatialft/spatialft.github.io)
153
+
154
+ Built for AIPI 590.03 Intelligent Agents (Duke University).