Update README.md
Browse files
README.md
CHANGED
|
@@ -5,7 +5,7 @@ license: apache-2.0
|
|
| 5 |
tags:
|
| 6 |
- merge
|
| 7 |
- mergekit
|
| 8 |
-
-
|
| 9 |
- agent
|
| 10 |
- gui-automation
|
| 11 |
- vision
|
|
@@ -181,19 +181,20 @@ This model was merged using **Mergekit**.
|
|
| 181 |
models:
|
| 182 |
- model: microsoft/Fara-7B
|
| 183 |
- model: ByteDance-Seed/UI-TARS-1.5-7B
|
| 184 |
-
|
| 185 |
-
density: 0.53
|
| 186 |
-
weight: 0.5
|
| 187 |
-
merge_method: dare_ties
|
| 188 |
base_model: microsoft/Fara-7B
|
| 189 |
-
parameters:
|
| 190 |
-
normalize: true
|
| 191 |
-
int8_mask: true
|
| 192 |
dtype: bfloat16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 193 |
```
|
| 194 |
-
*(Note: While `
|
| 195 |
|
| 196 |
-
##
|
| 197 |
|
| 198 |
1. **Strict Prompting:** The model expects the specific System Prompt defined in the usage class. Without it, it may hallucinate tool names.
|
| 199 |
2. **Repetition:** In extremely long lists (100+ items), the model may repeat. The recommended `repetition_penalty=1.15` fixes this for 99% of cases.
|
|
|
|
| 5 |
tags:
|
| 6 |
- merge
|
| 7 |
- mergekit
|
| 8 |
+
- slerp
|
| 9 |
- agent
|
| 10 |
- gui-automation
|
| 11 |
- vision
|
|
|
|
| 181 |
models:
|
| 182 |
- model: microsoft/Fara-7B
|
| 183 |
- model: ByteDance-Seed/UI-TARS-1.5-7B
|
| 184 |
+
merge_method: slerp
|
|
|
|
|
|
|
|
|
|
| 185 |
base_model: microsoft/Fara-7B
|
|
|
|
|
|
|
|
|
|
| 186 |
dtype: bfloat16
|
| 187 |
+
parameters:
|
| 188 |
+
t:
|
| 189 |
+
# 5-point gradient:
|
| 190 |
+
# 0.1 (Start): Mostly Fara -> Ensures input understanding and English grammar.
|
| 191 |
+
# 0.3 -> 0.5 (Middle): Blends TARS capability for reasoning and logic.
|
| 192 |
+
# 0.1 (End): Mostly Fara -> Ensures the output stops correctly and doesn't loop.
|
| 193 |
+
- value: [0.1, 0.3, 0.5, 0.3, 0.1]
|
| 194 |
```
|
| 195 |
+
*(Note: While `slerp` was used, specific inference parameters (temp=0.4, rep_penalty=1.15) are required to stabilize the output, as documented in the Usage section).*
|
| 196 |
|
| 197 |
+
## Limitations
|
| 198 |
|
| 199 |
1. **Strict Prompting:** The model expects the specific System Prompt defined in the usage class. Without it, it may hallucinate tool names.
|
| 200 |
2. **Repetition:** In extremely long lists (100+ items), the model may repeat. The recommended `repetition_penalty=1.15` fixes this for 99% of cases.
|