Update README.md
Browse files
README.md
CHANGED
|
@@ -12,19 +12,19 @@ pipeline_tag: text-generation
|
|
| 12 |
|
| 13 |
# Model Card for xlstm-7b-instruct-phase-2
|
| 14 |
|
| 15 |
-
This model is a fine-tuned version of [ethicalabs/xLSTM-7b-Instruct](https://huggingface.co/ethicalabs/xLSTM-7b-Instruct).
|
| 16 |
-
It has been trained using [TRL](https://github.com/huggingface/trl).
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
| 21 |
-
|
|
|
|
| 22 |
|
| 23 |
-
|
| 24 |
-
generator = pipeline("text-generation", model="None", device="cuda")
|
| 25 |
-
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
|
| 26 |
-
print(output["generated_text"])
|
| 27 |
-
```
|
| 28 |
|
| 29 |
## Training procedure
|
| 30 |
|
|
|
|
| 12 |
|
| 13 |
# Model Card for xlstm-7b-instruct-phase-2
|
| 14 |
|
| 15 |
+
This model is a fine-tuned version of [ethicalabs/xLSTM-7b-Instruct](https://huggingface.co/ethicalabs/xLSTM-7b-Instruct) for task alignment.
|
|
|
|
| 16 |
|
| 17 |
+
It has been trained using [TRL](https://github.com/huggingface/trl) using SFT on assistant-only tokens.
|
| 18 |
+
|
| 19 |
+
The `k_proj` and `v_proj` matrices have been frozen to isolate and preserve the model's pre-trained knowledge base.
|
| 20 |
+
|
| 21 |
+
This fine-tuning focused only on the `q_proj` (query) and FFN matrices, adapting the model's reasoning and query-retrieval mechanisms without overwriting its core, frozen knowledge.
|
| 22 |
|
| 23 |
+
This experiment was designed to test the hypothesis that the model's reasoning capabilities (`q_proj`) could be specialized for math/code while its knowledge (`k_proj`, `v_proj`) remained intact.
|
| 24 |
+
|
| 25 |
+
## Quick start
|
| 26 |
|
| 27 |
+
Work in Progress!
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
## Training procedure
|
| 30 |
|