Update README.md
Browse files
README.md
CHANGED
|
@@ -5,10 +5,11 @@ language: en
|
|
| 5 |
widget:
|
| 6 |
- text: "Give me a complete answer do not refer to other chapters but collect the information from them. How to setup a local network in Nextstep OS?"
|
| 7 |
---
|
| 8 |
-
|
| 9 |
The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operating System,
|
| 10 |
and able to answer question in the topic.
|
| 11 |
|
|
|
|
| 12 |
The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
|
| 13 |
documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
|
| 14 |
model has been used. The training data generation was completely unsuperwised, with only some sanity check (like ignore data chunks
|
|
@@ -19,7 +20,7 @@ Evaluation set has been generated similar method on 1% of the raw data with LLam
|
|
| 19 |
Trained locally on 2x3090 GPU with vanila DDP with HuggingFace Accelerate for 50 Epoch.
|
| 20 |
As I wanted to add new knowledge to the base model r=128 and lora_alpha=128 has been used -> LoRA weights were 3.5% of the base model.
|
| 21 |
|
| 22 |
-
|
| 23 |
Chat with model sample code:
|
| 24 |
https://github.com/csabakecskemeti/ai_utils/blob/main/generate.py
|
| 25 |
|
|
|
|
| 5 |
widget:
|
| 6 |
- text: "Give me a complete answer do not refer to other chapters but collect the information from them. How to setup a local network in Nextstep OS?"
|
| 7 |
---
|
| 8 |
+
## The goal
|
| 9 |
The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operating System,
|
| 10 |
and able to answer question in the topic.
|
| 11 |
|
| 12 |
+
### Details
|
| 13 |
The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
|
| 14 |
documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
|
| 15 |
model has been used. The training data generation was completely unsuperwised, with only some sanity check (like ignore data chunks
|
|
|
|
| 20 |
Trained locally on 2x3090 GPU with vanila DDP with HuggingFace Accelerate for 50 Epoch.
|
| 21 |
As I wanted to add new knowledge to the base model r=128 and lora_alpha=128 has been used -> LoRA weights were 3.5% of the base model.
|
| 22 |
|
| 23 |
+
## Sample code
|
| 24 |
Chat with model sample code:
|
| 25 |
https://github.com/csabakecskemeti/ai_utils/blob/main/generate.py
|
| 26 |
|