Add details
Browse files
README.md
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: question-answering
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operation system,
|
| 6 |
+
and able to answer question in the topic.
|
| 7 |
+
The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
|
| 8 |
+
documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
|
| 9 |
+
model has been used. The training data generation was completely unsuperwised, with minilar sanity check (like ignore data chunks
|
| 10 |
+
contains less than 100 tokens). The maximum token size for Orca2 is 4096 so a simple rule of split chunks over 3500 tokens
|
| 11 |
+
(considering propt instructions) has been used. Chunking did not consider context (text data might split within the context).
|
| 12 |
+
Evaluation set has been generated similar method on 1% of the raw data with LLama2 chat (https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF).
|
| 13 |
+
|
| 14 |
+
Trained locally on 2x3090 GPU with vanila DDP with HuggingFace Accelerate for 50 Epoch.
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
Chat with model sample code:
|
| 18 |
+
https://github.com/csabakecskemeti/ai_utils/blob/main/generate.py
|
| 19 |
+
|