Update README.md
Browse files
README.md
CHANGED
|
@@ -4,9 +4,10 @@ pipeline_tag: question-answering
|
|
| 4 |
|
| 5 |
The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operation system,
|
| 6 |
and able to answer question in the topic.
|
|
|
|
| 7 |
The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
|
| 8 |
documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
|
| 9 |
-
model has been used. The training data generation was completely unsuperwised, with
|
| 10 |
contains less than 100 tokens). The maximum token size for Orca2 is 4096 so a simple rule of split chunks over 3500 tokens
|
| 11 |
(considering propt instructions) has been used. Chunking did not consider context (text data might split within the context).
|
| 12 |
Evaluation set has been generated similar method on 1% of the raw data with LLama2 chat (https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF).
|
|
|
|
| 4 |
|
| 5 |
The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operation system,
|
| 6 |
and able to answer question in the topic.
|
| 7 |
+
|
| 8 |
The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
|
| 9 |
documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
|
| 10 |
+
model has been used. The training data generation was completely unsuperwised, with only some sanity check (like ignore data chunks
|
| 11 |
contains less than 100 tokens). The maximum token size for Orca2 is 4096 so a simple rule of split chunks over 3500 tokens
|
| 12 |
(considering propt instructions) has been used. Chunking did not consider context (text data might split within the context).
|
| 13 |
Evaluation set has been generated similar method on 1% of the raw data with LLama2 chat (https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF).
|