csabakecskemeti commited on
Commit
75dd3d8
·
verified ·
1 Parent(s): a6f1859

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -5,10 +5,11 @@ language: en
5
  widget:
6
  - text: "Give me a complete answer do not refer to other chapters but collect the information from them. How to setup a local network in Nextstep OS?"
7
  ---
8
-
9
  The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operating System,
10
  and able to answer question in the topic.
11
 
 
12
  The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
13
  documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
14
  model has been used. The training data generation was completely unsuperwised, with only some sanity check (like ignore data chunks
@@ -19,7 +20,7 @@ Evaluation set has been generated similar method on 1% of the raw data with LLam
19
  Trained locally on 2x3090 GPU with vanila DDP with HuggingFace Accelerate for 50 Epoch.
20
  As I wanted to add new knowledge to the base model r=128 and lora_alpha=128 has been used -> LoRA weights were 3.5% of the base model.
21
 
22
-
23
  Chat with model sample code:
24
  https://github.com/csabakecskemeti/ai_utils/blob/main/generate.py
25
 
 
5
  widget:
6
  - text: "Give me a complete answer do not refer to other chapters but collect the information from them. How to setup a local network in Nextstep OS?"
7
  ---
8
+ ## The goal
9
  The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operating System,
10
  and able to answer question in the topic.
11
 
12
+ ### Details
13
  The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
14
  documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
15
  model has been used. The training data generation was completely unsuperwised, with only some sanity check (like ignore data chunks
 
20
  Trained locally on 2x3090 GPU with vanila DDP with HuggingFace Accelerate for 50 Epoch.
21
  As I wanted to add new knowledge to the base model r=128 and lora_alpha=128 has been used -> LoRA weights were 3.5% of the base model.
22
 
23
+ ## Sample code
24
  Chat with model sample code:
25
  https://github.com/csabakecskemeti/ai_utils/blob/main/generate.py
26