Question Answering
Transformers
Safetensors
German
mistral
text-generation
Connect-Transport
Logics Software
German support chatbot
Deutscher KI Chatbot
Kundenservice Chatbot
Deutscher Chatbot
KI-Chatbots für Unternehmen
Chatbot for SMEs
Question-answering
QLoRA fine-tuning
LLM training
text-generation-inference
Update README.md
Browse files
README.md
CHANGED
|
@@ -133,7 +133,7 @@ llamafactory-cli train logicsct_train_Mistral_Nemo_qlora_sft_otfq.yaml # V
|
|
| 133 |
llamafactory-cli chat logicsct_inference_Mistral_Nemo_qlora_sft_otfq.yaml # VRAM used: 24833MiB for inference of base model + QLoRA adapter
|
| 134 |
llamafactory-cli export logicsct_export_Mistral_Nemo_qlora_sft.yaml # VRAM used: 657MiB + about 24 GB of system RAM for exporting a merged verison of the model with its adapter
|
| 135 |
llamafactory-cli export logicsct_export_Mistral_Nemo_qlora_sft_Q4.yaml # VRAM used: 30353MiB for a 4bit quant export of the merged model
|
| 136 |
-
llamafactory-cli chat logicsct_inference_Mistral_Nemo_qlora_sft_otfq_Q4.yaml # VRAM used: 8541MiB-9569MiB
|
| 137 |
```
|
| 138 |
|
| 139 |
### Comparison of Open Source Training/Models with OpenAI Proprietary Fine-Tuning
|
|
|
|
| 133 |
llamafactory-cli chat logicsct_inference_Mistral_Nemo_qlora_sft_otfq.yaml # VRAM used: 24833MiB for inference of base model + QLoRA adapter
|
| 134 |
llamafactory-cli export logicsct_export_Mistral_Nemo_qlora_sft.yaml # VRAM used: 657MiB + about 24 GB of system RAM for exporting a merged verison of the model with its adapter
|
| 135 |
llamafactory-cli export logicsct_export_Mistral_Nemo_qlora_sft_Q4.yaml # VRAM used: 30353MiB for a 4bit quant export of the merged model
|
| 136 |
+
llamafactory-cli chat logicsct_inference_Mistral_Nemo_qlora_sft_otfq_Q4.yaml # VRAM used: 8541MiB-9569MiB for inference of the 4bit quant merged model (increasing with increasing context length)
|
| 137 |
```
|
| 138 |
|
| 139 |
### Comparison of Open Source Training/Models with OpenAI Proprietary Fine-Tuning
|