Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ This repository contains 10 expert models fine-tuned via low-rank adaptation (Lo
|
|
| 21 |
- **Fine-tuning Framework:** llama-factory
|
| 22 |
- **Adaptation Technique:** LoRA
|
| 23 |
- **Training Hardware:** 8×A100-80GB GPUs
|
| 24 |
-
- **Note**: Deploying a 2B model only requires 12GB of VRAM. For optimal performance, we recommend using an RTX 3090 (24GB) or a comparable GPU.
|
| 25 |
|
| 26 |
A visualization of the performance (ranks) across various datasets shows that each expert model excels in its respective domain.
|
| 27 |
vLLM supports dynamic LoRA switching, allowing seamless adaptation of different expert models with minimal computational overhead, enabling cost-effective optimization.
|
|
|
|
| 21 |
- **Fine-tuning Framework:** llama-factory
|
| 22 |
- **Adaptation Technique:** LoRA
|
| 23 |
- **Training Hardware:** 8×A100-80GB GPUs
|
| 24 |
+
- **Note**: Deploying a 2B model only requires 12GB of VRAM. For optimal performance, we recommend using an RTX 3090/4090 (24GB) or a comparable GPU.
|
| 25 |
|
| 26 |
A visualization of the performance (ranks) across various datasets shows that each expert model excels in its respective domain.
|
| 27 |
vLLM supports dynamic LoRA switching, allowing seamless adaptation of different expert models with minimal computational overhead, enabling cost-effective optimization.
|