Update README.md
Browse files
README.md
CHANGED
|
@@ -52,7 +52,7 @@ print(response[0].outputs[0].text)
|
|
| 52 |
|
| 53 |
## 🏗️ Technical Specifications
|
| 54 |
### Hardware Requirements
|
| 55 |
-
- **Inference**:
|
| 56 |
- **Supported GPUs**: H100, L40S, A100 (80GB), RTX 4090 (2x for tensor parallelism)
|
| 57 |
- **GPU Architecture**: Ada Lovelace, Hopper (for optimal FP8 performance)
|
| 58 |
### Quantization Details
|
|
|
|
| 52 |
|
| 53 |
## 🏗️ Technical Specifications
|
| 54 |
### Hardware Requirements
|
| 55 |
+
- **Inference**: 47GB VRAM (+ Context)
|
| 56 |
- **Supported GPUs**: H100, L40S, A100 (80GB), RTX 4090 (2x for tensor parallelism)
|
| 57 |
- **GPU Architecture**: Ada Lovelace, Hopper (for optimal FP8 performance)
|
| 58 |
### Quantization Details
|