Scalai
/

scal-lite-60b-code

Mixture of Experts

8-bit precision

Model card Files Files and versions

Vicens commited on Mar 12

Commit

036f667

·

verified ·

1 Parent(s): e951e23

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ base_model: openai/gpt-oss-120b
 Engineered by [SCALAI](https://scalai.es), this model was surgically distilled from OpenAI's dense 117B parameter MoE (`gpt-oss-120b`) down to a 60B active parameter footprint. Quantized in MXFP4, **ScaLite-60B-Coder requires only ~30GB of VRAM, making it fully deployable on a single NVIDIA L40S (48GB) GPU** with ample room for large KV-caches in production environments.
 ## 🧠 Model Details
-* **Developer:** SCALAI (Vicens Gaitan)
 * **Model Type:** Pruned Mixture-of-Experts (MoE) Causal Language Model
 * **Base Model:** `openai/gpt-oss-120b` (128 experts)
 * **Pruned Architecture:** 60B active parameters (64 experts)

 Engineered by [SCALAI](https://scalai.es), this model was surgically distilled from OpenAI's dense 117B parameter MoE (`gpt-oss-120b`) down to a 60B active parameter footprint. Quantized in MXFP4, **ScaLite-60B-Coder requires only ~30GB of VRAM, making it fully deployable on a single NVIDIA L40S (48GB) GPU** with ample room for large KV-caches in production environments.
 ## 🧠 Model Details
+* **Developer:** SCALAI
 * **Model Type:** Pruned Mixture-of-Experts (MoE) Causal Language Model
 * **Base Model:** `openai/gpt-oss-120b` (128 experts)
 * **Pruned Architecture:** 60B active parameters (64 experts)