Updated README.md file
#1
by srkchowdary2000 - opened
README.md
CHANGED
|
@@ -2,17 +2,17 @@
|
|
| 2 |
{}
|
| 3 |
---
|
| 4 |
|
| 5 |
-
# **Model Summary:
|
| 6 |
|
| 7 |
## **Overview**
|
| 8 |
-
|
| 9 |
|
| 10 |
**Developed by**: Infosys Ltd.
|
| 11 |
|
| 12 |
---
|
| 13 |
|
| 14 |
## **Architecture & Training**
|
| 15 |
-
- **Base Model:**
|
| 16 |
- **Training Phases:**
|
| 17 |
- **Continual Pretraining (CPT):** Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
|
| 18 |
- **Supervised Fine-Tuning (SFT):** Instruction alignment for coding tasks, function calling, and safety.
|
|
@@ -46,5 +46,5 @@ Mify-Coder-2.5B-v1 is a breakthrough 2.5B-parameter code model fully designed, e
|
|
| 46 |
---
|
| 47 |
|
| 48 |
## **Deployment & Future Work**
|
| 49 |
-
- **Quantization:** The model was optimized for low latency outperforming most sub-8B SLM models. Furthermore, the quantized variants of
|
| 50 |
-
- Future work includes enhancing
|
|
|
|
| 2 |
{}
|
| 3 |
---
|
| 4 |
|
| 5 |
+
# **Model Summary: Infy-Coder-2.5B**
|
| 6 |
|
| 7 |
## **Overview**
|
| 8 |
+
Infy-Coder-2.5B-v1 is a breakthrough 2.5B-parameter code model fully designed, engineered, and trained at Infosys on 4.2T tokens on EnterpriseSLM-2.5B base model. Despite its compact size, Infy-Coder-2.5B-v1 sets a new benchmark for small language models, achieving performance parity with frontier open-source models in code generation and tool calling, along with exemplary performance on safety metrics in helpfulness and harmlessness, and superior throughput that surpasses larger frontier models.
|
| 9 |
|
| 10 |
**Developed by**: Infosys Ltd.
|
| 11 |
|
| 12 |
---
|
| 13 |
|
| 14 |
## **Architecture & Training**
|
| 15 |
+
- **Base Model:** EnterpriseSLM-2.5B
|
| 16 |
- **Training Phases:**
|
| 17 |
- **Continual Pretraining (CPT):** Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
|
| 18 |
- **Supervised Fine-Tuning (SFT):** Instruction alignment for coding tasks, function calling, and safety.
|
|
|
|
| 46 |
---
|
| 47 |
|
| 48 |
## **Deployment & Future Work**
|
| 49 |
+
- **Quantization:** The model was optimized for low latency outperforming most sub-8B SLM models. Furthermore, the quantized variants of Infy-Coder can be seamlessly deployed and inferenced on standard desktop environments, eliminating the need for specialized hardware such as GPUs.
|
| 50 |
+
- Future work includes enhancing Infy-Coder with agentic coding competencies and scaling its context length. The model weights will be open-sourced early next year to accelerate research and real-world deployment.
|