| --- |
| {} |
| --- |
| |
| # **Model Summary: Infy-Coder-2.5B** |
|
|
| ## **Overview** |
| Infy-Coder-2.5B-v1 is a breakthrough 2.5B-parameter code model fully designed, engineered, and trained at Infosys on 4.2T tokens on EnterpriseSLM-2.5B base model. Despite its compact size, Infy-Coder-2.5B-v1 sets a new benchmark for small language models, achieving performance parity with frontier open-source models in code generation and tool calling, along with exemplary performance on safety metrics in helpfulness and harmlessness, and superior throughput that surpasses larger frontier models. |
|
|
| **Developed by**: Infosys Ltd. |
|
|
| --- |
|
|
| ## **Architecture & Training** |
| - **Base Model:** EnterpriseSLM-2.5B |
| - **Training Phases:** |
| - **Continual Pretraining (CPT):** Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling. |
| - **Supervised Fine-Tuning (SFT):** Instruction alignment for coding tasks, function calling, and safety. |
| - **Optimization:** |
| - **BF16 mixed precision**, **Grouped Query Attention (GQA)**, and **Distributed Fused Adam** optimizer. |
| - Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors. |
|
|
| --- |
|
|
| ## **Performance Highlights** |
|
|
| | **Category** | **Benchmark** | **# Shots** | **Metric** | **Scores** | |
| |----------------|--------------------------------------|-------------|--------------|--------------| |
| | Code Gen | MBPP | 0 | pass@1 | 91.21% | |
| | Code Gen | MBPP+ | 0 | pass@1 | 89.15% | |
| | Code Gen | HumanEval | 0 | pass@1 | 53.66% | |
| | Code Gen | HumanEval+ | 0 | pass@1 | 48.78% | |
| | Code Gen | NumpyEval | 0 | pass@1 | 56.44% | |
| | Code Gen | PandasEval | 0 | pass@1 | 53.47% | |
| | Tool Use | BFCL v2 | 0 | overall acc | 55.26% | |
| | Safety | AIR-Bench | 0 | pass@1 | 67.32% | |
| | SecCode Gen | CybersecEval4-Autocomplete | 0 | pass@1 | 78.91% | |
|
|
| --- |
|
|
| ## **Responsible AI & Safety** |
| - Integrated safety objectives during SFT. |
| - Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use. |
| - Validated against **Stanford AIR-Bench** and **CybersecEval4-Autocomplete** benchmarks. |
|
|
| --- |
|
|
| ## **Deployment & Future Work** |
| - **Quantization:** The model was optimized for low latency outperforming most sub-8B SLM models. Furthermore, the quantized variants of Infy-Coder can be seamlessly deployed and inferenced on standard desktop environments, eliminating the need for specialized hardware such as GPUs. |
| - Future work includes enhancing Infy-Coder with agentic coding competencies and scaling its context length. The model weights will be open-sourced early next year to accelerate research and real-world deployment. |