Mify-Coder-2.5B / README.md

srkchowdary2000

Update README.md

c437958 verified about 2 months ago

3.09 kB

	---
	{}
	---

	# Model Summary: Mify-Coder-2.5B

	## Overview
	Mify-Coder-2.5B-v1 is a breakthrough 2.5B-parameter code model fully designed, engineered, and trained at Infosys on 4.2T tokens on Mify-2.5B base model. Despite its compact size, Mify-Coder-2.5B-v1 sets a new benchmark for small language models, achieving performance parity with frontier open-source models in code generation and tool calling, along with exemplary performance on safety metrics in helpfulness and harmlessness, and superior throughput that surpasses larger frontier models.

	Developed by: Infosys Ltd.

	---

	## Architecture & Training
	- Base Model: Mify-2.5B
	- Training Phases:
	- Continual Pretraining (CPT): Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
	- Supervised Fine-Tuning (SFT): Instruction alignment for coding tasks, function calling, and safety.
	- Optimization:
	- BF16 mixed precision, Grouped Query Attention (GQA), and Distributed Fused Adam optimizer.
	- Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.

	---

	## Performance Highlights

	\| Category \| Benchmark \| # Shots \| Metric \| Scores \|
	\|----------------\|--------------------------------------\|-------------\|--------------\|--------------\|
	\| Code Gen \| MBPP \| 0 \| pass@1 \| 91.21% \|
	\| Code Gen \| MBPP+ \| 0 \| pass@1 \| 89.15% \|
	\| Code Gen \| HumanEval \| 0 \| pass@1 \| 53.66% \|
	\| Code Gen \| HumanEval+ \| 0 \| pass@1 \| 48.78% \|
	\| Code Gen \| NumpyEval \| 0 \| pass@1 \| 56.44% \|
	\| Code Gen \| PandasEval \| 0 \| pass@1 \| 53.47% \|
	\| Tool Use \| BFCL v2 \| 0 \| overall acc \| 55.26% \|
	\| Safety \| AIR-Bench \| 0 \| pass@1 \| 67.32% \|
	\| SecCode Gen \| CybersecEval4-Autocomplete \| 0 \| pass@1 \| 78.91% \|

	---

	## Responsible AI & Safety
	- Integrated safety objectives during SFT.
	- Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
	- Validated against Stanford AIR-Bench and CybersecEval4-Autocomplete benchmarks.

	---

	## Deployment & Future Work
	- Quantization: The model was optimized for low latency outperforming most sub-8B SLM models. Furthermore, the quantized variants of Mify-Coder can be seamlessly deployed and inferenced on standard desktop environments, eliminating the need for specialized hardware such as GPUs.
	- Future work includes enhancing Mify-Coder with agentic coding competencies and scaling its context length. The model weights will be open-sourced early next year to accelerate research and real-world deployment.

	---
	{}
	---

	# Model Summary: Mify-Coder-2.5B

	## Overview
	Mify-Coder-2.5B-v1 is a breakthrough 2.5B-parameter code model fully designed, engineered, and trained at Infosys on 4.2T tokens on Mify-2.5B base model. Despite its compact size, Mify-Coder-2.5B-v1 sets a new benchmark for small language models, achieving performance parity with frontier open-source models in code generation and tool calling, along with exemplary performance on safety metrics in helpfulness and harmlessness, and superior throughput that surpasses larger frontier models.

	Developed by: Infosys Ltd.

	---

	## Architecture & Training
	- Base Model: Mify-2.5B
	- Training Phases:
	- Continual Pretraining (CPT): Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
	- Supervised Fine-Tuning (SFT): Instruction alignment for coding tasks, function calling, and safety.
	- Optimization:
	- BF16 mixed precision, Grouped Query Attention (GQA), and Distributed Fused Adam optimizer.
	- Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.

	---

	## Performance Highlights

	\| Category \| Benchmark \| # Shots \| Metric \| Scores \|
	\|----------------\|--------------------------------------\|-------------\|--------------\|--------------\|
	\| Code Gen \| MBPP \| 0 \| pass@1 \| 91.21% \|
	\| Code Gen \| MBPP+ \| 0 \| pass@1 \| 89.15% \|
	\| Code Gen \| HumanEval \| 0 \| pass@1 \| 53.66% \|
	\| Code Gen \| HumanEval+ \| 0 \| pass@1 \| 48.78% \|
	\| Code Gen \| NumpyEval \| 0 \| pass@1 \| 56.44% \|
	\| Code Gen \| PandasEval \| 0 \| pass@1 \| 53.47% \|
	\| Tool Use \| BFCL v2 \| 0 \| overall acc \| 55.26% \|
	\| Safety \| AIR-Bench \| 0 \| pass@1 \| 67.32% \|
	\| SecCode Gen \| CybersecEval4-Autocomplete \| 0 \| pass@1 \| 78.91% \|

	---

	## Responsible AI & Safety
	- Integrated safety objectives during SFT.
	- Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
	- Validated against Stanford AIR-Bench and CybersecEval4-Autocomplete benchmarks.

	---

	## Deployment & Future Work
	- Quantization: The model was optimized for low latency outperforming most sub-8B SLM models. Furthermore, the quantized variants of Mify-Coder can be seamlessly deployed and inferenced on standard desktop environments, eliminating the need for specialized hardware such as GPUs.
	- Future work includes enhancing Mify-Coder with agentic coding competencies and scaling its context length. The model weights will be open-sourced early next year to accelerate research and real-world deployment.