Mify-Coder-2.5B / README.md

srkchowdary2000

Update README.md

46cc8d0 verified about 2 months ago

2.47 kB

	---
	{}
	---

	# Model Summary: Mify-Coder-2.5B

	## Overview
	Mify-Coder-2.5B-8K is a 2.5B-parameter code-focused language model. It delivers frontier-grade performance in code generation, reasoning, and function calling tasks while maintaining compute efficiency and enterprise-grade safety. Unlike scale-first paradigms, Mify-Coder demonstrates that smaller models can achieve competitive results through principled data curation and optimized training strategies.

	Developed by: Infosys Ltd.

	---

	## Architecture & Training
	- Base Model: Mify-2.5B
	- Training Phases:
	- Continual Pretraining (CPT): Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
	- Supervised Fine-Tuning (SFT): Instruction alignment for coding tasks, multi-turn dialogues, function calling, and safety.
	- Optimization:
	- BF16 mixed precision, Grouped Query Attention (GQA), and Distributed Fused Adam optimizer.
	- Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.

	---

	## Performance Highlights

	\| Category \| Benchmark \| # Shots \| Metric \| Scores \|
	\|----------------\|----------------------\|-------------\|------------\|-------------------\|
	\| Code Gen \| MBPP \| 0 \| pass@1 \| 89.23% \|
	\| Code Gen \| MBPP+ \| 0 \| pass@1 \| 88.89% \|
	\| Code Gen \| HumanEval \| 0 \| pass@1 \| 53.05% \|
	\| Code Gen \| HumanEval+ \| 0 \| pass@1 \| 46.95% \|
	\| Code Gen \| NumpyEval \| 0 \| pass@1 \| 56.44% \|
	\| Code Gen \| PandasEval \| 0 \| pass@1 \| 53.47% \|
	\| Tool Use \| BFCL v1 \| 0 \| acc \| 79.19% \|
	\| Tool Use \| BFCL v2 \| 0 \| acc \| 55.26% \|


	- Outperforms larger models on algorithmic reasoning tasks while maintaining competitive general coding and security-oriented capabilities.

	---

	## Responsible AI & Safety
	- Integrated safety objectives during SFT.
	- Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
	- Validated against Stanford AirBench and CyberSecEval benchmarks.

	---

	## Deployment & Future Work
	- Quantization: FP8 and AWQ for efficient inference; optimized with TensorRT-LLM.

	---
	{}
	---

	# Model Summary: Mify-Coder-2.5B

	## Overview
	Mify-Coder-2.5B-8K is a 2.5B-parameter code-focused language model. It delivers frontier-grade performance in code generation, reasoning, and function calling tasks while maintaining compute efficiency and enterprise-grade safety. Unlike scale-first paradigms, Mify-Coder demonstrates that smaller models can achieve competitive results through principled data curation and optimized training strategies.

	Developed by: Infosys Ltd.

	---

	## Architecture & Training
	- Base Model: Mify-2.5B
	- Training Phases:
	- Continual Pretraining (CPT): Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
	- Supervised Fine-Tuning (SFT): Instruction alignment for coding tasks, multi-turn dialogues, function calling, and safety.
	- Optimization:
	- BF16 mixed precision, Grouped Query Attention (GQA), and Distributed Fused Adam optimizer.
	- Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.

	---

	## Performance Highlights

	\| Category \| Benchmark \| # Shots \| Metric \| Scores \|
	\|----------------\|----------------------\|-------------\|------------\|-------------------\|
	\| Code Gen \| MBPP \| 0 \| pass@1 \| 89.23% \|
	\| Code Gen \| MBPP+ \| 0 \| pass@1 \| 88.89% \|
	\| Code Gen \| HumanEval \| 0 \| pass@1 \| 53.05% \|
	\| Code Gen \| HumanEval+ \| 0 \| pass@1 \| 46.95% \|
	\| Code Gen \| NumpyEval \| 0 \| pass@1 \| 56.44% \|
	\| Code Gen \| PandasEval \| 0 \| pass@1 \| 53.47% \|
	\| Tool Use \| BFCL v1 \| 0 \| acc \| 79.19% \|
	\| Tool Use \| BFCL v2 \| 0 \| acc \| 55.26% \|


	- Outperforms larger models on algorithmic reasoning tasks while maintaining competitive general coding and security-oriented capabilities.

	---

	## Responsible AI & Safety
	- Integrated safety objectives during SFT.
	- Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
	- Validated against Stanford AirBench and CyberSecEval benchmarks.

	---

	## Deployment & Future Work
	- Quantization: FP8 and AWQ for efficient inference; optimized with TensorRT-LLM.