Maincoder-1B is a code-focused language model optimized for code generation and completion tasks. The model achieves strong performance on coding benchmarks while maintaining a compact size suitable for local deployment.

Key Features

  • Code Generation: Optimized for Python code completion and generation tasks.
  • Compact Size: 1 billion parameters, lightweight enough to run on consumer hardware.
  • Deep Architecture: Modern transformer architecture with RoPE embeddings, grouped-query attention, QK normalization and high depth-to-width ratio.
  • Advanced Data Mixing: Pre-trained and mid-trained on custom data mixes developed for high-performance coding.
  • MCPO Algorithm: Fine-tuned with specialised reinforcement learning policy optimisation algorithm to improve training stability and accelerate convergence.
  • SOTA Performance: State-of-the-art performance on Python coding benchmarks HumanEval, HumanEval+ and MBPP+.

Benchmark Results

Benchmark Performance Across Baseline LLMs
Model HumanEval HumanEval+ MBPP+ MMLU GSM8K
Maincode/Maincoder-1B 0.7622 0.7256 0.7090 0.3054 0.2976
deepseek-ai/deepseek-coder-1.3b-instruct 0.5610 0.5305 0.6217 0.2705 0.0413
HuggingFaceTB/SmolLM3-3B 0.5366 0.5000 0.6799 0.5928 0.5505
Qwen/Qwen2.5-Coder-1.5B-Instruct 0.4634 0.4451 0.6561 0.4984 0.4944
Qwen/Qwen3-1.7B 0.4024 0.3780 0.5582 0.5571 0.6865

Model Overview

Maincoder uses a modern transformer decoder architecture with:

  • Rotary Position Embeddings: With theta of 1,000,000.
  • RMSNorm: Pre-normalization for stable training.
  • Grouped Query Attention: 4:1 ratio of query to key-value heads.
  • QK Normalization: RMSNorm applied to attention queries and keys.
  • SwiGLU MLP: Gated linear units with SiLU activation.
Attribute Value
Parameters 1B
Hidden Size 1536
Layers 32
Attention Heads 16 (4 KV heads)
Head Dimension 96
Vocabulary Size 151,936
Context Length 2,048
Precision bfloat16

Usage

Installation

pip install transformers torch

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Maincode/Maincoder-1B",
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "Maincode/Maincoder-1B",
    trust_remote_code=True,
)

# Code completion example
prompt = '''def fibonacci(n: int) -> int:
    """Return the n-th Fibonacci number."""
'''

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.2,
    do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Code Completion

# Function completion
prompt = '''def quicksort(arr: list) -> list:
    """Sort a list using the quicksort algorithm."""
'''

# Class completion
prompt = '''class BinarySearchTree:
    """A binary search tree implementation."""
    
    def __init__(self):
'''

# Algorithm implementation
prompt = '''def dijkstra(graph: dict, start: str, end: str) -> tuple:
    """Find the shortest path using Dijkstra's algorithm.
    
    Args:
        graph: Adjacency list representation of the graph
        start: Starting node
        end: Target node
    
    Returns:
        Tuple of (distance, path)
    """
'''

Additional Notes

Reproducibility

Model evaluations were run on 8 AMD MI355X GPUs via the EleutherAI framework.
docker run --rm -it \
  --device=/dev/kfd --device=/dev/dri --group-add=video \
  --ipc=host --security-opt seccomp=unconfined \
  -v $(pwd):/workspace -w /workspace \
  -e HF_TOKEN \
  -e PYTHONHASHSEED=0 \
  -e TORCH_DETERMINISTIC=1 \
  -e ROCBLAS_ATOMICS_MODE="0" \
  -e MIOPEN_FIND_MODE="1" \
  -e CUBLAS_WORKSPACE_CONFIG=":4096:8" \
  -e HF_ALLOW_CODE_EVAL="1" \
  rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.9.1 \
  bash -c 'pip install "lm_eval[hf]" && \
  accelerate launch -m lm_eval \
  --model hf --model_args "pretrained=Maincode/Maincoder-1B,trust_remote_code=True,dtype=float32" \
  --tasks humaneval,humaneval_plus,mbpp_plus,mmlu,gsm8k \
  --device cuda:0 --batch_size 32 --seed 42 \
  --confirm_run_unsafe_code'

Limitations

  • Context length limited to 2,048 tokens
  • Primarily optimized for Python, performance may vary on other languages
  • May generate code with bugs or security issues - always review generated code
Disclaimer: This model has not undergone any alignment or safety tuning (e.g., RLHF/RLAIF, DPO, or safety fine-tuning). Outputs may be unsafe or biased. Please use appropriate safeguards and evaluate carefully for your use case.

License

This model is released under the Apache 2.0 License.

Citation

@misc{maincoder2025,
  title        = {Maincoder-1B: A High-Performance 1B Parameter Coding Model},
  author       = {Maincode Team},
  year         = {2025},
  organization = {Maincode},
  howpublished = {\url{https://huggingface.co/Maincode/Maincoder-1B}}
}

Contact

For questions, issues, or collaboration inquiries, please visit Maincode.

Downloads last month
296
Safetensors
Model size
1B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 7 Ask for provider support

Model tree for Maincode/Maincoder-1B

Unable to build the model tree, the base model loops to the model itself. Learn more.