| | --- |
| | library_name: transformers |
| | license: mit |
| | pipeline_tag: text-generation |
| | language: |
| | - en |
| | - code |
| | tags: |
| | - transformers |
| | - pytorch |
| | - safetensors |
| | - text-generation |
| | - code-generation |
| | - python |
| | - javascript |
| | - coding |
| | - programming |
| | - sagemaker |
| | - amazon-sagemaker |
| | - cpu |
| | - compact |
| | - efficient |
| | - nvdya-kit |
| | - death-legion |
| | - vllm |
| | - sglang |
| | - llama-cpp |
| | - ollama |
| | - lm-studio |
| | - year-2026 |
| | - next-gen |
| |
|
| | datasets: |
| | - the-stack-v2 |
| |
|
| | metrics: |
| | - perplexity |
| | - accuracy |
| |
|
| | model-index: |
| | - name: Legion Coder 8M 2026 |
| | results: [] |
| |
|
| | inference: |
| | parameters: |
| | temperature: 0.8 |
| | top_p: 0.95 |
| | top_k: 50 |
| | max_new_tokens: 200 |
| |
|
| | sagemaker: |
| | sdk_version: "2.200.0" |
| | instance_type: "ml.m5.large" |
| | instance_count: 1 |
| | container_image: "huggingface-pytorch-inference:2.0.0-transformers4.28.1-cpu-py310-ubuntu20.04-v1.0" |
| | --- |
| | |
| | # Legion Coder 8M 2026 |
| |
|
| | **A 44M Parameter Transformer for Code Generation - 2026 Edition** |
| |
|
| | [](https://huggingface.co/dineth554/legion-coder-8m) |
| | []() |
| | []() |
| |
|
| | ## Quick Links |
| |
|
| | <div align="center"> |
| |
|
| | ### Libraries and Frameworks |
| |
|
| | [](https://huggingface.co/docs/transformers) |
| | [](https://pytorch.org/) |
| | [](https://github.com/huggingface/safetensors) |
| |
|
| | ### Local Apps and Inference Engines |
| |
|
| | [](https://docs.vllm.ai/) |
| | [](https://sgl-project.github.io/) |
| | [](https://github.com/ggerganov/llama.cpp) |
| | [](https://ollama.ai/) |
| | [](https://lmstudio.ai/) |
| |
|
| | ### Notebooks and Cloud |
| |
|
| | [](https://colab.research.google.com/github/dineth554/legion-coder-8m/blob/main/notebooks/legion_coder_demo.ipynb) |
| | [](https://kaggle.com/kernels/welcome?src=https://github.com/dineth554/legion-coder-8m/blob/main/notebooks/legion_coder_demo.ipynb) |
| |
|
| | </div> |
| |
|
| | ## About |
| |
|
| | Legion Coder 2026 is a compact yet powerful 44M parameter transformer model optimized for coding tasks. Built with precision by **DEATH LEGION** and powered by **nvdya-kit**, this model delivers high-quality code generation in a lightweight package. |
| |
|
| | **2026 Edition Features:** |
| | - Enhanced performance optimizations |
| | - Updated documentation and branding |
| | - Professional icon-based UI |
| | - Advanced CSS animations |
| | - Performance comparison charts |
| |
|
| | ## Features |
| |
|
| | - **Clean Code Generation** - PEP 8 compliant Python and more |
| | - **Debug Assistance** - Help identify and fix code issues |
| | - **Code Explanation** - Understand complex programming concepts |
| | - **Multi-language Support** - Python, JavaScript, and more |
| | - **Fast Inference** - Optimized for CPU deployment |
| | - **SageMaker Ready** - One-click AWS deployment |
| | - **Template Ready** - Duplicate this space to create your own |
| |
|
| | ## Model Specifications 2026 |
| |
|
| | | Attribute | Value | |
| | |-----------|-------| |
| | | **Parameters** | 44,341,632 (~44M) | |
| | | **Model Size** | ~170MB | |
| | | **Architecture** | GPT-style Transformer | |
| | | **Hidden Size** | 576 | |
| | | **Layers** | 13 | |
| | | **Attention Heads** | 16 | |
| | | **Context Length** | 1,024 tokens | |
| | | **Vocabulary** | 16,000 tokens | |
| | | **Format** | Safetensors | |
| | | **Edition** | 2026 | |
| |
|
| | ## Model Comparison 2026 |
| |
|
| | | Model | Parameters | Size | Efficiency Score | Best For | |
| | |-------|------------|------|------------------|----------| |
| | | **Legion Coder 8M** | 44M | ~170MB | 9.5/10 | Code generation, CPU inference | |
| | | TinyLlama-1.1B | 1.1B | ~2.2GB | 6.0/10 | General text, GPU required | |
| | | Qwen2.5-0.5B | 500M | ~1.0GB | 7.0/10 | Multilingual, GPU recommended | |
| | | CodeLlama-7B | 7B | ~13GB | 5.0/10 | Production code, GPU required | |
| | | Phi-2 | 2.7B | ~5.3GB | 6.5/10 | Reasoning, GPU required | |
| |
|
| | **Efficiency Score** = (Parameter Efficiency x Memory Efficiency x Speed) / 3 |
| |
|
| | Legion Coder 8M 2026 achieves exceptional efficiency through: |
| | - **260x smaller** than CodeLlama-7B |
| | - **13x smaller** than TinyLlama-1.1B |
| | - **6x smaller** than Qwen2.5-0.5B |
| | - Runs entirely on CPU with 8GB RAM |
| |
|
| | ## Amazon SageMaker Deployment |
| |
|
| | This model is ready for deployment on Amazon SageMaker with one-click deployment support. |
| |
|
| | ### Deploy to AWS SageMaker |
| |
|
| | [](https://huggingface.co/dineth554/legion-coder-8m/deploy/sagemaker) |
| |
|
| | ### Using the SageMaker Python SDK |
| |
|
| | ```python |
| | import sagemaker |
| | from sagemaker.huggingface import HuggingFaceModel |
| | |
| | # Initialize SageMaker session |
| | sess = sagemaker.Session() |
| | |
| | # Create Hugging Face Model |
| | huggingface_model = HuggingFaceModel( |
| | model_data="dineth554/legion-coder-8m", |
| | transformers_version="4.36.0", |
| | pytorch_version="2.1.0", |
| | py_version="py310", |
| | role="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_SAGEMAKER_ROLE", |
| | sagemaker_session=sess, |
| | ) |
| | |
| | # Deploy to SageMaker |
| | predictor = huggingface_model.deploy( |
| | initial_instance_count=1, |
| | instance_type="ml.m5.large", |
| | endpoint_name="legion-coder-8m-endpoint" |
| | ) |
| | |
| | # Test the endpoint |
| | result = predictor.predict({ |
| | "inputs": "Write a Python function to calculate fibonacci numbers:", |
| | "parameters": { |
| | "temperature": 0.8, |
| | "max_new_tokens": 200 |
| | } |
| | }) |
| | |
| | print(result) |
| | ``` |
| |
|
| | ### SageMaker Inference Script |
| |
|
| | The `sagemaker_inference.py` file in this repository provides the inference handler for SageMaker deployment. |
| |
|
| | ## Local Inference with vLLM |
| |
|
| | ```python |
| | from vllm import LLM, SamplingParams |
| | |
| | # Load model with vLLM |
| | llm = LLM(model="dineth554/legion-coder-8m") |
| | |
| | # Set sampling parameters |
| | sampling_params = SamplingParams( |
| | temperature=0.8, |
| | top_p=0.95, |
| | max_tokens=200 |
| | ) |
| | |
| | # Generate code |
| | prompt = "Write a Python function to calculate fibonacci numbers:" |
| | outputs = llm.generate(prompt, sampling_params) |
| | print(outputs[0].outputs[0].text) |
| | ``` |
| |
|
| | ## Local Inference with SGLang |
| |
|
| | ```python |
| | import sglang as sgl |
| | |
| | # Define prompt template |
| | @sgl.function |
| | def code_gen(s, prompt): |
| | s += sgl.system("You are a helpful coding assistant.") |
| | s += sgl.user(prompt) |
| | s += sgl.assistant(sgl.gen("code", max_tokens=200)) |
| | |
| | # Run inference |
| | result = code_gen.run( |
| | prompt="Write a Python function to calculate fibonacci numbers:", |
| | temperature=0.8 |
| | ) |
| | print(result["code"]) |
| | ``` |
| |
|
| | ## Technical Details |
| |
|
| | ### Training Data |
| | - Python code from The Stack v2 dataset |
| | - GitHub code repositories (filtered for quality) |
| | - Code-specific preprocessing for indentation and special tokens |
| |
|
| | ### Training Procedure |
| | - **Optimizer:** AdamW |
| | - **Learning Rate:** 5e-4 with cosine decay |
| | - **Batch Size:** 4 with gradient accumulation |
| | - **Training Steps:** 10,000 |
| | - **Precision:** float32 (CPU-optimized) |
| |
|
| | ## License |
| |
|
| | This model is released under the **MIT License**. |
| |
|
| | ## Links |
| |
|
| | - **Model Repository:** [dineth554/legion-coder-8m](https://huggingface.co/dineth554/legion-coder-8m) |
| | - **Live Demo:** [Hugging Face Space](https://huggingface.co/spaces/dineth554/legion-coder-8m) |
| |
|
| | <div align="center"> |
| |
|
| | ### MADE WITH BY DEATH LEGION |
| |
|
| | **Powered by nvdya-kit** |
| |
|
| | *2026 DEATH LEGION. All rights reserved.* |
| |
|
| | </div> |
| |
|