| | --- |
| | library_name: transformers |
| | license: apache-2.0 |
| | license_link: https://huggingface.co/pnny13/legion-coder-8m/blob/main/LICENSE |
| | pipeline_tag: text-generation |
| | language: |
| | - en |
| | - code |
| | tags: |
| | - transformers |
| | - pytorch |
| | - safetensors |
| | - text-generation |
| | - code-generation |
| | - python |
| | - javascript |
| | - coding |
| | - programming |
| | - sagemaker |
| | - amazon-sagemaker |
| | - cpu |
| | - compact |
| | - efficient |
| | - nvdya-kit |
| | - death-legion |
| | - vllm |
| | - sglang |
| | - llama-cpp |
| | - ollama |
| | - lm-studio |
| | - year-2026 |
| | - next-gen |
| | datasets: |
| | - the-stack-v2 |
| | metrics: |
| | - perplexity |
| | - accuracy |
| | model-index: |
| | - name: Legion Coder 8M 2026 |
| | results: [] |
| | inference: |
| | parameters: |
| | temperature: 0.8 |
| | top_p: 0.95 |
| | top_k: 50 |
| | max_new_tokens: 200 |
| | sagemaker: |
| | sdk_version: "2.200.0" |
| | instance_type: "ml.m5.large" |
| | instance_count: 1 |
| | container_image: "huggingface-pytorch-inference:2.0.0-transformers4.28.1-cpu-py310-ubuntu20.04-v1.0" |
| | --- |
| | |
| | # Legion Coder 8M |
| |
|
| | <img width="400px" src="https://img.shields.io/badge/LEGION-CODER-ff0040?style=for-the-badge"> |
| |
|
| | [](https://huggingface.co/spaces/dineth554/legion-coder-8m) |
| |
|
| | > [!Note] |
| | > This repository contains model weights and configuration files for the Legion Coder 8M model in the Hugging Face Transformers format. |
| | > |
| | > These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. |
| |
|
| | Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance. Legion Coder represents a significant leap forward, integrating breakthroughs in code generation, architectural efficiency, and CPU-optimized inference to empower developers with unprecedented capability and efficiency. |
| |
|
| | ## Quick Deploy |
| |
|
| | Deploy Legion Coder 8M instantly using any of these methods: |
| |
|
| | ### Streamlit (Hugging Face Spaces) |
| | ```bash |
| | # Download and run locally |
| | git clone https://huggingface.co/pnny13/legion-coder-8m |
| | cd legion-coder-8m |
| | pip install -r requirements.txt |
| | streamlit run app.py |
| | ``` |
| |
|
| | **One-Click Deploy:** |
| | 1. Go to [Hugging Face New Space](https://huggingface.co/new-space) |
| | 2. Select "Streamlit" as SDK |
| | 3. Upload `app.py` and `requirements.txt` |
| | 4. Your Space is live! |
| |
|
| | ### Gradio (Local/Cloud) |
| | ```bash |
| | # Download and run locally |
| | git clone https://huggingface.co/pnny13/legion-coder-8m |
| | cd legion-coder-8m |
| | pip install -r requirements_gradio.txt |
| | python gradio_app.py |
| | ``` |
| |
|
| | **One-Click Deploy:** |
| | 1. Go to [Hugging Face New Space](https://huggingface.co/new-space) |
| | 2. Select "Gradio" as SDK |
| | 3. Upload `gradio_app.py` and `requirements_gradio.txt` |
| |
|
| | ### AWS SageMaker (Production) |
| | ```python |
| | import sagemaker |
| | from sagemaker.huggingface import HuggingFaceModel |
| | |
| | huggingface_model = HuggingFaceModel( |
| | model_data="pnny13/legion-coder-8m", |
| | transformers_version="4.36.0", |
| | pytorch_version="2.1.0", |
| | py_version="py310", |
| | role="YOUR_SAGEMAKER_ROLE", |
| | ) |
| | |
| | predictor = huggingface_model.deploy( |
| | initial_instance_count=1, |
| | instance_type="ml.m5.large", |
| | endpoint_name="legion-coder-8m" |
| | ) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Deploy This Model |
| |
|
| | <div align="center"> |
| |
|
| | ### One-Click Deployment Options |
| |
|
| | [](https://huggingface.co/pnny13/legion-coder-8m/deploy/sagemaker) |
| | [](https://huggingface.co/new-space?template=pnny13/legion-coder-8m&sdk=streamlit) |
| | [](https://huggingface.co/new-space?template=pnny13/legion-coder-8m&sdk=gradio) |
| |
|
| | </div> |
| |
|
| | ### Deployment Instructions |
| |
|
| | **AWS SageMaker:** |
| | - Click the "Deploy to SageMaker" button above |
| | - Configure your AWS credentials |
| | - Select instance type (recommended: ml.m5.large) |
| | - Deploy in one click |
| |
|
| | **Streamlit Space:** |
| | - Click the "Deploy to Streamlit Space" button |
| | - Select your Hugging Face account |
| | - Name your space and choose "Streamlit" SDK |
| | - Create Space |
| |
|
| | **Gradio Space:** |
| | - Click the "Deploy to Gradio Space" button |
| | - Select your Hugging Face account |
| | - Name your space and choose "Gradio" SDK |
| | - Create Space |
| |
|
| | ## Legion Coder Highlights |
| |
|
| | Legion Coder features the following enhancements: |
| |
|
| | - **Unified Code Generation Foundation**: Early training on curated code datasets achieves cross-generational parity with larger models across Python, JavaScript, and multi-language benchmarks. |
| |
|
| | - **Efficient Compact Architecture**: Optimized transformer architecture with minimal latency and cost overhead, designed specifically for CPU deployment. |
| |
|
| | - **Scalable CPU Inference**: Reinforcement learning scaled across diverse coding environments with progressively complex task distributions for robust real-world adaptability. |
| |
|
| | - **Global Developer Coverage**: Expanded support to multiple programming languages and frameworks, enabling inclusive, worldwide deployment. |
| |
|
| | - **Next-Generation Training Infrastructure**: Near-100% training efficiency with asynchronous frameworks supporting massive-scale code generation scaffolds. |
| |
|
| | ## Model Overview |
| |
|
| | - Type: Causal Language Model |
| | - Training Stage: Pre-training & Post-training |
| | - Language Model |
| | - Number of Parameters: 44,341,632 (~44M) |
| | - Hidden Dimension: 576 |
| | - Token Embedding: 16,000 |
| | - Number of Layers: 13 |
| | - Attention Heads: 16 |
| | - Context Length: 1,024 tokens |
| | - Vocabulary: 16,000 tokens |
| | - Format: Safetensors |
| | - LM Output: 16,000 |
| | - Context Length: 1,024 tokens natively |
| |
|
| | ## Benchmark Results |
| |
|
| | ### Code Generation |
| |
|
| | <div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:1000px;margin:0 auto;padding:16px 0"> |
| | <table style="width:100%;border-collapse:collapse;font-size:13px"> |
| | <thead><tr> |
| | <th style="padding:10px 7px;text-align:left;font-weight:600;border-bottom:2px solid #ff0040;color:#ff0040"></th><th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #ff0040;color:#ff0040;font-size: 14px;">Legion Coder 8M</th><th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #ff0040;color:#ff0040;font-size: 14px;">TinyLlama-1.1B</th><th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #ff0040;color:#ff0040;font-size: 14px;">Qwen2.5-0.5B</th><th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #ff0040;color:#ff0040;font-size: 14px;">CodeLlama-7B</th><th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #ff0040;color:#ff0040;font-size: 14px;">Phi-2</th></tr></thead> |
| | <tbody> |
| | <tr><td colspan="6" style="padding:8px 12px;font-weight:600;color:#ff0040;border-bottom:1px solid rgba(255, 0, 64, 0.2);background:rgba(255, 0, 64, 0.1)">Efficiency Metrics</td></tr> |
| | <tr> |
| | <td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Model Size</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">~170MB</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">~2.2GB</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">~1.0GB</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">~13GB</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">~5.3GB</td> |
| | </tr> |
| | <tr> |
| | <td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Parameters</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">44M</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">1.1B</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">500M</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">7B</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">2.7B</td> |
| | </tr> |
| | <tr> |
| | <td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">CPU Compatible</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">Yes</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">No</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">Limited</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">No</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">No</td> |
| | </tr> |
| | <tr> |
| | <td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Efficiency Score</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 0.15)">9.5/10</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">6.0/10</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">7.0/10</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">5.0/10</td> |
| | <td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">6.5/10</td> |
| | </tr> |
| | </tbody> |
| | </table> |
| | <p style="margin-top:12px;font-size:11px;opacity:0.7"> |
| | * Efficiency Score = (Parameter Efficiency x Memory Efficiency x Speed) / 3<br> |
| | * Legion Coder 8M achieves exceptional efficiency through compact architecture optimized for CPU deployment. |
| | </p> |
| | </div> |
| |
|
| | ## Amazon SageMaker Deployment |
| |
|
| | This model is ready for deployment on Amazon SageMaker with one-click deployment support. |
| |
|
| | ### Deploy to AWS SageMaker |
| |
|
| | [](https://huggingface.co/pnny13/legion-coder-8m/deploy/sagemaker) |
| |
|
| | ### Using the SageMaker Python SDK |
| |
|
| | ```python |
| | import sagemaker |
| | from sagemaker.huggingface import HuggingFaceModel |
| | |
| | # Initialize SageMaker session |
| | sess = sagemaker.Session() |
| | |
| | # Create Hugging Face Model |
| | huggingface_model = HuggingFaceModel( |
| | model_data="pnny13/legion-coder-8m", |
| | transformers_version="4.36.0", |
| | pytorch_version="2.1.0", |
| | py_version="py310", |
| | role="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_SAGEMAKER_ROLE", |
| | sagemaker_session=sess, |
| | ) |
| | |
| | # Deploy to SageMaker |
| | predictor = huggingface_model.deploy( |
| | initial_instance_count=1, |
| | instance_type="ml.m5.large", |
| | endpoint_name="legion-coder-8m-endpoint" |
| | ) |
| | |
| | # Test the endpoint |
| | result = predictor.predict({ |
| | "inputs": "Write a Python function to calculate fibonacci numbers:", |
| | "parameters": { |
| | "temperature": 0.8, |
| | "max_new_tokens": 200 |
| | } |
| | }) |
| | |
| | print(result) |
| | ``` |
| |
|
| | ## Local Inference with vLLM |
| |
|
| | ```python |
| | from vllm import LLM, SamplingParams |
| | |
| | # Load model with vLLM |
| | llm = LLM(model="pnny13/legion-coder-8m") |
| | |
| | # Set sampling parameters |
| | sampling_params = SamplingParams( |
| | temperature=0.8, |
| | top_p=0.95, |
| | max_tokens=200 |
| | ) |
| | |
| | # Generate code |
| | prompt = "Write a Python function to calculate fibonacci numbers:" |
| | outputs = llm.generate(prompt, sampling_params) |
| | print(outputs[0].outputs[0].text) |
| | ``` |
| |
|
| | ## Local Inference with SGLang |
| |
|
| | ```python |
| | import sglang as sgl |
| | |
| | # Define prompt template |
| | @sgl.function |
| | def code_gen(s, prompt): |
| | s += sgl.system("You are a helpful coding assistant.") |
| | s += sgl.user(prompt) |
| | s += sgl.assistant(sgl.gen("code", max_tokens=200)) |
| | |
| | # Run inference |
| | result = code_gen.run( |
| | prompt="Write a Python function to calculate fibonacci numbers:", |
| | temperature=0.8 |
| | ) |
| | print(result["code"]) |
| | ``` |
| |
|
| | ## Technical Details |
| |
|
| | ### Training Data |
| | - Python code from The Stack v2 dataset |
| | - GitHub code repositories (filtered for quality) |
| | - Code-specific preprocessing for indentation and special tokens |
| |
|
| | ### Training Procedure |
| | - **Optimizer:** AdamW |
| | - **Learning Rate:** 5e-4 with cosine decay |
| | - **Batch Size:** 4 with gradient accumulation |
| | - **Training Steps:** 10,000 |
| | - **Precision:** float32 (CPU-optimized) |
| |
|
| | ## License |
| |
|
| | This model is released under the **Apache 2.0 License**. |
| |
|
| | ## Links |
| |
|
| | - **Model Repository:** [pnny13/legion-coder-8m](https://huggingface.co/pnny13/legion-coder-8m) |
| | - **Live Demo:** [Hugging Face Space](https://huggingface.co/spaces/dineth554/legion-coder-8m) |
| |
|
| | <div align="center"> |
| |
|
| | ### MADE WITH BY DEATH LEGION |
| |
|
| | **Powered by nvdya-kit** |
| |
|
| | *2026 DEATH LEGION. All rights reserved.* |
| |
|
| | </div> |
| |
|