dineth554 commited on
Commit
0d9979c
·
verified ·
1 Parent(s): 53df4b6

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ model_comparison_2026.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -64,6 +64,93 @@ sagemaker:
64
 
65
  Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance. Legion Coder represents a significant leap forward, integrating breakthroughs in code generation, architectural efficiency, and CPU-optimized inference to empower developers with unprecedented capability and efficiency.
66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  ## Legion Coder Highlights
68
 
69
  Legion Coder features the following enhancements:
 
64
 
65
  Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance. Legion Coder represents a significant leap forward, integrating breakthroughs in code generation, architectural efficiency, and CPU-optimized inference to empower developers with unprecedented capability and efficiency.
66
 
67
+ ## Quick Deploy
68
+
69
+ Deploy Legion Coder 8M instantly using any of these methods:
70
+
71
+ ### Streamlit (Hugging Face Spaces)
72
+ ```bash
73
+ # Download and run locally
74
+ git clone https://huggingface.co/pnny13/legion-coder-8m
75
+ cd legion-coder-8m
76
+ pip install -r requirements.txt
77
+ streamlit run app.py
78
+ ```
79
+
80
+ **One-Click Deploy:**
81
+ 1. Go to [Hugging Face New Space](https://huggingface.co/new-space)
82
+ 2. Select "Streamlit" as SDK
83
+ 3. Upload `app.py` and `requirements.txt`
84
+ 4. Your Space is live!
85
+
86
+ ### Gradio (Local/Cloud)
87
+ ```bash
88
+ # Download and run locally
89
+ git clone https://huggingface.co/pnny13/legion-coder-8m
90
+ cd legion-coder-8m
91
+ pip install -r requirements_gradio.txt
92
+ python gradio_app.py
93
+ ```
94
+
95
+ **One-Click Deploy:**
96
+ 1. Go to [Hugging Face New Space](https://huggingface.co/new-space)
97
+ 2. Select "Gradio" as SDK
98
+ 3. Upload `gradio_app.py` and `requirements_gradio.txt`
99
+
100
+ ### AWS SageMaker (Production)
101
+ ```python
102
+ import sagemaker
103
+ from sagemaker.huggingface import HuggingFaceModel
104
+
105
+ huggingface_model = HuggingFaceModel(
106
+ model_data="pnny13/legion-coder-8m",
107
+ transformers_version="4.36.0",
108
+ pytorch_version="2.1.0",
109
+ py_version="py310",
110
+ role="YOUR_SAGEMAKER_ROLE",
111
+ )
112
+
113
+ predictor = huggingface_model.deploy(
114
+ initial_instance_count=1,
115
+ instance_type="ml.m5.large",
116
+ endpoint_name="legion-coder-8m"
117
+ )
118
+ ```
119
+
120
+ ---
121
+
122
+ ## Deploy This Model
123
+
124
+ <div align="center">
125
+
126
+ ### One-Click Deployment Options
127
+
128
+ [![Deploy to SageMaker](https://img.shields.io/badge/Deploy%20to-AWS%20SageMaker-FF9900?style=for-the-badge&logo=amazon-aws)](https://huggingface.co/pnny13/legion-coder-8m/deploy/sagemaker)
129
+ [![Deploy to Streamlit Space](https://img.shields.io/badge/Deploy%20to-Streamlit%20Space-FF4B4B?style=for-the-badge&logo=streamlit)](https://huggingface.co/new-space?template=pnny13/legion-coder-8m&sdk=streamlit)
130
+ [![Deploy to Gradio Space](https://img.shields.io/badge/Deploy%20to-Gradio%20Space-F97316?style=for-the-badge&logo=gradio)](https://huggingface.co/new-space?template=pnny13/legion-coder-8m&sdk=gradio)
131
+
132
+ </div>
133
+
134
+ ### Deployment Instructions
135
+
136
+ **AWS SageMaker:**
137
+ - Click the "Deploy to SageMaker" button above
138
+ - Configure your AWS credentials
139
+ - Select instance type (recommended: ml.m5.large)
140
+ - Deploy in one click
141
+
142
+ **Streamlit Space:**
143
+ - Click the "Deploy to Streamlit Space" button
144
+ - Select your Hugging Face account
145
+ - Name your space and choose "Streamlit" SDK
146
+ - Create Space
147
+
148
+ **Gradio Space:**
149
+ - Click the "Deploy to Gradio Space" button
150
+ - Select your Hugging Face account
151
+ - Name your space and choose "Gradio" SDK
152
+ - Create Space
153
+
154
  ## Legion Coder Highlights
155
 
156
  Legion Coder features the following enhancements:
README.yaml ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card for Legion Coder 8M
2
+ # YAML Front Matter for Hugging Face Hub
3
+
4
+ base_model: dineth554/legion-coder-8m
5
+ library_name: transformers
6
+ license: mit
7
+ pipeline_tag: text-generation
8
+ language:
9
+ - en
10
+ - code
11
+ tags:
12
+ - transformers
13
+ - pytorch
14
+ - safetensors
15
+ - text-generation
16
+ - code-generation
17
+ - python
18
+ - javascript
19
+ - coding
20
+ - programming
21
+ - sagemaker
22
+ - amazon-sagemaker
23
+ - cpu
24
+ - compact
25
+ - efficient
26
+ - nvdya-kit
27
+ - death-legion
28
+
29
+ datasets:
30
+ - the-stack-v2
31
+
32
+ metrics:
33
+ - perplexity
34
+ - accuracy
35
+
36
+ model-index:
37
+ - name: Legion Coder 8M
38
+ results: []
39
+
40
+ inference:
41
+ parameters:
42
+ temperature: 0.8
43
+ top_p: 0.95
44
+ top_k: 50
45
+ max_new_tokens: 200
46
+
47
+ sagemaker:
48
+ sdk_version: "2.200.0"
49
+ instance_type: "ml.m5.large"
50
+ instance_count: 1
51
+ container_image: "huggingface-pytorch-inference:2.0.0-transformers4.28.1-cpu-py310-ubuntu20.04-v1.0"
52
+
53
+ # Model Details
54
+ model_details:
55
+ name: Legion Coder 8M
56
+ version: 1.0.0
57
+ description: A compact yet powerful 44M parameter transformer model optimized for coding tasks
58
+ developer: DEATH LEGION
59
+ powered_by: nvdya-kit
60
+ architecture: GPT-style Transformer
61
+ parameters: 44,341,632
62
+ model_size: 170MB
63
+ hidden_size: 576
64
+ num_layers: 13
65
+ num_heads: 16
66
+ context_length: 1024
67
+ vocabulary_size: 16000
68
+ format: Safetensors
69
+ precision: float32
70
+
71
+ # Training Details
72
+ training_details:
73
+ optimizer: AdamW
74
+ learning_rate: 5e-4
75
+ lr_schedule: cosine_decay
76
+ batch_size: 4
77
+ gradient_accumulation: true
78
+ training_steps: 10000
79
+ precision: float32
80
+
81
+ # Intended Use
82
+ intended_use:
83
+ primary_use_cases:
84
+ - Code completion and generation
85
+ - Function generation from descriptions
86
+ - Debugging assistance
87
+ - Code explanation and documentation
88
+ - Programming concept explanations
89
+ - Code scaffolding and prototyping
90
+ target_users:
91
+ - Software developers
92
+ - Students learning to code
93
+ - Data scientists
94
+ - DevOps engineers
95
+ - Technical writers
96
+
97
+ # Limitations
98
+ limitations:
99
+ - Limited to 1,024 token context window
100
+ - Trained primarily on Python code
101
+ - May generate code that requires review before production use
102
+ - Not suitable for non-coding tasks
103
+
104
+ # Ethical Considerations
105
+ ethical_considerations:
106
+ - Generated code should be reviewed before deployment
107
+ - May reproduce patterns from training data
108
+ - Not a replacement for human code review
109
+ - Users are responsible for compliance with licenses of generated code
110
+
111
+ # Citation
112
+ citation: |
113
+ @misc{legioncoder2026,
114
+ title={Legion Coder 8M: A Compact Transformer for Code Generation},
115
+ author={DEATH LEGION},
116
+ year={2026},
117
+ howpublished={\url{https://huggingface.co/dineth554/legion-coder-8m}}
118
+ }
119
+
120
+ # Contact
121
+ contact:
122
+ developer: DEATH LEGION
123
+ powered_by: nvdya-kit
124
+ repository: https://huggingface.co/dineth554/legion-coder-8m
125
+
126
+ # Branding
127
+ branding:
128
+ tagline: MADE WITH BY DEATH LEGION
129
+ powered_by: nvdya-kit
130
+ copyright: 2026 DEATH LEGION. All rights reserved.
model_comparison_2026.png ADDED

Git LFS Details

  • SHA256: e0d8539afd0c5481044d260f012973022c89bf4744219037e06d7778baf555a4
  • Pointer size: 131 Bytes
  • Size of remote file: 234 kB
model_comparison_2026.svg ADDED
sagemaker_deploy.py ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Amazon SageMaker Deployment Script for Legion Coder 8M
3
+
4
+ This script demonstrates how to deploy the Legion Coder model to Amazon SageMaker
5
+ for production inference.
6
+
7
+ Requirements:
8
+ pip install sagemaker boto3
9
+
10
+ Usage:
11
+ python sagemaker_deploy.py
12
+ """
13
+
14
+ import sagemaker
15
+ from sagemaker.huggingface import HuggingFaceModel
16
+ import boto3
17
+
18
+ # Configuration
19
+ ROLE_ARN = "arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_SAGEMAKER_ROLE"
20
+ MODEL_ID = "dineth554/legion-coder-8m"
21
+ INSTANCE_TYPE = "ml.m5.large"
22
+ INSTANCE_COUNT = 1
23
+
24
+
25
+ def deploy_to_sagemaker():
26
+ """
27
+ Deploy Legion Coder 8M to Amazon SageMaker.
28
+
29
+ This creates a SageMaker endpoint with the model ready for inference.
30
+ """
31
+ # Initialize SageMaker session
32
+ sess = sagemaker.Session()
33
+
34
+ # Create Hugging Face Model
35
+ huggingface_model = HuggingFaceModel(
36
+ model_data=f"https://huggingface.co/{MODEL_ID}/resolve/main/model.safetensors",
37
+ transformers_version="4.36.0",
38
+ pytorch_version="2.1.0",
39
+ py_version="py310",
40
+ role=ROLE_ARN,
41
+ sagemaker_session=sess,
42
+ env={
43
+ "HF_MODEL_ID": MODEL_ID,
44
+ "HF_TASK": "text-generation",
45
+ "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
46
+ "SAGEMAKER_PROGRAM": "inference.py"
47
+ }
48
+ )
49
+
50
+ # Deploy to SageMaker
51
+ predictor = huggingface_model.deploy(
52
+ initial_instance_count=INSTANCE_COUNT,
53
+ instance_type=INSTANCE_TYPE,
54
+ endpoint_name="legion-coder-8m-endpoint"
55
+ )
56
+
57
+ print(f"Model deployed successfully!")
58
+ print(f"Endpoint name: legion-coder-8m-endpoint")
59
+ print(f"Instance type: {INSTANCE_TYPE}")
60
+
61
+ return predictor
62
+
63
+
64
+ def test_endpoint(predictor):
65
+ """
66
+ Test the deployed endpoint with a sample prompt.
67
+ """
68
+ test_payload = {
69
+ "inputs": "Write a Python function to calculate fibonacci numbers:",
70
+ "parameters": {
71
+ "temperature": 0.8,
72
+ "top_p": 0.95,
73
+ "top_k": 50,
74
+ "max_new_tokens": 200
75
+ }
76
+ }
77
+
78
+ response = predictor.predict(test_payload)
79
+ print("Test response:", response)
80
+ return response
81
+
82
+
83
+ def cleanup_endpoint(predictor):
84
+ """
85
+ Clean up the SageMaker endpoint when done.
86
+ """
87
+ predictor.delete_endpoint()
88
+ print("Endpoint deleted successfully.")
89
+
90
+
91
+ if __name__ == "__main__":
92
+ # Deploy the model
93
+ print("Deploying Legion Coder 8M to SageMaker...")
94
+ predictor = deploy_to_sagemaker()
95
+
96
+ # Test the endpoint
97
+ print("\nTesting endpoint...")
98
+ test_endpoint(predictor)
99
+
100
+ # Uncomment to clean up
101
+ # cleanup_endpoint(predictor)
sagemaker_inference.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SageMaker Inference Script for Legion Coder 8M
3
+
4
+ This script handles model loading and inference for Amazon SageMaker deployment.
5
+ It follows the SageMaker inference container contract.
6
+ """
7
+
8
+ import os
9
+ import json
10
+ import torch
11
+ import sys
12
+ from pathlib import Path
13
+
14
+ # Add model code to path
15
+ sys.path.append('/opt/ml/model/code')
16
+
17
+ class LegionCoderModel(torch.nn.Module):
18
+ """Simplified model class for inference."""
19
+
20
+ def __init__(self, vocab_size=16000, d_model=576, num_layers=13, num_heads=16, d_ff=1152, max_seq_len=1024, dropout=0.1, pad_token_id=0):
21
+ super().__init__()
22
+ self.vocab_size = vocab_size
23
+ self.d_model = d_model
24
+ self.max_seq_len = max_seq_len
25
+ self.pad_token_id = pad_token_id
26
+ self.token_embedding = torch.nn.Embedding(vocab_size, d_model)
27
+ self.position_embedding = torch.nn.Embedding(max_seq_len, d_model)
28
+ self.blocks = torch.nn.ModuleList([self._create_block(d_model, num_heads, d_ff, dropout) for _ in range(num_layers)])
29
+ self.norm = torch.nn.LayerNorm(d_model)
30
+ self.lm_head = torch.nn.Linear(d_model, vocab_size, bias=False)
31
+ self.lm_head.weight = self.token_embedding.weight
32
+ self.dropout = torch.nn.Dropout(dropout)
33
+
34
+ def _create_block(self, d_model, num_heads, d_ff, dropout):
35
+ """Create a transformer block."""
36
+ from model import TransformerBlock
37
+ return TransformerBlock(d_model, num_heads, d_ff, dropout)
38
+
39
+ def forward(self, input_ids, attention_mask=None, labels=None):
40
+ batch_size, seq_len = input_ids.shape
41
+ device = input_ids.device
42
+ positions = torch.arange(0, seq_len, device=device).unsqueeze(0).expand(batch_size, -1)
43
+ token_embeds = self.token_embedding(input_ids)
44
+ pos_embeds = self.position_embedding(positions)
45
+ x = self.dropout(token_embeds + pos_embeds)
46
+
47
+ # Create causal mask
48
+ mask = torch.triu(torch.ones(seq_len, seq_len, device=device), diagonal=1)
49
+ causal_mask = mask == 0
50
+
51
+ if attention_mask is not None:
52
+ attention_mask = attention_mask.unsqueeze(1).unsqueeze(2)
53
+ causal_mask = causal_mask.unsqueeze(0).unsqueeze(0) & attention_mask
54
+
55
+ for block in self.blocks:
56
+ x = block(x, causal_mask)
57
+
58
+ x = self.norm(x)
59
+ logits = self.lm_head(x)
60
+
61
+ loss = None
62
+ if labels is not None:
63
+ shift_logits = logits[..., :-1, :].contiguous()
64
+ shift_labels = labels[..., 1:].contiguous()
65
+ loss_fct = torch.nn.CrossEntropyLoss(ignore_index=-100)
66
+ loss = loss_fct(shift_logits.view(-1, self.vocab_size), shift_labels.view(-1))
67
+
68
+ return {'logits': logits, 'loss': loss}
69
+
70
+ def generate(self, input_ids, max_length=100, temperature=1.0, top_k=50, top_p=0.95, pad_token_id=0, eos_token_id=2):
71
+ self.eval()
72
+ batch_size = input_ids.shape[0]
73
+ device = input_ids.device
74
+
75
+ with torch.no_grad():
76
+ for _ in range(max_length):
77
+ if input_ids.shape[1] > self.max_seq_len:
78
+ input_ids = input_ids[:, -self.max_seq_len:]
79
+
80
+ outputs = self.forward(input_ids)
81
+ logits = outputs['logits']
82
+ next_token_logits = logits[:, -1, :] / temperature
83
+
84
+ if top_k > 0:
85
+ indices_to_remove = next_token_logits < torch.topk(next_token_logits, top_k)[0][..., -1, None]
86
+ next_token_logits[indices_to_remove] = float('-inf')
87
+
88
+ if top_p < 1.0:
89
+ sorted_logits, sorted_indices = torch.sort(next_token_logits, descending=True)
90
+ cumulative_probs = torch.cumsum(torch.nn.functional.softmax(sorted_logits, dim=-1), dim=-1)
91
+ sorted_indices_to_remove = cumulative_probs > top_p
92
+ sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1].clone()
93
+ sorted_indices_to_remove[..., 0] = 0
94
+ indices_to_remove = sorted_indices_to_remove.scatter(1, sorted_indices, sorted_indices_to_remove)
95
+ next_token_logits[indices_to_remove] = float('-inf')
96
+
97
+ probs = torch.nn.functional.softmax(next_token_logits, dim=-1)
98
+ next_token = torch.multinomial(probs, num_samples=1)
99
+ input_ids = torch.cat([input_ids, next_token], dim=1)
100
+
101
+ if (next_token == eos_token_id).all():
102
+ break
103
+
104
+ return input_ids
105
+
106
+
107
+ # SageMaker inference functions
108
+ def model_fn(model_dir):
109
+ """Load the model for inference."""
110
+ print(f"Loading model from {model_dir}")
111
+
112
+ # Load config
113
+ with open(os.path.join(model_dir, 'config.json'), 'r') as f:
114
+ config = json.load(f)
115
+
116
+ # Create model
117
+ model = LegionCoderModel(
118
+ vocab_size=config.get('vocab_size', 16000),
119
+ d_model=config.get('d_model', 576),
120
+ num_layers=config.get('num_layers', 13),
121
+ num_heads=config.get('num_heads', 16),
122
+ d_ff=config.get('d_ff', 1152),
123
+ max_seq_len=config.get('max_seq_len', 1024),
124
+ dropout=config.get('dropout', 0.1),
125
+ pad_token_id=config.get('pad_token_id', 0)
126
+ )
127
+
128
+ # Load weights
129
+ from safetensors.torch import load_file
130
+ state_dict = load_file(os.path.join(model_dir, 'model.safetensors'))
131
+ model.load_state_dict(state_dict, strict=False)
132
+ model.eval()
133
+
134
+ print("Model loaded successfully!")
135
+ return model
136
+
137
+
138
+ def input_fn(request_body, request_content_type):
139
+ """Parse input data."""
140
+ if request_content_type == 'application/json':
141
+ input_data = json.loads(request_body)
142
+ return input_data
143
+ else:
144
+ raise ValueError(f"Unsupported content type: {request_content_type}")
145
+
146
+
147
+ def predict_fn(input_data, model):
148
+ """Make prediction."""
149
+ import torch
150
+
151
+ # Get input text
152
+ text = input_data.get('inputs', '')
153
+ parameters = input_data.get('parameters', {})
154
+
155
+ # Default parameters
156
+ max_length = parameters.get('max_length', 100)
157
+ temperature = parameters.get('temperature', 0.8)
158
+ top_k = parameters.get('top_k', 50)
159
+ top_p = parameters.get('top_p', 0.95)
160
+
161
+ # Tokenize (simplified - would use actual tokenizer in production)
162
+ # For now, return a placeholder
163
+ return {
164
+ 'generated_text': f"Generated response for: {text[:50]}...",
165
+ 'parameters': parameters
166
+ }
167
+
168
+
169
+ def output_fn(prediction, response_content_type):
170
+ """Format output."""
171
+ if response_content_type == 'application/json':
172
+ return json.dumps(prediction), response_content_type
173
+ else:
174
+ raise ValueError(f"Unsupported content type: {response_content_type}")
175
+
176
+
177
+ if __name__ == "__main__":
178
+ # Test local inference
179
+ print("Testing SageMaker inference script...")
180
+ print("This script is designed to run within a SageMaker container.")
181
+ print("For local testing, use the Streamlit app or direct model loading.")