dineth554 commited on
Commit
7c5aef2
·
verified ·
1 Parent(s): 1964634

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +157 -13
README.md CHANGED
@@ -1,31 +1,175 @@
1
- # Legion-Coder-8M
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- A compact 8M parameter transformer model optimized for coding tasks.
 
 
4
 
5
  ## Model Details
6
 
7
- - **Architecture**: GPT-style transformer
8
- - **Parameters**: 7,510,528 (~8M)
9
- - **Vocabulary Size**: 16,000
10
- - **Hidden Size**: 256
11
- - **Layers**: 6
12
- - **Attention Heads**: 8
 
13
  - **Context Length**: 1,024 tokens
14
  - **Format**: Safetensors
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ## Usage
17
 
 
 
18
  ```python
19
  from transformers import AutoModel, AutoTokenizer
 
 
 
 
 
20
 
21
- model = AutoModel.from_pretrained("your-username/Legion-Coder-8M")
22
- tokenizer = AutoTokenizer.from_pretrained("your-username/Legion-Coder-8M")
23
  ```
24
 
25
- ## Training
 
 
 
 
 
26
 
27
- Trained on Python code datasets with memory-efficient techniques for CPU environments.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## License
30
 
31
- MIT License
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - code
6
+ - coding
7
+ - python
8
+ - programming
9
+ - text-generation
10
+ - causal-lm
11
+ - transformer
12
+ - gpt
13
+ - legion-coder
14
+ - code-generation
15
+ - code-completion
16
+ license: mit
17
+ datasets:
18
+ - the-stack-v2
19
+ - codeparrot/github-code
20
+ - bigcode/the-stack
21
+ model-index:
22
+ - name: Legion Coder 8M
23
+ results: []
24
+ ---
25
 
26
+ # Legion Coder 8M
27
+
28
+ A compact yet powerful 44M parameter transformer model optimized for coding tasks. Legion Coder is designed to generate clean, efficient, and well-documented code while maintaining a small footprint suitable for local deployment.
29
 
30
  ## Model Details
31
 
32
+ - **Architecture**: GPT-style transformer with pre-normalization
33
+ - **Parameters**: 44,341,632 (~44M)
34
+ - **Vocabulary Size**: 16,000 (BPE tokenizer optimized for code)
35
+ - **Hidden Size (d_model)**: 576
36
+ - **Layers**: 13
37
+ - **Attention Heads**: 16
38
+ - **Feed-forward Dimension**: 1,152
39
  - **Context Length**: 1,024 tokens
40
  - **Format**: Safetensors
41
+ - **Precision**: float32
42
+
43
+ ## Model Specifications
44
+
45
+ | Attribute | Value |
46
+ |-----------|-------|
47
+ | Model Type | Causal Language Model |
48
+ | Architecture | Transformer Decoder |
49
+ | Parameters | 44,341,632 |
50
+ | Hidden Size | 576 |
51
+ | Num Layers | 13 |
52
+ | Num Attention Heads | 16 |
53
+ | Intermediate Size | 1,152 |
54
+ | Max Position Embeddings | 1,024 |
55
+ | Vocab Size | 16,000 |
56
+
57
+ ## Intended Use
58
+
59
+ This model is designed for:
60
+ - **Code Generation**: Generate Python and other programming language code
61
+ - **Code Completion**: Complete partial code snippets
62
+ - **Code Explanation**: Provide explanations for code functionality
63
+ - **Debugging Assistance**: Help identify and fix code issues
64
+ - **Educational Purposes**: Learn programming concepts through examples
65
 
66
  ## Usage
67
 
68
+ ### Loading the Model
69
+
70
  ```python
71
  from transformers import AutoModel, AutoTokenizer
72
+ import torch
73
+
74
+ # Load model and tokenizer
75
+ model = AutoModel.from_pretrained("pnny13/legion-coder-8m", trust_remote_code=True)
76
+ tokenizer = AutoTokenizer.from_pretrained("pnny13/legion-coder-8m", trust_remote_code=True)
77
 
78
+ # Set to eval mode
79
+ model.eval()
80
  ```
81
 
82
+ ### Generating Code
83
+
84
+ ```python
85
+ # Prepare prompt
86
+ prompt = "# Write a function to calculate factorial\ndef factorial(n):"
87
+ inputs = tokenizer(prompt, return_tensors="pt")
88
 
89
+ # Generate
90
+ with torch.no_grad():
91
+ outputs = model.generate(
92
+ inputs.input_ids,
93
+ max_length=200,
94
+ temperature=0.8,
95
+ top_p=0.95,
96
+ top_k=50
97
+ )
98
+
99
+ # Decode
100
+ generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
101
+ print(generated_code)
102
+ ```
103
+
104
+ ## System Prompt
105
+
106
+ For optimal results, use the following system prompt:
107
+
108
+ ```
109
+ You are Legion Coder, an expert coding assistant. Your purpose is to help users write clean, efficient, and well-documented code.
110
+
111
+ Guidelines:
112
+ - Write code that follows best practices and PEP 8 style guidelines
113
+ - Include helpful comments explaining complex logic
114
+ - Provide complete, runnable code examples
115
+ - Explain your approach before showing code when helpful
116
+ - If asked to debug, identify the issue and provide the corrected code
117
+
118
+ Always wrap code blocks in triple backticks with the appropriate language identifier.
119
+ ```
120
+
121
+ ## Training Details
122
+
123
+ ### Training Data
124
+ - Python code from The Stack v2 dataset
125
+ - GitHub code repositories (filtered for quality)
126
+ - Code-specific preprocessing to handle indentation and special tokens
127
+
128
+ ### Training Procedure
129
+ - Optimizer: AdamW
130
+ - Learning Rate: 5e-4 with cosine decay
131
+ - Batch Size: 4 with gradient accumulation
132
+ - Training Steps: 10,000
133
+ - Mixed Precision: No (CPU-optimized)
134
+
135
+ ## Limitations
136
+
137
+ - **Context Length**: Limited to 1,024 tokens
138
+ - **Language Support**: Primarily optimized for Python
139
+ - **Model Size**: 44M parameters may not capture all programming patterns
140
+ - **Training Data**: May reflect biases present in training code
141
+ - **No Internet Access**: Cannot access external APIs or documentation
142
+
143
+ ## Ethical Considerations
144
+
145
+ - Generated code should be reviewed before production use
146
+ - The model may reproduce patterns from training data; verify licensing
147
+ - Do not use for generating malicious code
148
+ - Consider environmental impact of model inference
149
+
150
+ ## Citation
151
+
152
+ If you use this model in your research, please cite:
153
+
154
+ ```bibtex
155
+ @misc{legioncoder2024,
156
+ title={Legion Coder 8M: A Compact Transformer for Code Generation},
157
+ author={Legion Coder Team},
158
+ year={2024},
159
+ howpublished={\url{https://huggingface.co/pnny13/legion-coder-8m}}
160
+ }
161
+ ```
162
 
163
  ## License
164
 
165
+ This model is released under the MIT License.
166
+
167
+ ## Contact
168
+
169
+ For questions or issues, please open an issue on the Hugging Face model repository.
170
+
171
+ ---
172
+
173
+ **Model Version**: 1.0.0
174
+ **Last Updated**: 2024-03-08
175
+ **Hugging Face Hub**: https://huggingface.co/pnny13/legion-coder-8m