likhonsheikh commited on
Commit
d569902
·
verified ·
1 Parent(s): 15ab53e

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +130 -0
README.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Sheikh-2.5-Coder
2
+
3
+ **A lightweight 3B parameter code-focused language model inspired by MiniMax-M2 architecture, optimized for efficient on-device deployment.**
4
+
5
+ ## Model Description
6
+
7
+ Sheikh-2.5-Coder is a 3 billion parameter transformer model specifically designed for code generation and programming assistance. Inspired by the efficient architecture of MiniMax-M2, this model delivers strong performance in code generation while being optimized for on-device deployment.
8
+
9
+ ### Key Features
10
+
11
+ - **3B Parameters**: Optimized for efficiency and performance balance
12
+ - **Code-Focused Training**: Trained on diverse programming languages and code patterns
13
+ - **On-Device Ready**: Quantized variants available for mobile and edge deployment
14
+ - **Multi-Language Support**: Handles multiple programming languages
15
+ - **Chat Capabilities**: Instruction-tuned for conversational coding assistance
16
+ - **Efficient Architecture**: Inspired by MiniMax-M2's efficiency principles
17
+
18
+ ### Performance Highlights
19
+
20
+ - Competitive performance with models 2.5x larger
21
+ - Optimized memory usage for mobile deployment
22
+ - Fast inference times suitable for real-time applications
23
+ - Strong performance on code generation benchmarks
24
+
25
+ ## Model Variants
26
+
27
+ - **Base Model**: Full precision for research and development
28
+ - **8-bit Quantized**: Balanced performance and memory usage
29
+ - **4-bit Quantized**: Maximum efficiency for edge devices
30
+
31
+ ## Usage
32
+
33
+ ### Installation
34
+
35
+ ```bash
36
+ pip install transformers torch
37
+ ```
38
+
39
+ ### Basic Usage
40
+
41
+ ```python
42
+ from transformers import AutoModelForCausalLM, AutoTokenizer
43
+ import torch
44
+
45
+ # Load the model and tokenizer
46
+ model_name = "your-username/sheikh-2.5-coder"
47
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
48
+ model = AutoModelForCausalLM.from_pretrained(
49
+ model_name,
50
+ torch_dtype=torch.bfloat16,
51
+ device_map="auto"
52
+ )
53
+
54
+ # Generate code
55
+ prompt = "Write a function to calculate the factorial of a number:"
56
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
57
+ outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
58
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
59
+ ```
60
+
61
+ ### Chat Usage
62
+
63
+ ```python
64
+ # For conversational interaction
65
+ messages = [
66
+ {"role": "user", "content": "Help me write a Python function to sort a list"}
67
+ ]
68
+ inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
69
+ outputs = model.generate(inputs, max_new_tokens=200, temperature=0.1)
70
+ response = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
71
+ print(response)
72
+ ```
73
+
74
+ ## Technical Specifications
75
+
76
+ - **Parameters**: 3.09B (2.77B non-embedding)
77
+ - **Context Length**: 32,768 tokens
78
+ - **Architecture**: Transformer with attention optimizations
79
+ - **Training Data**: Diverse programming languages and code-comment pairs
80
+ - **Optimization**: Quantization-ready for on-device deployment
81
+
82
+ ## Benchmarks
83
+
84
+ *Performance metrics will be added after training completion*
85
+
86
+ ## Deployment
87
+
88
+ ### CPU Inference
89
+
90
+ ```python
91
+ model = AutoModelForCausalLM.from_pretrained(
92
+ "your-username/sheikh-2.5-coder",
93
+ torch_dtype=torch.float32,
94
+ device_map="cpu"
95
+ )
96
+ ```
97
+
98
+ ### Mobile Deployment
99
+
100
+ For mobile deployment, use the quantized variants:
101
+ - 8-bit quantized model for balance of speed and accuracy
102
+ - 4-bit quantized model for maximum efficiency
103
+
104
+ ## License
105
+
106
+ [License information to be added]
107
+
108
+ ## Contributing
109
+
110
+ We welcome contributions! Please see our contributing guidelines for more details.
111
+
112
+ ## Citation
113
+
114
+ ```bibtex
115
+ @article{sheikh2024sheikh25coder,
116
+ title={Sheikh-2.5-Coder: Efficient On-Device Code Generation Model},
117
+ author={Author Name},
118
+ year={2024}
119
+ }
120
+ ```
121
+
122
+ ## Acknowledgments
123
+
124
+ - Inspired by MiniMax-M2 architecture
125
+ - Trained on diverse code datasets
126
+ - Built with modern transformer optimizations
127
+
128
+ ---
129
+
130
+ **Note**: This is a research model. For production use, please thoroughly test performance and consider safety implications.