devadigaprathamesh commited on
Commit
7c71785
·
verified ·
1 Parent(s): 194fece

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +218 -3
README.md CHANGED
@@ -1,3 +1,218 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MistralXGo-7B: Fine-Tuned Model for Go Code Generation
2
+
3
+ ![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)
4
+ ![Hugging Face](https://img.shields.io/badge/HuggingFace-Model-orange.svg)
5
+
6
+ This repository contains **MistralXGo-7B**, a fine-tuned version of the [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) language model optimized for generating Go code based on comments. The model was trained using **LoRA (Low-Rank Adaptation)**, making it lightweight and efficient for deployment.
7
+
8
+ ---
9
+
10
+ ## Table of Contents
11
+ 1. [Overview](#overview)
12
+ 2. [Model Details](#model-details)
13
+ 3. [Usage](#usage)
14
+ 4. [Training Details](#training-details)
15
+ 5. [Dataset](#dataset)
16
+ 6. [Evaluation](#evaluation)
17
+ 7. [Limitations](#limitations)
18
+ 8. [Contributing](#contributing)
19
+ 9. [Citation](#citation)
20
+ 10. [License](#license)
21
+
22
+ ---
23
+
24
+ ## Overview
25
+
26
+ The goal of this project is to create a specialized language model that generates Go code from natural language comments. For example, given a comment like:
27
+
28
+ ```go
29
+ // Basic routing with Chi.
30
+ ```
31
+
32
+ The model generates corresponding Go code:
33
+
34
+ ```go
35
+ package main
36
+
37
+ import (
38
+ "net/http"
39
+ "github.com/go-chi/chi/v5"
40
+ )
41
+
42
+ func main() {
43
+ r := chi.NewRouter()
44
+ r.Get("/", func(w http.ResponseWriter, r *http.Request) {
45
+ w.Write([]byte("Hello, World!"))
46
+ })
47
+ http.ListenAndServe(":3000", r)
48
+ }
49
+ ```
50
+
51
+ This model is particularly useful for developers who want to quickly prototype Go applications or explore code generation capabilities.
52
+
53
+ ---
54
+
55
+ ## Model Details
56
+
57
+ - **Base Model:** [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1)
58
+ - **Fine-Tuning Method:** LoRA (Low-Rank Adaptation)
59
+ - **Quantization:** 4-bit quantization for memory efficiency
60
+ - **Max Sequence Length:** 2048 tokens
61
+ - **Precision:** Mixed precision (`bf16` or `fp16` depending on hardware)
62
+
63
+ The model is hosted on Hugging Face Hub at:
64
+ [MistralXGo-7B](https://huggingface.co/devadigaprathamesh/MistralXGo-7B)
65
+
66
+ ---
67
+
68
+ ## Usage
69
+
70
+ ### Installation
71
+
72
+ Install the required libraries:
73
+
74
+ ```bash
75
+ pip install transformers torch
76
+ ```
77
+
78
+ ### Inference
79
+
80
+ Load the model and tokenizer from Hugging Face Hub and generate Go code:
81
+
82
+ ```python
83
+ from transformers import AutoModelForCausalLM, AutoTokenizer
84
+
85
+ # Load the model and tokenizer
86
+ model_name = "your-username/MistralXGo-7B"
87
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
88
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
89
+
90
+ # Generate Go code
91
+ prompt = "// Basic routing with Chi."
92
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
93
+ outputs = model.generate(**inputs, max_new_tokens=256)
94
+
95
+ # Decode and print the output
96
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
97
+ ```
98
+
99
+ ### Example Output
100
+
101
+ Input:
102
+ ```go
103
+ // Basic routing with Chi.
104
+ ```
105
+
106
+ Output:
107
+ ```go
108
+ package main
109
+
110
+ import (
111
+ "net/http"
112
+ "github.com/go-chi/chi/v5"
113
+ )
114
+
115
+ func main() {
116
+ r := chi.NewRouter()
117
+ r.Get("/", func(w http.ResponseWriter, r *http.Request) {
118
+ w.Write([]byte("Hello, World!"))
119
+ })
120
+ http.ListenAndServe(":3000", r)
121
+ }
122
+ ```
123
+
124
+ ---
125
+
126
+ ## Training Details
127
+
128
+ ### Dataset
129
+ - **Source:** A custom dataset of Go code snippets paired with descriptive comments.
130
+ - **Size:** ~1,620 examples (90% training, 10% testing).
131
+ - **Preprocessing:** Comments and code were formatted into instruction-output pairs.
132
+
133
+ ### Training Configuration
134
+ - **Epochs:** 10
135
+ - **Batch Size:** Effective batch size of 8 (per_device_train_batch_size=2, gradient_accumulation_steps=4).
136
+ - **Learning Rate:** 2e-4
137
+ - **Optimizer:** AdamW 8-bit
138
+ - **Mixed Precision:** `bf16` (for Ampere+ GPUs) or `fp16` (for older GPUs).
139
+
140
+ ---
141
+
142
+ ## Dataset
143
+
144
+ The dataset used for training consists of Go code snippets paired with descriptive comments. Each example follows the format in json:
145
+
146
+
147
+ **Instruction:**
148
+ Basic routing with Chi.
149
+
150
+ **Code Example**
151
+
152
+ ```go
153
+ package main
154
+
155
+ import (
156
+ "net/http"
157
+ "github.com/go-chi/chi/v5"
158
+ )
159
+
160
+ func main() {
161
+ r := chi.NewRouter()
162
+ r.Get("/", func(w http.ResponseWriter, r *http.Request) {
163
+ w.Write([]byte("Hello, World!"))
164
+ })
165
+ http.ListenAndServe(":3000", r)
166
+ }
167
+
168
+ ```
169
+
170
+ If you’d like to access the dataset, it can be found here: [Gofiles](data/combined.go)
171
+ .
172
+
173
+ ---
174
+
175
+ ## Evaluation
176
+
177
+ The model was evaluated qualitatively by generating code for various comments and verifying correctness. While no formal quantitative metrics were used, the model demonstrates strong performance in generating syntactically correct and semantically relevant Go code.
178
+
179
+ ---
180
+
181
+ ## Limitations
182
+
183
+ - **Edge Cases:** The model may struggle with highly complex or domain-specific comments.
184
+ - **Ambiguity:** Ambiguous or vague comments may lead to incorrect or incomplete code.
185
+ - **Bias:** The model reflects biases present in the training data.
186
+
187
+ ---
188
+
189
+ ## Contributing
190
+
191
+ Contributions are welcome! If you’d like to improve this project, consider:
192
+ - Adding more examples to the dataset.
193
+ - Experimenting with different fine-tuning techniques.
194
+ - Reporting bugs or suggesting improvements via GitHub Issues.
195
+
196
+ ---
197
+
198
+ ## Citation
199
+
200
+ If you use this model or dataset in your research, please cite it as follows:
201
+
202
+ ```bibtex
203
+ @misc{mistralxgo-7b,
204
+ author = {Prathamesh Devadiga},
205
+ title = {MistralXGo-7B: Fine-Tuned Model for Go Code Generation/Optimization},
206
+ year = {2023},
207
+ publisher = {GitHub},
208
+ journal = {GitHub Repository},
209
+ howpublished = {\url{https://github.com/devadigapratham/MistralXGo-7B}},
210
+ }
211
+ ```
212
+
213
+ ---
214
+
215
+ ## License
216
+
217
+ This project is released under the [Apache 2.0 License](LICENSE). You are free to use, modify, and distribute the model and code, provided you include appropriate attribution.
218
+