File size: 5,556 Bytes
451139c 7c71785 4f12539 7c71785 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
---
language: en
license: apache-2.0
tags:
- mistral
- fine-tuned
- code-generation
- go
- lora
---
# MistralXGo-7B: Fine-Tuned Model for Go Code Generation


This repository contains **MistralXGo-7B**, a fine-tuned version of the [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) language model optimized for generating Go code based on comments. The model was trained using **LoRA (Low-Rank Adaptation)**, making it lightweight and efficient for deployment.
---
## Table of Contents
1. [Overview](#overview)
2. [Model Details](#model-details)
3. [Usage](#usage)
4. [Training Details](#training-details)
5. [Evaluation](#evaluation)
6. [Limitations](#limitations)
7. [Contributing](#contributing)
8. [Citation](#citation)
9. [License](#license)
---
## Overview
The goal of this project is to create a specialized language model that generates Go code from natural language comments. For example, given a comment like:
```go
// Basic routing with Chi.
```
The model generates corresponding Go code:
```go
package main
import (
"net/http"
"github.com/go-chi/chi/v5"
)
func main() {
r := chi.NewRouter()
r.Get("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, World!"))
})
http.ListenAndServe(":3000", r)
}
```
This model is particularly useful for developers who want to quickly prototype Go applications or explore code generation capabilities.
---
## Model Details
- **Base Model:** [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1)
- **Fine-Tuning Method:** LoRA (Low-Rank Adaptation)
- **Quantization:** 4-bit quantization for memory efficiency
- **Max Sequence Length:** 2048 tokens
- **Precision:** Mixed precision (`bf16` or `fp16` depending on hardware)
The model is hosted on Hugging Face Hub at:
[MistralXGo-7B](https://huggingface.co/devadigaprathamesh/MistralXGo-7B)
---
## Usage
### Installation
Install the required libraries:
```bash
pip install transformers torch
```
### Inference
Load the model and tokenizer from Hugging Face Hub and generate Go code:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model and tokenizer
model_name = "your-username/MistralXGo-7B"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Generate Go code
prompt = "// Basic routing with Chi."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256)
# Decode and print the output
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Example Output
Input:
```go
// Basic routing with Chi.
```
Output:
```go
package main
import (
"net/http"
"github.com/go-chi/chi/v5"
)
func main() {
r := chi.NewRouter()
r.Get("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, World!"))
})
http.ListenAndServe(":3000", r)
}
```
---
## Training Details
### Dataset
- **Source:** A custom dataset of Go code snippets paired with descriptive comments.
- **Size:** ~1,620 examples (90% training, 10% testing).
- **Preprocessing:** Comments and code were formatted into instruction-output pairs.
### Training Configuration
- **Epochs:** 10
- **Batch Size:** Effective batch size of 8 (per_device_train_batch_size=2, gradient_accumulation_steps=4).
- **Learning Rate:** 2e-4
- **Optimizer:** AdamW 8-bit
- **Mixed Precision:** `bf16` (for Ampere+ GPUs) or `fp16` (for older GPUs).
---
## Dataset
The dataset used for training consists of Go code snippets paired with descriptive comments. Each example follows the format in json:
**Instruction:**
Basic routing with Chi.
**Code Example**
```go
package main
import (
"net/http"
"github.com/go-chi/chi/v5"
)
func main() {
r := chi.NewRouter()
r.Get("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, World!"))
})
http.ListenAndServe(":3000", r)
}
```
---
## Evaluation
The model was evaluated qualitatively by generating code for various comments and verifying correctness. While no formal quantitative metrics were used, the model demonstrates strong performance in generating syntactically correct and semantically relevant Go code.
---
## Limitations
- **Edge Cases:** The model may struggle with highly complex or domain-specific comments.
- **Ambiguity:** Ambiguous or vague comments may lead to incorrect or incomplete code.
- **Bias:** The model reflects biases present in the training data.
---
## Contributing
Contributions are welcome! If you’d like to improve this project, consider:
- Adding more examples to the dataset.
- Experimenting with different fine-tuning techniques.
- Reporting bugs or suggesting improvements via GitHub Issues.
---
## Citation
If you use this model or dataset in your research, please cite it as follows:
```bibtex
@misc{mistralxgo-7b,
author = {Prathamesh Devadiga},
title = {MistralXGo-7B: Fine-Tuned Model for Go Code Generation/Optimization},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/devadigapratham/MistralXGo-7B}},
}
```
---
## License
This project is released under the [Apache 2.0 License](LICENSE). You are free to use, modify, and distribute the model and code, provided you include appropriate attribution.
|