Ricardouchub commited on
Commit
651e888
·
verified ·
1 Parent(s): a14d59b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -34
README.md CHANGED
@@ -7,7 +7,11 @@ base_model:
7
  - mistralai/Mistral-7B-Instruct-v0.3
8
  tags:
9
  - code
 
 
 
10
  ---
 
11
  # Code Specialist 7B
12
 
13
  <p align="left">
@@ -21,59 +25,67 @@ tags:
21
  <img src="https://img.shields.io/badge/Transformers-4.56+-purple?style=flat-square&logo=huggingface&logoColor=white" alt="Transformers"/>
22
  </a>
23
  <a href="https://github.com/Ricardouchub">
24
- <img src="https://img.shields.io/badge/Autor-Ricardo_Urdaneta-000000?style=flat-square&logo=github&logoColor=white" alt="Author"/>
25
  </a>
26
  </p>
27
 
28
- ## Descripción
 
 
29
 
30
- **Code Specialist 7B** es un modelo de lenguaje basado en **Mistral-7B-Instruct-v0.3**, adaptado mediante **SFT (Supervised Fine-Tuning)** con datasets especializados en **Python y SQL**.
31
- El entrenamiento fue realizado con el objetivo de mejorar la capacidad del modelo en **resolución de problemas de analisis de datos y data science**.
32
 
33
- Este modelo mantiene la arquitectura original de **7B parámetros**, pero incorpora un ajuste fino orientado a código, lo que lo hace más robusto en generación de funciones, queries SQL y respuestas técnicas.
34
 
35
  ---
36
 
37
- ## Modelo base
38
 
39
  - [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
40
- - Arquitectura: Transformer Decoder-only
41
- - Parámetros: ~7B
42
 
43
  ---
44
 
45
- ## Dataset utilizado para SFT
46
 
47
  - [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
48
- - [code_instructions_122k (alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style)
49
 
50
- Ambos datasets fueron **filtrados a ejemplos de Python y SQL**, con formato de prompts estilo **Alpaca/Mistral**.
51
 
52
- Ejemplo de formato aplicado:
53
 
54
  ```
55
- [INST] Escribe una función en Python que sume dos números. [/INST]
56
  def add(a, b):
57
  return a + b
58
  ```
59
 
60
  ---
61
 
62
- ## Entrenamiento
 
 
 
 
 
 
 
 
63
 
64
- - **Método:** QLoRA con Merge final de pesos
65
- - **Frameworks:** transformers, trl, peft, bitsandbytes
66
- - **Hardware:** GPU 12GB VRAM (cuantización en 4-bit para entrenamiento)
67
- - **Hiperparámetros principales:**
68
- - per_device_train_batch_size=2
69
- - gradient_accumulation_steps=4
70
- - learning_rate=2e-4
71
- - num_train_epochs=1
72
- - max_seq_length=1024
73
 
74
  ---
75
 
76
- ## Uso
77
 
78
  ```python
79
  from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -82,7 +94,7 @@ model_id = "Ricardouchub/Code-Specialist-7B"
82
  tok = AutoTokenizer.from_pretrained(model_id)
83
  mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
84
 
85
- prompt = "[INST] Escribe una función en Python que calcule la media de una lista. [/INST]"
86
  inputs = tok(prompt, return_tensors="pt").to(mdl.device)
87
 
88
  out = mdl.generate(**inputs, max_new_tokens=256)
@@ -91,14 +103,14 @@ print(tok.decode(out[0], skip_special_tokens=True))
91
 
92
  ---
93
 
94
- ## Benchmarks iniciales
95
 
96
- - **Eval simple (Python tasks):** mejora en tareas básicas de programación para uso en **data analysis, SQL queries y snippets Python** comparado con el modelo base.
97
- - Se recomienda evaluar en HumanEval / MBPP para métricas reproducibles.
98
 
99
  ---
100
 
101
- ## Autor
102
 
103
  **Ricardo Urdaneta**
104
  - [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/)
@@ -106,13 +118,13 @@ print(tok.decode(out[0], skip_special_tokens=True))
106
 
107
  ---
108
 
109
- ## Limitaciones
110
 
111
- - El modelo **no garantiza exactitud 100%** en código complejo.
112
- - Puede generar respuestas incoherentes en prompts ambiguos.
113
 
114
  ---
115
 
116
- ## Licencia
117
 
118
- El modelo se publica con la misma licencia que **Mistral-7B-Instruct-v0.3**: **MIT**
 
7
  - mistralai/Mistral-7B-Instruct-v0.3
8
  tags:
9
  - code
10
+ - python
11
+ - sql
12
+ - data-science
13
  ---
14
+
15
  # Code Specialist 7B
16
 
17
  <p align="left">
 
25
  <img src="https://img.shields.io/badge/Transformers-4.56+-purple?style=flat-square&logo=huggingface&logoColor=white" alt="Transformers"/>
26
  </a>
27
  <a href="https://github.com/Ricardouchub">
28
+ <img src="https://img.shields.io/badge/Author-Ricardo_Urdaneta-000000?style=flat-square&logo=github&logoColor=white" alt="Author"/>
29
  </a>
30
  </p>
31
 
32
+ ---
33
+
34
+ ## Description
35
 
36
+ **Code Specialist 7B** is a fine-tuned version of **Mistral-7B-Instruct-v0.3**, trained through **Supervised Fine-Tuning (SFT)** using datasets focused on **Python and SQL**.
37
+ The goal of this training was to enhance the model’s performance in **data analysis, programming problem-solving, and technical reasoning**.
38
 
39
+ The model preserves the **7B parameter Transformer decoder-only** architecture while introducing a code-oriented fine-tuning, resulting in improved robustness for function generation, SQL queries, and technical answers.
40
 
41
  ---
42
 
43
+ ## Base Model
44
 
45
  - [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
46
+ - Architecture: Transformer (decoder-only)
47
+ - Parameters: ~7B
48
 
49
  ---
50
 
51
+ ## Datasets Used for SFT
52
 
53
  - [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
54
+ - [Code Instructions 122k (Alpaca-style)](https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style)
55
 
56
+ Both datasets were **filtered to include only Python and SQL examples**, following **Alpaca/Mistral-style** instruction formatting.
57
 
58
+ Example prompt format:
59
 
60
  ```
61
+ [INST] Write a Python function that adds two numbers. [/INST]
62
  def add(a, b):
63
  return a + b
64
  ```
65
 
66
  ---
67
 
68
+ ## Training Details
69
+
70
+ | **Aspect** | **Detail** |
71
+ |--------------------|-------------|
72
+ | **Method** | QLoRA with final weight merge |
73
+ | **Frameworks** | `transformers`, `trl`, `peft`, `bitsandbytes` |
74
+ | **Hardware** | GPU with 12 GB VRAM (4-bit quantization for training) |
75
+
76
+ ### Main Hyperparameters
77
 
78
+ | **Parameter** | **Value** |
79
+ |----------------|-----------|
80
+ | `per_device_train_batch_size` | 2 |
81
+ | `gradient_accumulation_steps` | 4 |
82
+ | `learning_rate` | 2e-4 |
83
+ | `num_train_epochs` | 1 |
84
+ | `max_seq_length` | 1024 |
 
 
85
 
86
  ---
87
 
88
+ ## Usage
89
 
90
  ```python
91
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
94
  tok = AutoTokenizer.from_pretrained(model_id)
95
  mdl = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
96
 
97
+ prompt = "[INST] Write a Python function that calculates the average of a list. [/INST]"
98
  inputs = tok(prompt, return_tensors="pt").to(mdl.device)
99
 
100
  out = mdl.generate(**inputs, max_new_tokens=256)
 
103
 
104
  ---
105
 
106
+ ## Initial Benchmarks
107
 
108
+ - **Simple evaluation (Python tasks):** Improved results on small programming and data-related tasks, including **data analysis, SQL query generation, and Python snippets**, compared to the base model.
109
+ - Further evaluation on **HumanEval** or **MBPP** is recommended for reproducible metrics.
110
 
111
  ---
112
 
113
+ ## Author
114
 
115
  **Ricardo Urdaneta**
116
  - [LinkedIn](https://www.linkedin.com/in/ricardourdanetacastro/)
 
118
 
119
  ---
120
 
121
+ ## Limitations
122
 
123
+ - The model does **not guarantee 100% accuracy** on complex programming tasks.
124
+ - It may produce inconsistent results for ambiguous or incomplete prompts.
125
 
126
  ---
127
 
128
+ ## License
129
 
130
+ This model is released under the same license as **Mistral-7B-Instruct-v0.3** **MIT License**.