kyleparrratt commited on
Commit
b999e09
·
verified ·
1 Parent(s): 89f89b3

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +191 -50
README.md CHANGED
@@ -1,69 +1,210 @@
1
  ---
2
- language:
3
- - bg
4
- - en
5
- license: mit
6
- base_model: Qwen/Qwen2.5-Coder-7B-Instruct
7
  tags:
8
- - code
9
- - bulgarian
10
- - lora
11
- - peft
12
- - vitosha-gpt-code
13
- - slm
14
- - offline
15
  ---
16
 
17
- # Vitosha-GPT-Code
18
 
19
- **Every Bulgarian has the right to AI.** Right now that’s a luxury for people with fast internet and expensive hardware. If you’re in a remote area on an old PC, you’re locked out. Vitosha-GPT-Code is built to change that.
20
 
21
- It’s a **Bulgarian-first coding assistant** (Small Language Model / SLM track): explanations and code in Bulgarian by default. Named after Vitosha, the mountain overlooking Sofia. The goal is to run **100% offline on as little as 4GB RAM**—no subscriptions, no fiber, no data leaving your machine. Same coding and logic tools for a kid in a remote province as for a developer in Sofia: building a website, learning to program, without hardware as a barrier.
22
 
23
- **V0.1 is in development**, kept free and local for every Bulgarian.
24
 
25
- This repo hosts the **LoRA adapter** on [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct), trained on thousands of Bulgarian coding examples (OPUS-translated, no prompt poisoning). Use it as the coding model that speaks Bulgarian first. A lightweight, 4GB-friendly SLM variant is planned for offline use.
26
 
27
- ## Usage
28
 
29
- ```python
30
- from transformers import AutoModelForCausalLM, AutoTokenizer
31
- from peft import PeftModel
32
- import torch
33
 
34
- base = "Qwen/Qwen2.5-Coder-7B-Instruct"
35
- adapter = "kyleparrratt/Vitosha-GPT-Code"
36
 
37
- tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
38
- model = AutoModelForCausalLM.from_pretrained(
39
- base,
40
- torch_dtype=torch.bfloat16,
41
- device_map="auto",
42
- )
43
- model = PeftModel.from_pretrained(model, adapter, is_trainable=False)
44
 
45
- messages = [
46
- {"role": "system", "content": "Ти си полезен асистент за програмиране. Отговаряш на български."},
47
- {"role": "user", "content": "Напиши функция на Python за проверка на просто число и обясни на български."},
48
- ]
49
- text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
50
- inputs = tokenizer(text, return_tensors="pt").to(model.device)
51
- out = model.generate(**inputs, max_new_tokens=512, do_sample=False, pad_token_id=tokenizer.eos_token_id)
52
- print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
53
- ```
54
 
55
- ## Training
56
 
57
- - **Base:** Qwen2.5-Coder-7B-Instruct
58
- - **Data:** Bulgarian coding data from evol-codealpaca-v1: prompts and completions translated to Bulgarian with OPUS (opus-mt-tc-big-en-bg), 100% Bulgarian-target examples, no boost phrase
59
- - **Adapter:** LoRA r=16, trained with Unsloth
60
- - **Inference:** Use `transformers` + PEFT (not Unsloth inference, to avoid RoPE issues)
61
 
62
- ## Limitations
 
 
63
 
64
- - Occasional English in explanations. Including “Отговори на български.” in the user message keeps output in Bulgarian.
65
- - Code identifiers and APIs stay in English; explanations and prose are in Bulgarian.
66
 
67
- ## License
68
 
69
- MIT. Adapter and card as-is; base model terms apply.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: unsloth/Qwen2.5-Coder-7B-Instruct
3
+ library_name: peft
4
+ pipeline_tag: text-generation
 
 
5
  tags:
6
+ - base_model:adapter:unsloth/Qwen2.5-Coder-7B-Instruct
7
+ - lora
8
+ - sft
9
+ - transformers
10
+ - trl
11
+ - unsloth
 
12
  ---
13
 
14
+ # Model Card for Model ID
15
 
16
+ <!-- Provide a quick summary of what the model is/does. -->
17
 
 
18
 
 
19
 
20
+ ## Model Details
21
 
22
+ ### Model Description
23
 
24
+ <!-- Provide a longer summary of what this model is. -->
 
 
 
25
 
 
 
26
 
 
 
 
 
 
 
 
27
 
28
+ - **Developed by:** [More Information Needed]
29
+ - **Funded by [optional]:** [More Information Needed]
30
+ - **Shared by [optional]:** [More Information Needed]
31
+ - **Model type:** [More Information Needed]
32
+ - **Language(s) (NLP):** [More Information Needed]
33
+ - **License:** [More Information Needed]
34
+ - **Finetuned from model [optional]:** [More Information Needed]
 
 
35
 
36
+ ### Model Sources [optional]
37
 
38
+ <!-- Provide the basic links for the model. -->
 
 
 
39
 
40
+ - **Repository:** [More Information Needed]
41
+ - **Paper [optional]:** [More Information Needed]
42
+ - **Demo [optional]:** [More Information Needed]
43
 
44
+ ## Uses
 
45
 
46
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
47
 
48
+ ### Direct Use
49
+
50
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
51
+
52
+ [More Information Needed]
53
+
54
+ ### Downstream Use [optional]
55
+
56
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
57
+
58
+ [More Information Needed]
59
+
60
+ ### Out-of-Scope Use
61
+
62
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
63
+
64
+ [More Information Needed]
65
+
66
+ ## Bias, Risks, and Limitations
67
+
68
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
69
+
70
+ [More Information Needed]
71
+
72
+ ### Recommendations
73
+
74
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
75
+
76
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
77
+
78
+ ## How to Get Started with the Model
79
+
80
+ Use the code below to get started with the model.
81
+
82
+ [More Information Needed]
83
+
84
+ ## Training Details
85
+
86
+ ### Training Data
87
+
88
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
89
+
90
+ [More Information Needed]
91
+
92
+ ### Training Procedure
93
+
94
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
95
+
96
+ #### Preprocessing [optional]
97
+
98
+ [More Information Needed]
99
+
100
+
101
+ #### Training Hyperparameters
102
+
103
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
104
+
105
+ #### Speeds, Sizes, Times [optional]
106
+
107
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
108
+
109
+ [More Information Needed]
110
+
111
+ ## Evaluation
112
+
113
+ <!-- This section describes the evaluation protocols and provides the results. -->
114
+
115
+ ### Testing Data, Factors & Metrics
116
+
117
+ #### Testing Data
118
+
119
+ <!-- This should link to a Dataset Card if possible. -->
120
+
121
+ [More Information Needed]
122
+
123
+ #### Factors
124
+
125
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
126
+
127
+ [More Information Needed]
128
+
129
+ #### Metrics
130
+
131
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
132
+
133
+ [More Information Needed]
134
+
135
+ ### Results
136
+
137
+ [More Information Needed]
138
+
139
+ #### Summary
140
+
141
+
142
+
143
+ ## Model Examination [optional]
144
+
145
+ <!-- Relevant interpretability work for the model goes here -->
146
+
147
+ [More Information Needed]
148
+
149
+ ## Environmental Impact
150
+
151
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
152
+
153
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
154
+
155
+ - **Hardware Type:** [More Information Needed]
156
+ - **Hours used:** [More Information Needed]
157
+ - **Cloud Provider:** [More Information Needed]
158
+ - **Compute Region:** [More Information Needed]
159
+ - **Carbon Emitted:** [More Information Needed]
160
+
161
+ ## Technical Specifications [optional]
162
+
163
+ ### Model Architecture and Objective
164
+
165
+ [More Information Needed]
166
+
167
+ ### Compute Infrastructure
168
+
169
+ [More Information Needed]
170
+
171
+ #### Hardware
172
+
173
+ [More Information Needed]
174
+
175
+ #### Software
176
+
177
+ [More Information Needed]
178
+
179
+ ## Citation [optional]
180
+
181
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
182
+
183
+ **BibTeX:**
184
+
185
+ [More Information Needed]
186
+
187
+ **APA:**
188
+
189
+ [More Information Needed]
190
+
191
+ ## Glossary [optional]
192
+
193
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
194
+
195
+ [More Information Needed]
196
+
197
+ ## More Information [optional]
198
+
199
+ [More Information Needed]
200
+
201
+ ## Model Card Authors [optional]
202
+
203
+ [More Information Needed]
204
+
205
+ ## Model Card Contact
206
+
207
+ [More Information Needed]
208
+ ### Framework versions
209
+
210
+ - PEFT 0.18.1