mamounyosef commited on
Commit
6988873
·
verified ·
1 Parent(s): 8ef3b55

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +332 -331
README.md CHANGED
@@ -1,331 +1,332 @@
1
- ---
2
- base_model: Qwen/Qwen2.5-Coder-0.5B
3
- library_name: peft
4
- pipeline_tag: text-generation
5
- tags:
6
- - base_model:adapter:Qwen/Qwen2.5-Coder-0.5B
7
- - lora
8
- - transformers
9
- - qlora
10
- - commit-message-generation
11
- - code-summarization
12
- license: apache-2.0
13
- language:
14
- - en
15
- ---
16
-
17
- # QLoRA Adapter for Commit Message Generation
18
-
19
- Fine-tuned LoRA adapter for **Qwen2.5-Coder-0.5B** that generates clear, concise Git commit messages from code diffs.
20
-
21
- ## Model Details
22
-
23
- ### Model Description
24
-
25
- This model is a **QLoRA (4-bit quantized LoRA)** adapter trained on the Qwen2.5-Coder-0.5B base model to automatically generate commit messages from Git diffs. The adapter learns to summarize code changes into human-readable descriptions, understanding programming patterns and translating technical modifications into natural language.
26
-
27
- **Key characteristics:**
28
- - Uses the **PT (Pretrained/Base)** version of Qwen2.5-Coder for cleaner, more controllable outputs
29
- - Trained with 4-bit NF4 quantization for efficient fine-tuning on consumer hardware
30
- - Only LoRA adapters are included (~few MB); requires base model for inference
31
- - Optimized for diff-to-message generation, not chat or instruction following
32
-
33
- - **Developed by:** Mamoun Yosef
34
- - **Model type:** Causal Language Model (Decoder-only Transformer) with LoRA adapters
35
- - **Language(s):** English
36
- - **License:** Apache 2.0
37
- - **Finetuned from model:** Qwen/Qwen2.5-Coder-0.5B
38
-
39
- ### Model Sources
40
-
41
- - **Repository:** [[commit-message-llm]](https://github.com/mamounyosef/commit-message-llm)
42
- - **Base Model:** [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B)
43
-
44
- ## Uses
45
-
46
- ### Direct Use
47
-
48
- This adapter is designed for **automated commit message generation** from Git diffs. It can be used to:
49
-
50
- - Generate commit messages for staged changes in Git repositories
51
- - Suggest descriptive summaries for code modifications
52
- - Automate documentation of code changes in CI/CD pipelines
53
- - Assist developers in writing clear, consistent commit messages
54
-
55
- **Example input (Git diff):**
56
- ```diff
57
- diff --git a/src/utils.py b/src/utils.py
58
- index abc123..def456 100644
59
- --- a/src/utils.py
60
- +++ b/src/utils.py
61
- @@ -10,6 +10,9 @@ def process_data(data):
62
- return result
63
-
64
- +def validate_input(data):
65
- + return data is not None and len(data) > 0
66
- +
67
- def save_output(output, filename):
68
- ```
69
-
70
- **Example output:**
71
- ```
72
- Add input validation function
73
- ```
74
-
75
- ### Downstream Use
76
-
77
- Can be integrated into:
78
- - Git hooks (pre-commit, commit-msg)
79
- - IDE extensions for code editors
80
- - Code review tools
81
- - Developer productivity applications
82
-
83
- ### Out-of-Scope Use
84
-
85
- **Not suitable for:**
86
- - General text generation or chat
87
- - Generating code from descriptions (reverse direction)
88
- - Diffs from non-programming languages
89
- - Extremely large diffs (>8000 characters)
90
- - Commit messages requiring deep domain knowledge beyond code structure
91
-
92
- ## Bias, Risks, and Limitations
93
-
94
- **Limitations:**
95
- - Trained only on English commit messages
96
- - May struggle with very complex multi-file changes
97
- - Limited to diff length of 50-8000 characters
98
- - Performance depends on code quality and diff clarity
99
- - May generate generic messages for trivial changes
100
- - Does not understand business context or domain-specific terminology
101
-
102
- **Risks:**
103
- - Generated messages may not capture full intent of changes
104
- - Should be reviewed by developers before committing
105
- - May miss important security or breaking change implications
106
-
107
- ### Recommendations
108
-
109
- - Always review generated commit messages before use
110
- - Use as a suggestion tool, not fully automated solution
111
- - Combine with manual editing for complex changes
112
- - Test on your codebase to evaluate quality
113
-
114
- ## How to Get Started with the Model
115
- ```python
116
- from transformers import AutoTokenizer, AutoModelForCausalLM
117
- from peft import PeftModel
118
- import torch
119
-
120
- # Load base model in 4-bit
121
- from transformers import BitsAndBytesConfig
122
-
123
- quant_config = BitsAndBytesConfig(
124
- load_in_4bit=True,
125
- bnb_4bit_quant_type="nf4",
126
- bnb_4bit_use_double_quant=True,
127
- bnb_4bit_compute_dtype=torch.bfloat16,
128
- )
129
-
130
- base_model = AutoModelForCausalLM.from_pretrained(
131
- "Qwen/Qwen2.5-Coder-0.5B",
132
- quantization_config=quant_config,
133
- device_map="auto",
134
- torch_dtype=torch.bfloat16,
135
- )
136
-
137
- # Load LoRA adapter
138
- model = PeftModel.from_pretrained(base_model, "path/to/checkpoint-6000")
139
- tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-0.5B")
140
-
141
- # Generate commit message
142
- diff = """diff --git a/file.py b/file.py
143
- --- a/file.py
144
- +++ b/file.py
145
- @@ -1,3 +1,4 @@
146
- +import os
147
- def main():
148
- print("Hello")
149
- """
150
-
151
- prompt = diff + "\n\nCommit message:\n"
152
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
153
-
154
- outputs = model.generate(
155
- **inputs,
156
- max_new_tokens=30,
157
- do_sample=False,
158
- num_beams=1,
159
- eos_token_id=tokenizer.eos_token_id,
160
- )
161
-
162
- message = tokenizer.decode(outputs[0], skip_special_tokens=True)
163
- message = message[len(prompt):].strip()
164
- print(message)
165
- ```
166
-
167
- ## Training Details
168
-
169
- ### Training Data
170
-
171
- **Dataset:** [Maxscha/commitbench](https://huggingface.co/datasets/Maxscha/commitbench)
172
-
173
- **Preprocessing:**
174
- - Removed trivial messages (fix, update, wip, etc.)
175
- - Filtered out reference-only commits (fix #123)
176
- - Removed placeholder tokens (<HASH>, <URL>)
177
- - Kept diffs between 50-8000 characters
178
- - Required messages with semantic content (≥3 words)
179
-
180
- **Final dataset sizes:**
181
- - Training: 120,000 samples
182
- - Validation: 15,000 samples
183
- - Test: 15,000 samples
184
-
185
- ### Training Procedure
186
-
187
- **Format:**
188
- ```
189
- {diff content}
190
-
191
- Commit message:
192
- {target message}<eos>
193
- ```
194
-
195
- Prompt tokens (diff + separator) are masked with label `-100` so loss is computed only on the commit message generation.
196
-
197
- #### Preprocessing
198
-
199
- 1. Normalize newlines (CRLF → LF)
200
- 2. Tokenize diff + separator + message
201
- 3. Mask prompt labels to `-100`
202
- 4. Truncate to max_length=512 tokens
203
- 5. Append EOS token to target
204
-
205
- #### Training Hyperparameters
206
-
207
- **QLoRA Configuration:**
208
- - Quantization: 4-bit NF4
209
- - Compute dtype: bfloat16
210
- - LoRA rank (r): 16
211
- - LoRA alpha: 32
212
- - LoRA dropout: 0.05
213
- - Target modules: q_proj, k_proj, v_proj, o_proj
214
-
215
- **Training Parameters:**
216
- - Max sequence length: 512 tokens
217
- - Per-device train batch size: 6
218
- - Per-device eval batch size: 6
219
- - Gradient accumulation steps: 8
220
- - **Effective batch size: 48**
221
- - Learning rate: 1.8e-4
222
- - LR scheduler: Cosine with 4% warmup
223
- - Total training steps: 6000
224
- - Epochs: ~2
225
- - Optimizer: paged_adamw_8bit
226
- - Gradient clipping: 1.0
227
- - **Training regime:** bf16 mixed precision
228
-
229
- **Memory Optimizations:**
230
- - Gradient checkpointing enabled
231
- - SDPA (Scaled Dot-Product Attention) for efficient attention
232
- - 8-bit paged optimizer
233
- - Group by length for efficient batching
234
-
235
- #### Speeds, Sizes, Times
236
-
237
- - **Hardware:** NVIDIA RTX 4060 (8GB VRAM)
238
- - **Total training time:** ~13 hours
239
- - **Checkpoint size:** ~few MB (LoRA adapters only)
240
- - **Peak VRAM usage:** <8GB
241
- - **Training throughput:** ~2500 samples/hour
242
-
243
- ## Evaluation
244
-
245
- ### Testing Data, Factors & Metrics
246
-
247
- #### Testing Data
248
-
249
- **Test split from Maxscha/commitbench:**
250
- - 15,000 cleaned samples
251
- - Same preprocessing as training data
252
- - No overlap with training/validation sets
253
-
254
- #### Metrics
255
-
256
- - **Loss:** Cross-entropy loss on commit message tokens
257
- - **Perplexity:** exp(loss), measures model confidence
258
- - Lower perplexity = better prediction quality
259
- - Perplexity 17 is strong for this task
260
-
261
- ### Results
262
-
263
- | Split | Loss | Perplexity |
264
- |-------|------|------------|
265
- | Validation | 2.8583 | 17.43 |
266
- | Test | 2.8501 | 17.29 |
267
-
268
- **Qualitative Example:**
269
- ```diff
270
- diff --git a/src/client/core/commands/menu.js
271
- + 'core/settings'
272
- +], function (_, hr, MenubarView, box, panels, tabs, session, localfs, settings) {
273
- + }).menuSection({
274
- + 'id': "themes.settings",
275
- + 'title': "Settings",
276
- + 'action': function() {
277
- + settings.open("themes"...
278
- ```
279
-
280
- - **Ground truth:** Add command to open themes settings in view menu
281
- - **Model output:** Add theme settings to the menu
282
-
283
- The model correctly identifies the purpose (menu settings addition) and generates a concise, accurate description.
284
-
285
- ## Environmental Impact
286
-
287
- - **Hardware Type:** NVIDIA RTX 4060 (8GB VRAM)
288
- - **Hours used:** ~13 hours
289
- - **Cloud Provider:** N/A (local training)
290
- - **Compute Region:** N/A
291
- - **Carbon Emitted:** Minimal (single consumer GPU, short training time)
292
-
293
- ## Technical Specifications
294
-
295
- ### Model Architecture and Objective
296
-
297
- - **Base Architecture:** Qwen2.5-Coder-0.5B (Decoder-only Transformer)
298
- - **Adapter Type:** LoRA (Low-Rank Adaptation)
299
- - **Objective:** Causal language modeling with masked prompts
300
- - **Loss Function:** Cross-entropy on commit message tokens only
301
-
302
- ### Compute Infrastructure
303
-
304
- #### Hardware
305
-
306
- - GPU: NVIDIA RTX 4060
307
- - VRAM: 8GB
308
- - System RAM: 16GB
309
- - Storage: SSD recommended for dataset loading
310
-
311
- #### Software
312
-
313
- - **Framework:** PyTorch, Hugging Face Transformers
314
- - **PEFT Version:** 0.18.1
315
- - **Key Libraries:**
316
- - `transformers` (model loading, training)
317
- - `peft` (LoRA adapters)
318
- - `bitsandbytes` (4-bit quantization)
319
- - `datasets` (data loading)
320
- - `torch` (deep learning backend)
321
-
322
- ## Model Card Authors
323
-
324
- Mamoun Yosef
325
-
326
- ### Framework versions
327
-
328
- - PEFT 0.18.1
329
- - Transformers 4.x
330
- - PyTorch 2.x
331
- - bitsandbytes 0.x
 
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-Coder-0.5B
3
+ library_name: peft
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - lora
7
+ - transformers
8
+ - qlora
9
+ - commit-message-generation
10
+ - code-summarization
11
+ - generated_from_trainer
12
+ license: apache-2.0
13
+ language:
14
+ - en
15
+ ---
16
+
17
+ # QLoRA Adapter for Commit Message Generation
18
+
19
+ Fine-tuned LoRA adapter for **Qwen2.5-Coder-0.5B** that generates clear, concise Git commit messages from code diffs.
20
+
21
+ ## Model Details
22
+
23
+ ### Model Description
24
+
25
+ This model is a **QLoRA (4-bit quantized LoRA)** adapter trained on the Qwen2.5-Coder-0.5B base model to automatically generate commit messages from Git diffs. The adapter learns to summarize code changes into human-readable descriptions, understanding programming patterns and translating technical modifications into natural language.
26
+
27
+ **Key characteristics:**
28
+ - Uses the **PT (Pretrained/Base)** version of Qwen2.5-Coder for cleaner, more controllable outputs
29
+ - Trained with 4-bit NF4 quantization for efficient fine-tuning on consumer hardware
30
+ - Only LoRA adapters are included (~few MB); requires base model for inference
31
+ - Optimized for diff-to-message generation, not chat or instruction following
32
+
33
+ - **Developed by:** Mamoun Yosef
34
+ - **Model type:** Causal Language Model (Decoder-only Transformer) with LoRA adapters
35
+ - **Language(s):** English
36
+ - **License:** Apache 2.0
37
+ - **Finetuned from model:** Qwen/Qwen2.5-Coder-0.5B
38
+
39
+ ### Model Sources
40
+
41
+ - **Repository:** [commit-message-llm](https://github.com/mamounyosef/commit-message-llm)
42
+ - **Base Model:** [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B)
43
+
44
+ ## Uses
45
+
46
+ ### Direct Use
47
+
48
+ This adapter is designed for **automated commit message generation** from Git diffs. It can be used to:
49
+
50
+ - Generate commit messages for staged changes in Git repositories
51
+ - Suggest descriptive summaries for code modifications
52
+ - Automate documentation of code changes in CI/CD pipelines
53
+ - Assist developers in writing clear, consistent commit messages
54
+
55
+ **Example input (Git diff):**
56
+ ```diff
57
+ diff --git a/src/utils.py b/src/utils.py
58
+ index abc123..def456 100644
59
+ --- a/src/utils.py
60
+ +++ b/src/utils.py
61
+ @@ -10,6 +10,9 @@ def process_data(data):
62
+ return result
63
+
64
+ +def validate_input(data):
65
+ + return data is not None and len(data) > 0
66
+ +
67
+ def save_output(output, filename):
68
+ ```
69
+
70
+ **Example output:**
71
+ ```
72
+ Add input validation function
73
+ ```
74
+
75
+ ### Downstream Use
76
+
77
+ Can be integrated into:
78
+ - Git hooks (pre-commit, commit-msg)
79
+ - IDE extensions for code editors
80
+ - Code review tools
81
+ - Developer productivity applications
82
+
83
+ ### Out-of-Scope Use
84
+
85
+ **Not suitable for:**
86
+ - General text generation or chat
87
+ - Generating code from descriptions (reverse direction)
88
+ - Diffs from non-programming languages
89
+ - Extremely large diffs (>8000 characters)
90
+ - Commit messages requiring deep domain knowledge beyond code structure
91
+
92
+ ## Bias, Risks, and Limitations
93
+
94
+ **Limitations:**
95
+ - Trained only on English commit messages
96
+ - May struggle with very complex multi-file changes
97
+ - Limited to diff length of 50-8000 characters
98
+ - Performance depends on code quality and diff clarity
99
+ - May generate generic messages for trivial changes
100
+ - Does not understand business context or domain-specific terminology
101
+
102
+ **Risks:**
103
+ - Generated messages may not capture full intent of changes
104
+ - Should be reviewed by developers before committing
105
+ - May miss important security or breaking change implications
106
+
107
+ ### Recommendations
108
+
109
+ - Always review generated commit messages before use
110
+ - Use as a suggestion tool, not fully automated solution
111
+ - Combine with manual editing for complex changes
112
+ - Test on your codebase to evaluate quality
113
+
114
+ ## How to Get Started with the Model
115
+
116
+ ```python
117
+ from transformers import AutoTokenizer, AutoModelForCausalLM
118
+ from peft import PeftModel
119
+ import torch
120
+
121
+ # Load base model in 4-bit
122
+ from transformers import BitsAndBytesConfig
123
+
124
+ quant_config = BitsAndBytesConfig(
125
+ load_in_4bit=True,
126
+ bnb_4bit_quant_type="nf4",
127
+ bnb_4bit_use_double_quant=True,
128
+ bnb_4bit_compute_dtype=torch.bfloat16,
129
+ )
130
+
131
+ base_model = AutoModelForCausalLM.from_pretrained(
132
+ "Qwen/Qwen2.5-Coder-0.5B",
133
+ quantization_config=quant_config,
134
+ device_map="auto",
135
+ torch_dtype=torch.bfloat16,
136
+ )
137
+
138
+ # Load LoRA adapter
139
+ model = PeftModel.from_pretrained(base_model, "mamounyosef/commit-message-llm")
140
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-0.5B")
141
+
142
+ # Generate commit message
143
+ diff = """diff --git a/file.py b/file.py
144
+ --- a/file.py
145
+ +++ b/file.py
146
+ @@ -1,3 +1,4 @@
147
+ +import os
148
+ def main():
149
+ print("Hello")
150
+ """
151
+
152
+ prompt = diff + "\n\nCommit message:\n"
153
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
154
+
155
+ outputs = model.generate(
156
+ **inputs,
157
+ max_new_tokens=30,
158
+ do_sample=False,
159
+ num_beams=1,
160
+ eos_token_id=tokenizer.eos_token_id,
161
+ )
162
+
163
+ message = tokenizer.decode(outputs[0], skip_special_tokens=True)
164
+ message = message[len(prompt):].strip()
165
+ print(message)
166
+ ```
167
+
168
+ ## Training Details
169
+
170
+ ### Training Data
171
+
172
+ **Dataset:** [Maxscha/commitbench](https://huggingface.co/datasets/Maxscha/commitbench)
173
+
174
+ **Preprocessing:**
175
+ - Removed trivial messages (fix, update, wip, etc.)
176
+ - Filtered out reference-only commits (fix #123)
177
+ - Removed placeholder tokens (<HASH>, <URL>)
178
+ - Kept diffs between 50-8000 characters
179
+ - Required messages with semantic content (≥3 words)
180
+
181
+ **Final dataset sizes:**
182
+ - Training: 120,000 samples
183
+ - Validation: 15,000 samples
184
+ - Test: 15,000 samples
185
+
186
+ ### Training Procedure
187
+
188
+ **Format:**
189
+ ```
190
+ {diff content}
191
+
192
+ Commit message:
193
+ {target message}<eos>
194
+ ```
195
+
196
+ Prompt tokens (diff + separator) are masked with label `-100` so loss is computed only on the commit message generation.
197
+
198
+ #### Preprocessing
199
+
200
+ 1. Normalize newlines (CRLF LF)
201
+ 2. Tokenize diff + separator + message
202
+ 3. Mask prompt labels to `-100`
203
+ 4. Truncate to max_length=512 tokens
204
+ 5. Append EOS token to target
205
+
206
+ #### Training Hyperparameters
207
+
208
+ **QLoRA Configuration:**
209
+ - Quantization: 4-bit NF4
210
+ - Compute dtype: bfloat16
211
+ - LoRA rank (r): 16
212
+ - LoRA alpha: 32
213
+ - LoRA dropout: 0.05
214
+ - Target modules: q_proj, k_proj, v_proj, o_proj
215
+
216
+ **Training Parameters:**
217
+ - Max sequence length: 512 tokens
218
+ - Per-device train batch size: 6
219
+ - Per-device eval batch size: 6
220
+ - Gradient accumulation steps: 8
221
+ - **Effective batch size: 48**
222
+ - Learning rate: 1.8e-4
223
+ - LR scheduler: Cosine with 4% warmup
224
+ - Total training steps: 6000
225
+ - Epochs: ~2
226
+ - Optimizer: paged_adamw_8bit
227
+ - Gradient clipping: 1.0
228
+ - **Training regime:** bf16 mixed precision
229
+
230
+ **Memory Optimizations:**
231
+ - Gradient checkpointing enabled
232
+ - SDPA (Scaled Dot-Product Attention) for efficient attention
233
+ - 8-bit paged optimizer
234
+ - Group by length for efficient batching
235
+
236
+ #### Speeds, Sizes, Times
237
+
238
+ - **Hardware:** NVIDIA RTX 4060 (8GB VRAM)
239
+ - **Total training time:** ~13 hours
240
+ - **Checkpoint size:** ~few MB (LoRA adapters only)
241
+ - **Peak VRAM usage:** <8GB
242
+ - **Training throughput:** ~2500 samples/hour
243
+
244
+ ## Evaluation
245
+
246
+ ### Testing Data, Factors & Metrics
247
+
248
+ #### Testing Data
249
+
250
+ **Test split from Maxscha/commitbench:**
251
+ - 15,000 cleaned samples
252
+ - Same preprocessing as training data
253
+ - No overlap with training/validation sets
254
+
255
+ #### Metrics
256
+
257
+ - **Loss:** Cross-entropy loss on commit message tokens
258
+ - **Perplexity:** exp(loss), measures model confidence
259
+ - Lower perplexity = better prediction quality
260
+ - Perplexity ≈ 17 is strong for this task
261
+
262
+ ### Results
263
+
264
+ | Split | Loss | Perplexity |
265
+ |-------|------|------------|
266
+ | Validation | 2.8583 | 17.43 |
267
+ | Test | 2.8501 | 17.29 |
268
+
269
+ **Qualitative Example:**
270
+ ```diff
271
+ diff --git a/src/client/core/commands/menu.js
272
+ + 'core/settings'
273
+ +], function (_, hr, MenubarView, box, panels, tabs, session, localfs, settings) {
274
+ + }).menuSection({
275
+ + 'id': "themes.settings",
276
+ + 'title': "Settings",
277
+ + 'action': function() {
278
+ + settings.open("themes"...
279
+ ```
280
+
281
+ - **Ground truth:** Add command to open themes settings in view menu
282
+ - **Model output:** Add theme settings to the menu
283
+
284
+ The model correctly identifies the purpose (menu settings addition) and generates a concise, accurate description.
285
+
286
+ ## Environmental Impact
287
+
288
+ - **Hardware Type:** NVIDIA RTX 4060 (8GB VRAM)
289
+ - **Hours used:** ~13 hours
290
+ - **Cloud Provider:** N/A (local training)
291
+ - **Compute Region:** N/A
292
+ - **Carbon Emitted:** Minimal (single consumer GPU, short training time)
293
+
294
+ ## Technical Specifications
295
+
296
+ ### Model Architecture and Objective
297
+
298
+ - **Base Architecture:** Qwen2.5-Coder-0.5B (Decoder-only Transformer)
299
+ - **Adapter Type:** LoRA (Low-Rank Adaptation)
300
+ - **Objective:** Causal language modeling with masked prompts
301
+ - **Loss Function:** Cross-entropy on commit message tokens only
302
+
303
+ ### Compute Infrastructure
304
+
305
+ #### Hardware
306
+
307
+ - GPU: NVIDIA RTX 4060
308
+ - VRAM: 8GB
309
+ - System RAM: 16GB
310
+ - Storage: SSD recommended for dataset loading
311
+
312
+ #### Software
313
+
314
+ - **Framework:** PyTorch, Hugging Face Transformers
315
+ - **PEFT Version:** 0.18.1
316
+ - **Key Libraries:**
317
+ - `transformers` (model loading, training)
318
+ - `peft` (LoRA adapters)
319
+ - `bitsandbytes` (4-bit quantization)
320
+ - `datasets` (data loading)
321
+ - `torch` (deep learning backend)
322
+
323
+ ## Model Card Authors
324
+
325
+ Mamoun Yosef
326
+
327
+ ### Framework Versions
328
+
329
+ - PEFT 0.18.1
330
+ - Transformers 4.x
331
+ - PyTorch 2.x
332
+ - bitsandbytes 0.x