File size: 11,688 Bytes
f2661f1
05e5a00
f2661f1
05e5a00
 
 
 
 
 
f2661f1
05e5a00
 
 
 
f2661f1
05e5a00
 
 
 
 
 
 
f2661f1
 
05e5a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68ada46
05e5a00
68ada46
 
 
05e5a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f2661f1
05e5a00
 
 
 
 
 
 
f2661f1
05e5a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68ada46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
05e5a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68ada46
05e5a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f2661f1
 
05e5a00
 
 
 
 
 
f2661f1
05e5a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f2661f1
 
05e5a00
 
 
 
68ada46
05e5a00
 
 
 
f2661f1
05e5a00
f2661f1
05e5a00
f2661f1
05e5a00
 
 
 
f2661f1
05e5a00
f2661f1
05e5a00
f2661f1
05e5a00
f2661f1
05e5a00
f2661f1
05e5a00
f2661f1
05e5a00
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
---
license: apache-2.0
base_model: Qwen/Qwen3.5-4B-Base
library_name: peft
pipeline_tag: text-generation
model_name: Graphite 1.0 4B
language:
- en
- ru
tags:
- qwen
- qwen3.5
- peft
- lora
- unsloth
- trl
- sft
- code
- reasoning
- bilingual
- obsidian
- graphite
---

# Graphite 1.0 4B

`Graphite 1.0 4B` is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) and tuned for strict, grounded, low-noise responses across:

- repo repair and debugging
- agent tool-use formatting
- technical writing and Markdown workflows
- code review and integration tasks
- logic and factual precision
- bilingual Russian / English instruction following

## What This Repository Contains

This repo contains a **LoRA adapter**, not merged base weights.

- Base model: `Qwen/Qwen3.5-4B-Base`
- Adapter type: `LoRA`
- Rank: `r=16`
- Alpha: `16`
- Dropout: `0.0`
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

Files of interest:

- `adapter_model.safetensors`: LoRA weights
- `adapter_config.json`: PEFT adapter config
- `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets
- `run_summary.json`: public training run summary
- `length_stats.json`: length filtering summary
- `masking_sanity.json`: formatting sanity check

## Training Lineage

This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**.

- dataset family: **`obsidian-critic-broad-mix-20260321`**
- training stack: **Unsloth + TRL + torchrun DDP**
- base model: **`Qwen/Qwen3.5-4B-Base`**

Notebook lineage used for this stream:

- `obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb`: smoke-test notebook for the broad mix
- `obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb`: full fine-tune lineage used to produce the public LoRA run

## Dataset Provenance

The training data for this first public stream comes from the mixed dataset:

- dataset name: `obsidian-critic-broad-mix-20260321`
- examples in mixed dataset: `37,008`
- approximate token volume: `6,885,960`
- exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
- normalized near-duplicates removed from wave backfill rows: `201`
- dataset SHA-256: `5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a`

The public training run then created a deterministic train / validation split and applied sequence-length filtering:

- train rows before filter: `36,638`
- validation rows before filter: `370`
- train rows after filter: `36,081`
- validation rows after filter: `363`
- removed for length filtering: `564`
- minimum kept sequence length: `48`
- maximum kept sequence length: `2048`

### Mix Roles

| Role | Examples | Approx. tokens |
| --- | ---: | ---: |
| `repair` | 5,353 | 983,890 |
| `tool_use` | 4,682 | 455,600 |
| `core_real` | 4,200 | 1,043,187 |
| `robustness` | 3,600 | 397,399 |
| `agent_core` | 3,200 | 645,426 |
| `logic` | 3,031 | 297,812 |
| `factual` | 2,960 | 142,787 |
| `obsidian_docs` | 2,740 | 490,850 |
| `reasoning` | 2,200 | 655,624 |
| `greenfield` | 1,488 | 563,331 |
| `integration` | 1,473 | 241,331 |
| `review` | 1,327 | 343,007 |
| `regularizer` | 500 | 55,840 |
| `wave_backfill` | 230 | 218,922 |
| `long_context` | 24 | 350,954 |

### Source Dataset Table

| Dataset | Role | Examples | Approx. tokens |
| --- | --- | ---: | ---: |
| `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 |
| `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 |
| `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 |
| `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 |
| `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 |
| `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 |
| `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 |
| `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 |
| `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 |
| `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 |
| `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 |
| `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 |
| `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 |
| `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 |
| `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 |
| `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 |
| `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 |
| `security-repair-review-sft-20260319` | `review` | 893 | 159,086 |
| `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 |
| `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 |
| `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 |
| `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 |
| `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 |
| `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 |
| `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 |
| `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 |
| `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 |
| `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 |
| `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 |
| `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 |
| `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 |
| `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 |

## Representative Training Examples

These are short excerpts from the actual broad-mix and component datasets used in the first public stream.

### Broad mix: grounded repo repair

**System**
```text
Ground the answer in one active boundary, not in general advice.
```

**User**
```text
First fix failed. Repo: packages/notes-index
Patch 1 removed one fallback in `app/config.py`.
New evidence: `docker compose: service worker exited with code 1`.
Need the second minimal fix.
```

**Assistant**
```text
The first patch hit the wrong seam. The new signal points back to `app/config.py`.
```

### Broad mix: Russian ambiguity tightening

**System**
```text
Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии.
```

**User**
```text
Оцени формулировку из инструкции:
`Разрешён доступ аналитики и владельцы без резервной копией.`
```

**Assistant**
```text
Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй.
Строже: `И аналитики, и владельцы получают доступ без резервной копией.`
```

### Tool-use source: strict JSON action selection

**System**
```text
Ты code-workspace агент. Возвращай только валидный JSON.
```

**User**
```text
Исправь этот сломанный tool call и верни только валидный JSON.
{"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}}
```

**Assistant**
```json
{"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}}
```

### Docs source: structured README generation

**System**
```text
Ты technical writer для engineering workflows. Только полезная структура.
```

**User**
```text
Напиши полноценный README.md для `md-audit`.
Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions.
```

### Logic source: contradiction checking

**System**
```text
Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности.
```

**User**
```text
Проверь набор на непротиворечивость.
1. `Борис` гость -> не может публиковать.
2. `Борис` гость.
3. `Борис` может публиковать.
```

**Assistant**
```text
Нет. Противоречие между выводом из 1+2 и пунктом 3.
```

## Training Recipe

The public run used:

- distributed setup: **`torchrun` DDP**
- training framework: **Unsloth + TRL**
- base model loading: **4-bit**
- LoRA rank: **16**
- LoRA alpha: **16**
- LoRA dropout: **0.0**
- max sequence length: **2048**
- per-device train batch size: **1**
- gradient accumulation steps: **8**
- effective global batch size: **16** examples / optimization step
- epochs: **1**
- optimizer: **`adamw_8bit`**
- scheduler: **cosine**
- learning rate: **1e-4**
- warmup steps: **5**
- gradient checkpointing: **enabled**
- FP16: **forced**
- packing: **disabled**
- completion-only loss: **disabled**
- public run total steps: **2256**
- logging / eval / save cadence: **50 / 125 / 250**

## Prompt Style

This adapter was trained on a simple, explicit prompt layout:

```text
System:
<system prompt>

User:
<user prompt>

Assistant:
```

For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to:

- repo repair tasks with concrete evidence
- exact wording / logic cleanup tasks
- tool-call selection with explicit schemas
- technical writing with clear requested sections
- review / integration prompts that specify files, symptoms, and expected outcomes

## Quick Start

```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-4B-Base"
adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)

system = "Return the smallest useful answer. Do not invent missing evidence."
user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect."
prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=160)

print(tokenizer.decode(out[0], skip_special_tokens=True))
```

## Intended Use

Graphite 1.0 4B is intended for:

- coding assistants
- repo triage and patch-planning copilots
- Markdown / docs tooling assistants
- logic and wording critique
- bilingual technical task routing

It is especially useful when you want **short, grounded, non-theatrical outputs** instead of generic assistant prose.

## Limitations

- This is an **adapter**, not a standalone merged model.
- It is tuned for **structured technical work**, not general consumer chat.
- It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
- The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.

## License

This repository is released under **Apache License 2.0**. See [`LICENSE`](./LICENSE).

Please also review the license and usage terms of the base model:

- [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base)

## Acknowledgements

- Alibaba Qwen team for the base model
- Unsloth for the efficient LoRA training stack
- TRL / Transformers / PEFT / PyTorch maintainers