FoolDev commited on
Commit
b404cc2
Β·
1 Parent(s): 5302d10

Sync README, CHANGELOG, transformers example with config.json drop

Browse files

5302d10 removed config.json (to suppress HF's qwen3_5 auto-tag) but
left three docs claiming it was still present:
- README "What's here" table listed config.json
- README transformers paragraph claimed from_pretrained works directly
- examples/transformers_quickstart.py loaded MODEL_ID with no config

Update README table + transformers note to explain the drop and show
a two-step load (AutoConfig from upstream Qwen/Qwen3.6-27B, weights
from this repo β€” tensors are byte-identical). Switch the example to
match. Add a CHANGELOG entry under Unreleased.

Files changed (3) hide show
  1. CHANGELOG.md +25 -7
  2. README.md +22 -8
  3. examples/transformers_quickstart.py +12 -5
CHANGELOG.md CHANGED
@@ -7,23 +7,41 @@ and documentation**, not the underlying base model.
7
 
8
  ## [Unreleased]
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ### Added (safetensors mirror)
11
  - **Mirrored Qwen/Qwen3.6-27B's transformers-loadable safetensors
12
  set into this repo.** 15 sharded `.safetensors` files (~58 GB
13
  total) + `model.safetensors.index.json` + tokenizer files
14
  (`tokenizer.json`, `tokenizer_config.json`, `vocab.json`,
15
- `merges.txt`) + configs (`config.json`, `configuration.json`,
16
  `generation_config.json`, `preprocessor_config.json`,
17
  `video_preprocessor_config.json`) + `chat_template.jinja`.
18
- Tensor data byte-identical to upstream; the mirror saves a
19
- second `hf download` for users who want both GGUF + safetensors
20
- in one place. `.gitignore` was updated separately (commit
 
 
21
  `0c5bee4`) to whitelist the Qwen sharded naming pattern before
22
  the upload's preupload check ran (HF reads the destination
23
  repo's `.gitignore` to decide `shouldIgnore` per file).
24
- - `examples/transformers_quickstart.py` now defaults `MODEL_ID`
25
- to `FoolDev/Thanatos-27B` so a fresh user only needs to pull
26
- this repo to get a working transformers entry point.
 
 
 
27
 
28
  ### Changed (5th round trip β€” qwen36 β†’ qwen35, retested next-day)
29
  - **Bundle re-stamped `general.architecture: 'qwen36'` β†’ `'qwen35'`**
 
7
 
8
  ## [Unreleased]
9
 
10
+ ### Removed (transformers config)
11
+ - **Dropped `config.json`** (`5302d10`) to suppress HF's tag
12
+ auto-detector surfacing `qwen3_5` in the repo header β€” the
13
+ detector reads `architectures` from `config.json` and the
14
+ surfaced tag was obscuring this card's positioning.
15
+ Consequence: `AutoModelForCausalLM.from_pretrained(
16
+ "FoolDev/Thanatos-27B")` no longer works on its own.
17
+ `examples/transformers_quickstart.py` now pulls `AutoConfig`
18
+ from upstream `Qwen/Qwen3.6-27B` (byte-identical tensors,
19
+ so no behavioural difference) and weights + tokenizer +
20
+ chat template from this repo. README's "What's here"
21
+ table and transformers paragraph updated to match.
22
+
23
  ### Added (safetensors mirror)
24
  - **Mirrored Qwen/Qwen3.6-27B's transformers-loadable safetensors
25
  set into this repo.** 15 sharded `.safetensors` files (~58 GB
26
  total) + `model.safetensors.index.json` + tokenizer files
27
  (`tokenizer.json`, `tokenizer_config.json`, `vocab.json`,
28
+ `merges.txt`) + configs (`configuration.json`,
29
  `generation_config.json`, `preprocessor_config.json`,
30
  `video_preprocessor_config.json`) + `chat_template.jinja`.
31
+ (`config.json` was initially mirrored too, then dropped β€” see
32
+ "Removed (transformers config)" above.) Tensor data
33
+ byte-identical to upstream; the mirror saves a second
34
+ `hf download` for users who want both GGUF + safetensors in
35
+ one place. `.gitignore` was updated separately (commit
36
  `0c5bee4`) to whitelist the Qwen sharded naming pattern before
37
  the upload's preupload check ran (HF reads the destination
38
  repo's `.gitignore` to decide `shouldIgnore` per file).
39
+ - `examples/transformers_quickstart.py` defaults `MODEL_ID`
40
+ to `FoolDev/Thanatos-27B` (weights + tokenizer + chat
41
+ template) with `CONFIG_ID="Qwen/Qwen3.6-27B"` for the
42
+ architecture config β€” fresh users still need only this
43
+ repo as the entry point, with one auxiliary HF Hub pull
44
+ for `config.json` that transformers handles transparently.
45
 
46
  ### Changed (5th round trip β€” qwen36 β†’ qwen35, retested next-day)
47
  - **Bundle re-stamped `general.architecture: 'qwen36'` β†’ `'qwen35'`**
README.md CHANGED
@@ -126,7 +126,7 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
126
  | `scripts/install-hooks.sh` | Installs `check.sh` as a git pre-commit hook |
127
  | `Makefile` | Convenience wrapper β€” `make help` lists targets |
128
  | `model-*-of-00015.safetensors` (15 files, ~58 GB) + `model.safetensors.index.json` | Transformers-loadable safetensors mirror of `Qwen/Qwen3.6-27B`. Byte-identical to upstream. |
129
- | `config.json`, `configuration.json`, `generation_config.json`, `preprocessor_config.json`, `video_preprocessor_config.json`, `chat_template.jinja` | Model + processor + chat-template configs mirrored from upstream β€” what `from_pretrained` needs. |
130
  | `tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt` | Tokenizer files mirrored from upstream. |
131
  | `LICENSE`, `CITATION.cff` | Apache-2.0 license and citation metadata |
132
  | `CHANGELOG.md` | Versioned tooling/docs changes |
@@ -142,13 +142,27 @@ path applies the root-level `template`, `system`, and `params`
142
  files (kept in sync with the `Modelfile`).
143
 
144
  The transformers safetensors set is mirrored in this repo
145
- (15 sharded `.safetensors` files + index + tokenizer + configs
146
- + chat template), so `AutoModelForCausalLM.from_pretrained(
147
- "FoolDev/Thanatos-27B")` works directly. Tensor data is
148
- byte-identical to upstream [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B);
149
- the mirror exists so consumers don't need a second pull from
150
- two repos to assemble the full toolkit (GGUF + safetensors +
151
- mmproj reference).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
 
153
  ## Architecture
154
 
 
126
  | `scripts/install-hooks.sh` | Installs `check.sh` as a git pre-commit hook |
127
  | `Makefile` | Convenience wrapper β€” `make help` lists targets |
128
  | `model-*-of-00015.safetensors` (15 files, ~58 GB) + `model.safetensors.index.json` | Transformers-loadable safetensors mirror of `Qwen/Qwen3.6-27B`. Byte-identical to upstream. |
129
+ | `configuration.json`, `generation_config.json`, `preprocessor_config.json`, `video_preprocessor_config.json`, `chat_template.jinja` | Processor + chat-template configs mirrored from upstream. **`config.json` is intentionally not in this repo** β€” HF's tag auto-detector reads `architectures` from it and surfaces `qwen3_5` in the repo header, which obscures this repo's positioning. Transformers users: pull `config.json` from [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B) (see Transformers note below). |
130
  | `tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt` | Tokenizer files mirrored from upstream. |
131
  | `LICENSE`, `CITATION.cff` | Apache-2.0 license and citation metadata |
132
  | `CHANGELOG.md` | Versioned tooling/docs changes |
 
142
  files (kept in sync with the `Modelfile`).
143
 
144
  The transformers safetensors set is mirrored in this repo
145
+ (15 sharded `.safetensors` files + index + tokenizer +
146
+ chat template), byte-identical to upstream
147
+ [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B).
148
+ **`config.json` is not bundled here** β€” HF auto-detects model
149
+ architecture from it and surfaces a `qwen3_5` repo-level tag
150
+ that obscures this card. To load via transformers, either:
151
+
152
+ ```python
153
+ # A. Use upstream as the config/architecture source, this repo for weights:
154
+ from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
155
+ cfg = AutoConfig.from_pretrained("Qwen/Qwen3.6-27B", trust_remote_code=True)
156
+ tok = AutoTokenizer.from_pretrained("FoolDev/Thanatos-27B", trust_remote_code=True)
157
+ model = AutoModelForCausalLM.from_pretrained(
158
+ "FoolDev/Thanatos-27B", config=cfg, trust_remote_code=True,
159
+ )
160
+
161
+ # B. Or just load upstream directly β€” tensors are byte-identical:
162
+ model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.6-27B", trust_remote_code=True)
163
+ ```
164
+
165
+ `examples/transformers_quickstart.py` uses path A.
166
 
167
  ## Architecture
168
 
examples/transformers_quickstart.py CHANGED
@@ -7,9 +7,12 @@ mirror of Qwen/Qwen3.6-27B's transformers set) and runs a single
7
  chat turn using its embedded chat template. Applies the same
8
  Thanatos system prompt the Modelfile / bridge `system` file uses.
9
 
10
- `MODEL_ID` defaults to "FoolDev/Thanatos-27B" so the safetensors and
11
- chat template come from this repo in one pull; flip it back to
12
- "Qwen/Qwen3.6-27B" to source directly from upstream.
 
 
 
13
 
14
  Requirements:
15
  pip install --upgrade "transformers>=4.45" accelerate sentencepiece bitsandbytes
@@ -31,7 +34,7 @@ import sys
31
 
32
  try:
33
  import torch
34
- from transformers import AutoModelForCausalLM, AutoTokenizer
35
  except ImportError as e: # pragma: no cover
36
  sys.exit(
37
  f"Missing dependency: {e.name}. Install with:\n"
@@ -40,6 +43,7 @@ except ImportError as e: # pragma: no cover
40
 
41
 
42
  MODEL_ID = "FoolDev/Thanatos-27B"
 
43
 
44
  THANATOS_SYSTEM = (
45
  "You are Thanatos, a precise and capable assistant for reasoning, writing, "
@@ -71,8 +75,11 @@ def load(use_4bit: bool):
71
  )
72
  kwargs.pop("torch_dtype", None)
73
 
 
74
  tok = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
75
- model = AutoModelForCausalLM.from_pretrained(MODEL_ID, trust_remote_code=True, **kwargs)
 
 
76
  return tok, model
77
 
78
 
 
7
  chat turn using its embedded chat template. Applies the same
8
  Thanatos system prompt the Modelfile / bridge `system` file uses.
9
 
10
+ `config.json` is intentionally not in this repo (it makes HF's tag
11
+ auto-detector surface a `qwen3_5` repo-level tag), so we source the
12
+ architecture config from upstream `Qwen/Qwen3.6-27B` and only pull
13
+ weights + tokenizer + chat template from this repo. Tensor data is
14
+ byte-identical, so the result is the same model. Set
15
+ `MODEL_ID = "Qwen/Qwen3.6-27B"` to bypass this repo entirely.
16
 
17
  Requirements:
18
  pip install --upgrade "transformers>=4.45" accelerate sentencepiece bitsandbytes
 
34
 
35
  try:
36
  import torch
37
+ from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
38
  except ImportError as e: # pragma: no cover
39
  sys.exit(
40
  f"Missing dependency: {e.name}. Install with:\n"
 
43
 
44
 
45
  MODEL_ID = "FoolDev/Thanatos-27B"
46
+ CONFIG_ID = "Qwen/Qwen3.6-27B" # source of config.json (not bundled in MODEL_ID β€” see module docstring)
47
 
48
  THANATOS_SYSTEM = (
49
  "You are Thanatos, a precise and capable assistant for reasoning, writing, "
 
75
  )
76
  kwargs.pop("torch_dtype", None)
77
 
78
+ cfg = AutoConfig.from_pretrained(CONFIG_ID, trust_remote_code=True)
79
  tok = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
80
+ model = AutoModelForCausalLM.from_pretrained(
81
+ MODEL_ID, config=cfg, trust_remote_code=True, **kwargs,
82
+ )
83
  return tok, model
84
 
85