Files changed (3) hide show
  1. README.md +3 -3
  2. model.safetensors +1 -1
  3. training_state.pt +2 -2
README.md CHANGED
@@ -9,13 +9,13 @@ pipeline_tag: text-generation
9
 
10
  # Qwen3 mHC
11
 
12
- This checkpoint is a Manifold-Constrained Hyper-Connections (mHC) V2 variant of Qwen/Qwen3-0.6B, trained for 100k steps in a parity-mixed setup. It is intended for research on residual stream mixing and hyper-connection behavior.
13
 
14
  ## Model Description
15
 
16
  - **Base model:** Qwen/Qwen3-0.6B
17
  - **Architecture:** Qwen3 with mHC V2 hyper-connections (stream-mixing)
18
- - **Checkpoint:** 100,000 steps
19
  - **Language(s):** Multilingual (see data notes)
20
  - **License:** Apache-2.0 (inherits base model license)
21
 
@@ -37,7 +37,7 @@ This checkpoint was trained on multilingual pretokenized datasets, primarily San
37
  ## Training Procedure
38
 
39
  - Converted from a Qwen3 base checkpoint into an mHC V2 model.
40
- - Trained for 100k steps in a parity-mixed run.
41
  - Uses Sinkhorn-based projection for residual mixing stability.
42
 
43
  ## Evaluation
 
9
 
10
  # Qwen3 mHC
11
 
12
+ This checkpoint is a Manifold-Constrained Hyper-Connections (mHC) V2 variant of Qwen/Qwen3-0.6B, trained for 30k steps in a parity-mixed setup. It is intended for research on residual stream mixing and hyper-connection behavior.
13
 
14
  ## Model Description
15
 
16
  - **Base model:** Qwen/Qwen3-0.6B
17
  - **Architecture:** Qwen3 with mHC V2 hyper-connections (stream-mixing)
18
+ - **Checkpoint:** 30,000 steps
19
  - **Language(s):** Multilingual (see data notes)
20
  - **License:** Apache-2.0 (inherits base model license)
21
 
 
37
  ## Training Procedure
38
 
39
  - Converted from a Qwen3 base checkpoint into an mHC V2 model.
40
+ - Trained for 30k steps in a parity-mixed run.
41
  - Uses Sinkhorn-based projection for residual mixing stability.
42
 
43
  ## Evaluation
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8a53fe57c6e9db80d33eb308f3653bb20dbc38699ed096c973a3998e8ddd4c51
3
  size 1192998544
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aad3344026d91d0467a056746fa3bb2d3f085778ae0715a42731cb5247134d3e
3
  size 1192998544
training_state.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:72b0d8f5cb125d37f2bd3181882114979ef1fda40beccc460b99e660e540aadc
3
- size 2387347844
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f2d23078ace67f6ac34dd3d8b7ea4846d4c0347d324e30a73ce2ab64e09d09dc
3
+ size 2386717700