Gryphe commited on
Commit
c60e18b
·
verified ·
1 Parent(s): 2bd7f01

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen3.6-35B-A3B
7
+ datasets:
8
+ - Gryphe/Opus-4.6-Reasoning-24k
9
+ tags:
10
+ - qwen3_6_moe
11
+ - conversational
12
+ - instruct
13
+ - finetune
14
+ - chatml
15
+ - axolotl
16
+ - roleplay
17
+ - reasoning
18
+ - creative-writing
19
+ pipeline_tag: text-generation
20
+ ---
21
+
22
+ # WorldSim-Opus-3.6-35B-A3B
23
+
24
+ [![image/jpg](WorldSim-Opus.jpg)](WorldSim-Opus.jpg)
25
+
26
+ An experiment in fusing creative world simulation and genuine reasoning capability into a single Qwen 3.6 MoE model.
27
+
28
+ The idea here was simple: find out whether a small reasoning model can roleplay properly if fed high quality data. Every dataset used here includes full thinking traces, so the model reasons its way through creative writing — planning story beats, considering character motivations, and working through consequences before committing to a response.
29
+
30
+ ...Or so the theory goes!
31
+
32
+ ## Model details
33
+
34
+ Base model is [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) - MoEs are neat little things, and this one is actually (finally!) remarkably easy to train with Axolotl. One huge benefit is that the model's thinking traces have become much more concise after training since this model absolutely loves overthinking.
35
+
36
+ All three training sources are reasoning datasets, meaning every assistant turn includes a full thinking trace:
37
+
38
+ - **[Opus-4.6-Reasoning-24k](https://huggingface.co/datasets/Gryphe/Opus-4.6-Reasoning-24k)** (50%) - a cleaned and deduplicated aggregation of Claude Opus 4.6 reasoning traces, covering general instruction-following, STEM, and coding domains
39
+ - **WorldSim data** (40%) - long-form Opus 4.6 narrative roleplay with full reasoning traces, focusing on extended storytelling, character immersion, and emergent world logic, cobbled together through various experiments - mainly third person present tense but has a bit of everything + cliché cleaned, of course!
40
+ - **Tiamat data** (10%) - character and roleplay dataset originally built for [Tiamat-24B-Magistral](https://huggingface.co/Gryphe/Tiamat-24B-Magistral), featuring a multi-step generation/extension/improvement pipeline with critic-improver rewrites to reduce AI clichés, with reasoning back-generated for each exchange
41
+
42
+ The model was trained with `preserve_thinking: true`, so thinking tags are active across all assistant turns in multi-turn conversations, not just the last one.
43
+
44
+ ## Inference
45
+
46
+ These settings have been working well for me:
47
+
48
+ ```
49
+ "temperature": 0.8,
50
+ "repetition_penalty": 1.05,
51
+ "min_p": 0.05
52
+ ```
53
+
54
+ I obviously recommend leaving thinking enabled, and ideally with `preserve_thinking` turned on.
55
+
56
+ ## Prompt Format
57
+
58
+ The model was trained using ChatML via Qwen3.6's chat template, which should be applied automatically.
59
+
60
+ Since reasoning doesn't tend to play nice with character name prefixes enabled I'm inclined to recommend against using them.
61
+
62
+ ## Notes
63
+
64
+ This is, like always, a research release and hasn't gone through extensive quality testing beyond basic sanity checks. The blend of reasoning + creative data is an experiment, and I'm genuinely not sure how well the two domains mix in practice. Let me know what you find! To me it feels absurdly promising, but I could be very wrong here, hence me sharing it with you all.
65
+
66
+ ## Credits
67
+
68
+ - Everyone from [Anthracite](https://huggingface.co/anthracite-org)! Hi, guys! Still alive!
69
+ - [Latitude](https://huggingface.co/LatitudeGames), who decided to take me on as a finetuner and gave me the chance to accumulate even more experience in this fascinating field
70
+ - All the original dataset authors behind the Opus 4.6 reasoning data — full credits in the [dataset card](https://huggingface.co/datasets/Gryphe/Opus-4.6-Reasoning-24k)
71
+ - All the folks I chat with on a daily basis on Discord! You know who you are.
72
+ - Anyone I forgot to mention, just in case!