Gryphe
/

WorldSim-Opus-3.6-35B-A3B

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- Qwen/Qwen3.6-35B-A3B
+datasets:
+- Gryphe/Opus-4.6-Reasoning-24k
+tags:
+- qwen3_6_moe
+- conversational
+- instruct
+- finetune
+- chatml
+- axolotl
+- roleplay
+- reasoning
+- creative-writing
+pipeline_tag: text-generation
+---
+# WorldSim-Opus-3.6-35B-A3B
+[![image/jpg](WorldSim-Opus.jpg)](WorldSim-Opus.jpg)
+An experiment in fusing creative world simulation and genuine reasoning capability into a single Qwen 3.6 MoE model.
+The idea here was simple: find out whether a small reasoning model can roleplay properly if fed high quality data. Every dataset used here includes full thinking traces, so the model reasons its way through creative writing — planning story beats, considering character motivations, and working through consequences before committing to a response.
+...Or so the theory goes!
+## Model details
+Base model is [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) - MoEs are neat little things, and this one is actually (finally!) remarkably easy to train with Axolotl. One huge benefit is that the model's thinking traces have become much more concise after training since this model absolutely loves overthinking.
+All three training sources are reasoning datasets, meaning every assistant turn includes a full thinking trace:
+- **[Opus-4.6-Reasoning-24k](https://huggingface.co/datasets/Gryphe/Opus-4.6-Reasoning-24k)** (50%) - a cleaned and deduplicated aggregation of Claude Opus 4.6 reasoning traces, covering general instruction-following, STEM, and coding domains
+- **WorldSim data** (40%) - long-form Opus 4.6 narrative roleplay with full reasoning traces, focusing on extended storytelling, character immersion, and emergent world logic, cobbled together through various experiments - mainly third person present tense but has a bit of everything + cliché cleaned, of course!
+- **Tiamat data** (10%) - character and roleplay dataset originally built for [Tiamat-24B-Magistral](https://huggingface.co/Gryphe/Tiamat-24B-Magistral), featuring a multi-step generation/extension/improvement pipeline with critic-improver rewrites to reduce AI clichés, with reasoning back-generated for each exchange
+The model was trained with `preserve_thinking: true`, so thinking tags are active across all assistant turns in multi-turn conversations, not just the last one.
+## Inference
+These settings have been working well for me:
+```
+"temperature": 0.8,
+"repetition_penalty": 1.05,
+"min_p": 0.05
+```
+I obviously recommend leaving thinking enabled, and ideally with `preserve_thinking` turned on.
+## Prompt Format
+The model was trained using ChatML via Qwen3.6's chat template, which should be applied automatically.
+Since reasoning doesn't tend to play nice with character name prefixes enabled I'm inclined to recommend against using them.
+## Notes
+This is, like always, a research release and hasn't gone through extensive quality testing beyond basic sanity checks. The blend of reasoning + creative data is an experiment, and I'm genuinely not sure how well the two domains mix in practice. Let me know what you find! To me it feels absurdly promising, but I could be very wrong here, hence me sharing it with you all.
+## Credits
+- Everyone from [Anthracite](https://huggingface.co/anthracite-org)! Hi, guys! Still alive!
+- [Latitude](https://huggingface.co/LatitudeGames), who decided to take me on as a finetuner and gave me the chance to accumulate even more experience in this fascinating field
+- All the original dataset authors behind the Opus 4.6 reasoning data — full credits in the [dataset card](https://huggingface.co/datasets/Gryphe/Opus-4.6-Reasoning-24k)
+- All the folks I chat with on a daily basis on Discord! You know who you are.
+- Anyone I forgot to mention, just in case!