aimeri
/

SpoomplesMaxx-Base

Safetensors

qwen3

Model card Files Files and versions

xet

Community

aimeri commited on Dec 23, 2025

Commit

da89015

verified ·

1 Parent(s): fc7c357

Create README.md

Browse files

Files changed (1) hide show

README.md +127 -0

README.md ADDED Viewed

	@@ -0,0 +1,127 @@

+---
+license: mit
+base_model: Qwen/Qwen3-14B-Base
+tags:
+  - cpt
+  - continued-pretraining
+  - roleplay
+  - creative-writing
+  - character-cards
+  - fiction
+datasets:
+  - nyuuzyou/fandom
+  - gryffindor-ISWS/dbpedia_abstracts_fictional_characters_with_img
+language:
+  - en
+pipeline_tag: text-generation
+---
+# SpoomplesMaxx Base
+A continued pre-training (CPT) checkpoint of [Qwen/Qwen3-14B-Base](https://huggingface.co/Qwen/Qwen3-14B-Base) fine-tuned on creative writing and roleplay data.
+## Model Description
+This model is part of the SpoomplesMaxx training pipeline: **CPT → SFT → DPO**
+The CPT stage teaches the model:
+- Character understanding and portrayal
+- Creative fiction writing patterns
+- Fandom/wiki-style lore knowledge
+- Dialogue patterns for roleplay
+## Training Data
+### Phase 1: Core Knowledge
+This checkpoint was trained on data focused on character knowledge and lore:
+| Dataset | Source | Samples | Description |
+|---------|--------|---------|-------------|
+| **Private Dataset** | Private | ~100k (50,000 sampled) | SillyTavern-style character cards with personality, scenario, and example dialogue, as well as fanfics, essays about media and characters, short novels, and high quality roleplay data |
+| [nyuuzyou/fandom](https://huggingface.co/datasets/nyuuzyou/fandom) | HuggingFace | 50,000 (sampled) | Fandom wiki articles with character/world lore |
+| [gryffindor-ISWS/dbpedia_abstracts_fictional_characters_with_img](https://huggingface.co/datasets/gryffindor-ISWS/dbpedia_abstracts_fictional_characters_with_img) | HuggingFace | 50,000 (sampled) | DBpedia abstracts of fictional characters |
+**Total training samples:** ~46k
+## Training Configuration
+| Parameter | Value |
+|-----------|-------|
+| Base Model | `Qwen/Qwen3-14B-Base` |
+| Training Phase | Phase 1 (Core Knowledge) |
+| Steps | 1000 / 3000 |
+| Batch Size | 1 |
+| Gradient Accumulation | 16 |
+| **Effective Batch Size** | **16** |
+| Learning Rate | 1e-5 |
+| LR Scheduler | Cosine |
+| Warmup Ratio | 5% |
+| Max Sequence Length | 8192 |
+| Precision | BF16 |
+| Optimizer | 8-bit Paged AdamW |
+| Gradient Checkpointing | ✓ |
+| Priority Repeat | 50× (character cards) |
+### Hardware
+- **GPU:** 1× NVIDIA A800
+- **Training Time:** ~6 hours for 1000 steps
+## Intended Use
+This model is intended for use as a creative base model for further finetuning.
+### Not Recommended For:
+- Production deployment (use final model after full CPT → SFT → DPO pipeline)
+- Direct chat/instruction following (this is a base model continuation, not instruction-tuned)
+## Limitations
+- **No instruction tuning:** This model continues raw text, not chat/instructions
+- **Private data bias:** Heavy weighting toward private character cards may introduce specific character patterns
+- **NSFW content:** Training data includes creative fiction that may contain mature themes. No safety filtering was applied at this stage.
+## How to Use
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained(
+    "aimeri/SpoomplesMaxx-CPT-3-Base",
+    dtype="auto",
+    device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained("aimeri/SpoomplesMaxx-CPT-3-Base")
+# CPT models continue text, not chat
+prompt = "The castle stood silent against the darkening sky, its towers reaching toward clouds that promised rain. Inside,"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=200, do_sample=True, temperature=0.8)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Citation
+If you use this model, please cite the base model and datasets:
+```bibtex
+@misc{qwen3-14b-base,
+  title={Qwen3-14B-Base},
+  author={Qwen Team},
+  year={2025},
+  publisher={Hugging Face},
+  url={https://huggingface.co/Qwen/Qwen3-14B-Base}
+}
+```
+## Acknowledgments
+- [Qwen Team](https://huggingface.co/Qwen) for the excellent base model
+- [nyuuzyou](https://huggingface.co/nyuuzyou) for the Fandom wiki dataset
+- [Archive of Our Own](https://archiveofourown.org/) for creative fiction
+---