--- license: apache-2.0 base_model: ibm-granite/granite-4.1-8b-base base_model_relation: finetune datasets: - aimeri/st-characters-alpaca language: - en library_name: transformers pipeline_tag: text-generation tags: - sillytavern - character-cards - character-card-generation - roleplay - granite - granite-4.1 - unsloth - trl - sft - lora - conversational --- # SpoomplesMaxx Card Maker V1 A fine-tune of [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) that turns a short, open-ended prompt into a complete [SillyTavern](https://github.com/SillyTavern/SillyTavern) character card. Give it a concept — an archetype, a name and a few constraints, or just a one-liner — and it generates a full V2/V3-style card (description, personality, scenario, first message, example messages, and sometimes a lorebook). ## Model Details - **Developed by:** [aimeri](https://huggingface.co/aimeri) - **Base model:** [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) (Apache 2.0) - **Language:** English - **Finetuned from a base (not instruct) checkpoint** so output is the card itself, with no assistant-style preamble, disclaimers, or refusals. - **License:** Apache 2.0 ## Uses ### Direct Use Generating SillyTavern-compatible character cards on demand from a natural-language request. The intended workflow is "describe a character, get a card," with the card output piped through a structural validator before import. ### Out-of-Scope Use This is a single-turn card *generator*, not a roleplay or chat model — the assistant turn is a static card definition, not a conversation. It is not intended for multi-turn roleplay, as a general-purpose assistant, or for factual question answering. ## How to Get Started The model was trained **without a system prompt**, so the cleanest usage is user-only. Use the chat template and sampling settings below. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # transformers >= 5.0 model_id = "aimeri/spoomplesmaxx-cardmaker-v1" tok = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="auto") messages = [ {"role": "user", "content": "Create a character card for a grumpy lighthouse keeper."}, ] inputs = tok.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt", return_dict=True ).to(model.device) out = model.generate( **inputs, max_new_tokens=8192, do_sample=True, temperature=1.0, top_k=64, top_p=0.95, repetition_penalty=1.1, ) print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) ``` Cards that include a `character_book` can be long; if generation cuts off mid-card, raise `max_new_tokens`. The merged 16-bit weights also serve directly under vLLM (`vllm serve aimeri/spoomplesmaxx-cardmaker-v1`), again with no system message. ## Training Details ### Procedure LoRA fine-tune with [Unsloth](https://github.com/unslothai/unsloth) + TRL `SFTTrainer`, using the official Granite 4.1 chat template. Loss was computed on the assistant (card) completion only via `train_on_responses_only`. **LoRA configuration** | Setting | Value | |---|---| | Rank `r` | 16 | | `lora_alpha` | 22 | | `lora_dropout` | 0 | | Target modules | all-linear | | Rank-stabilized LoRA | enabled | | Bias | none | **Training hyperparameters** | Setting | Value | |---|---| | Epochs | 2 (848 optimizer steps) | | Per-device batch size | 1 | | Gradient accumulation | 8 (effective batch size 8) | | Max sequence length | 8192 | | Optimizer | adamw_8bit (β₁ 0.9, β₂ 0.999, ε 1e-8) | | Learning rate | 1e-4, cosine schedule | | Warmup steps | 25 | | Weight decay | 0.001 | | Max grad norm | 1.0 | | Precision | bf16 | | Seed | 1985 | | Frameworks | Unsloth 2026.6.1, Transformers 5.5.0, TRL, PEFT, PyTorch 2.10 | ### Results Evaluation loss on the 5% held-out split fell from the base checkpoint to the final model over the two epochs (most of the gain came in the first ~100 steps, with a slow grind afterward): | Checkpoint | Eval loss | |---|---| | Base (step 0, `eval_on_start`) | 2.234 | | Step 100 | 1.704 | | Step 400 | 1.656 | | Final (step 848) | **1.641** | Final mean training loss was ~1.57. Total wall-clock training time was ~4.6 hours. ## Evaluation Quality was judged primarily **behaviorally** rather than by a single metric — eval loss is a weak proxy for card quality on a held-out set this small (~178 rows). A fixed prompt battery probed the behaviors that matter for this task: - **Structure & completeness** — clean, parseable cards with all expected fields on easy archetypes. - **Constraint adherence** — exact name / age / occupation, and a character's voice actually showing up in `first_mes` and `mes_example` rather than drifting generic. - **Sparse invention** — building a full, internally consistent card from a near-empty prompt. - **First-message craft** — second-person address to `{{user}}`, scene-setting, action formatting, in-voice dialogue, and a natural hand-off. - **Register** — antagonist/villain cards produced in-character, with no disclaimers, moralizing, or assistant-voice leakage. This is the main reason the model was trained from a base rather than an instruct checkpoint. ## Bias, Risks, and Limitations - **Mature content.** This model was trained with a mix of Safe for Work and Not Safe For Work cards, and it may generate objectionable content. Please use discretion when generating new cards. - **Structural validity is not guaranteed.** Output is generated text, not schema-validated card JSON. Run it through a parser/validator before importing into SillyTavern. - **Card conventions.** Output uses `{{user}}` / `{{char}}` macros and assumes a SillyTavern runtime. - **Single-turn only.** This generates a card, not a conversation; it is not itself a roleplay partner. - **Inherited bias.** The model carries the biases of both the base model and the curated card sources, including their genre, aesthetic, and demographic skew. "High quality" reflects a subjective curation judgment. ## Citation If you use this model, please reference this repository and the [base model](https://huggingface.co/ibm-granite/granite-4.1-8b-base).