File size: 6,342 Bytes
4c166f8
aff38a2
9daed53
 
 
 
aff38a2
 
9daed53
 
 
 
 
 
 
 
 
 
 
 
 
 
aff38a2
 
9daed53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aff38a2
9daed53
 
 
 
 
aff38a2
9daed53
aff38a2
9daed53
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
license: apache-2.0
base_model: ibm-granite/granite-4.1-8b-base
base_model_relation: finetune
datasets:
- aimeri/st-characters-alpaca
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- sillytavern
- character-cards
- character-card-generation
- roleplay
- granite
- granite-4.1
- unsloth
- trl
- sft
- lora
- conversational
---

# SpoomplesMaxx Card Maker V1

A fine-tune of [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) that turns a short, open-ended prompt into a complete [SillyTavern](https://github.com/SillyTavern/SillyTavern) character card. Give it a concept — an archetype, a name and a few constraints, or just a one-liner — and it generates a full V2/V3-style card (description, personality, scenario, first message, example messages, and sometimes a lorebook).

## Model Details

- **Developed by:** [aimeri](https://huggingface.co/aimeri)
- **Base model:** [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) (Apache 2.0)
- **Language:** English
- **Finetuned from a base (not instruct) checkpoint** so output is the card itself, with no assistant-style preamble, disclaimers, or refusals.
- **License:** Apache 2.0

## Uses

### Direct Use

Generating SillyTavern-compatible character cards on demand from a natural-language request. The intended workflow is "describe a character, get a card," with the card output piped through a structural validator before import.

### Out-of-Scope Use

This is a single-turn card *generator*, not a roleplay or chat model — the assistant turn is a static card definition, not a conversation. It is not intended for multi-turn roleplay, as a general-purpose assistant, or for factual question answering.

## How to Get Started

The model was trained **without a system prompt**, so the cleanest usage is user-only. Use the chat template and sampling settings below.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer  # transformers >= 5.0

model_id = "aimeri/spoomplesmaxx-cardmaker-v1"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="auto")

messages = [
    {"role": "user", "content": "Create a character card for a grumpy lighthouse keeper."},
]
inputs = tok.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
).to(model.device)

out = model.generate(
    **inputs,
    max_new_tokens=8192,
    do_sample=True,
    temperature=1.0,
    top_k=64,
    top_p=0.95,
    repetition_penalty=1.1,
)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```

Cards that include a `character_book` can be long; if generation cuts off mid-card, raise `max_new_tokens`. The merged 16-bit weights also serve directly under vLLM (`vllm serve aimeri/spoomplesmaxx-cardmaker-v1`), again with no system message.

## Training Details

### Procedure

LoRA fine-tune with [Unsloth](https://github.com/unslothai/unsloth) + TRL `SFTTrainer`, using the official Granite 4.1 chat template. Loss was computed on the assistant (card) completion only via `train_on_responses_only`.

**LoRA configuration**

| Setting | Value |
|---|---|
| Rank `r` | 16 |
| `lora_alpha` | 22 |
| `lora_dropout` | 0 |
| Target modules | all-linear |
| Rank-stabilized LoRA | enabled |
| Bias | none |

**Training hyperparameters**

| Setting | Value |
|---|---|
| Epochs | 2 (848 optimizer steps) |
| Per-device batch size | 1 |
| Gradient accumulation | 8 (effective batch size 8) |
| Max sequence length | 8192 |
| Optimizer | adamw_8bit (β₁ 0.9, β₂ 0.999, ε 1e-8) |
| Learning rate | 1e-4, cosine schedule |
| Warmup steps | 25 |
| Weight decay | 0.001 |
| Max grad norm | 1.0 |
| Precision | bf16 |
| Seed | 1985 |
| Frameworks | Unsloth 2026.6.1, Transformers 5.5.0, TRL, PEFT, PyTorch 2.10 |

### Results

Evaluation loss on the 5% held-out split fell from the base checkpoint to the final model over the two epochs (most of the gain came in the first ~100 steps, with a slow grind afterward):

| Checkpoint | Eval loss |
|---|---|
| Base (step 0, `eval_on_start`) | 2.234 |
| Step 100 | 1.704 |
| Step 400 | 1.656 |
| Final (step 848) | **1.641** |

Final mean training loss was ~1.57. Total wall-clock training time was ~4.6 hours.

## Evaluation

Quality was judged primarily **behaviorally** rather than by a single metric — eval loss is a weak proxy for card quality on a held-out set this small (~178 rows). A fixed prompt battery probed the behaviors that matter for this task:

- **Structure & completeness** — clean, parseable cards with all expected fields on easy archetypes.
- **Constraint adherence** — exact name / age / occupation, and a character's voice actually showing up in `first_mes` and `mes_example` rather than drifting generic.
- **Sparse invention** — building a full, internally consistent card from a near-empty prompt.
- **First-message craft** — second-person address to `{{user}}`, scene-setting, action formatting, in-voice dialogue, and a natural hand-off.
- **Register** — antagonist/villain cards produced in-character, with no disclaimers, moralizing, or assistant-voice leakage. This is the main reason the model was trained from a base rather than an instruct checkpoint.

## Bias, Risks, and Limitations

- **Mature content.** This model was trained with a mix of Safe for Work and Not Safe For Work cards, and it may generate objectionable content. Please use discretion when generating new cards.
- **Structural validity is not guaranteed.** Output is generated text, not schema-validated card JSON. Run it through a parser/validator before importing into SillyTavern.
- **Card conventions.** Output uses `{{user}}` / `{{char}}` macros and assumes a SillyTavern runtime.
- **Single-turn only.** This generates a card, not a conversation; it is not itself a roleplay partner.
- **Inherited bias.** The model carries the biases of both the base model and the curated card sources, including their genre, aesthetic, and demographic skew. "High quality" reflects a subjective curation judgment.

## Citation

If you use this model, please reference this repository and the [base model](https://huggingface.co/ibm-granite/granite-4.1-8b-base).