Text Generation
Transformers
Safetensors
English
granite
sillytavern
character-cards
character-card-generation
roleplay
granite-4.1
unsloth
trl
sft
lora
conversational
Instructions to use aimeri/spoomplesmaxx-cardmaker-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aimeri/spoomplesmaxx-cardmaker-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aimeri/spoomplesmaxx-cardmaker-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("aimeri/spoomplesmaxx-cardmaker-v1") model = AutoModelForMultimodalLM.from_pretrained("aimeri/spoomplesmaxx-cardmaker-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use aimeri/spoomplesmaxx-cardmaker-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aimeri/spoomplesmaxx-cardmaker-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aimeri/spoomplesmaxx-cardmaker-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/aimeri/spoomplesmaxx-cardmaker-v1
- SGLang
How to use aimeri/spoomplesmaxx-cardmaker-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aimeri/spoomplesmaxx-cardmaker-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aimeri/spoomplesmaxx-cardmaker-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aimeri/spoomplesmaxx-cardmaker-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aimeri/spoomplesmaxx-cardmaker-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use aimeri/spoomplesmaxx-cardmaker-v1 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for aimeri/spoomplesmaxx-cardmaker-v1 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for aimeri/spoomplesmaxx-cardmaker-v1 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for aimeri/spoomplesmaxx-cardmaker-v1 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="aimeri/spoomplesmaxx-cardmaker-v1", max_seq_length=2048, ) - Docker Model Runner
How to use aimeri/spoomplesmaxx-cardmaker-v1 with Docker Model Runner:
docker model run hf.co/aimeri/spoomplesmaxx-cardmaker-v1
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,21 +1,147 @@
|
|
| 1 |
-
--
|
| 2 |
-
base_model: ibm-granite/granite-4.1-8b-base
|
| 3 |
-
tags:
|
| 4 |
-
- text-generation-inference
|
| 5 |
-
- transformers
|
| 6 |
-
- unsloth
|
| 7 |
-
- granite
|
| 8 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
language:
|
| 10 |
- en
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
-
- **
|
| 16 |
-
- **
|
| 17 |
-
- **
|
|
|
|
|
|
|
| 18 |
|
| 19 |
-
|
| 20 |
|
| 21 |
-
|
|
|
|
| 1 |
+
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: apache-2.0
|
| 3 |
+
base_model: ibm-granite/granite-4.1-8b-base
|
| 4 |
+
base_model_relation: finetune
|
| 5 |
+
datasets:
|
| 6 |
+
- aimeri/st-characters-alpaca
|
| 7 |
language:
|
| 8 |
- en
|
| 9 |
+
library_name: transformers
|
| 10 |
+
pipeline_tag: text-generation
|
| 11 |
+
tags:
|
| 12 |
+
- sillytavern
|
| 13 |
+
- character-cards
|
| 14 |
+
- character-card-generation
|
| 15 |
+
- roleplay
|
| 16 |
+
- granite
|
| 17 |
+
- granite-4.1
|
| 18 |
+
- unsloth
|
| 19 |
+
- trl
|
| 20 |
+
- sft
|
| 21 |
+
- lora
|
| 22 |
+
- conversational
|
| 23 |
---
|
| 24 |
|
| 25 |
+
# SpoomplesMaxx Card Maker V1
|
| 26 |
+
|
| 27 |
+
A fine-tune of [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) that turns a short, open-ended prompt into a complete [SillyTavern](https://github.com/SillyTavern/SillyTavern) character card. Give it a concept — an archetype, a name and a few constraints, or just a one-liner — and it generates a full V2/V3-style card (description, personality, scenario, first message, example messages, and sometimes a lorebook).
|
| 28 |
+
|
| 29 |
+
## Model Details
|
| 30 |
+
|
| 31 |
+
- **Developed by:** [aimeri](https://huggingface.co/aimeri)
|
| 32 |
+
- **Base model:** [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) (Apache 2.0)
|
| 33 |
+
- **Language:** English
|
| 34 |
+
- **Finetuned from a base (not instruct) checkpoint** so output is the card itself, with no assistant-style preamble, disclaimers, or refusals.
|
| 35 |
+
- **License:** Apache 2.0
|
| 36 |
+
|
| 37 |
+
## Uses
|
| 38 |
+
|
| 39 |
+
### Direct Use
|
| 40 |
+
|
| 41 |
+
Generating SillyTavern-compatible character cards on demand from a natural-language request. The intended workflow is "describe a character, get a card," with the card output piped through a structural validator before import.
|
| 42 |
+
|
| 43 |
+
### Out-of-Scope Use
|
| 44 |
+
|
| 45 |
+
This is a single-turn card *generator*, not a roleplay or chat model — the assistant turn is a static card definition, not a conversation. It is not intended for multi-turn roleplay, as a general-purpose assistant, or for factual question answering.
|
| 46 |
+
|
| 47 |
+
## How to Get Started
|
| 48 |
+
|
| 49 |
+
The model was trained **without a system prompt**, so the cleanest usage is user-only. Use the chat template and sampling settings below.
|
| 50 |
+
|
| 51 |
+
```python
|
| 52 |
+
import torch
|
| 53 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer # transformers >= 5.0
|
| 54 |
+
|
| 55 |
+
model_id = "aimeri/spoomplesmaxx-cardmaker-v1"
|
| 56 |
+
tok = AutoTokenizer.from_pretrained(model_id)
|
| 57 |
+
model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="auto")
|
| 58 |
+
|
| 59 |
+
messages = [
|
| 60 |
+
{"role": "user", "content": "Create a character card for a grumpy lighthouse keeper."},
|
| 61 |
+
]
|
| 62 |
+
inputs = tok.apply_chat_template(
|
| 63 |
+
messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
|
| 64 |
+
).to(model.device)
|
| 65 |
+
|
| 66 |
+
out = model.generate(
|
| 67 |
+
**inputs,
|
| 68 |
+
max_new_tokens=8192,
|
| 69 |
+
do_sample=True,
|
| 70 |
+
temperature=1.0,
|
| 71 |
+
top_k=64,
|
| 72 |
+
top_p=0.95,
|
| 73 |
+
repetition_penalty=1.1,
|
| 74 |
+
)
|
| 75 |
+
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
Cards that include a `character_book` can be long; if generation cuts off mid-card, raise `max_new_tokens`. The merged 16-bit weights also serve directly under vLLM (`vllm serve aimeri/spoomplesmaxx-cardmaker-v1`), again with no system message.
|
| 79 |
+
|
| 80 |
+
## Training Details
|
| 81 |
+
|
| 82 |
+
### Procedure
|
| 83 |
+
|
| 84 |
+
LoRA fine-tune with [Unsloth](https://github.com/unslothai/unsloth) + TRL `SFTTrainer`, using the official Granite 4.1 chat template. Loss was computed on the assistant (card) completion only via `train_on_responses_only`.
|
| 85 |
+
|
| 86 |
+
**LoRA configuration**
|
| 87 |
+
|
| 88 |
+
| Setting | Value |
|
| 89 |
+
|---|---|
|
| 90 |
+
| Rank `r` | 16 |
|
| 91 |
+
| `lora_alpha` | 22 |
|
| 92 |
+
| `lora_dropout` | 0 |
|
| 93 |
+
| Target modules | all-linear |
|
| 94 |
+
| Rank-stabilized LoRA | enabled |
|
| 95 |
+
| Bias | none |
|
| 96 |
+
|
| 97 |
+
**Training hyperparameters**
|
| 98 |
+
|
| 99 |
+
| Setting | Value |
|
| 100 |
+
|---|---|
|
| 101 |
+
| Epochs | 2 (848 optimizer steps) |
|
| 102 |
+
| Per-device batch size | 1 |
|
| 103 |
+
| Gradient accumulation | 8 (effective batch size 8) |
|
| 104 |
+
| Max sequence length | 8192 |
|
| 105 |
+
| Optimizer | adamw_8bit (β₁ 0.9, β₂ 0.999, ε 1e-8) |
|
| 106 |
+
| Learning rate | 1e-4, cosine schedule |
|
| 107 |
+
| Warmup steps | 25 |
|
| 108 |
+
| Weight decay | 0.001 |
|
| 109 |
+
| Max grad norm | 1.0 |
|
| 110 |
+
| Precision | bf16 |
|
| 111 |
+
| Seed | 1985 |
|
| 112 |
+
| Frameworks | Unsloth 2026.6.1, Transformers 5.5.0, TRL, PEFT, PyTorch 2.10 |
|
| 113 |
+
|
| 114 |
+
### Results
|
| 115 |
+
|
| 116 |
+
Evaluation loss on the 5% held-out split fell from the base checkpoint to the final model over the two epochs (most of the gain came in the first ~100 steps, with a slow grind afterward):
|
| 117 |
+
|
| 118 |
+
| Checkpoint | Eval loss |
|
| 119 |
+
|---|---|
|
| 120 |
+
| Base (step 0, `eval_on_start`) | 2.234 |
|
| 121 |
+
| Step 100 | 1.704 |
|
| 122 |
+
| Step 400 | 1.656 |
|
| 123 |
+
| Final (step 848) | **1.641** |
|
| 124 |
+
|
| 125 |
+
Final mean training loss was ~1.57. Total wall-clock training time was ~4.6 hours.
|
| 126 |
+
|
| 127 |
+
## Evaluation
|
| 128 |
+
|
| 129 |
+
Quality was judged primarily **behaviorally** rather than by a single metric — eval loss is a weak proxy for card quality on a held-out set this small (~178 rows). A fixed prompt battery probed the behaviors that matter for this task:
|
| 130 |
+
|
| 131 |
+
- **Structure & completeness** — clean, parseable cards with all expected fields on easy archetypes.
|
| 132 |
+
- **Constraint adherence** — exact name / age / occupation, and a character's voice actually showing up in `first_mes` and `mes_example` rather than drifting generic.
|
| 133 |
+
- **Sparse invention** — building a full, internally consistent card from a near-empty prompt.
|
| 134 |
+
- **First-message craft** — second-person address to `{{user}}`, scene-setting, action formatting, in-voice dialogue, and a natural hand-off.
|
| 135 |
+
- **Register** — antagonist/villain cards produced in-character, with no disclaimers, moralizing, or assistant-voice leakage. This is the main reason the model was trained from a base rather than an instruct checkpoint.
|
| 136 |
+
|
| 137 |
+
## Bias, Risks, and Limitations
|
| 138 |
|
| 139 |
+
- **Mature content.** This model was trained with a mix of Safe for Work and Not Safe For Work cards, and it may generate objectionable content. Please use discretion when generating new cards.
|
| 140 |
+
- **Structural validity is not guaranteed.** Output is generated text, not schema-validated card JSON. Run it through a parser/validator before importing into SillyTavern.
|
| 141 |
+
- **Card conventions.** Output uses `{{user}}` / `{{char}}` macros and assumes a SillyTavern runtime.
|
| 142 |
+
- **Single-turn only.** This generates a card, not a conversation; it is not itself a roleplay partner.
|
| 143 |
+
- **Inherited bias.** The model carries the biases of both the base model and the curated card sources, including their genre, aesthetic, and demographic skew. "High quality" reflects a subjective curation judgment.
|
| 144 |
|
| 145 |
+
## Citation
|
| 146 |
|
| 147 |
+
If you use this model, please reference this repository and the [base model](https://huggingface.co/ibm-granite/granite-4.1-8b-base).
|