aimeri commited on
Commit
9daed53
·
verified ·
1 Parent(s): 42c67a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +139 -13
README.md CHANGED
@@ -1,21 +1,147 @@
1
- ---
2
- base_model: ibm-granite/granite-4.1-8b-base
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - granite
8
  license: apache-2.0
 
 
 
 
9
  language:
10
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # Uploaded finetuned model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- - **Developed by:** aimeri
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** ibm-granite/granite-4.1-8b-base
 
 
18
 
19
- This granite model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
+ --
 
 
 
 
 
 
2
  license: apache-2.0
3
+ base_model: ibm-granite/granite-4.1-8b-base
4
+ base_model_relation: finetune
5
+ datasets:
6
+ - aimeri/st-characters-alpaca
7
  language:
8
  - en
9
+ library_name: transformers
10
+ pipeline_tag: text-generation
11
+ tags:
12
+ - sillytavern
13
+ - character-cards
14
+ - character-card-generation
15
+ - roleplay
16
+ - granite
17
+ - granite-4.1
18
+ - unsloth
19
+ - trl
20
+ - sft
21
+ - lora
22
+ - conversational
23
  ---
24
 
25
+ # SpoomplesMaxx Card Maker V1
26
+
27
+ A fine-tune of [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) that turns a short, open-ended prompt into a complete [SillyTavern](https://github.com/SillyTavern/SillyTavern) character card. Give it a concept — an archetype, a name and a few constraints, or just a one-liner — and it generates a full V2/V3-style card (description, personality, scenario, first message, example messages, and sometimes a lorebook).
28
+
29
+ ## Model Details
30
+
31
+ - **Developed by:** [aimeri](https://huggingface.co/aimeri)
32
+ - **Base model:** [`ibm-granite/granite-4.1-8b-base`](https://huggingface.co/ibm-granite/granite-4.1-8b-base) (Apache 2.0)
33
+ - **Language:** English
34
+ - **Finetuned from a base (not instruct) checkpoint** so output is the card itself, with no assistant-style preamble, disclaimers, or refusals.
35
+ - **License:** Apache 2.0
36
+
37
+ ## Uses
38
+
39
+ ### Direct Use
40
+
41
+ Generating SillyTavern-compatible character cards on demand from a natural-language request. The intended workflow is "describe a character, get a card," with the card output piped through a structural validator before import.
42
+
43
+ ### Out-of-Scope Use
44
+
45
+ This is a single-turn card *generator*, not a roleplay or chat model — the assistant turn is a static card definition, not a conversation. It is not intended for multi-turn roleplay, as a general-purpose assistant, or for factual question answering.
46
+
47
+ ## How to Get Started
48
+
49
+ The model was trained **without a system prompt**, so the cleanest usage is user-only. Use the chat template and sampling settings below.
50
+
51
+ ```python
52
+ import torch
53
+ from transformers import AutoModelForCausalLM, AutoTokenizer # transformers >= 5.0
54
+
55
+ model_id = "aimeri/spoomplesmaxx-cardmaker-v1"
56
+ tok = AutoTokenizer.from_pretrained(model_id)
57
+ model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="auto")
58
+
59
+ messages = [
60
+ {"role": "user", "content": "Create a character card for a grumpy lighthouse keeper."},
61
+ ]
62
+ inputs = tok.apply_chat_template(
63
+ messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
64
+ ).to(model.device)
65
+
66
+ out = model.generate(
67
+ **inputs,
68
+ max_new_tokens=8192,
69
+ do_sample=True,
70
+ temperature=1.0,
71
+ top_k=64,
72
+ top_p=0.95,
73
+ repetition_penalty=1.1,
74
+ )
75
+ print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
76
+ ```
77
+
78
+ Cards that include a `character_book` can be long; if generation cuts off mid-card, raise `max_new_tokens`. The merged 16-bit weights also serve directly under vLLM (`vllm serve aimeri/spoomplesmaxx-cardmaker-v1`), again with no system message.
79
+
80
+ ## Training Details
81
+
82
+ ### Procedure
83
+
84
+ LoRA fine-tune with [Unsloth](https://github.com/unslothai/unsloth) + TRL `SFTTrainer`, using the official Granite 4.1 chat template. Loss was computed on the assistant (card) completion only via `train_on_responses_only`.
85
+
86
+ **LoRA configuration**
87
+
88
+ | Setting | Value |
89
+ |---|---|
90
+ | Rank `r` | 16 |
91
+ | `lora_alpha` | 22 |
92
+ | `lora_dropout` | 0 |
93
+ | Target modules | all-linear |
94
+ | Rank-stabilized LoRA | enabled |
95
+ | Bias | none |
96
+
97
+ **Training hyperparameters**
98
+
99
+ | Setting | Value |
100
+ |---|---|
101
+ | Epochs | 2 (848 optimizer steps) |
102
+ | Per-device batch size | 1 |
103
+ | Gradient accumulation | 8 (effective batch size 8) |
104
+ | Max sequence length | 8192 |
105
+ | Optimizer | adamw_8bit (β₁ 0.9, β₂ 0.999, ε 1e-8) |
106
+ | Learning rate | 1e-4, cosine schedule |
107
+ | Warmup steps | 25 |
108
+ | Weight decay | 0.001 |
109
+ | Max grad norm | 1.0 |
110
+ | Precision | bf16 |
111
+ | Seed | 1985 |
112
+ | Frameworks | Unsloth 2026.6.1, Transformers 5.5.0, TRL, PEFT, PyTorch 2.10 |
113
+
114
+ ### Results
115
+
116
+ Evaluation loss on the 5% held-out split fell from the base checkpoint to the final model over the two epochs (most of the gain came in the first ~100 steps, with a slow grind afterward):
117
+
118
+ | Checkpoint | Eval loss |
119
+ |---|---|
120
+ | Base (step 0, `eval_on_start`) | 2.234 |
121
+ | Step 100 | 1.704 |
122
+ | Step 400 | 1.656 |
123
+ | Final (step 848) | **1.641** |
124
+
125
+ Final mean training loss was ~1.57. Total wall-clock training time was ~4.6 hours.
126
+
127
+ ## Evaluation
128
+
129
+ Quality was judged primarily **behaviorally** rather than by a single metric — eval loss is a weak proxy for card quality on a held-out set this small (~178 rows). A fixed prompt battery probed the behaviors that matter for this task:
130
+
131
+ - **Structure & completeness** — clean, parseable cards with all expected fields on easy archetypes.
132
+ - **Constraint adherence** — exact name / age / occupation, and a character's voice actually showing up in `first_mes` and `mes_example` rather than drifting generic.
133
+ - **Sparse invention** — building a full, internally consistent card from a near-empty prompt.
134
+ - **First-message craft** — second-person address to `{{user}}`, scene-setting, action formatting, in-voice dialogue, and a natural hand-off.
135
+ - **Register** — antagonist/villain cards produced in-character, with no disclaimers, moralizing, or assistant-voice leakage. This is the main reason the model was trained from a base rather than an instruct checkpoint.
136
+
137
+ ## Bias, Risks, and Limitations
138
 
139
+ - **Mature content.** This model was trained with a mix of Safe for Work and Not Safe For Work cards, and it may generate objectionable content. Please use discretion when generating new cards.
140
+ - **Structural validity is not guaranteed.** Output is generated text, not schema-validated card JSON. Run it through a parser/validator before importing into SillyTavern.
141
+ - **Card conventions.** Output uses `{{user}}` / `{{char}}` macros and assumes a SillyTavern runtime.
142
+ - **Single-turn only.** This generates a card, not a conversation; it is not itself a roleplay partner.
143
+ - **Inherited bias.** The model carries the biases of both the base model and the curated card sources, including their genre, aesthetic, and demographic skew. "High quality" reflects a subjective curation judgment.
144
 
145
+ ## Citation
146
 
147
+ If you use this model, please reference this repository and the [base model](https://huggingface.co/ibm-granite/granite-4.1-8b-base).