AI Puppet Theater MiniCPM5 Actor LoRA

LoRA/QLoRA adapter for openbmb/MiniCPM5-1B, fine-tuned to act as an Actor agent for AI Puppet Theater.

The adapter is trained to produce one short, theatrical, speakable Actor JSON object for a single puppet-show beat. It is not a general chat model.

Base Model

Base model: openbmb/MiniCPM5-1B
Method: LoRA/QLoRA supervised fine-tuning
Dataset: AI Puppet Theater Actor SFT
Adapter version: v0 Actor adapter used for the hackathon demo and GGUF conversion

Intended Output Schema

{
  "intent": "inspect_prop",
  "line": "This rubber duck squeaks exactly like a guilty witness.",
  "emotion": "investigative",
  "gesture": "leans toward the glowing prop",
  "stage_effect": "prop_table_glow",
  "memory_update": "Noted that the rubber duck behaved like evidence.",
  "tool_request": {
    "tool": "inspect_prop",
    "args": {"prop": "rubber duck"},
    "reason": "The prop may reveal a stage clue."
  }
}

tool_request may be null. Current AI Puppet Theater runtime tool schemas:

inspect_prop: {"prop": "..."}
consult_stage_oracle: {"question": "..."}
change_lighting: {"mood": "..."}

Eval Summary

This LoRA adapter is the v0 Actor adapter used for the hackathon demo and downstream GGUF conversion.

On a 40-prompt held-out Actor eval, the LoRA / merged model produced:

Check	Result
Extractable JSON	35/40, 87.5%
Required fields present	34/40, 85.0%
Exact top-level schema	34/40, 85.0%
Sanitized Actor JSON usable	34/40, 85.0%
Strict `tool_request` valid	34/40, 85.0%
Sanitized `tool_request` usable	35/40, 87.5%

The later GGUF evaluation improved usability with the final llama.cpp prompt/runtime settings. See the GGUF model card and fine-tuning blog for local inference details.

Runtime should still use JSON extraction, strict validation, sanitized Actor JSON, tool-argument normalization, and deterministic fallback.

Example Prompt Shape

premise: A moon mayor denies stealing the town's last spoon
show_state JSON: {"story_phase":"complication","latest_prop":"silver spoon",...}
actor JSON: {"name":"Mina Moonbutton","goal":"Find the emotional truth...",...}
director_instruction: Use the latest prop as evidence and request inspection if useful.

Expected assistant response:

{"intent":"inspect_prop","line":"This silver spoon squeaks exactly like a guilty witness.","emotion":"investigative","gesture":"leans toward the glowing prop","stage_effect":"prop_table_glow","memory_update":"Noted that the silver spoon behaved like evidence.","tool_request":{"tool":"inspect_prop","args":{"prop":"silver spoon"},"reason":"The prop may reveal one concrete stage clue."}}

Usage Example

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = "openbmb/MiniCPM5-1B"
adapter_id = "build-small-hackathon/AI-Puppet-Theater-MiniCPM5-Actor-LoRA"

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, adapter_id)
model.eval()

messages = [
    {
        "role": "system",
        "content": (
            "You are an Actor agent in AI Puppet Theater. "
            "Return only one valid JSON object. No markdown. No commentary. "
            "Keep the puppet line short, theatrical, and speakable."
        ),
    },
    {
        "role": "user",
        "content": (
            "premise: ...\n"
            "show_state JSON: {...}\n"
            "actor JSON: {...}\n"
            "director_instruction: ..."
        ),
    },
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=192,
    do_sample=False,
)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

Limitations

Synthetic dataset; not a broad creative-writing model.
Actor-only behavior; Director planning is handled elsewhere.
Not a general chat or instruction model.
Can emit extra top-level fields, copied state fields, or text around the JSON; use runtime extraction, sanitizer, and fallback.
Tool calls must be validated before execution.
Demo-safe filters are simple and do not replace product-level safety review.

Related Artifacts

Space: AI Puppet Theater
Dataset: AI-Puppet-Theater-Actor-SFT
Base model: openbmb/MiniCPM5-1B
LoRA adapter: AI-Puppet-Theater-MiniCPM5-Actor-LoRA
GGUF model: AI-Puppet-Theater-MiniCPM5-Actor-GGUF
Product blog: AI Puppet Theater: From Premise to Puppet Show
Fine-tuning blog: Teaching a 1B Model to Speak Puppet JSON
Demo video: YouTube walkthrough

License

This adapter is released under Apache-2.0. The base model openbmb/MiniCPM5-1B is also published under Apache-2.0; users should review the base model license before downstream use.