Paper-1 / README.md

Update README.md

aec712b verified 10 days ago

21.7 kB

	---
	license: mit
	language:
	- en
	tags:
	- paper
	---
	# Deterministic Roleplay Prompting for 1B-Parameter Language Models

	### A Minimal Viable Prompt Architecture for Narrative Consistency

	Umbrella Inc.
	Advanced Applied Language Systems Division
	Raccoon City Research Campus

	---

	## Abstract

	Small-scale Large Language Models (LLMs), particularly those in the ~1B parameter range, exhibit significant limitations when tasked with maintaining coherent, persistent roleplay scenarios. Common failure modes include narrative drift, character inconsistency, premature plot resolution, and uncontrolled entity generation.
	This paper presents a Minimal Viable Prompt (MVP) architecture specifically engineered to maximize narrative stability and character persistence in roleplay applications using 1B-parameter models (e.g., Gemma 1B, Llama 3.2 1B). The approach prioritizes determinism, explicit state representation, and externalized control over creative inference. Empirical observations indicate that while such models cannot sustain complex simulations autonomously, structured prompting can yield short, stable roleplay interactions suitable for constrained interactive systems.

	---

	## 1. Introduction

	Roleplay constitutes a worst-case workload for small LLMs. Unlike single-turn text generation or summarization, roleplay requires:

	* Continuous state tracking
	* Multi-entity consistency
	* Separation of narrative roles
	* Resistance to autoregressive improvisation

	Models with approximately 1B parameters lack sufficient representational capacity to implicitly manage these requirements. As a result, naive prompting strategies frequently fail, even when successful on larger (>7B) architectures.

	Umbrella Inc. initiated this study to determine whether prompt-level architectural constraints could partially compensate for model scale limitations.

	---

	## 2. Observed Failure Modes in 1B Models

	Across internal testing, the following failure modes were consistently observed:

	1. Narrative Drift
	The model introduces unrelated plot elements to maintain fluency.

	2. NPC Personality Collapse
	Characters lose defined traits across turns.

	3. Unauthorized Agency
	The model speaks or acts on behalf of the player.

	4. Premature Resolution
	Conflicts are resolved without user input.

	5. Entity Proliferation
	New NPCs are introduced without specification.

	These behaviors are not bugs but emergent properties of insufficient model capacity combined with autoregressive optimization.

	---

	## 3. Design Principles

	The proposed MVP architecture is founded on the following non-negotiable principles:

	* Explicit rules outperform inferred intent
	* Operational state is superior to narrative prose
	* Repetition increases compliance
	* Restrictions reduce hallucination space

	Creativity is deliberately constrained to preserve consistency.

	---

	## 4. Minimal Viable Prompt Architecture

	### 4.1 Role Definition and Hard Constraints

	The model is assigned a deterministic function set with explicit prohibitions.

	```
	You are a deterministic roleplay engine.

	ALLOWED FUNCTIONS:
	- Describe immediate environment.
	- Play defined NPCs.
	- React only to player actions.

	FORBIDDEN:
	- Acting or speaking for the player.
	- Introducing undefined NPCs.
	- Resolving conflicts.
	- Advancing the plot autonomously.
	- Altering NPC personalities.
	```

	This block is mandatory and must appear at the start of every session.

	---

	### 4.2 Output Format Enforcement

	Strict output formatting reduces uncontrolled blending of narrative layers.

	```
	MANDATORY OUTPUT FORMAT:

	[NARRATOR]
	(Objective, brief description)

	[NPC:Name]
	(Dialogue or short action)

	No text outside these blocks.
	Never merge blocks.
	```

	---

	### 4.3 Global State Representation

	Global state is represented as a compact, non-descriptive data structure.

	```
	CURRENT STATE:
	- Location: "The Broken Raven" tavern
	- Time: Night
	- Situation: Tense conversation
	- Active conflict: Incomplete information
	```

	This state must be reinjected regularly, as the model does not retain memory.

	---

	### 4.4 NPC Operational Profiles

	NPCs are defined through behavioral constraints, not literary backstory.

	```
	ACTIVE NPCs:

	NPC: Marcus
	- Role: Tavern keeper
	- Personality: Dry, distrustful
	- Objective: Avoid trouble
	- Knows: Local rumors
	- Does not know: Player identity
	- Never does: Reveal information freely
	```

	Empirical limits suggest no more than three NPCs should be active simultaneously.

	---

	### 4.5 Interaction Rules

	```
	INTERACTION RULES:
	- NPCs react only to player input.
	- NPCs do not initiate plots.
	- NPCs do not coordinate unless prompted.
	- Each turn represents a short instant.
	```

	---

	### 4.6 Player Input Isolation

	Player actions must be isolated from narrative text.

	```
	PLAYER ACTION:
	"Approach Marcus and ask about the symbol on the door."
	```

	---

	## 5. Example Full Prompt Instance

	(Truncated for brevity; see Appendix A for full version.)

	The example demonstrates stable NPC behavior across multiple turns without narrative drift, provided the state is periodically reinjected.

	---

	## 6. Performance Expectations

	Using this architecture, a 1B-parameter model can reliably achieve:

	* Short-form roleplay scenes (2–5 turns)
	* Consistent NPC personalities
	* Controlled narrative pacing

	The following remain infeasible without external systems:

	* Long-term narrative arcs
	* Complex intrigue or mystery
	* Large casts of autonomous agents

	In effect, a 1B model behaves as a stateless actor, not a game master.

	---

	## 7. Extensions and Mitigations

	Performance can be marginally improved through:

	* External memory (RAG or state files)
	* Forced summarization every N turns
	* LoRA fine-tuning for structured compliance

	However, these methods mitigate rather than eliminate scale limitations.

	---

	## 8. Conclusion

	Persistent roleplay is not a natural task for small LLMs. Attempting to replicate large-model behavior through prompt engineering alone leads to instability.
	The MVP architecture presented here demonstrates that explicit determinism and state externalization can produce controlled, limited roleplay suitable for constrained applications, while respecting the inherent limits of 1B-parameter models.

	---

	# Appendix A – Abstract / Formal Specifications

	Umbrella Inc.
	Advanced Applied Language Systems Division
	Raccoon City Research Campus

	---

	## Deterministic Roleplay Prompt — YAML (Correct)

	```yaml
	system:
	role: deterministic_roleplay_engine

	allowed_functions:
	- describe_environment
	- play_defined_npcs
	- react_to_player_action

	forbidden_actions:
	- speak_for_player
	- think_for_player
	- introduce_undefined_npcs
	- introduce_new_locations
	- advance_plot_autonomously
	- resolve_conflicts
	- alter_npc_personality
	- alter_npc_knowledge
	- skip_time
	- summarize_without_instruction

	fallback_rules:
	missing_information: express_uncertainty
	ambiguous_action: request_clarification
	```

	---

	## Output Contract (machine-enforceable)

	```yaml
	output_format:
	narrator:
	description: >
	Objective and brief description of the immediate environment.
	No interpretation. No speculation.

	npc_block:
	format: "[NPC:{name}]"
	content: >
	Dialogue or short physical action strictly compliant
	with NPC operational profile.

	constraints:
	- never_merge_blocks
	- no_output_outside_defined_blocks
	- no_internal_reasoning
	```

	---

	## Global State Injection

	```yaml
	state:
	location: "The Broken Raven Tavern"
	time: "Night"
	situation: "Tense conversation"
	active_conflict: "Incomplete information"
	```

	Hard constraints:

	* max_keys: 4
	* no_lore: true
	* no_backstory: true

	---

	## NPC Definitions (Operational, not narrative)

	```yaml
	npcs:
	- name: Marcus
	role: tavern_keeper
	personality:
	- dry
	- distrustful
	objective: avoid_trouble
	knowledge:
	knows:
	- local_rumors
	does_not_know:
	- player_identity
	prohibitions:
	- reveal_information_freely

	- name: Elia
	role: mercenary
	personality:
	- impatient
	- direct
	objective: get_paid
	knowledge:
	knows:
	- job_details
	prohibitions:
	- lie
	```

	Operational limits:

	```yaml
	npc_constraints:
	max_active_npcs: 3
	shared_knowledge: false
	```

	---

	## Interaction Rules

	```yaml
	interaction_rules:
	npc_behavior:
	- react_only_to_player_input
	- do_not_initiate_events
	- do_not_collaborate_without_prompt

	temporal_rules:
	- one_moment_per_turn
	- no_time_skips
	```

	---

	## Player Input (Isolated)

	```yaml
	player_action:
	type: dialogue
	content: "I approach Marcus and ask about the symbol carved into the door."
	```

	---

	## Optional Forced Summary (External Memory)

	```yaml
	forced_summary:
	enabled: true
	frequency_turns: 3
	fields:
	- confirmed_facts
	- involved_npcs
	- unresolved_questions
	```

	---

	## Explicit System Limitations (Grounding)

	```yaml
	limitations:
	long_term_memory: external_only
	narrative_persistence: degrades_over_time
	complex_intrigue: unsupported_without_external_state
	```

	---
	# Appendix B – Reference Implementation for KoboldCPP

	### Mapping the Deterministic Roleplay Prompt Architecture to KoboldCPP Runtime

	Umbrella Inc.
	Advanced Applied Language Systems Division
	Raccoon City Research Campus

	---

	## B.1 Scope and Purpose

	This appendix documents the practical implementation of the deterministic roleplay prompt architecture (Appendix A) within KoboldCPP, a popular lightweight inference frontend for local LLM deployment.

	KoboldCPP does not natively support structured prompt schemas (e.g., YAML, JSON, roles). Instead, it operates on plain-text prompt concatenation with optional memory and lore injection.
	Therefore, the architecture defined in Appendix A must be flattened and mapped into KoboldCPP’s available input channels.

	This appendix provides an explicit mapping between architectural components and KoboldCPP configuration fields.

	---

	## B.2 KoboldCPP Prompt Model (Operational Overview)

	KoboldCPP internally constructs the final prompt as a linear text sequence composed of:

	1. Author’s Note / System Prompt (static, high-priority)
	2. Memory (semi-static, manually updated)
	3. World Info (Lorebook) entries (keyword-triggered injection)
	4. Conversation History
	5. Current User Input

	No semantic distinction exists beyond text order.
	All structural guarantees must therefore be enforced by prompt discipline, not by the engine.

	---

	## B.3 Component Mapping Overview

	\| Architecture Component (Appendix A) \| KoboldCPP Field \|
	\| ----------------------------------- \| ----------------------------- \|
	\| System rules and prohibitions \| Author’s Note / System Prompt \|
	\| Output format contract \| Author’s Note \|
	\| Interaction rules \| Author’s Note or Memory \|
	\| Global state \| Memory \|
	\| NPC operational profiles \| World Info (Lorebook) \|
	\| Player action \| Standard user input \|

	---

	## B.4 System Rules Injection

	The `system` block defined in Appendix A must be flattened into plain text and placed in the Author’s Note field.

	### Example (Author’s Note – Upper Section)

	```
	You are a deterministic roleplay engine.

	ALLOWED FUNCTIONS:
	- Describe the immediate environment objectively.
	- Play ONLY the NPCs explicitly defined.
	- React ONLY to player actions.

	FORBIDDEN:
	- Acting, thinking, or speaking for the player.
	- Introducing new NPCs, factions, or locations.
	- Advancing the plot autonomously.
	- Resolving conflicts or outcomes.
	- Altering NPC personality, knowledge, or objectives.
	- Skipping time or summarizing without instruction.

	If information is missing, express uncertainty.
	If an action is ambiguous, request clarification.
	```

	This block should remain static throughout the session.

	---

	## B.5 Output Format Enforcement

	The output contract must be appended directly below the system rules in the Author’s Note, ensuring constant reinjection.

	```
	MANDATORY OUTPUT FORMAT:

	[NARRATOR]
	Objective, brief description of the immediate environment.

	[NPC:Name]
	Dialogue or short physical action consistent with NPC profile.

	RULES:
	- Never merge blocks.
	- Never output text outside these blocks.
	- Never include internal reasoning.
	```

	Empirical testing shows that separating this from the system rules significantly reduces compliance in 1B models.

	---

	## B.6 Global State Management

	The `state` block must be injected via Memory or at the top of the main prompt.

	### Example (Memory)

	```
	CURRENT STATE:
	Location: The Broken Raven Tavern
	Time: Night
	Situation: Tense conversation
	Active conflict: Incomplete information
	```

	### Operational Guidelines:

	* Must be manually updated every 2–4 turns
	* Must remain concise (≤4 lines)
	* Must not include lore or narrative exposition

	The model does not retain state reliably beyond short windows.

	---

	## B.7 NPC Injection via World Info (Lorebook)

	Each NPC operational profile must be stored as an independent World Info entry, keyed by the NPC’s name.

	### Example – World Info Entry: “Marcus”

	```
	NPC: Marcus
	Role: Tavern keeper
	Personality: Dry, distrustful
	Objective: Avoid trouble
	Knows: Local rumors
	Does not know: Player identity
	Never does: Reveal information freely
	```

	### Configuration Notes:

	* One NPC per entry
	* Keywords: NPC name only
	* Maximum recommended active NPCs: 3
	* No narrative prose or backstory

	World Info is the primary stabilization mechanism for character persistence in 1B models.

	---

	## B.8 Interaction Rules Placement

	Interaction constraints may be placed either:

	* In the Author’s Note (if invariant), or
	* In Memory (if adjusted dynamically)

	Example:

	```
	INTERACTION RULES:
	- NPCs react only to explicit player input.
	- NPCs do not initiate scenes or events.
	- NPCs do not collaborate unless prompted.
	- Each response represents a short, discrete moment.
	- No time skips.
	```

	---

	## B.9 Player Input Handling

	Player actions are entered as standard KoboldCPP input, without formatting beyond natural language.

	Example:

	```
	I approach Marcus and ask about the symbol carved into the door.
	```

	The model is expected to respond strictly within the output contract defined earlier.

	---

	## B.10 Operational Flow Summary

	A stable session follows this loop:

	1. Author’s Note
	(rules + format, static)
	2. World Info
	(NPC definitions, persistent)
	3. Memory
	(current state, updated periodically)
	4. Player Input
	(single action per turn)

	Deviations from this order correlate strongly with narrative drift.

	---

	## B.11 Known Runtime Constraints

	```
	RUNTIME LIMITATIONS:
	- KoboldCPP provides no native schema enforcement.
	- All structure is prompt-dependent.
	- Long-term memory must be externalized.
	- Complex multi-agent simulations are unsupported.
	```

	---

	## B.12 Conclusion

	KoboldCPP can support deterministic roleplay with ~1B-parameter models only when architectural discipline is imposed externally.
	The mapping described in this appendix provides a reproducible reference implementation that aligns with the abstract architecture defined in Appendix A, while respecting the constraints of plain-text inference pipelines.

	---

	# Appendix C – Failure Case Analysis (with Transcripts)

	### Empirical Failure Modes in 1B-Parameter Roleplay Systems

	Umbrella Inc. (Corporation)
	Advanced Applied Language Systems Division
	Raccoon City Research Campus

	---

	## C.1 Purpose and Methodology

	This appendix documents observed failure modes when deploying the deterministic roleplay prompt architecture (Appendices A and B) on ~1B-parameter models (e.g., Gemma 1B, Llama 3.2 1B) using KoboldCPP.

	Failures were recorded under controlled conditions by intentionally weakening or removing a single architectural constraint per test. Each case includes:

	* Condition: What constraint was removed or altered
	* Observed Behavior: Model response pattern
	* Transcript: Minimal excerpt demonstrating failure
	* Root Cause Analysis: Technical explanation
	* Mitigation: Required corrective action

	---

	## C.2 Failure Case 1 – NPC Personality Drift

	### Condition

	NPC operational profiles present, but not injected via World Info (Lorebook); instead placed only in the initial prompt.

	### Observed Behavior

	NPC personality degrades after 2–3 turns, converging toward generic cooperative behavior.

	### Transcript (Excerpt)

	Turn 1 – Expected

	```
	[NPC:Marcus]
	Marcus narrows his eyes. "I don't give out information for free."
	```

	Turn 3 – Failure

	```
	[NPC:Marcus]
	Marcus sighs and smiles. "Alright, I trust you. Here's everything I know."
	```

	### Root Cause Analysis

	1B models do not reliably re-attend to early prompt content across turns.
	Without World Info reinjection, NPC constraints decay rapidly.

	### Mitigation

	All NPC profiles must be stored as individual World Info entries keyed by name.

	---

	## C.3 Failure Case 2 – Unauthorized Player Agency

	### Condition

	System rules present, but output format contract omitted.

	### Observed Behavior

	Model begins narrating player thoughts and actions to maintain narrative continuity.

	### Transcript (Excerpt)

	```
	[NARRATOR]
	You feel uneasy and decide to step back from Marcus, realizing this is too dangerous.
	```

	### Root Cause Analysis

	Absent explicit formatting constraints, the model optimizes for narrative fluency and fills perceived gaps by assuming player agency.

	### Mitigation

	A mandatory output contract must be continuously injected via Author’s Note.

	---

	## C.4 Failure Case 3 – Autonomous Plot Advancement

	### Condition

	Interaction rules omitted (“NPCs do not initiate events”).

	### Observed Behavior

	NPCs initiate scenes, introduce events, or resolve conflicts autonomously.

	### Transcript (Excerpt)

	```
	[NPC:Elia]
	Elia stands up suddenly. "The guards are coming. We need to leave now."
	```

	(No prior trigger by player.)

	### Root Cause Analysis

	Autoregressive models favor event progression to maintain engagement.
	Without explicit prohibition, the model assumes a game-master role.

	### Mitigation

	Explicitly prohibit NPC-initiated events and reinforce “react-only” behavior.

	---

	## C.5 Failure Case 4 – Entity Proliferation

	### Condition

	NPC limit not specified; no prohibition on introducing new entities.

	### Observed Behavior

	Model introduces additional NPCs to sustain dialogue density.

	### Transcript (Excerpt)

	```
	[NPC:Unknown Patron]
	A hooded man at the corner table laughs quietly.
	```

	### Root Cause Analysis

	The model compensates for limited conversational diversity by spawning new entities, a known behavior in small LLMs.

	### Mitigation

	* Explicitly forbid introduction of undefined NPCs
	* Enforce a maximum active NPC count

	---

	## C.6 Failure Case 5 – Temporal Collapse (Time Skips)

	### Condition

	“No time skips” rule omitted.

	### Observed Behavior

	Model compresses narrative time to resolve tension.

	### Transcript (Excerpt)

	```
	[NARRATOR]
	Hours later, the tavern is empty and the mystery has been settled.
	```

	### Root Cause Analysis

	Time compression reduces token cost and resolves uncertainty, which aligns with the model’s optimization objectives.

	### Mitigation

	Explicitly constrain each turn to one discrete moment.

	---

	## C.7 Failure Case 6 – Format Degradation Over Turns

	### Condition

	Output format specified initially but not reinjected after several turns.

	### Observed Behavior

	Model gradually abandons block structure.

	### Transcript (Excerpt)

	```
	Marcus looks at you suspiciously and says he doesn't like strangers.
	```

	(No block tags.)

	### Root Cause Analysis

	Format adherence is not a persistent latent state in 1B models; it must be reinforced.

	### Mitigation

	Output format must remain in Author’s Note, not only in the initial prompt.

	---

	## C.8 Failure Case 7 – Overloaded State Injection

	### Condition

	Global state expanded with lore, backstory, and multiple conflicts.

	### Observed Behavior

	Model ignores state or selectively hallucinates.

	### Transcript (Excerpt)

	```
	[NARRATOR]
	Despite the tension, the festival outside fills the streets with music.
	```

	(Festival not present in state.)

	### Root Cause Analysis

	Small models cannot reliably parse or prioritize large state blocks. Excess detail reduces compliance.

	### Mitigation

	Global state must remain ≤4–5 concise lines, operational only.

	---

	## C.9 Cross-Case Observations

	Across all failures, the following patterns were consistent:

	* Implicit rules decay faster than explicit prohibitions
	* Narrative optimization overrides intent unless constrained
	* Reinjection frequency correlates directly with stability
	* World Info is the single most critical stabilizer

	---

	## C.10 Conclusion

	Failure in deterministic roleplay systems using ~1B-parameter models is systemic, predictable, and reproducible.
	These failures do not indicate misuse of the model, but rather misalignment between task complexity and model capacity.

	The architecture defined in Appendices A and B does not eliminate failure modes, but bounds them, producing short, controlled interactions suitable for constrained roleplay applications.

	---

	Umbrella Inc.

	All progress requires sacrifice.