ananke-sclm / README.md

SCLM v2 - Stateful Coherent Language Model with EARCP

47f4ba3 verified 16 days ago

3.18 kB

	---
	license: mit
	language:
	- en
	library_name: transformers
	tags:
	- sclm
	- stateful
	- memory
	- earcp
	- text-generation
	- conversational
	pipeline_tag: text-generation
	base_model: mistralai/Mistral-7B-v0.1
	widget:
	- text: "The wizard Elara lived in Silverwood forest. One day, she discovered"
	example_title: "Fantasy Story"
	- text: "In the year 2050, humanity had finally achieved"
	example_title: "Science Fiction"
	- text: "The detective examined the crime scene carefully. The clues pointed to"
	example_title: "Mystery"
	inference:
	parameters:
	max_new_tokens: 100
	temperature: 0.7
	top_p: 0.9
	repetition_penalty: 1.1
	---

	# 🧠 SCLM: Stateful Coherent Language Model

	SCLM adds persistent latent memory to transformer language models, enabling better coherence across long conversations and multi-turn generation.

	## 🎯 Key Features

	- Persistent State: Memory that evolves across conversation turns
	- Entity Coherence: Maintains context about characters, places, and objects
	- Edit Mode: Make local changes without affecting global memory
	- Lightweight: Only 91.7M additional parameters (2.44% overhead)

	## 📊 Architecture: EARCP

	```
	EARCP = Encapsulation + Alignment + Revision + Coherence + Propagation
	```

	\| Component \| Function \|
	\|-----------\|----------\|
	\| Encapsulation \| GRU-style state update from hidden states \|
	\| Alignment \| Cross-attention between state and hidden layers \|
	\| Revision \| Drift detection and correction \|
	\| Coherence \| Mixture-of-Experts for consistency \|
	\| Propagation \| State injection into transformer layers \|

	## 🔧 Model Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base Model \| mistralai/Mistral-7B-v0.1 \|
	\| EARCP Parameters \| 91.7M \|
	\| Latent State Dim \| 256 \|
	\| Injection Layers \| [8, 16] \|
	\| Alpha (injection strength) \| 0.02 \|
	\| Experts \| 2 \|

	## 🚀 Quick Start

	```python
	# Note: Full SCLM requires custom loading (see below)
	# The inference widget uses the base model only

	from transformers import AutoTokenizer
	import torch

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained("amewebstudio/ananke-sclm")

	# For full SCLM functionality, load weights separately:
	# 1. Load base Mistral-7B
	# 2. Load EARCP weights from earcp_weights.pt
	# 3. Apply SCLM wrapper
	```

	## 📈 Validation Results

	\| Test \| Result \|
	\|------\|--------\|
	\| Forward Pass \| ✅ \|
	\| State Evolution \| ✅ (norm: 0 → 4.6 → 7.5) \|
	\| Coherent Generation \| ✅ \|
	\| Edit Mode \| ✅ \|
	\| Entity Memory \| ✅ (Elara, Nimbus retained) \|

	## 💡 Use Cases

	- Interactive Fiction: Characters and plot points remain consistent
	- Long Conversations: Context persists without growing prompts
	- Creative Writing: Maintain story coherence across chapters
	- Role-Playing: NPCs remember past interactions

	## 📝 Citation

	```bibtex
	@article{amega2025sclm,
	title={SCLM: Stateful Coherent Language Models with EARCP Architecture},
	author={Amega, Mike},
	year={2025},
	note={Ame Web Studio}
	}
	```

	## 👤 Author

	Mike Amega - [Ame Web Studio](https://github.com/Volgat)

	---

	SCLM is an experimental architecture exploring persistent memory in language models.