ananke-sclm / README.md
amewebstudio's picture
SCLM v2 - Stateful Coherent Language Model with EARCP
47f4ba3 verified
---
license: mit
language:
- en
library_name: transformers
tags:
- sclm
- stateful
- memory
- earcp
- text-generation
- conversational
pipeline_tag: text-generation
base_model: mistralai/Mistral-7B-v0.1
widget:
- text: "The wizard Elara lived in Silverwood forest. One day, she discovered"
example_title: "Fantasy Story"
- text: "In the year 2050, humanity had finally achieved"
example_title: "Science Fiction"
- text: "The detective examined the crime scene carefully. The clues pointed to"
example_title: "Mystery"
inference:
parameters:
max_new_tokens: 100
temperature: 0.7
top_p: 0.9
repetition_penalty: 1.1
---
# 🧠 SCLM: Stateful Coherent Language Model
**SCLM** adds **persistent latent memory** to transformer language models, enabling better coherence across long conversations and multi-turn generation.
## 🎯 Key Features
- **Persistent State**: Memory that evolves across conversation turns
- **Entity Coherence**: Maintains context about characters, places, and objects
- **Edit Mode**: Make local changes without affecting global memory
- **Lightweight**: Only 91.7M additional parameters (2.44% overhead)
## πŸ“Š Architecture: EARCP
```
EARCP = Encapsulation + Alignment + Revision + Coherence + Propagation
```
| Component | Function |
|-----------|----------|
| **Encapsulation** | GRU-style state update from hidden states |
| **Alignment** | Cross-attention between state and hidden layers |
| **Revision** | Drift detection and correction |
| **Coherence** | Mixture-of-Experts for consistency |
| **Propagation** | State injection into transformer layers |
## πŸ”§ Model Details
| Parameter | Value |
|-----------|-------|
| Base Model | mistralai/Mistral-7B-v0.1 |
| EARCP Parameters | 91.7M |
| Latent State Dim | 256 |
| Injection Layers | [8, 16] |
| Alpha (injection strength) | 0.02 |
| Experts | 2 |
## πŸš€ Quick Start
```python
# Note: Full SCLM requires custom loading (see below)
# The inference widget uses the base model only
from transformers import AutoTokenizer
import torch
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("amewebstudio/ananke-sclm")
# For full SCLM functionality, load weights separately:
# 1. Load base Mistral-7B
# 2. Load EARCP weights from earcp_weights.pt
# 3. Apply SCLM wrapper
```
## πŸ“ˆ Validation Results
| Test | Result |
|------|--------|
| Forward Pass | βœ… |
| State Evolution | βœ… (norm: 0 β†’ 4.6 β†’ 7.5) |
| Coherent Generation | βœ… |
| Edit Mode | βœ… |
| Entity Memory | βœ… (Elara, Nimbus retained) |
## πŸ’‘ Use Cases
- **Interactive Fiction**: Characters and plot points remain consistent
- **Long Conversations**: Context persists without growing prompts
- **Creative Writing**: Maintain story coherence across chapters
- **Role-Playing**: NPCs remember past interactions
## πŸ“ Citation
```bibtex
@article{amega2025sclm,
title={SCLM: Stateful Coherent Language Models with EARCP Architecture},
author={Amega, Mike},
year={2025},
note={Ame Web Studio}
}
```
## πŸ‘€ Author
**Mike Amega** - [Ame Web Studio](https://github.com/Volgat)
---
*SCLM is an experimental architecture exploring persistent memory in language models.*