File size: 3,181 Bytes
47f4ba3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
---
license: mit
language:
- en
library_name: transformers
tags:
- sclm
- stateful
- memory
- earcp
- text-generation
- conversational
pipeline_tag: text-generation
base_model: mistralai/Mistral-7B-v0.1
widget:
- text: "The wizard Elara lived in Silverwood forest. One day, she discovered"
example_title: "Fantasy Story"
- text: "In the year 2050, humanity had finally achieved"
example_title: "Science Fiction"
- text: "The detective examined the crime scene carefully. The clues pointed to"
example_title: "Mystery"
inference:
parameters:
max_new_tokens: 100
temperature: 0.7
top_p: 0.9
repetition_penalty: 1.1
---
# π§ SCLM: Stateful Coherent Language Model
**SCLM** adds **persistent latent memory** to transformer language models, enabling better coherence across long conversations and multi-turn generation.
## π― Key Features
- **Persistent State**: Memory that evolves across conversation turns
- **Entity Coherence**: Maintains context about characters, places, and objects
- **Edit Mode**: Make local changes without affecting global memory
- **Lightweight**: Only 91.7M additional parameters (2.44% overhead)
## π Architecture: EARCP
```
EARCP = Encapsulation + Alignment + Revision + Coherence + Propagation
```
| Component | Function |
|-----------|----------|
| **Encapsulation** | GRU-style state update from hidden states |
| **Alignment** | Cross-attention between state and hidden layers |
| **Revision** | Drift detection and correction |
| **Coherence** | Mixture-of-Experts for consistency |
| **Propagation** | State injection into transformer layers |
## π§ Model Details
| Parameter | Value |
|-----------|-------|
| Base Model | mistralai/Mistral-7B-v0.1 |
| EARCP Parameters | 91.7M |
| Latent State Dim | 256 |
| Injection Layers | [8, 16] |
| Alpha (injection strength) | 0.02 |
| Experts | 2 |
## π Quick Start
```python
# Note: Full SCLM requires custom loading (see below)
# The inference widget uses the base model only
from transformers import AutoTokenizer
import torch
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("amewebstudio/ananke-sclm")
# For full SCLM functionality, load weights separately:
# 1. Load base Mistral-7B
# 2. Load EARCP weights from earcp_weights.pt
# 3. Apply SCLM wrapper
```
## π Validation Results
| Test | Result |
|------|--------|
| Forward Pass | β
|
| State Evolution | β
(norm: 0 β 4.6 β 7.5) |
| Coherent Generation | β
|
| Edit Mode | β
|
| Entity Memory | β
(Elara, Nimbus retained) |
## π‘ Use Cases
- **Interactive Fiction**: Characters and plot points remain consistent
- **Long Conversations**: Context persists without growing prompts
- **Creative Writing**: Maintain story coherence across chapters
- **Role-Playing**: NPCs remember past interactions
## π Citation
```bibtex
@article{amega2025sclm,
title={SCLM: Stateful Coherent Language Models with EARCP Architecture},
author={Amega, Mike},
year={2025},
note={Ame Web Studio}
}
```
## π€ Author
**Mike Amega** - [Ame Web Studio](https://github.com/Volgat)
---
*SCLM is an experimental architecture exploring persistent memory in language models.*
|