--- license: mit language: - en library_name: transformers tags: - sclm - stateful - memory - earcp - text-generation - conversational pipeline_tag: text-generation base_model: mistralai/Mistral-7B-v0.1 widget: - text: "The wizard Elara lived in Silverwood forest. One day, she discovered" example_title: "Fantasy Story" - text: "In the year 2050, humanity had finally achieved" example_title: "Science Fiction" - text: "The detective examined the crime scene carefully. The clues pointed to" example_title: "Mystery" inference: parameters: max_new_tokens: 100 temperature: 0.7 top_p: 0.9 repetition_penalty: 1.1 --- # 🧠 SCLM: Stateful Coherent Language Model **SCLM** adds **persistent latent memory** to transformer language models, enabling better coherence across long conversations and multi-turn generation. ## 🎯 Key Features - **Persistent State**: Memory that evolves across conversation turns - **Entity Coherence**: Maintains context about characters, places, and objects - **Edit Mode**: Make local changes without affecting global memory - **Lightweight**: Only 91.7M additional parameters (2.44% overhead) ## 📊 Architecture: EARCP ``` EARCP = Encapsulation + Alignment + Revision + Coherence + Propagation ``` | Component | Function | |-----------|----------| | **Encapsulation** | GRU-style state update from hidden states | | **Alignment** | Cross-attention between state and hidden layers | | **Revision** | Drift detection and correction | | **Coherence** | Mixture-of-Experts for consistency | | **Propagation** | State injection into transformer layers | ## 🔧 Model Details | Parameter | Value | |-----------|-------| | Base Model | mistralai/Mistral-7B-v0.1 | | EARCP Parameters | 91.7M | | Latent State Dim | 256 | | Injection Layers | [8, 16] | | Alpha (injection strength) | 0.02 | | Experts | 2 | ## 🚀 Quick Start ```python # Note: Full SCLM requires custom loading (see below) # The inference widget uses the base model only from transformers import AutoTokenizer import torch # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("amewebstudio/ananke-sclm") # For full SCLM functionality, load weights separately: # 1. Load base Mistral-7B # 2. Load EARCP weights from earcp_weights.pt # 3. Apply SCLM wrapper ``` ## 📈 Validation Results | Test | Result | |------|--------| | Forward Pass | ✅ | | State Evolution | ✅ (norm: 0 → 4.6 → 7.5) | | Coherent Generation | ✅ | | Edit Mode | ✅ | | Entity Memory | ✅ (Elara, Nimbus retained) | ## 💡 Use Cases - **Interactive Fiction**: Characters and plot points remain consistent - **Long Conversations**: Context persists without growing prompts - **Creative Writing**: Maintain story coherence across chapters - **Role-Playing**: NPCs remember past interactions ## 📝 Citation ```bibtex @article{amega2025sclm, title={SCLM: Stateful Coherent Language Models with EARCP Architecture}, author={Amega, Mike}, year={2025}, note={Ame Web Studio} } ``` ## 👤 Author **Mike Amega** - [Ame Web Studio](https://github.com/Volgat) --- *SCLM is an experimental architecture exploring persistent memory in language models.*