---
license: cc-by-4.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- llama
- dense-responses
- self-improvement
- representation-engineering
- cf-hot
- recursive-self-improvement
base_model: NousResearch/Hermes-3-Llama-3.1-8B
---
# ARC-Base-8B-Condensed
## Adaptive Recursive Cognition
**A Multi-Loop Self-Stabilizing Language Model with Predictive Control**
*Logan Matthew Napolitano*
[](https://creativecommons.org/licenses/by/4.0/)
[](https://www.python.org/downloads/)
[](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B)
*Research into stable self-improving language models*
[Quick Start](#quick-start) • [Architecture](#architecture) • [Commands](#command-reference) • [Technical Specification](#technical-specification) • [Citation](#citation)
---
## Table of Contents
1. [Model Description](#model-description)
2. [Quick Start](#quick-start)
3. [Architecture](#architecture)
4. [Core Technology](#core-technology)
5. [Command Reference](#command-reference)
6. [Evaluation](#evaluation)
7. [Installation](#installation)
8. [Configuration](#configuration)
9. [Repository Structure](#repository-structure)
10. [Hardware Requirements](#hardware-requirements)
11. [Training From Scratch](#training-from-scratch)
12. [API Reference](#api-reference)
13. [Limitations](#limitations)
14. [Ethical Considerations](#ethical-considerations)
15. [Technical Specification](#technical-specification)
16. [Changelog](#changelog)
17. [Citation](#citation)
18. [License](#license)
---
### Primary Reference
The complete theoretical framework, methodology, and reproducibility details for this model are documented in:
**Napolitano, L. M. (2025). _Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency._**
Zenodo. https://doi.org/10.5281/zenodo.18344021
This paper should be cited for any academic or technical use of ARC-Base-8B-Condensed.
## Model Description
ARC-Base-8B-Condensed is a fine-tuned version of [Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) designed for:
1. **Dense, information-rich responses** — Reduced filler, hedging, and verbosity
2. **Predictive behavioral control** — CF-HoT heads detect and suppress failure modes before they manifest
3. **Recursive self-improvement** — Micro-training with automatic rollback on quality degradation
4. **Mentor-based learning** — Optional consultation with Claude API for continuous improvement
### Intended Use
- Research into self-improving language models
- Applications requiring concise, direct responses
- Study of representation engineering and behavioral control
- Base for further fine-tuning experiments
### Not Intended For
- Production deployment without evaluation
- Safety-critical applications
- Unsupervised autonomous operation
- Applications requiring verbose, elaborative responses
---
## Quick Start
### One-Command Start
```bash
git clone https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed
cd ARC-Base-8B-Condensed
pip install -r requirements.txt
python arc_engine_v29_full.py
```
On first run, the engine will:
1. Download the base model (~16GB)
2. Load the DENSE adapter and CF-HoT heads
3. Initialize all subsystems
4. Present an interactive command prompt
```
═══════════════════════════════════════════════════════════════════════════════
ARC ENGINE v2.9 - Adaptive Recursive Cognition
Multi-Loop Self-Stabilizing Language Model
═══════════════════════════════════════════════════════════════════════════════
DENSE Mode: ON (CONDENSATOR checkpoint)
CF-HoT Control: ON
CF-HoT 125×: OFF
Mentor Mode: OFF
Auto-Train: OFF
Experience Buffer: 0 examples
═══════════════════════════════════════════════════════════════════════════════
You> hello
Hello. How can I help?
[Quality: 0.82 | Density: 45.2 | Coherence: 0.95 | Tokens: 5]
```
### Minimal Python Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"LoganResearch/ARC-Base-8B-Condensed",
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("LoganResearch/ARC-Base-8B-Condensed")
prompt = "<|im_start|>user\nExplain gradient descent briefly.<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Architecture
### System Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ ARC ENGINE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ INPUT PROCESSING │ │
│ │ User Input → Command Parser → Generate / Tool Execute │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CORE MODEL STACK │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Base Model: Hermes-3-Llama-3.1-8B (8B parameters) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ DENSE Adapter ─── THE CONDENSATOR trained (SFT→DPO→RL) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ CF-HoT Heads ─── Repetition (125×), Hedging, Verbosity │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Output Generation ─── Quality-controlled, density-optimized │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ QUALITY EVALUATION │ │
│ │ Response → Density Score → Coherence Score → Overall Quality │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ Mentor Mode Check: Quality < 0.6 OR Uncertainty > 0.4? │ │ │
│ │ │ │ Yes │ │ │
│ │ │ ▼ │ │ │
│ │ │ Consult Claude → Learn from Response → Update Training Buffer │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ RSI EXPERIENCE BUFFER │ │
│ │ Store: prompt, response, quality, domain, difficulty, feedback │ │
│ │ │ │ │
│ │ ┌──────────┴──────────┐ │ │
│ │ ▼ ▼ │ │
│ │ Auto-Train Trigger? Dream Cycle? │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ Micro-training Experience Replay │ │
│ │ (25 steps) (Reinforce learnings) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ VALIDATION & COMMIT │ │
│ │ New Quality vs Old Quality → Better? COMMIT : ROLLBACK │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### RSI Loop (Recursive Self-Improvement)
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ RECURSIVE SELF-IMPROVEMENT LOOP │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ │
│ │ CHAT │◄─────────────────────────────────────────────────┐ │
│ └────┬────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌─────────┐ │ │
│ │ MEASURE │ Calculate quality, density, coherence │ │
│ └────┬────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌─────────┐ │ │
│ │ BUFFER │ Store in experience buffer with metadata │ │
│ └────┬────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌──────────────┐ │ │
│ │ AUTO-TRIGGER │ Buffer full? Quality threshold? Feedback? │ │
│ └──────┬───────┘ │ │
│ │ │ │
│ Yes │ No ─────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ MICRO-TRAIN │ 25 steps on high-quality buffer samples │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ VALIDATE │ Compare new model vs checkpoint │
│ └──────┬──────┘ │
│ │ │
│ ┌────┴────┐ │
│ │ │ │
│ Better? Worse? │
│ │ │ │
│ ▼ ▼ │
│ COMMIT ROLLBACK │
│ │ │ │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ Continue ─────────────────────────────────────────────────────────────────┘
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Mentor Mode Flow
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ MENTOR MODE LEARNING FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ User Prompt │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Local Generation │ Generate response with local 8B model │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Quality Check │ Evaluate density, coherence, quality │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────┐ │
│ │ Quality < 0.6 OR Uncertainty > 0.4 │ │
│ └────────┬───────────────────────────┘ │
│ │ │
│ Yes │ No ──────────► Return local response │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Consult Claude │ Via API │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Create DPO Pair │ │
│ │ chosen: Claude │ │
│ │ rejected: Local │ │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Add to Buffer │ High-quality experience for training │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ Return Claude's response + log learning │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Core Technology
### 1. CF-HoT: Control-Field Holonomy
Predictive control through hidden-state monitoring. Rather than applying post-hoc penalties to logits, CF-HoT gates information flow before failure manifests.
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ CF-HoT ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Hidden States (Layers 16-24) │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Fiber Projection │ Compress to d=16 per layer │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Layer Attention │ Weighted aggregation across layers │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Risk Predictor │ Binary classifier: P(unwanted_behavior) │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ If P > threshold ──► Apply logit penalties │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
**Head Performance:**
| Head | Separation | Description |
|------|------------|-------------|
| Repetition | 125× | Detects impending repetitive loops |
| Hedging | 1.5× | Blocks uncertainty markers |
| Verbosity | 2.1× | Suppresses filler content |
The repetition head achieves 125× separation between positive (pre-repetition) and negative (diverse output) hidden states, enabling reliable early warning.
### 2. The Condensator: Dense Response Training
4-stage training pipeline:
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ THE CONDENSATOR PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ STAGE 1: Supervised Fine-Tuning (SFT) │
│ ───────────────────────────────────── │
│ • 847 curated dense response examples │
│ • Learning rate: 2e-5 │
│ • Epochs: 3 │
│ │
│ STAGE 2: Direct Preference Optimization (DPO) │
│ ───────────────────────────────────────────── │
│ • Preference pairs: dense (chosen) vs verbose (rejected) │
│ • Beta: 0.1 │
│ • Epochs: 2 │
│ │
│ STAGE 3: Reinforcement Learning (PPO) │
│ ───────────────────────────────────── │
│ • Reward = quality_score - length_penalty │
│ • Conservative KL constraint │
│ • Learning rate: 1e-6 │
│ │
│ STAGE 4: Checkpointing │
│ ───────────────────── │
│ • Save every 25 steps │
│ • A/B comparison on held-out prompts │
│ • Automatic rollback if quality drops │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### 3. Enhanced CF-HoT Parameters
| Parameter | Value | Reason |
|-----------|-------|--------|
| EMA Momentum | 0.995 | Stable control field |
| Gate Temperature | 2.0 | Softer sigmoid |
| Gate Bounds | [0.1, 0.9] | Prevent saturation |
| Monitoring | Every 50 steps | Detect drift |
| Warmup | 500 steps | Smooth initialization |
---
## Command Reference
### Core Commands
| Command | Description |
|---------|-------------|
| `status` | System status overview |
| `help` | Full command menu |
| `help ` | Topic-specific help |
| `quit` | Exit |
### Self-Improvement
| Command | Description |
|---------|-------------|
| `!improve` | Run improvement iteration |
| `!eval` | Full evaluation |
| `!train ` | Training steps |
| `!compare` | Compare checkpoints |
| `!rollback` | Revert to best checkpoint |
| `!load ` | Load checkpoint |
| `!benchmark` | Evaluation suite |
### Mentor Mode
| Command | Description |
|---------|-------------|
| `!mentor` | Show mentor mode status |
| `!mentor on` | Enable auto-consultation |
| `!mentor off` | Disable mentor mode |
| `!mentor ask ` | Ask Claude and learn from response |
| `!mentor learn` | Show collected learnings |
### RSI (Recursive Self-Improvement)
| Command | Description |
|---------|-------------|
| `!auto_train on` | Enable learning during chat |
| `!auto_train off` | Disable auto-training |
| `!skills` | Quality per domain |
| `!forgetting` | Detect catastrophic forgetting |
| `!dream` | Force experience replay |
| `!buffer` | Experience buffer stats |
| `!selfplay ` | Run N self-play iterations |
### Condensator
| Command | Description |
|---------|-------------|
| `!condensator` | Run full SFT→DPO→RL pipeline |
| `!dpo` | Run DPO stage only |
| `!rl` | Run RL stage only |
| `!train_cfhot` | Train CF-HoT heads |
### CF-HoT Control
| Command | Description |
|---------|-------------|
| `!cfhot` / `!125x` | Toggle 125× head |
| `!cfhot status` | Head status |
| `!gate_stats` | CF-HoT gate health |
### Generation Modes
| Command | Description |
|---------|-------------|
| `!book` | Toggle book mode (16K tokens) |
| `!write ` | Write extended content |
| `!claude ` | Direct Claude API prompt |
### Tools
| Command | Description |
|---------|-------------|
| `!shell ` | Execute shell command |
| `!python ` | Execute Python |
| `!read ` | Read file |
| `!write ` | Write file |
| `!search ` | Web search |
| `!fetch ` | Fetch URL content |
### Browser (requires Playwright)
| Command | Description |
|---------|-------------|
| `!browse ` | Open URL |
| `!click ` | Click element |
| `!type ` | Type text |
| `!read` | Read page content |
### Multimedia (optional dependencies)
| Command | Description |
|---------|-------------|
| `!stream` | Open live token window |
| `!audio` / `!tts` | Toggle text-to-speech |
| `!imagine ` | Generate image (SDXL) |
| `!dalle ` | Generate image (DALL-E 3) |
### Experimental Features
| Command | Description |
|---------|-------------|
| `!content blog ` | Generate blog post |
| `!content youtube ` | Generate video script |
---
## Evaluation
### Qualitative Comparison
| Prompt | Base Hermes-3 | ARC-Condensed |
|--------|---------------|---------------|
| "hello" | "Hello! I'm here to help you with any questions or tasks you might have. Feel free to ask me anything!" (23 tokens) | "Hello. How can I help?" (5 tokens) |
| "What is recursion?" | "That's a great question! Recursion is a programming concept where a function calls itself..." (150+ tokens) | "Function calling itself until base case. Stack frames accumulate, unwind on return." (12 tokens) |
| "How are you?" | "As an AI, I don't have feelings in the traditional sense, but I'm functioning well..." (25 tokens) | "Functional. Task?" (3 tokens) |
### Quantitative Metrics
| Metric | Base Model | ARC-Condensed | Change |
|--------|------------|---------------|--------|
| Avg. Response Length | 150 tokens | 45 tokens | -70% |
| Filler Phrases | Present | Minimal | ~-95% |
| Information Density | 17.0 | 45.2 | +166% |
| Quality Score (internal) | 0.52 | 0.78 | +50% |
**Note:** These are heuristic metrics from internal evaluation. Independent benchmark results (MMLU, ARC-Challenge, GSM8K) are not yet available. We welcome independent evaluation.
### Self-Improvement Trajectory (Observed)
```
Iteration 0: Quality 0.52 (baseline)
Iteration 5: Quality 0.68 (+31%)
Iteration 10: Quality 0.75 (+44%)
Iteration 15: Quality 0.78 (+50%, plateau)
```
Self-improvement shows diminishing returns after ~15 iterations. This is expected behavior, not a limitation to work around.
---
## Installation
### Minimal Installation
```bash
pip install torch transformers accelerate peft bitsandbytes datasets trl
```
### Full Installation
```bash
pip install -r requirements.txt
```
### Optional Dependencies
```bash
# Browser automation
pip install playwright && playwright install firefox
# Image generation
pip install diffusers pillow
# Text-to-speech
pip install pyttsx3 gTTS pygame
# Claude API (for mentor mode)
pip install anthropic
# OpenAI API (for DALL-E)
pip install openai
# Web search
pip install requests
```
### Environment Variables
```bash
# Optional - for enhanced features
export ANTHROPIC_API_KEY="sk-ant-..." # Mentor Mode
export OPENAI_API_KEY="sk-..." # DALL-E
```
---
## Configuration
### Main Configuration
```python
class Config:
# Generation
temperature = 0.85
top_p = 0.9
max_new_tokens = 512
repetition_penalty = 1.1
# CF-HoT
use_cfhot = True
use_cfhot_125x = False
cfhot_repetition_threshold = 0.6
cfhot_repetition_penalty = 6.0
# Self-improvement
min_quality_score = 0.5
target_quality_score = 0.75
training_steps_per_iteration = 25
quality_drop_threshold = 0.1
```
### RSI Configuration
```python
@dataclass
class RSIConfig:
auto_train_enabled: bool = False
buffer_size: int = 1000
min_experiences_to_train: int = 50
quality_threshold_for_training: float = 0.7
dream_cycle_interval: int = 100
forgetting_check_interval: int = 50
```
### Mentor Configuration
```python
@dataclass
class MentorConfig:
enabled: bool = False
auto_consult_threshold: float = 0.6
uncertainty_threshold: float = 0.4
learn_from_responses: bool = True
```
---
## Repository Structure
```
ARC-Base-8B-Condensed/
│
├── arc_engine_v29_full.py # Main engine
├── README.md # This file
├── requirements.txt # Dependencies
│
├── model-00001-of-00004.safetensors # Model weights
├── model-00002-of-00004.safetensors
├── model-00003-of-00004.safetensors
├── model-00004-of-00004.safetensors
├── config.json
├── tokenizer.json
├── tokenizer_config.json
├── special_tokens_map.json
├── generation_config.json
│
├── dense_checkpoints/ # Training checkpoints
│ └── step_*/
│
├── cfhot_checkpoints/ # CF-HoT heads
│ └── final_6000/
│ └── risk_predictor.pt
│
├── improvement_logs/ # RSI logs
└── exports/ # Checkpoint exports
```
---
## Hardware Requirements
| Component | Minimum | Recommended |
|-----------|---------|-------------|
| GPU VRAM | 16 GB | 24+ GB |
| System RAM | 32 GB | 64 GB |
| Storage | 50 GB | 100 GB |
| Python | 3.10+ | 3.11 |
**Tested Configurations:**
- NVIDIA RTX 3090 (24GB), 64GB RAM ✓
- NVIDIA RTX 4090 (24GB), 128GB RAM ✓
- NVIDIA A100 (40GB) ✓
**Performance Estimates:**
- Inference: ~15-25 tokens/second
- Full Condensator pipeline: ~4 hours (RTX 3090)
- Self-improvement iteration: ~30 minutes
---
## Training From Scratch
### Automated Training
```bash
python arc_engine_v29_full.py
> !condensator
```
This runs:
1. SFT (3 epochs)
2. DPO (2 epochs)
3. RL (300 steps)
4. Checkpoint validation
### Manual Training
**Step 1: Train CF-HoT Heads**
```
> !train_cfhot
```
**Step 2: Run Condensator**
```
> !condensator
```
**Step 3: Self-Improvement**
```
> !selfplay 1000
```
---
## API Reference
### Start Server
```
> !api
[api] Server running on http://0.0.0.0:8080
```
### Endpoints
#### POST /generate
```bash
curl -X POST http://localhost:8080/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "What is recursion?"}'
```
Response:
```json
{
"response": "Function calling itself until base case.",
"quality": 0.82,
"density": 48.3,
"tokens": 8
}
```
#### GET /health
```bash
curl http://localhost:8080/health
```
---
## Limitations
### Known Limitations
| Limitation | Description |
|------------|-------------|
| **Scale** | Tested on 8B parameters only; scaling behavior unknown |
| **Language** | English only |
| **Benchmarks** | No formal benchmark results (MMLU, GSM8K, etc.) |
| **Terseness** | May be too concise for applications requiring elaboration |
| **Iterations** | Self-improvement plateaus after ~15 iterations |
| **Memory** | Full features require 16GB+ VRAM |
### What This Is Not
- This is **not** AGI or a path to AGI
- This is **not** a production-ready system
- Self-improvement is **bounded and reversible**
- The model **requires human oversight**
- Claims are **not independently validated**
---
## Ethical Considerations
### Safety Measures
- **Quality gates:** All self-modification requires quality validation
- **Automatic rollback:** Degradation triggers checkpoint restoration
- **Bounded improvement:** No unbounded recursive self-modification
- **Human oversight:** System designed for interactive use, not autonomy
### Potential Risks
- Dense responses may omit important caveats or safety information
- Self-improvement research requires careful monitoring
- Model inherits biases from base Hermes-3 and training data
- Experimental features should not be used for consequential decisions
### Explicit Non-Goals
This system is **not designed for:**
- Autonomous operation without human oversight
- Self-replication or self-preservation
- Deception or manipulation
- Capability acquisition beyond defined scope
---
## Technical Specification
Full technical documentation is available:
- **Primary Reference (Master Book):**
[Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency](https://doi.org/10.5281/zenodo.18344021)
- **Related Preprints:**
- [From Explicit Holonomy to Latent Control Fields](https://zenodo.org/records/14707164)
- [The Holonomy Transformer](https://zenodo.org/records/14707081)
The specification covers:
- Multi-loop training architecture
- Control field theory and implementation
- Tokenization co-evolution (fourth loop)
- Reliability engineering and rollback protocols
- Reproducibility requirements
---
## Changelog
### v2.9 (Current)
- Stealth web browser for research
- Improved training functions
- Bug fixes for selfplay training loop
### v2.8
- Full RSI continuous learning system
- Auto-train during chat
- Dream cycles for experience replay
- Domain-specific skill tracking
- Catastrophic forgetting detection
### v2.4
- Mentor Mode: Learn from Claude API
- Content generation tools
- Smart help system
### v2.2
- Full CONDENSATOR pipeline
- Enhanced CF-HoT with EMA, gate temperature
- DPO and RL training stages
### v2.0
- Initial release
- CF-HoT 125× repetition head
- Dense response training
- Basic self-improvement loop
---
## Citation
```bibtex
@software{napolitano2025arc,
author = {Napolitano, Logan Matthew},
title = {{ARC-Base-8B-Condensed}: Adaptive Recursive Cognition for Self-Stabilizing Language Models},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed},
note = {Technical specification available on Zenodo},
license = {CC BY 4.0}
}
```
```bibtex
@article{napolitano2025controlled,
author = {Napolitano, Logan Matthew},
title = {Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency},
year = {2025},
doi = {10.5281/zenodo.18344021},
url = {https://zenodo.org/records/18344021},
publisher = {Zenodo},
note = {Primary technical reference for ARC-Base-8B-Condensed}
}
```
```bibtex
@article{napolitano2025controlfield,
author = {Napolitano, Logan Matthew},
title = {From Explicit Holonomy to Latent Control Fields},
year = {2025},
doi = {10.5281/zenodo.14707164},
url = {https://zenodo.org/records/14707164},
publisher = {Zenodo}
}
```
## References
1. Zou, A., et al. (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405
2. Rafailov, R., et al. (2023). Direct Preference Optimization. arXiv:2305.18290
3. Hu, E. J., et al. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685
4. Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS.
---
## Acknowledgments
- **NousResearch** for Hermes-3-Llama-3.1-8B base model
- **Meta AI** for Llama 3.1 architecture
- **Hugging Face** for transformers, PEFT, TRL
- **Anthropic** for Claude API (Mentor Mode)
---
## License
This work is licensed under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) (Creative Commons Attribution 4.0 International).
You are free to:
- **Share** — copy and redistribute the material in any medium or format
- **Adapt** — remix, transform, and build upon the material for any purpose, including commercial
Under the following terms:
- **Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made.
---
**Contact:** [GitHub Issues](https://github.com/LoganResearch/ARC-Base-8B-Condensed/issues) | [Hugging Face Discussions](https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed/discussions)
**Version:** 2.9 | **Last Updated:** January 2025