Update README.md

8bc141e verified 11 days ago

44.2 kB

	---
	license: cc-by-4.0
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- llama
	- dense-responses
	- self-improvement
	- representation-engineering
	- cf-hot
	- recursive-self-improvement
	base_model: NousResearch/Hermes-3-Llama-3.1-8B
	---

	<div align="center">

	# ARC-Base-8B-Condensed
	## Adaptive Recursive Cognition

	A Multi-Loop Self-Stabilizing Language Model with Predictive Control

	Logan Matthew Napolitano

	[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
	[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
	[![Base Model](https://img.shields.io/badge/base-Hermes--3--8B-green.svg)](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B)

	Research into stable self-improving language models

	[Quick Start](#quick-start) • [Architecture](#architecture) • [Commands](#command-reference) • [Technical Specification](#technical-specification) • [Citation](#citation)

	</div>

	---

	## Table of Contents

	1. [Model Description](#model-description)
	2. [Quick Start](#quick-start)
	3. [Architecture](#architecture)
	4. [Core Technology](#core-technology)
	5. [Command Reference](#command-reference)
	6. [Evaluation](#evaluation)
	7. [Installation](#installation)
	8. [Configuration](#configuration)
	9. [Repository Structure](#repository-structure)
	10. [Hardware Requirements](#hardware-requirements)
	11. [Training From Scratch](#training-from-scratch)
	12. [API Reference](#api-reference)
	13. [Limitations](#limitations)
	14. [Ethical Considerations](#ethical-considerations)
	15. [Technical Specification](#technical-specification)
	16. [Changelog](#changelog)
	17. [Citation](#citation)
	18. [License](#license)

	---

	### Primary Reference

	The complete theoretical framework, methodology, and reproducibility details for this model are documented in:

	Napolitano, L. M. (2025). _Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency._
	Zenodo. https://doi.org/10.5281/zenodo.18344021

	This paper should be cited for any academic or technical use of ARC-Base-8B-Condensed.


	## Model Description

	ARC-Base-8B-Condensed is a fine-tuned version of [Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) designed for:

	1. Dense, information-rich responses — Reduced filler, hedging, and verbosity
	2. Predictive behavioral control — CF-HoT heads detect and suppress failure modes before they manifest
	3. Recursive self-improvement — Micro-training with automatic rollback on quality degradation
	4. Mentor-based learning — Optional consultation with Claude API for continuous improvement

	### Intended Use

	- Research into self-improving language models
	- Applications requiring concise, direct responses
	- Study of representation engineering and behavioral control
	- Base for further fine-tuning experiments

	### Not Intended For

	- Production deployment without evaluation
	- Safety-critical applications
	- Unsupervised autonomous operation
	- Applications requiring verbose, elaborative responses

	---

	## Quick Start

	### One-Command Start

	```bash
	git clone https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed
	cd ARC-Base-8B-Condensed
	pip install -r requirements.txt
	python arc_engine_v29_full.py
	```

	On first run, the engine will:
	1. Download the base model (~16GB)
	2. Load the DENSE adapter and CF-HoT heads
	3. Initialize all subsystems
	4. Present an interactive command prompt

	```
	═══════════════════════════════════════════════════════════════════════════════
	ARC ENGINE v2.9 - Adaptive Recursive Cognition
	Multi-Loop Self-Stabilizing Language Model
	═══════════════════════════════════════════════════════════════════════════════
	DENSE Mode: ON (CONDENSATOR checkpoint)
	CF-HoT Control: ON
	CF-HoT 125×: OFF
	Mentor Mode: OFF
	Auto-Train: OFF
	Experience Buffer: 0 examples
	═══════════════════════════════════════════════════════════════════════════════

	You> hello
	Hello. How can I help?

	[Quality: 0.82 \| Density: 45.2 \| Coherence: 0.95 \| Tokens: 5]
	```

	### Minimal Python Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model = AutoModelForCausalLM.from_pretrained(
	"LoganResearch/ARC-Base-8B-Condensed",
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("LoganResearch/ARC-Base-8B-Condensed")

	prompt = "<\|im_start\|>user\nExplain gradient descent briefly.<\|im_end\|>\n<\|im_start\|>assistant\n"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## Architecture

	### System Overview

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ ARC ENGINE ARCHITECTURE │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ ┌─────────────────────────────────────────────────────────────────────┐ │
	│ │ INPUT PROCESSING │ │
	│ │ User Input → Command Parser → Generate / Tool Execute │ │
	│ └─────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────────────────────────────────────────────────────────┐ │
	│ │ CORE MODEL STACK │ │
	│ ├─────────────────────────────────────────────────────────────────────┤ │
	│ │ │ │
	│ │ Base Model: Hermes-3-Llama-3.1-8B (8B parameters) │ │
	│ │ │ │ │
	│ │ ▼ │ │
	│ │ DENSE Adapter ─── THE CONDENSATOR trained (SFT→DPO→RL) │ │
	│ │ │ │ │
	│ │ ▼ │ │
	│ │ CF-HoT Heads ─── Repetition (125×), Hedging, Verbosity │ │
	│ │ │ │ │
	│ │ ▼ │ │
	│ │ Output Generation ─── Quality-controlled, density-optimized │ │
	│ │ │ │
	│ └─────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────────────────────────────────────────────────────────┐ │
	│ │ QUALITY EVALUATION │ │
	│ │ Response → Density Score → Coherence Score → Overall Quality │ │
	│ │ │ │ │
	│ │ ▼ │ │
	│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
	│ │ │ Mentor Mode Check: Quality < 0.6 OR Uncertainty > 0.4? │ │ │
	│ │ │ │ Yes │ │ │
	│ │ │ ▼ │ │ │
	│ │ │ Consult Claude → Learn from Response → Update Training Buffer │ │ │
	│ │ └──────────────────────────────────────────────────────────────┘ │ │
	│ └─────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────────────────────────────────────────────────────────┐ │
	│ │ RSI EXPERIENCE BUFFER │ │
	│ │ Store: prompt, response, quality, domain, difficulty, feedback │ │
	│ │ │ │ │
	│ │ ┌──────────┴──────────┐ │ │
	│ │ ▼ ▼ │ │
	│ │ Auto-Train Trigger? Dream Cycle? │ │
	│ │ │ │ │ │
	│ │ ▼ ▼ │ │
	│ │ Micro-training Experience Replay │ │
	│ │ (25 steps) (Reinforce learnings) │ │
	│ └─────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────────────────────────────────────────────────────────┐ │
	│ │ VALIDATION & COMMIT │ │
	│ │ New Quality vs Old Quality → Better? COMMIT : ROLLBACK │ │
	│ └─────────────────────────────────────────────────────────────────────┘ │
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	### RSI Loop (Recursive Self-Improvement)

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ RECURSIVE SELF-IMPROVEMENT LOOP │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ ┌─────────┐ │
	│ │ CHAT │◄─────────────────────────────────────────────────┐ │
	│ └────┬────┘ │ │
	│ │ │ │
	│ ▼ │ │
	│ ┌─────────┐ │ │
	│ │ MEASURE │ Calculate quality, density, coherence │ │
	│ └────┬────┘ │ │
	│ │ │ │
	│ ▼ │ │
	│ ┌─────────┐ │ │
	│ │ BUFFER │ Store in experience buffer with metadata │ │
	│ └────┬────┘ │ │
	│ │ │ │
	│ ▼ │ │
	│ ┌──────────────┐ │ │
	│ │ AUTO-TRIGGER │ Buffer full? Quality threshold? Feedback? │ │
	│ └──────┬───────┘ │ │
	│ │ │ │
	│ Yes │ No ─────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────┐ │
	│ │ MICRO-TRAIN │ 25 steps on high-quality buffer samples │
	│ └──────┬──────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────┐ │
	│ │ VALIDATE │ Compare new model vs checkpoint │
	│ └──────┬──────┘ │
	│ │ │
	│ ┌────┴────┐ │
	│ │ │ │
	│ Better? Worse? │
	│ │ │ │
	│ ▼ ▼ │
	│ COMMIT ROLLBACK │
	│ │ │ │
	│ └────┬────┘ │
	│ │ │
	│ ▼ │
	│ Continue ─────────────────────────────────────────────────────────────────┘
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	### Mentor Mode Flow

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ MENTOR MODE LEARNING FLOW │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ User Prompt │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Local Generation │ Generate response with local 8B model │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Quality Check │ Evaluate density, coherence, quality │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ ┌────────────────────────────────────┐ │
	│ │ Quality < 0.6 OR Uncertainty > 0.4 │ │
	│ └────────┬───────────────────────────┘ │
	│ │ │
	│ Yes │ No ──────────► Return local response │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Consult Claude │ Via API │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Create DPO Pair │ │
	│ │ chosen: Claude │ │
	│ │ rejected: Local │ │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Add to Buffer │ High-quality experience for training │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ Return Claude's response + log learning │
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	---

	## Core Technology

	### 1. CF-HoT: Control-Field Holonomy

	Predictive control through hidden-state monitoring. Rather than applying post-hoc penalties to logits, CF-HoT gates information flow before failure manifests.

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ CF-HoT ARCHITECTURE │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ Hidden States (Layers 16-24) │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Fiber Projection │ Compress to d=16 per layer │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Layer Attention │ Weighted aggregation across layers │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────┐ │
	│ │ Risk Predictor │ Binary classifier: P(unwanted_behavior) │
	│ └────────┬────────┘ │
	│ │ │
	│ ▼ │
	│ If P > threshold ──► Apply logit penalties │
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	Head Performance:

	\| Head \| Separation \| Description \|
	\|------\|------------\|-------------\|
	\| Repetition \| 125× \| Detects impending repetitive loops \|
	\| Hedging \| 1.5× \| Blocks uncertainty markers \|
	\| Verbosity \| 2.1× \| Suppresses filler content \|

	The repetition head achieves 125× separation between positive (pre-repetition) and negative (diverse output) hidden states, enabling reliable early warning.

	### 2. The Condensator: Dense Response Training

	4-stage training pipeline:

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ THE CONDENSATOR PIPELINE │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ STAGE 1: Supervised Fine-Tuning (SFT) │
	│ ───────────────────────────────────── │
	│ • 847 curated dense response examples │
	│ • Learning rate: 2e-5 │
	│ • Epochs: 3 │
	│ │
	│ STAGE 2: Direct Preference Optimization (DPO) │
	│ ───────────────────────────────────────────── │
	│ • Preference pairs: dense (chosen) vs verbose (rejected) │
	│ • Beta: 0.1 │
	│ • Epochs: 2 │
	│ │
	│ STAGE 3: Reinforcement Learning (PPO) │
	│ ───────────────────────────────────── │
	│ • Reward = quality_score - length_penalty │
	│ • Conservative KL constraint │
	│ • Learning rate: 1e-6 │
	│ │
	│ STAGE 4: Checkpointing │
	│ ───────────────────── │
	│ • Save every 25 steps │
	│ • A/B comparison on held-out prompts │
	│ • Automatic rollback if quality drops │
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	### 3. Enhanced CF-HoT Parameters

	\| Parameter \| Value \| Reason \|
	\|-----------\|-------\|--------\|
	\| EMA Momentum \| 0.995 \| Stable control field \|
	\| Gate Temperature \| 2.0 \| Softer sigmoid \|
	\| Gate Bounds \| [0.1, 0.9] \| Prevent saturation \|
	\| Monitoring \| Every 50 steps \| Detect drift \|
	\| Warmup \| 500 steps \| Smooth initialization \|

	---

	## Command Reference

	### Core Commands

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `status` \| System status overview \|
	\| `help` \| Full command menu \|
	\| `help <topic>` \| Topic-specific help \|
	\| `quit` \| Exit \|

	### Self-Improvement

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!improve` \| Run improvement iteration \|
	\| `!eval` \| Full evaluation \|
	\| `!train <steps>` \| Training steps \|
	\| `!compare` \| Compare checkpoints \|
	\| `!rollback` \| Revert to best checkpoint \|
	\| `!load <path>` \| Load checkpoint \|
	\| `!benchmark` \| Evaluation suite \|

	### Mentor Mode

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!mentor` \| Show mentor mode status \|
	\| `!mentor on` \| Enable auto-consultation \|
	\| `!mentor off` \| Disable mentor mode \|
	\| `!mentor ask <question>` \| Ask Claude and learn from response \|
	\| `!mentor learn` \| Show collected learnings \|

	### RSI (Recursive Self-Improvement)

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!auto_train on` \| Enable learning during chat \|
	\| `!auto_train off` \| Disable auto-training \|
	\| `!skills` \| Quality per domain \|
	\| `!forgetting` \| Detect catastrophic forgetting \|
	\| `!dream` \| Force experience replay \|
	\| `!buffer` \| Experience buffer stats \|
	\| `!selfplay <N>` \| Run N self-play iterations \|

	### Condensator

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!condensator` \| Run full SFT→DPO→RL pipeline \|
	\| `!dpo` \| Run DPO stage only \|
	\| `!rl` \| Run RL stage only \|
	\| `!train_cfhot` \| Train CF-HoT heads \|

	### CF-HoT Control

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!cfhot` / `!125x` \| Toggle 125× head \|
	\| `!cfhot status` \| Head status \|
	\| `!gate_stats` \| CF-HoT gate health \|

	### Generation Modes

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!book` \| Toggle book mode (16K tokens) \|
	\| `!write <topic>` \| Write extended content \|
	\| `!claude <prompt>` \| Direct Claude API prompt \|

	### Tools

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!shell <cmd>` \| Execute shell command \|
	\| `!python <code>` \| Execute Python \|
	\| `!read <path>` \| Read file \|
	\| `!write <path> <content>` \| Write file \|
	\| `!search <query>` \| Web search \|
	\| `!fetch <url>` \| Fetch URL content \|

	### Browser (requires Playwright)

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!browse <url>` \| Open URL \|
	\| `!click <selector>` \| Click element \|
	\| `!type <text>` \| Type text \|
	\| `!read` \| Read page content \|

	### Multimedia (optional dependencies)

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!stream` \| Open live token window \|
	\| `!audio` / `!tts` \| Toggle text-to-speech \|
	\| `!imagine <prompt>` \| Generate image (SDXL) \|
	\| `!dalle <prompt>` \| Generate image (DALL-E 3) \|

	### Experimental Features

	\| Command \| Description \|
	\|---------\|-------------\|
	\| `!content blog <topic>` \| Generate blog post \|
	\| `!content youtube <topic>` \| Generate video script \|

	---

	## Evaluation

	### Qualitative Comparison

	\| Prompt \| Base Hermes-3 \| ARC-Condensed \|
	\|--------\|---------------\|---------------\|
	\| "hello" \| "Hello! I'm here to help you with any questions or tasks you might have. Feel free to ask me anything!" (23 tokens) \| "Hello. How can I help?" (5 tokens) \|
	\| "What is recursion?" \| "That's a great question! Recursion is a programming concept where a function calls itself..." (150+ tokens) \| "Function calling itself until base case. Stack frames accumulate, unwind on return." (12 tokens) \|
	\| "How are you?" \| "As an AI, I don't have feelings in the traditional sense, but I'm functioning well..." (25 tokens) \| "Functional. Task?" (3 tokens) \|

	### Quantitative Metrics

	\| Metric \| Base Model \| ARC-Condensed \| Change \|
	\|--------\|------------\|---------------\|--------\|
	\| Avg. Response Length \| 150 tokens \| 45 tokens \| -70% \|
	\| Filler Phrases \| Present \| Minimal \| ~-95% \|
	\| Information Density \| 17.0 \| 45.2 \| +166% \|
	\| Quality Score (internal) \| 0.52 \| 0.78 \| +50% \|

	Note: These are heuristic metrics from internal evaluation. Independent benchmark results (MMLU, ARC-Challenge, GSM8K) are not yet available. We welcome independent evaluation.

	### Self-Improvement Trajectory (Observed)

	```
	Iteration 0: Quality 0.52 (baseline)
	Iteration 5: Quality 0.68 (+31%)
	Iteration 10: Quality 0.75 (+44%)
	Iteration 15: Quality 0.78 (+50%, plateau)
	```

	Self-improvement shows diminishing returns after ~15 iterations. This is expected behavior, not a limitation to work around.

	---

	## Installation

	### Minimal Installation

	```bash
	pip install torch transformers accelerate peft bitsandbytes datasets trl
	```

	### Full Installation

	```bash
	pip install -r requirements.txt
	```

	### Optional Dependencies

	```bash
	# Browser automation
	pip install playwright && playwright install firefox

	# Image generation
	pip install diffusers pillow

	# Text-to-speech
	pip install pyttsx3 gTTS pygame

	# Claude API (for mentor mode)
	pip install anthropic

	# OpenAI API (for DALL-E)
	pip install openai

	# Web search
	pip install requests
	```

	### Environment Variables

	```bash
	# Optional - for enhanced features
	export ANTHROPIC_API_KEY="sk-ant-..." # Mentor Mode
	export OPENAI_API_KEY="sk-..." # DALL-E
	```

	---

	## Configuration

	### Main Configuration

	```python
	class Config:
	# Generation
	temperature = 0.85
	top_p = 0.9
	max_new_tokens = 512
	repetition_penalty = 1.1

	# CF-HoT
	use_cfhot = True
	use_cfhot_125x = False
	cfhot_repetition_threshold = 0.6
	cfhot_repetition_penalty = 6.0

	# Self-improvement
	min_quality_score = 0.5
	target_quality_score = 0.75
	training_steps_per_iteration = 25
	quality_drop_threshold = 0.1
	```

	### RSI Configuration

	```python
	@dataclass
	class RSIConfig:
	auto_train_enabled: bool = False
	buffer_size: int = 1000
	min_experiences_to_train: int = 50
	quality_threshold_for_training: float = 0.7
	dream_cycle_interval: int = 100
	forgetting_check_interval: int = 50
	```

	### Mentor Configuration

	```python
	@dataclass
	class MentorConfig:
	enabled: bool = False
	auto_consult_threshold: float = 0.6
	uncertainty_threshold: float = 0.4
	learn_from_responses: bool = True
	```

	---

	## Repository Structure

	```
	ARC-Base-8B-Condensed/
	│
	├── arc_engine_v29_full.py # Main engine
	├── README.md # This file
	├── requirements.txt # Dependencies
	│
	├── model-00001-of-00004.safetensors # Model weights
	├── model-00002-of-00004.safetensors
	├── model-00003-of-00004.safetensors
	├── model-00004-of-00004.safetensors
	├── config.json
	├── tokenizer.json
	├── tokenizer_config.json
	├── special_tokens_map.json
	├── generation_config.json
	│
	├── dense_checkpoints/ # Training checkpoints
	│ └── step_*/
	│
	├── cfhot_checkpoints/ # CF-HoT heads
	│ └── final_6000/
	│ └── risk_predictor.pt
	│
	├── improvement_logs/ # RSI logs
	└── exports/ # Checkpoint exports
	```

	---

	## Hardware Requirements

	\| Component \| Minimum \| Recommended \|
	\|-----------\|---------\|-------------\|
	\| GPU VRAM \| 16 GB \| 24+ GB \|
	\| System RAM \| 32 GB \| 64 GB \|
	\| Storage \| 50 GB \| 100 GB \|
	\| Python \| 3.10+ \| 3.11 \|

	Tested Configurations:
	- NVIDIA RTX 3090 (24GB), 64GB RAM ✓
	- NVIDIA RTX 4090 (24GB), 128GB RAM ✓
	- NVIDIA A100 (40GB) ✓

	Performance Estimates:
	- Inference: ~15-25 tokens/second
	- Full Condensator pipeline: ~4 hours (RTX 3090)
	- Self-improvement iteration: ~30 minutes

	---

	## Training From Scratch

	### Automated Training

	```bash
	python arc_engine_v29_full.py
	> !condensator
	```

	This runs:
	1. SFT (3 epochs)
	2. DPO (2 epochs)
	3. RL (300 steps)
	4. Checkpoint validation

	### Manual Training

	Step 1: Train CF-HoT Heads
	```
	> !train_cfhot
	```

	Step 2: Run Condensator
	```
	> !condensator
	```

	Step 3: Self-Improvement
	```
	> !selfplay 1000
	```

	---

	## API Reference

	### Start Server

	```
	> !api
	[api] Server running on http://0.0.0.0:8080
	```

	### Endpoints

	#### POST /generate

	```bash
	curl -X POST http://localhost:8080/generate \
	-H "Content-Type: application/json" \
	-d '{"prompt": "What is recursion?"}'
	```

	Response:
	```json
	{
	"response": "Function calling itself until base case.",
	"quality": 0.82,
	"density": 48.3,
	"tokens": 8
	}
	```

	#### GET /health

	```bash
	curl http://localhost:8080/health
	```

	---

	## Limitations

	### Known Limitations

	\| Limitation \| Description \|
	\|------------\|-------------\|
	\| Scale \| Tested on 8B parameters only; scaling behavior unknown \|
	\| Language \| English only \|
	\| Benchmarks \| No formal benchmark results (MMLU, GSM8K, etc.) \|
	\| Terseness \| May be too concise for applications requiring elaboration \|
	\| Iterations \| Self-improvement plateaus after ~15 iterations \|
	\| Memory \| Full features require 16GB+ VRAM \|

	### What This Is Not

	- This is not AGI or a path to AGI
	- This is not a production-ready system
	- Self-improvement is bounded and reversible
	- The model requires human oversight
	- Claims are not independently validated

	---

	## Ethical Considerations

	### Safety Measures

	- Quality gates: All self-modification requires quality validation
	- Automatic rollback: Degradation triggers checkpoint restoration
	- Bounded improvement: No unbounded recursive self-modification
	- Human oversight: System designed for interactive use, not autonomy

	### Potential Risks

	- Dense responses may omit important caveats or safety information
	- Self-improvement research requires careful monitoring
	- Model inherits biases from base Hermes-3 and training data
	- Experimental features should not be used for consequential decisions

	### Explicit Non-Goals

	This system is not designed for:
	- Autonomous operation without human oversight
	- Self-replication or self-preservation
	- Deception or manipulation
	- Capability acquisition beyond defined scope

	---

	## Technical Specification

	Full technical documentation is available:

	- Primary Reference (Master Book):
	[Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency](https://doi.org/10.5281/zenodo.18344021)

	- Related Preprints:
	- [From Explicit Holonomy to Latent Control Fields](https://zenodo.org/records/14707164)
	- [The Holonomy Transformer](https://zenodo.org/records/14707081)

	The specification covers:
	- Multi-loop training architecture
	- Control field theory and implementation
	- Tokenization co-evolution (fourth loop)
	- Reliability engineering and rollback protocols
	- Reproducibility requirements


	---

	## Changelog

	### v2.9 (Current)
	- Stealth web browser for research
	- Improved training functions
	- Bug fixes for selfplay training loop

	### v2.8
	- Full RSI continuous learning system
	- Auto-train during chat
	- Dream cycles for experience replay
	- Domain-specific skill tracking
	- Catastrophic forgetting detection

	### v2.4
	- Mentor Mode: Learn from Claude API
	- Content generation tools
	- Smart help system

	### v2.2
	- Full CONDENSATOR pipeline
	- Enhanced CF-HoT with EMA, gate temperature
	- DPO and RL training stages

	### v2.0
	- Initial release
	- CF-HoT 125× repetition head
	- Dense response training
	- Basic self-improvement loop

	---

	## Citation
	```bibtex
	@software{napolitano2025arc,
	author = {Napolitano, Logan Matthew},
	title = {{ARC-Base-8B-Condensed}: Adaptive Recursive Cognition for Self-Stabilizing Language Models},
	year = {2025},
	publisher = {Hugging Face},
	url = {https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed},
	note = {Technical specification available on Zenodo},
	license = {CC BY 4.0}
	}
	```
	```bibtex
	@article{napolitano2025controlled,
	author = {Napolitano, Logan Matthew},
	title = {Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency},
	year = {2025},
	doi = {10.5281/zenodo.18344021},
	url = {https://zenodo.org/records/18344021},
	publisher = {Zenodo},
	note = {Primary technical reference for ARC-Base-8B-Condensed}
	}
	```
	```bibtex
	@article{napolitano2025controlfield,
	author = {Napolitano, Logan Matthew},
	title = {From Explicit Holonomy to Latent Control Fields},
	year = {2025},
	doi = {10.5281/zenodo.14707164},
	url = {https://zenodo.org/records/14707164},
	publisher = {Zenodo}
	}
	```

	## References

	1. Zou, A., et al. (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405
	2. Rafailov, R., et al. (2023). Direct Preference Optimization. arXiv:2305.18290
	3. Hu, E. J., et al. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685
	4. Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS.

	---

	## Acknowledgments

	- NousResearch for Hermes-3-Llama-3.1-8B base model
	- Meta AI for Llama 3.1 architecture
	- Hugging Face for transformers, PEFT, TRL
	- Anthropic for Claude API (Mentor Mode)

	---

	## License

	This work is licensed under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) (Creative Commons Attribution 4.0 International).

	You are free to:
	- Share — copy and redistribute the material in any medium or format
	- Adapt — remix, transform, and build upon the material for any purpose, including commercial

	Under the following terms:
	- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.

	---

	<div align="center">

	Contact: [GitHub Issues](https://github.com/LoganResearch/ARC-Base-8B-Condensed/issues) \| [Hugging Face Discussions](https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed/discussions)

	Version: 2.9 \| Last Updated: January 2025

	</div>