--- license: cc-by-4.0 language: - en library_name: transformers pipeline_tag: text-generation tags: - llama - dense-responses - self-improvement - representation-engineering - cf-hot - recursive-self-improvement base_model: NousResearch/Hermes-3-Llama-3.1-8B ---
# ARC-Base-8B-Condensed ## Adaptive Recursive Cognition **A Multi-Loop Self-Stabilizing Language Model with Predictive Control** *Logan Matthew Napolitano* [![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/) [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/) [![Base Model](https://img.shields.io/badge/base-Hermes--3--8B-green.svg)](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) *Research into stable self-improving language models* [Quick Start](#quick-start) • [Architecture](#architecture) • [Commands](#command-reference) • [Technical Specification](#technical-specification) • [Citation](#citation)
--- ## Table of Contents 1. [Model Description](#model-description) 2. [Quick Start](#quick-start) 3. [Architecture](#architecture) 4. [Core Technology](#core-technology) 5. [Command Reference](#command-reference) 6. [Evaluation](#evaluation) 7. [Installation](#installation) 8. [Configuration](#configuration) 9. [Repository Structure](#repository-structure) 10. [Hardware Requirements](#hardware-requirements) 11. [Training From Scratch](#training-from-scratch) 12. [API Reference](#api-reference) 13. [Limitations](#limitations) 14. [Ethical Considerations](#ethical-considerations) 15. [Technical Specification](#technical-specification) 16. [Changelog](#changelog) 17. [Citation](#citation) 18. [License](#license) --- ### Primary Reference The complete theoretical framework, methodology, and reproducibility details for this model are documented in: **Napolitano, L. M. (2025). _Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency._** Zenodo. https://doi.org/10.5281/zenodo.18344021 This paper should be cited for any academic or technical use of ARC-Base-8B-Condensed. ## Model Description ARC-Base-8B-Condensed is a fine-tuned version of [Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) designed for: 1. **Dense, information-rich responses** — Reduced filler, hedging, and verbosity 2. **Predictive behavioral control** — CF-HoT heads detect and suppress failure modes before they manifest 3. **Recursive self-improvement** — Micro-training with automatic rollback on quality degradation 4. **Mentor-based learning** — Optional consultation with Claude API for continuous improvement ### Intended Use - Research into self-improving language models - Applications requiring concise, direct responses - Study of representation engineering and behavioral control - Base for further fine-tuning experiments ### Not Intended For - Production deployment without evaluation - Safety-critical applications - Unsupervised autonomous operation - Applications requiring verbose, elaborative responses --- ## Quick Start ### One-Command Start ```bash git clone https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed cd ARC-Base-8B-Condensed pip install -r requirements.txt python arc_engine_v29_full.py ``` On first run, the engine will: 1. Download the base model (~16GB) 2. Load the DENSE adapter and CF-HoT heads 3. Initialize all subsystems 4. Present an interactive command prompt ``` ═══════════════════════════════════════════════════════════════════════════════ ARC ENGINE v2.9 - Adaptive Recursive Cognition Multi-Loop Self-Stabilizing Language Model ═══════════════════════════════════════════════════════════════════════════════ DENSE Mode: ON (CONDENSATOR checkpoint) CF-HoT Control: ON CF-HoT 125×: OFF Mentor Mode: OFF Auto-Train: OFF Experience Buffer: 0 examples ═══════════════════════════════════════════════════════════════════════════════ You> hello Hello. How can I help? [Quality: 0.82 | Density: 45.2 | Coherence: 0.95 | Tokens: 5] ``` ### Minimal Python Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "LoganResearch/ARC-Base-8B-Condensed", torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("LoganResearch/ARC-Base-8B-Condensed") prompt = "<|im_start|>user\nExplain gradient descent briefly.<|im_end|>\n<|im_start|>assistant\n" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## Architecture ### System Overview ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ ARC ENGINE ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ INPUT PROCESSING │ │ │ │ User Input → Command Parser → Generate / Tool Execute │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ CORE MODEL STACK │ │ │ ├─────────────────────────────────────────────────────────────────────┤ │ │ │ │ │ │ │ Base Model: Hermes-3-Llama-3.1-8B (8B parameters) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ DENSE Adapter ─── THE CONDENSATOR trained (SFT→DPO→RL) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ CF-HoT Heads ─── Repetition (125×), Hedging, Verbosity │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ Output Generation ─── Quality-controlled, density-optimized │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ QUALITY EVALUATION │ │ │ │ Response → Density Score → Coherence Score → Overall Quality │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ │ │ │ Mentor Mode Check: Quality < 0.6 OR Uncertainty > 0.4? │ │ │ │ │ │ │ Yes │ │ │ │ │ │ ▼ │ │ │ │ │ │ Consult Claude → Learn from Response → Update Training Buffer │ │ │ │ │ └──────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ RSI EXPERIENCE BUFFER │ │ │ │ Store: prompt, response, quality, domain, difficulty, feedback │ │ │ │ │ │ │ │ │ ┌──────────┴──────────┐ │ │ │ │ ▼ ▼ │ │ │ │ Auto-Train Trigger? Dream Cycle? │ │ │ │ │ │ │ │ │ │ ▼ ▼ │ │ │ │ Micro-training Experience Replay │ │ │ │ (25 steps) (Reinforce learnings) │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ VALIDATION & COMMIT │ │ │ │ New Quality vs Old Quality → Better? COMMIT : ROLLBACK │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### RSI Loop (Recursive Self-Improvement) ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ RECURSIVE SELF-IMPROVEMENT LOOP │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────┐ │ │ │ CHAT │◄─────────────────────────────────────────────────┐ │ │ └────┬────┘ │ │ │ │ │ │ │ ▼ │ │ │ ┌─────────┐ │ │ │ │ MEASURE │ Calculate quality, density, coherence │ │ │ └────┬────┘ │ │ │ │ │ │ │ ▼ │ │ │ ┌─────────┐ │ │ │ │ BUFFER │ Store in experience buffer with metadata │ │ │ └────┬────┘ │ │ │ │ │ │ │ ▼ │ │ │ ┌──────────────┐ │ │ │ │ AUTO-TRIGGER │ Buffer full? Quality threshold? Feedback? │ │ │ └──────┬───────┘ │ │ │ │ │ │ │ Yes │ No ─────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────┐ │ │ │ MICRO-TRAIN │ 25 steps on high-quality buffer samples │ │ └──────┬──────┘ │ │ │ │ │ ▼ │ │ ┌─────────────┐ │ │ │ VALIDATE │ Compare new model vs checkpoint │ │ └──────┬──────┘ │ │ │ │ │ ┌────┴────┐ │ │ │ │ │ │ Better? Worse? │ │ │ │ │ │ ▼ ▼ │ │ COMMIT ROLLBACK │ │ │ │ │ │ └────┬────┘ │ │ │ │ │ ▼ │ │ Continue ─────────────────────────────────────────────────────────────────┘ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### Mentor Mode Flow ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ MENTOR MODE LEARNING FLOW │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ User Prompt │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Local Generation │ Generate response with local 8B model │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Quality Check │ Evaluate density, coherence, quality │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ ┌────────────────────────────────────┐ │ │ │ Quality < 0.6 OR Uncertainty > 0.4 │ │ │ └────────┬───────────────────────────┘ │ │ │ │ │ Yes │ No ──────────► Return local response │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Consult Claude │ Via API │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Create DPO Pair │ │ │ │ chosen: Claude │ │ │ │ rejected: Local │ │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Add to Buffer │ High-quality experience for training │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ Return Claude's response + log learning │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- ## Core Technology ### 1. CF-HoT: Control-Field Holonomy Predictive control through hidden-state monitoring. Rather than applying post-hoc penalties to logits, CF-HoT gates information flow before failure manifests. ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ CF-HoT ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ Hidden States (Layers 16-24) │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Fiber Projection │ Compress to d=16 per layer │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Layer Attention │ Weighted aggregation across layers │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────┐ │ │ │ Risk Predictor │ Binary classifier: P(unwanted_behavior) │ │ └────────┬────────┘ │ │ │ │ │ ▼ │ │ If P > threshold ──► Apply logit penalties │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` **Head Performance:** | Head | Separation | Description | |------|------------|-------------| | Repetition | 125× | Detects impending repetitive loops | | Hedging | 1.5× | Blocks uncertainty markers | | Verbosity | 2.1× | Suppresses filler content | The repetition head achieves 125× separation between positive (pre-repetition) and negative (diverse output) hidden states, enabling reliable early warning. ### 2. The Condensator: Dense Response Training 4-stage training pipeline: ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ THE CONDENSATOR PIPELINE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ STAGE 1: Supervised Fine-Tuning (SFT) │ │ ───────────────────────────────────── │ │ • 847 curated dense response examples │ │ • Learning rate: 2e-5 │ │ • Epochs: 3 │ │ │ │ STAGE 2: Direct Preference Optimization (DPO) │ │ ───────────────────────────────────────────── │ │ • Preference pairs: dense (chosen) vs verbose (rejected) │ │ • Beta: 0.1 │ │ • Epochs: 2 │ │ │ │ STAGE 3: Reinforcement Learning (PPO) │ │ ───────────────────────────────────── │ │ • Reward = quality_score - length_penalty │ │ • Conservative KL constraint │ │ • Learning rate: 1e-6 │ │ │ │ STAGE 4: Checkpointing │ │ ───────────────────── │ │ • Save every 25 steps │ │ • A/B comparison on held-out prompts │ │ • Automatic rollback if quality drops │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### 3. Enhanced CF-HoT Parameters | Parameter | Value | Reason | |-----------|-------|--------| | EMA Momentum | 0.995 | Stable control field | | Gate Temperature | 2.0 | Softer sigmoid | | Gate Bounds | [0.1, 0.9] | Prevent saturation | | Monitoring | Every 50 steps | Detect drift | | Warmup | 500 steps | Smooth initialization | --- ## Command Reference ### Core Commands | Command | Description | |---------|-------------| | `status` | System status overview | | `help` | Full command menu | | `help ` | Topic-specific help | | `quit` | Exit | ### Self-Improvement | Command | Description | |---------|-------------| | `!improve` | Run improvement iteration | | `!eval` | Full evaluation | | `!train ` | Training steps | | `!compare` | Compare checkpoints | | `!rollback` | Revert to best checkpoint | | `!load ` | Load checkpoint | | `!benchmark` | Evaluation suite | ### Mentor Mode | Command | Description | |---------|-------------| | `!mentor` | Show mentor mode status | | `!mentor on` | Enable auto-consultation | | `!mentor off` | Disable mentor mode | | `!mentor ask ` | Ask Claude and learn from response | | `!mentor learn` | Show collected learnings | ### RSI (Recursive Self-Improvement) | Command | Description | |---------|-------------| | `!auto_train on` | Enable learning during chat | | `!auto_train off` | Disable auto-training | | `!skills` | Quality per domain | | `!forgetting` | Detect catastrophic forgetting | | `!dream` | Force experience replay | | `!buffer` | Experience buffer stats | | `!selfplay ` | Run N self-play iterations | ### Condensator | Command | Description | |---------|-------------| | `!condensator` | Run full SFT→DPO→RL pipeline | | `!dpo` | Run DPO stage only | | `!rl` | Run RL stage only | | `!train_cfhot` | Train CF-HoT heads | ### CF-HoT Control | Command | Description | |---------|-------------| | `!cfhot` / `!125x` | Toggle 125× head | | `!cfhot status` | Head status | | `!gate_stats` | CF-HoT gate health | ### Generation Modes | Command | Description | |---------|-------------| | `!book` | Toggle book mode (16K tokens) | | `!write ` | Write extended content | | `!claude ` | Direct Claude API prompt | ### Tools | Command | Description | |---------|-------------| | `!shell ` | Execute shell command | | `!python ` | Execute Python | | `!read ` | Read file | | `!write ` | Write file | | `!search ` | Web search | | `!fetch ` | Fetch URL content | ### Browser (requires Playwright) | Command | Description | |---------|-------------| | `!browse ` | Open URL | | `!click ` | Click element | | `!type ` | Type text | | `!read` | Read page content | ### Multimedia (optional dependencies) | Command | Description | |---------|-------------| | `!stream` | Open live token window | | `!audio` / `!tts` | Toggle text-to-speech | | `!imagine ` | Generate image (SDXL) | | `!dalle ` | Generate image (DALL-E 3) | ### Experimental Features | Command | Description | |---------|-------------| | `!content blog ` | Generate blog post | | `!content youtube ` | Generate video script | --- ## Evaluation ### Qualitative Comparison | Prompt | Base Hermes-3 | ARC-Condensed | |--------|---------------|---------------| | "hello" | "Hello! I'm here to help you with any questions or tasks you might have. Feel free to ask me anything!" (23 tokens) | "Hello. How can I help?" (5 tokens) | | "What is recursion?" | "That's a great question! Recursion is a programming concept where a function calls itself..." (150+ tokens) | "Function calling itself until base case. Stack frames accumulate, unwind on return." (12 tokens) | | "How are you?" | "As an AI, I don't have feelings in the traditional sense, but I'm functioning well..." (25 tokens) | "Functional. Task?" (3 tokens) | ### Quantitative Metrics | Metric | Base Model | ARC-Condensed | Change | |--------|------------|---------------|--------| | Avg. Response Length | 150 tokens | 45 tokens | -70% | | Filler Phrases | Present | Minimal | ~-95% | | Information Density | 17.0 | 45.2 | +166% | | Quality Score (internal) | 0.52 | 0.78 | +50% | **Note:** These are heuristic metrics from internal evaluation. Independent benchmark results (MMLU, ARC-Challenge, GSM8K) are not yet available. We welcome independent evaluation. ### Self-Improvement Trajectory (Observed) ``` Iteration 0: Quality 0.52 (baseline) Iteration 5: Quality 0.68 (+31%) Iteration 10: Quality 0.75 (+44%) Iteration 15: Quality 0.78 (+50%, plateau) ``` Self-improvement shows diminishing returns after ~15 iterations. This is expected behavior, not a limitation to work around. --- ## Installation ### Minimal Installation ```bash pip install torch transformers accelerate peft bitsandbytes datasets trl ``` ### Full Installation ```bash pip install -r requirements.txt ``` ### Optional Dependencies ```bash # Browser automation pip install playwright && playwright install firefox # Image generation pip install diffusers pillow # Text-to-speech pip install pyttsx3 gTTS pygame # Claude API (for mentor mode) pip install anthropic # OpenAI API (for DALL-E) pip install openai # Web search pip install requests ``` ### Environment Variables ```bash # Optional - for enhanced features export ANTHROPIC_API_KEY="sk-ant-..." # Mentor Mode export OPENAI_API_KEY="sk-..." # DALL-E ``` --- ## Configuration ### Main Configuration ```python class Config: # Generation temperature = 0.85 top_p = 0.9 max_new_tokens = 512 repetition_penalty = 1.1 # CF-HoT use_cfhot = True use_cfhot_125x = False cfhot_repetition_threshold = 0.6 cfhot_repetition_penalty = 6.0 # Self-improvement min_quality_score = 0.5 target_quality_score = 0.75 training_steps_per_iteration = 25 quality_drop_threshold = 0.1 ``` ### RSI Configuration ```python @dataclass class RSIConfig: auto_train_enabled: bool = False buffer_size: int = 1000 min_experiences_to_train: int = 50 quality_threshold_for_training: float = 0.7 dream_cycle_interval: int = 100 forgetting_check_interval: int = 50 ``` ### Mentor Configuration ```python @dataclass class MentorConfig: enabled: bool = False auto_consult_threshold: float = 0.6 uncertainty_threshold: float = 0.4 learn_from_responses: bool = True ``` --- ## Repository Structure ``` ARC-Base-8B-Condensed/ │ ├── arc_engine_v29_full.py # Main engine ├── README.md # This file ├── requirements.txt # Dependencies │ ├── model-00001-of-00004.safetensors # Model weights ├── model-00002-of-00004.safetensors ├── model-00003-of-00004.safetensors ├── model-00004-of-00004.safetensors ├── config.json ├── tokenizer.json ├── tokenizer_config.json ├── special_tokens_map.json ├── generation_config.json │ ├── dense_checkpoints/ # Training checkpoints │ └── step_*/ │ ├── cfhot_checkpoints/ # CF-HoT heads │ └── final_6000/ │ └── risk_predictor.pt │ ├── improvement_logs/ # RSI logs └── exports/ # Checkpoint exports ``` --- ## Hardware Requirements | Component | Minimum | Recommended | |-----------|---------|-------------| | GPU VRAM | 16 GB | 24+ GB | | System RAM | 32 GB | 64 GB | | Storage | 50 GB | 100 GB | | Python | 3.10+ | 3.11 | **Tested Configurations:** - NVIDIA RTX 3090 (24GB), 64GB RAM ✓ - NVIDIA RTX 4090 (24GB), 128GB RAM ✓ - NVIDIA A100 (40GB) ✓ **Performance Estimates:** - Inference: ~15-25 tokens/second - Full Condensator pipeline: ~4 hours (RTX 3090) - Self-improvement iteration: ~30 minutes --- ## Training From Scratch ### Automated Training ```bash python arc_engine_v29_full.py > !condensator ``` This runs: 1. SFT (3 epochs) 2. DPO (2 epochs) 3. RL (300 steps) 4. Checkpoint validation ### Manual Training **Step 1: Train CF-HoT Heads** ``` > !train_cfhot ``` **Step 2: Run Condensator** ``` > !condensator ``` **Step 3: Self-Improvement** ``` > !selfplay 1000 ``` --- ## API Reference ### Start Server ``` > !api [api] Server running on http://0.0.0.0:8080 ``` ### Endpoints #### POST /generate ```bash curl -X POST http://localhost:8080/generate \ -H "Content-Type: application/json" \ -d '{"prompt": "What is recursion?"}' ``` Response: ```json { "response": "Function calling itself until base case.", "quality": 0.82, "density": 48.3, "tokens": 8 } ``` #### GET /health ```bash curl http://localhost:8080/health ``` --- ## Limitations ### Known Limitations | Limitation | Description | |------------|-------------| | **Scale** | Tested on 8B parameters only; scaling behavior unknown | | **Language** | English only | | **Benchmarks** | No formal benchmark results (MMLU, GSM8K, etc.) | | **Terseness** | May be too concise for applications requiring elaboration | | **Iterations** | Self-improvement plateaus after ~15 iterations | | **Memory** | Full features require 16GB+ VRAM | ### What This Is Not - This is **not** AGI or a path to AGI - This is **not** a production-ready system - Self-improvement is **bounded and reversible** - The model **requires human oversight** - Claims are **not independently validated** --- ## Ethical Considerations ### Safety Measures - **Quality gates:** All self-modification requires quality validation - **Automatic rollback:** Degradation triggers checkpoint restoration - **Bounded improvement:** No unbounded recursive self-modification - **Human oversight:** System designed for interactive use, not autonomy ### Potential Risks - Dense responses may omit important caveats or safety information - Self-improvement research requires careful monitoring - Model inherits biases from base Hermes-3 and training data - Experimental features should not be used for consequential decisions ### Explicit Non-Goals This system is **not designed for:** - Autonomous operation without human oversight - Self-replication or self-preservation - Deception or manipulation - Capability acquisition beyond defined scope --- ## Technical Specification Full technical documentation is available: - **Primary Reference (Master Book):** [Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency](https://doi.org/10.5281/zenodo.18344021) - **Related Preprints:** - [From Explicit Holonomy to Latent Control Fields](https://zenodo.org/records/14707164) - [The Holonomy Transformer](https://zenodo.org/records/14707081) The specification covers: - Multi-loop training architecture - Control field theory and implementation - Tokenization co-evolution (fourth loop) - Reliability engineering and rollback protocols - Reproducibility requirements --- ## Changelog ### v2.9 (Current) - Stealth web browser for research - Improved training functions - Bug fixes for selfplay training loop ### v2.8 - Full RSI continuous learning system - Auto-train during chat - Dream cycles for experience replay - Domain-specific skill tracking - Catastrophic forgetting detection ### v2.4 - Mentor Mode: Learn from Claude API - Content generation tools - Smart help system ### v2.2 - Full CONDENSATOR pipeline - Enhanced CF-HoT with EMA, gate temperature - DPO and RL training stages ### v2.0 - Initial release - CF-HoT 125× repetition head - Dense response training - Basic self-improvement loop --- ## Citation ```bibtex @software{napolitano2025arc, author = {Napolitano, Logan Matthew}, title = {{ARC-Base-8B-Condensed}: Adaptive Recursive Cognition for Self-Stabilizing Language Models}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed}, note = {Technical specification available on Zenodo}, license = {CC BY 4.0} } ``` ```bibtex @article{napolitano2025controlled, author = {Napolitano, Logan Matthew}, title = {Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency}, year = {2025}, doi = {10.5281/zenodo.18344021}, url = {https://zenodo.org/records/18344021}, publisher = {Zenodo}, note = {Primary technical reference for ARC-Base-8B-Condensed} } ``` ```bibtex @article{napolitano2025controlfield, author = {Napolitano, Logan Matthew}, title = {From Explicit Holonomy to Latent Control Fields}, year = {2025}, doi = {10.5281/zenodo.14707164}, url = {https://zenodo.org/records/14707164}, publisher = {Zenodo} } ``` ## References 1. Zou, A., et al. (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405 2. Rafailov, R., et al. (2023). Direct Preference Optimization. arXiv:2305.18290 3. Hu, E. J., et al. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 4. Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS. --- ## Acknowledgments - **NousResearch** for Hermes-3-Llama-3.1-8B base model - **Meta AI** for Llama 3.1 architecture - **Hugging Face** for transformers, PEFT, TRL - **Anthropic** for Claude API (Mentor Mode) --- ## License This work is licensed under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) (Creative Commons Attribution 4.0 International). You are free to: - **Share** — copy and redistribute the material in any medium or format - **Adapt** — remix, transform, and build upon the material for any purpose, including commercial Under the following terms: - **Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made. ---
**Contact:** [GitHub Issues](https://github.com/LoganResearch/ARC-Base-8B-Condensed/issues) | [Hugging Face Discussions](https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed/discussions) **Version:** 2.9 | **Last Updated:** January 2025