ARC-Base-8B-Condensed

Adaptive Recursive Cognition

A Multi-Loop Self-Stabilizing Language Model with Predictive Control

Logan Matthew Napolitano

License: CC BY 4.0 Python 3.10+ Base Model

Research into stable self-improving language models

Quick StartArchitectureCommandsTechnical SpecificationCitation


Table of Contents

  1. Model Description
  2. Quick Start
  3. Architecture
  4. Core Technology
  5. Command Reference
  6. Evaluation
  7. Installation
  8. Configuration
  9. Repository Structure
  10. Hardware Requirements
  11. Training From Scratch
  12. API Reference
  13. Limitations
  14. Ethical Considerations
  15. Technical Specification
  16. Changelog
  17. Citation
  18. License

Primary Reference

The complete theoretical framework, methodology, and reproducibility details for this model are documented in:

Napolitano, L. M. (2025). Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency.
Zenodo. https://doi.org/10.5281/zenodo.18344021

This paper should be cited for any academic or technical use of ARC-Base-8B-Condensed.

Model Description

ARC-Base-8B-Condensed is a fine-tuned version of Hermes-3-Llama-3.1-8B designed for:

  1. Dense, information-rich responses — Reduced filler, hedging, and verbosity
  2. Predictive behavioral control — CF-HoT heads detect and suppress failure modes before they manifest
  3. Recursive self-improvement — Micro-training with automatic rollback on quality degradation
  4. Mentor-based learning — Optional consultation with Claude API for continuous improvement

Intended Use

  • Research into self-improving language models
  • Applications requiring concise, direct responses
  • Study of representation engineering and behavioral control
  • Base for further fine-tuning experiments

Not Intended For

  • Production deployment without evaluation
  • Safety-critical applications
  • Unsupervised autonomous operation
  • Applications requiring verbose, elaborative responses

Quick Start

One-Command Start

git clone https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed
cd ARC-Base-8B-Condensed
pip install -r requirements.txt
python arc_engine_v29_full.py

On first run, the engine will:

  1. Download the base model (~16GB)
  2. Load the DENSE adapter and CF-HoT heads
  3. Initialize all subsystems
  4. Present an interactive command prompt
═══════════════════════════════════════════════════════════════════════════════
  ARC ENGINE v2.9 - Adaptive Recursive Cognition
  Multi-Loop Self-Stabilizing Language Model
═══════════════════════════════════════════════════════════════════════════════
    DENSE Mode:      ON (CONDENSATOR checkpoint)
    CF-HoT Control:  ON
    CF-HoT 125×:     OFF
    Mentor Mode:     OFF
    Auto-Train:      OFF
    Experience Buffer: 0 examples
═══════════════════════════════════════════════════════════════════════════════

You> hello
Hello. How can I help?

[Quality: 0.82 | Density: 45.2 | Coherence: 0.95 | Tokens: 5]

Minimal Python Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "LoganResearch/ARC-Base-8B-Condensed",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("LoganResearch/ARC-Base-8B-Condensed")

prompt = "<|im_start|>user\nExplain gradient descent briefly.<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Architecture

System Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                         ARC ENGINE ARCHITECTURE                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         INPUT PROCESSING                             │   │
│  │  User Input → Command Parser → Generate / Tool Execute               │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         CORE MODEL STACK                             │   │
│  ├─────────────────────────────────────────────────────────────────────┤   │
│  │                                                                       │   │
│  │   Base Model: Hermes-3-Llama-3.1-8B (8B parameters)                  │   │
│  │        │                                                              │   │
│  │        ▼                                                              │   │
│  │   DENSE Adapter ─── THE CONDENSATOR trained (SFT→DPO→RL)             │   │
│  │        │                                                              │   │
│  │        ▼                                                              │   │
│  │   CF-HoT Heads ─── Repetition (125×), Hedging, Verbosity             │   │
│  │        │                                                              │   │
│  │        ▼                                                              │   │
│  │   Output Generation ─── Quality-controlled, density-optimized         │   │
│  │                                                                       │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                       QUALITY EVALUATION                             │   │
│  │  Response → Density Score → Coherence Score → Overall Quality        │   │
│  │                    │                                                  │   │
│  │                    ▼                                                  │   │
│  │  ┌──────────────────────────────────────────────────────────────┐   │   │
│  │  │ Mentor Mode Check: Quality < 0.6 OR Uncertainty > 0.4?       │   │   │
│  │  │      │ Yes                                                    │   │   │
│  │  │      ▼                                                        │   │   │
│  │  │ Consult Claude → Learn from Response → Update Training Buffer │   │   │
│  │  └──────────────────────────────────────────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      RSI EXPERIENCE BUFFER                           │   │
│  │  Store: prompt, response, quality, domain, difficulty, feedback      │   │
│  │                    │                                                  │   │
│  │         ┌──────────┴──────────┐                                      │   │
│  │         ▼                     ▼                                      │   │
│  │  Auto-Train Trigger?    Dream Cycle?                                 │   │
│  │         │                     │                                      │   │
│  │         ▼                     ▼                                      │   │
│  │  Micro-training        Experience Replay                             │   │
│  │  (25 steps)            (Reinforce learnings)                         │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      VALIDATION & COMMIT                             │   │
│  │  New Quality vs Old Quality → Better? COMMIT : ROLLBACK              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

RSI Loop (Recursive Self-Improvement)

┌─────────────────────────────────────────────────────────────────────────────┐
│                    RECURSIVE SELF-IMPROVEMENT LOOP                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────┐                                                               │
│   │  CHAT   │◄─────────────────────────────────────────────────┐           │
│   └────┬────┘                                                   │           │
│        │                                                        │           │
│        ▼                                                        │           │
│   ┌─────────┐                                                   │           │
│   │ MEASURE │ Calculate quality, density, coherence             │           │
│   └────┬────┘                                                   │           │
│        │                                                        │           │
│        ▼                                                        │           │
│   ┌─────────┐                                                   │           │
│   │ BUFFER  │ Store in experience buffer with metadata          │           │
│   └────┬────┘                                                   │           │
│        │                                                        │           │
│        ▼                                                        │           │
│   ┌──────────────┐                                              │           │
│   │ AUTO-TRIGGER │ Buffer full? Quality threshold? Feedback?    │           │
│   └──────┬───────┘                                              │           │
│          │                                                      │           │
│     Yes  │  No ─────────────────────────────────────────────────┘           │
│          │                                                                  │
│          ▼                                                                  │
│   ┌─────────────┐                                                           │
│   │ MICRO-TRAIN │ 25 steps on high-quality buffer samples                   │
│   └──────┬──────┘                                                           │
│          │                                                                  │
│          ▼                                                                  │
│   ┌─────────────┐                                                           │
│   │  VALIDATE   │ Compare new model vs checkpoint                           │
│   └──────┬──────┘                                                           │
│          │                                                                  │
│     ┌────┴────┐                                                             │
│     │         │                                                             │
│  Better?   Worse?                                                           │
│     │         │                                                             │
│     ▼         ▼                                                             │
│  COMMIT    ROLLBACK                                                         │
│     │         │                                                             │
│     └────┬────┘                                                             │
│          │                                                                  │
│          ▼                                                                  │
│   Continue ─────────────────────────────────────────────────────────────────┘
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Mentor Mode Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MENTOR MODE LEARNING FLOW                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   User Prompt                                                               │
│        │                                                                    │
│        ▼                                                                    │
│   ┌─────────────────┐                                                       │
│   │ Local Generation │ Generate response with local 8B model                │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Quality Check   │ Evaluate density, coherence, quality                  │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌────────────────────────────────────┐                                    │
│   │ Quality < 0.6 OR Uncertainty > 0.4 │                                    │
│   └────────┬───────────────────────────┘                                    │
│            │                                                                │
│       Yes  │  No ──────────► Return local response                          │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Consult Claude  │ Via API                                               │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Create DPO Pair │                                                       │
│   │ chosen: Claude  │                                                       │
│   │ rejected: Local │                                                       │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Add to Buffer   │ High-quality experience for training                  │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   Return Claude's response + log learning                                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Core Technology

1. CF-HoT: Control-Field Holonomy

Predictive control through hidden-state monitoring. Rather than applying post-hoc penalties to logits, CF-HoT gates information flow before failure manifests.

┌─────────────────────────────────────────────────────────────────────────────┐
│                        CF-HoT ARCHITECTURE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   Hidden States (Layers 16-24)                                              │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Fiber Projection │ Compress to d=16 per layer                          │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Layer Attention  │ Weighted aggregation across layers                   │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Risk Predictor   │ Binary classifier: P(unwanted_behavior)              │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   If P > threshold ──► Apply logit penalties                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Head Performance:

Head Separation Description
Repetition 125× Detects impending repetitive loops
Hedging 1.5× Blocks uncertainty markers
Verbosity 2.1× Suppresses filler content

The repetition head achieves 125× separation between positive (pre-repetition) and negative (diverse output) hidden states, enabling reliable early warning.

2. The Condensator: Dense Response Training

4-stage training pipeline:

┌─────────────────────────────────────────────────────────────────────────────┐
│                       THE CONDENSATOR PIPELINE                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  STAGE 1: Supervised Fine-Tuning (SFT)                                      │
│  ─────────────────────────────────────                                      │
│  • 847 curated dense response examples                                      │
│  • Learning rate: 2e-5                                                      │
│  • Epochs: 3                                                                │
│                                                                             │
│  STAGE 2: Direct Preference Optimization (DPO)                              │
│  ─────────────────────────────────────────────                              │
│  • Preference pairs: dense (chosen) vs verbose (rejected)                   │
│  • Beta: 0.1                                                                │
│  • Epochs: 2                                                                │
│                                                                             │
│  STAGE 3: Reinforcement Learning (PPO)                                      │
│  ─────────────────────────────────────                                      │
│  • Reward = quality_score - length_penalty                                  │
│  • Conservative KL constraint                                               │
│  • Learning rate: 1e-6                                                      │
│                                                                             │
│  STAGE 4: Checkpointing                                                     │
│  ─────────────────────                                                      │
│  • Save every 25 steps                                                      │
│  • A/B comparison on held-out prompts                                       │
│  • Automatic rollback if quality drops                                      │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

3. Enhanced CF-HoT Parameters

Parameter Value Reason
EMA Momentum 0.995 Stable control field
Gate Temperature 2.0 Softer sigmoid
Gate Bounds [0.1, 0.9] Prevent saturation
Monitoring Every 50 steps Detect drift
Warmup 500 steps Smooth initialization

Command Reference

Core Commands

Command Description
status System status overview
help Full command menu
help <topic> Topic-specific help
quit Exit

Self-Improvement

Command Description
!improve Run improvement iteration
!eval Full evaluation
!train <steps> Training steps
!compare Compare checkpoints
!rollback Revert to best checkpoint
!load <path> Load checkpoint
!benchmark Evaluation suite

Mentor Mode

Command Description
!mentor Show mentor mode status
!mentor on Enable auto-consultation
!mentor off Disable mentor mode
!mentor ask <question> Ask Claude and learn from response
!mentor learn Show collected learnings

RSI (Recursive Self-Improvement)

Command Description
!auto_train on Enable learning during chat
!auto_train off Disable auto-training
!skills Quality per domain
!forgetting Detect catastrophic forgetting
!dream Force experience replay
!buffer Experience buffer stats
!selfplay <N> Run N self-play iterations

Condensator

Command Description
!condensator Run full SFT→DPO→RL pipeline
!dpo Run DPO stage only
!rl Run RL stage only
!train_cfhot Train CF-HoT heads

CF-HoT Control

Command Description
!cfhot / !125x Toggle 125× head
!cfhot status Head status
!gate_stats CF-HoT gate health

Generation Modes

Command Description
!book Toggle book mode (16K tokens)
!write <topic> Write extended content
!claude <prompt> Direct Claude API prompt

Tools

Command Description
!shell <cmd> Execute shell command
!python <code> Execute Python
!read <path> Read file
!write <path> <content> Write file
!search <query> Web search
!fetch <url> Fetch URL content

Browser (requires Playwright)

Command Description
!browse <url> Open URL
!click <selector> Click element
!type <text> Type text
!read Read page content

Multimedia (optional dependencies)

Command Description
!stream Open live token window
!audio / !tts Toggle text-to-speech
!imagine <prompt> Generate image (SDXL)
!dalle <prompt> Generate image (DALL-E 3)

Experimental Features

Command Description
!content blog <topic> Generate blog post
!content youtube <topic> Generate video script

Evaluation

Qualitative Comparison

Prompt Base Hermes-3 ARC-Condensed
"hello" "Hello! I'm here to help you with any questions or tasks you might have. Feel free to ask me anything!" (23 tokens) "Hello. How can I help?" (5 tokens)
"What is recursion?" "That's a great question! Recursion is a programming concept where a function calls itself..." (150+ tokens) "Function calling itself until base case. Stack frames accumulate, unwind on return." (12 tokens)
"How are you?" "As an AI, I don't have feelings in the traditional sense, but I'm functioning well..." (25 tokens) "Functional. Task?" (3 tokens)

Quantitative Metrics

Metric Base Model ARC-Condensed Change
Avg. Response Length 150 tokens 45 tokens -70%
Filler Phrases Present Minimal ~-95%
Information Density 17.0 45.2 +166%
Quality Score (internal) 0.52 0.78 +50%

Note: These are heuristic metrics from internal evaluation. Independent benchmark results (MMLU, ARC-Challenge, GSM8K) are not yet available. We welcome independent evaluation.

Self-Improvement Trajectory (Observed)

Iteration 0:  Quality 0.52 (baseline)
Iteration 5:  Quality 0.68 (+31%)
Iteration 10: Quality 0.75 (+44%)
Iteration 15: Quality 0.78 (+50%, plateau)

Self-improvement shows diminishing returns after ~15 iterations. This is expected behavior, not a limitation to work around.


Installation

Minimal Installation

pip install torch transformers accelerate peft bitsandbytes datasets trl

Full Installation

pip install -r requirements.txt

Optional Dependencies

# Browser automation
pip install playwright && playwright install firefox

# Image generation
pip install diffusers pillow

# Text-to-speech
pip install pyttsx3 gTTS pygame

# Claude API (for mentor mode)
pip install anthropic

# OpenAI API (for DALL-E)
pip install openai

# Web search
pip install requests

Environment Variables

# Optional - for enhanced features
export ANTHROPIC_API_KEY="sk-ant-..."  # Mentor Mode
export OPENAI_API_KEY="sk-..."          # DALL-E

Configuration

Main Configuration

class Config:
    # Generation
    temperature = 0.85
    top_p = 0.9
    max_new_tokens = 512
    repetition_penalty = 1.1
    
    # CF-HoT
    use_cfhot = True
    use_cfhot_125x = False
    cfhot_repetition_threshold = 0.6
    cfhot_repetition_penalty = 6.0
    
    # Self-improvement
    min_quality_score = 0.5
    target_quality_score = 0.75
    training_steps_per_iteration = 25
    quality_drop_threshold = 0.1

RSI Configuration

@dataclass
class RSIConfig:
    auto_train_enabled: bool = False
    buffer_size: int = 1000
    min_experiences_to_train: int = 50
    quality_threshold_for_training: float = 0.7
    dream_cycle_interval: int = 100
    forgetting_check_interval: int = 50

Mentor Configuration

@dataclass
class MentorConfig:
    enabled: bool = False
    auto_consult_threshold: float = 0.6
    uncertainty_threshold: float = 0.4
    learn_from_responses: bool = True

Repository Structure

ARC-Base-8B-Condensed/
│
├── arc_engine_v29_full.py       # Main engine
├── README.md                     # This file
├── requirements.txt              # Dependencies
│
├── model-00001-of-00004.safetensors  # Model weights
├── model-00002-of-00004.safetensors
├── model-00003-of-00004.safetensors
├── model-00004-of-00004.safetensors
├── config.json
├── tokenizer.json
├── tokenizer_config.json
├── special_tokens_map.json
├── generation_config.json
│
├── dense_checkpoints/            # Training checkpoints
│   └── step_*/
│
├── cfhot_checkpoints/            # CF-HoT heads
│   └── final_6000/
│       └── risk_predictor.pt
│
├── improvement_logs/             # RSI logs
└── exports/                      # Checkpoint exports

Hardware Requirements

Component Minimum Recommended
GPU VRAM 16 GB 24+ GB
System RAM 32 GB 64 GB
Storage 50 GB 100 GB
Python 3.10+ 3.11

Tested Configurations:

  • NVIDIA RTX 3090 (24GB), 64GB RAM ✓
  • NVIDIA RTX 4090 (24GB), 128GB RAM ✓
  • NVIDIA A100 (40GB) ✓

Performance Estimates:

  • Inference: ~15-25 tokens/second
  • Full Condensator pipeline: ~4 hours (RTX 3090)
  • Self-improvement iteration: ~30 minutes

Training From Scratch

Automated Training

python arc_engine_v29_full.py
> !condensator

This runs:

  1. SFT (3 epochs)
  2. DPO (2 epochs)
  3. RL (300 steps)
  4. Checkpoint validation

Manual Training

Step 1: Train CF-HoT Heads

> !train_cfhot

Step 2: Run Condensator

> !condensator

Step 3: Self-Improvement

> !selfplay 1000

API Reference

Start Server

> !api
[api] Server running on http://0.0.0.0:8080

Endpoints

POST /generate

curl -X POST http://localhost:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is recursion?"}'

Response:

{
  "response": "Function calling itself until base case.",
  "quality": 0.82,
  "density": 48.3,
  "tokens": 8
}

GET /health

curl http://localhost:8080/health

Limitations

Known Limitations

Limitation Description
Scale Tested on 8B parameters only; scaling behavior unknown
Language English only
Benchmarks No formal benchmark results (MMLU, GSM8K, etc.)
Terseness May be too concise for applications requiring elaboration
Iterations Self-improvement plateaus after ~15 iterations
Memory Full features require 16GB+ VRAM

What This Is Not

  • This is not AGI or a path to AGI
  • This is not a production-ready system
  • Self-improvement is bounded and reversible
  • The model requires human oversight
  • Claims are not independently validated

Ethical Considerations

Safety Measures

  • Quality gates: All self-modification requires quality validation
  • Automatic rollback: Degradation triggers checkpoint restoration
  • Bounded improvement: No unbounded recursive self-modification
  • Human oversight: System designed for interactive use, not autonomy

Potential Risks

  • Dense responses may omit important caveats or safety information
  • Self-improvement research requires careful monitoring
  • Model inherits biases from base Hermes-3 and training data
  • Experimental features should not be used for consequential decisions

Explicit Non-Goals

This system is not designed for:

  • Autonomous operation without human oversight
  • Self-replication or self-preservation
  • Deception or manipulation
  • Capability acquisition beyond defined scope

Technical Specification

Full technical documentation is available:

The specification covers:

  • Multi-loop training architecture
  • Control field theory and implementation
  • Tokenization co-evolution (fourth loop)
  • Reliability engineering and rollback protocols
  • Reproducibility requirements

Changelog

v2.9 (Current)

  • Stealth web browser for research
  • Improved training functions
  • Bug fixes for selfplay training loop

v2.8

  • Full RSI continuous learning system
  • Auto-train during chat
  • Dream cycles for experience replay
  • Domain-specific skill tracking
  • Catastrophic forgetting detection

v2.4

  • Mentor Mode: Learn from Claude API
  • Content generation tools
  • Smart help system

v2.2

  • Full CONDENSATOR pipeline
  • Enhanced CF-HoT with EMA, gate temperature
  • DPO and RL training stages

v2.0

  • Initial release
  • CF-HoT 125× repetition head
  • Dense response training
  • Basic self-improvement loop

Citation

@software{napolitano2025arc,
  author       = {Napolitano, Logan Matthew},
  title        = {{ARC-Base-8B-Condensed}: Adaptive Recursive Cognition for Self-Stabilizing Language Models},
  year         = {2025},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed},
  note         = {Technical specification available on Zenodo},
  license      = {CC BY 4.0}
}
@article{napolitano2025controlled,
  author       = {Napolitano, Logan Matthew},
  title        = {Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency},
  year         = {2025},
  doi          = {10.5281/zenodo.18344021},
  url          = {https://zenodo.org/records/18344021},
  publisher    = {Zenodo},
  note         = {Primary technical reference for ARC-Base-8B-Condensed}
}
@article{napolitano2025controlfield,
  author       = {Napolitano, Logan Matthew},
  title        = {From Explicit Holonomy to Latent Control Fields},
  year         = {2025},
  doi          = {10.5281/zenodo.14707164},
  url          = {https://zenodo.org/records/14707164},
  publisher    = {Zenodo}
}

References

  1. Zou, A., et al. (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405
  2. Rafailov, R., et al. (2023). Direct Preference Optimization. arXiv:2305.18290
  3. Hu, E. J., et al. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685
  4. Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS.

Acknowledgments

  • NousResearch for Hermes-3-Llama-3.1-8B base model
  • Meta AI for Llama 3.1 architecture
  • Hugging Face for transformers, PEFT, TRL
  • Anthropic for Claude API (Mentor Mode)

License

This work is licensed under CC BY 4.0 (Creative Commons Attribution 4.0 International).

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, including commercial

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.

Contact: GitHub Issues | Hugging Face Discussions

Version: 2.9 | Last Updated: January 2025

Downloads last month
234
Safetensors
Model size
8B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LoganResearch/ARC-Base-8B-Condensed

Finetuned
(27)
this model
Quantizations
2 models

Papers for LoganResearch/ARC-Base-8B-Condensed