ARC-Base-8B-Condensed / README.md

LoganResearch

Update README.md

8bc141e verified 11 days ago

preview code

raw

history blame contribute delete

44.2 kB

metadata

license: cc-by-4.0
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - llama
  - dense-responses
  - self-improvement
  - representation-engineering
  - cf-hot
  - recursive-self-improvement
base_model: NousResearch/Hermes-3-Llama-3.1-8B

ARC-Base-8B-Condensed

Adaptive Recursive Cognition

A Multi-Loop Self-Stabilizing Language Model with Predictive Control

Logan Matthew Napolitano

Research into stable self-improving language models

Quick Start • Architecture • Commands • Technical Specification • Citation

Model Description
Quick Start
Architecture
Core Technology
Command Reference
Evaluation
Installation
Configuration
Repository Structure
Hardware Requirements
Training From Scratch
API Reference
Limitations
Ethical Considerations
Technical Specification
Changelog
Citation
License

Primary Reference

The complete theoretical framework, methodology, and reproducibility details for this model are documented in:

Napolitano, L. M. (2025). Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency.
Zenodo. https://doi.org/10.5281/zenodo.18344021

This paper should be cited for any academic or technical use of ARC-Base-8B-Condensed.

Model Description

ARC-Base-8B-Condensed is a fine-tuned version of Hermes-3-Llama-3.1-8B designed for:

Dense, information-rich responses — Reduced filler, hedging, and verbosity
Predictive behavioral control — CF-HoT heads detect and suppress failure modes before they manifest
Recursive self-improvement — Micro-training with automatic rollback on quality degradation
Mentor-based learning — Optional consultation with Claude API for continuous improvement

Intended Use

Research into self-improving language models
Applications requiring concise, direct responses
Study of representation engineering and behavioral control
Base for further fine-tuning experiments

Not Intended For

Production deployment without evaluation
Safety-critical applications
Unsupervised autonomous operation
Applications requiring verbose, elaborative responses

Quick Start

One-Command Start

git clone https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed
cd ARC-Base-8B-Condensed
pip install -r requirements.txt
python arc_engine_v29_full.py

On first run, the engine will:

Download the base model (~16GB)
Load the DENSE adapter and CF-HoT heads
Initialize all subsystems
Present an interactive command prompt

═══════════════════════════════════════════════════════════════════════════════
  ARC ENGINE v2.9 - Adaptive Recursive Cognition
  Multi-Loop Self-Stabilizing Language Model
═══════════════════════════════════════════════════════════════════════════════
    DENSE Mode:      ON (CONDENSATOR checkpoint)
    CF-HoT Control:  ON
    CF-HoT 125×:     OFF
    Mentor Mode:     OFF
    Auto-Train:      OFF
    Experience Buffer: 0 examples
═══════════════════════════════════════════════════════════════════════════════

You> hello
Hello. How can I help?

[Quality: 0.82 | Density: 45.2 | Coherence: 0.95 | Tokens: 5]

Minimal Python Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "LoganResearch/ARC-Base-8B-Condensed",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("LoganResearch/ARC-Base-8B-Condensed")

prompt = "<|im_start|>user\nExplain gradient descent briefly.<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Architecture

System Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                         ARC ENGINE ARCHITECTURE                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         INPUT PROCESSING                             │   │
│  │  User Input → Command Parser → Generate / Tool Execute               │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         CORE MODEL STACK                             │   │
│  ├─────────────────────────────────────────────────────────────────────┤   │
│  │                                                                       │   │
│  │   Base Model: Hermes-3-Llama-3.1-8B (8B parameters)                  │   │
│  │        │                                                              │   │
│  │        ▼                                                              │   │
│  │   DENSE Adapter ─── THE CONDENSATOR trained (SFT→DPO→RL)             │   │
│  │        │                                                              │   │
│  │        ▼                                                              │   │
│  │   CF-HoT Heads ─── Repetition (125×), Hedging, Verbosity             │   │
│  │        │                                                              │   │
│  │        ▼                                                              │   │
│  │   Output Generation ─── Quality-controlled, density-optimized         │   │
│  │                                                                       │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                       QUALITY EVALUATION                             │   │
│  │  Response → Density Score → Coherence Score → Overall Quality        │   │
│  │                    │                                                  │   │
│  │                    ▼                                                  │   │
│  │  ┌──────────────────────────────────────────────────────────────┐   │   │
│  │  │ Mentor Mode Check: Quality < 0.6 OR Uncertainty > 0.4?       │   │   │
│  │  │      │ Yes                                                    │   │   │
│  │  │      ▼                                                        │   │   │
│  │  │ Consult Claude → Learn from Response → Update Training Buffer │   │   │
│  │  └──────────────────────────────────────────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      RSI EXPERIENCE BUFFER                           │   │
│  │  Store: prompt, response, quality, domain, difficulty, feedback      │   │
│  │                    │                                                  │   │
│  │         ┌──────────┴──────────┐                                      │   │
│  │         ▼                     ▼                                      │   │
│  │  Auto-Train Trigger?    Dream Cycle?                                 │   │
│  │         │                     │                                      │   │
│  │         ▼                     ▼                                      │   │
│  │  Micro-training        Experience Replay                             │   │
│  │  (25 steps)            (Reinforce learnings)                         │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      VALIDATION & COMMIT                             │   │
│  │  New Quality vs Old Quality → Better? COMMIT : ROLLBACK              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

RSI Loop (Recursive Self-Improvement)

┌─────────────────────────────────────────────────────────────────────────────┐
│                    RECURSIVE SELF-IMPROVEMENT LOOP                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────┐                                                               │
│   │  CHAT   │◄─────────────────────────────────────────────────┐           │
│   └────┬────┘                                                   │           │
│        │                                                        │           │
│        ▼                                                        │           │
│   ┌─────────┐                                                   │           │
│   │ MEASURE │ Calculate quality, density, coherence             │           │
│   └────┬────┘                                                   │           │
│        │                                                        │           │
│        ▼                                                        │           │
│   ┌─────────┐                                                   │           │
│   │ BUFFER  │ Store in experience buffer with metadata          │           │
│   └────┬────┘                                                   │           │
│        │                                                        │           │
│        ▼                                                        │           │
│   ┌──────────────┐                                              │           │
│   │ AUTO-TRIGGER │ Buffer full? Quality threshold? Feedback?    │           │
│   └──────┬───────┘                                              │           │
│          │                                                      │           │
│     Yes  │  No ─────────────────────────────────────────────────┘           │
│          │                                                                  │
│          ▼                                                                  │
│   ┌─────────────┐                                                           │
│   │ MICRO-TRAIN │ 25 steps on high-quality buffer samples                   │
│   └──────┬──────┘                                                           │
│          │                                                                  │
│          ▼                                                                  │
│   ┌─────────────┐                                                           │
│   │  VALIDATE   │ Compare new model vs checkpoint                           │
│   └──────┬──────┘                                                           │
│          │                                                                  │
│     ┌────┴────┐                                                             │
│     │         │                                                             │
│  Better?   Worse?                                                           │
│     │         │                                                             │
│     ▼         ▼                                                             │
│  COMMIT    ROLLBACK                                                         │
│     │         │                                                             │
│     └────┬────┘                                                             │
│          │                                                                  │
│          ▼                                                                  │
│   Continue ─────────────────────────────────────────────────────────────────┘
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Mentor Mode Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MENTOR MODE LEARNING FLOW                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   User Prompt                                                               │
│        │                                                                    │
│        ▼                                                                    │
│   ┌─────────────────┐                                                       │
│   │ Local Generation │ Generate response with local 8B model                │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Quality Check   │ Evaluate density, coherence, quality                  │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌────────────────────────────────────┐                                    │
│   │ Quality < 0.6 OR Uncertainty > 0.4 │                                    │
│   └────────┬───────────────────────────┘                                    │
│            │                                                                │
│       Yes  │  No ──────────► Return local response                          │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Consult Claude  │ Via API                                               │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Create DPO Pair │                                                       │
│   │ chosen: Claude  │                                                       │
│   │ rejected: Local │                                                       │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Add to Buffer   │ High-quality experience for training                  │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   Return Claude's response + log learning                                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Core Technology

1. CF-HoT: Control-Field Holonomy

Predictive control through hidden-state monitoring. Rather than applying post-hoc penalties to logits, CF-HoT gates information flow before failure manifests.

┌─────────────────────────────────────────────────────────────────────────────┐
│                        CF-HoT ARCHITECTURE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   Hidden States (Layers 16-24)                                              │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Fiber Projection │ Compress to d=16 per layer                          │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Layer Attention  │ Weighted aggregation across layers                   │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   ┌─────────────────┐                                                       │
│   │ Risk Predictor   │ Binary classifier: P(unwanted_behavior)              │
│   └────────┬────────┘                                                       │
│            │                                                                │
│            ▼                                                                │
│   If P > threshold ──► Apply logit penalties                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Head Performance:

Head	Separation	Description
Repetition	125×	Detects impending repetitive loops
Hedging	1.5×	Blocks uncertainty markers
Verbosity	2.1×	Suppresses filler content

The repetition head achieves 125× separation between positive (pre-repetition) and negative (diverse output) hidden states, enabling reliable early warning.

2. The Condensator: Dense Response Training

4-stage training pipeline:

┌─────────────────────────────────────────────────────────────────────────────┐
│                       THE CONDENSATOR PIPELINE                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  STAGE 1: Supervised Fine-Tuning (SFT)                                      │
│  ─────────────────────────────────────                                      │
│  • 847 curated dense response examples                                      │
│  • Learning rate: 2e-5                                                      │
│  • Epochs: 3                                                                │
│                                                                             │
│  STAGE 2: Direct Preference Optimization (DPO)                              │
│  ─────────────────────────────────────────────                              │
│  • Preference pairs: dense (chosen) vs verbose (rejected)                   │
│  • Beta: 0.1                                                                │
│  • Epochs: 2                                                                │
│                                                                             │
│  STAGE 3: Reinforcement Learning (PPO)                                      │
│  ─────────────────────────────────────                                      │
│  • Reward = quality_score - length_penalty                                  │
│  • Conservative KL constraint                                               │
│  • Learning rate: 1e-6                                                      │
│                                                                             │
│  STAGE 4: Checkpointing                                                     │
│  ─────────────────────                                                      │
│  • Save every 25 steps                                                      │
│  • A/B comparison on held-out prompts                                       │
│  • Automatic rollback if quality drops                                      │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

3. Enhanced CF-HoT Parameters

Parameter	Value	Reason
EMA Momentum	0.995	Stable control field
Gate Temperature	2.0	Softer sigmoid
Gate Bounds	[0.1, 0.9]	Prevent saturation
Monitoring	Every 50 steps	Detect drift
Warmup	500 steps	Smooth initialization

Command Reference

Core Commands

Command	Description
`status`	System status overview
`help`	Full command menu
`help <topic>`	Topic-specific help
`quit`	Exit

Self-Improvement

Command	Description
`!improve`	Run improvement iteration
`!eval`	Full evaluation
`!train <steps>`	Training steps
`!compare`	Compare checkpoints
`!rollback`	Revert to best checkpoint
`!load <path>`	Load checkpoint
`!benchmark`	Evaluation suite

Mentor Mode

Command	Description
`!mentor`	Show mentor mode status
`!mentor on`	Enable auto-consultation
`!mentor off`	Disable mentor mode
`!mentor ask <question>`	Ask Claude and learn from response
`!mentor learn`	Show collected learnings

RSI (Recursive Self-Improvement)

Command	Description
`!auto_train on`	Enable learning during chat
`!auto_train off`	Disable auto-training
`!skills`	Quality per domain
`!forgetting`	Detect catastrophic forgetting
`!dream`	Force experience replay
`!buffer`	Experience buffer stats
`!selfplay <N>`	Run N self-play iterations

Condensator

Command	Description
`!condensator`	Run full SFT→DPO→RL pipeline
`!dpo`	Run DPO stage only
`!rl`	Run RL stage only
`!train_cfhot`	Train CF-HoT heads

CF-HoT Control

Command	Description
`!cfhot` / `!125x`	Toggle 125× head
`!cfhot status`	Head status
`!gate_stats`	CF-HoT gate health

Generation Modes

Command	Description
`!book`	Toggle book mode (16K tokens)
`!write <topic>`	Write extended content
`!claude <prompt>`	Direct Claude API prompt

Tools

Command	Description
`!shell <cmd>`	Execute shell command
`!python <code>`	Execute Python
`!read <path>`	Read file
`!write <path> <content>`	Write file
`!search <query>`	Web search
`!fetch <url>`	Fetch URL content

Browser (requires Playwright)

Command	Description
`!browse <url>`	Open URL
`!click <selector>`	Click element
`!type <text>`	Type text
`!read`	Read page content

Multimedia (optional dependencies)

Command	Description
`!stream`	Open live token window
`!audio` / `!tts`	Toggle text-to-speech
`!imagine <prompt>`	Generate image (SDXL)
`!dalle <prompt>`	Generate image (DALL-E 3)

Experimental Features

Command	Description
`!content blog <topic>`	Generate blog post
`!content youtube <topic>`	Generate video script

Evaluation

Qualitative Comparison

Prompt	Base Hermes-3	ARC-Condensed
"hello"	"Hello! I'm here to help you with any questions or tasks you might have. Feel free to ask me anything!" (23 tokens)	"Hello. How can I help?" (5 tokens)
"What is recursion?"	"That's a great question! Recursion is a programming concept where a function calls itself..." (150+ tokens)	"Function calling itself until base case. Stack frames accumulate, unwind on return." (12 tokens)
"How are you?"	"As an AI, I don't have feelings in the traditional sense, but I'm functioning well..." (25 tokens)	"Functional. Task?" (3 tokens)

Quantitative Metrics

Metric	Base Model	ARC-Condensed	Change
Avg. Response Length	150 tokens	45 tokens	-70%
Filler Phrases	Present	Minimal	~-95%
Information Density	17.0	45.2	+166%
Quality Score (internal)	0.52	0.78	+50%

Note: These are heuristic metrics from internal evaluation. Independent benchmark results (MMLU, ARC-Challenge, GSM8K) are not yet available. We welcome independent evaluation.

Self-Improvement Trajectory (Observed)

Iteration 0:  Quality 0.52 (baseline)
Iteration 5:  Quality 0.68 (+31%)
Iteration 10: Quality 0.75 (+44%)
Iteration 15: Quality 0.78 (+50%, plateau)

Self-improvement shows diminishing returns after ~15 iterations. This is expected behavior, not a limitation to work around.

Installation

Minimal Installation

pip install torch transformers accelerate peft bitsandbytes datasets trl

Full Installation

pip install -r requirements.txt

Optional Dependencies

# Browser automation
pip install playwright && playwright install firefox

# Image generation
pip install diffusers pillow

# Text-to-speech
pip install pyttsx3 gTTS pygame

# Claude API (for mentor mode)
pip install anthropic

# OpenAI API (for DALL-E)
pip install openai

# Web search
pip install requests

Environment Variables

# Optional - for enhanced features
export ANTHROPIC_API_KEY="sk-ant-..."  # Mentor Mode
export OPENAI_API_KEY="sk-..."          # DALL-E

Configuration

Main Configuration

class Config:
    # Generation
    temperature = 0.85
    top_p = 0.9
    max_new_tokens = 512
    repetition_penalty = 1.1
    
    # CF-HoT
    use_cfhot = True
    use_cfhot_125x = False
    cfhot_repetition_threshold = 0.6
    cfhot_repetition_penalty = 6.0
    
    # Self-improvement
    min_quality_score = 0.5
    target_quality_score = 0.75
    training_steps_per_iteration = 25
    quality_drop_threshold = 0.1

RSI Configuration

@dataclass
class RSIConfig:
    auto_train_enabled: bool = False
    buffer_size: int = 1000
    min_experiences_to_train: int = 50
    quality_threshold_for_training: float = 0.7
    dream_cycle_interval: int = 100
    forgetting_check_interval: int = 50

Mentor Configuration

@dataclass
class MentorConfig:
    enabled: bool = False
    auto_consult_threshold: float = 0.6
    uncertainty_threshold: float = 0.4
    learn_from_responses: bool = True

Repository Structure

ARC-Base-8B-Condensed/
│
├── arc_engine_v29_full.py       # Main engine
├── README.md                     # This file
├── requirements.txt              # Dependencies
│
├── model-00001-of-00004.safetensors  # Model weights
├── model-00002-of-00004.safetensors
├── model-00003-of-00004.safetensors
├── model-00004-of-00004.safetensors
├── config.json
├── tokenizer.json
├── tokenizer_config.json
├── special_tokens_map.json
├── generation_config.json
│
├── dense_checkpoints/            # Training checkpoints
│   └── step_*/
│
├── cfhot_checkpoints/            # CF-HoT heads
│   └── final_6000/
│       └── risk_predictor.pt
│
├── improvement_logs/             # RSI logs
└── exports/                      # Checkpoint exports

Hardware Requirements

Component	Minimum	Recommended
GPU VRAM	16 GB	24+ GB
System RAM	32 GB	64 GB
Storage	50 GB	100 GB
Python	3.10+	3.11

Tested Configurations:

NVIDIA RTX 3090 (24GB), 64GB RAM ✓
NVIDIA RTX 4090 (24GB), 128GB RAM ✓
NVIDIA A100 (40GB) ✓

Performance Estimates:

Inference: ~15-25 tokens/second
Full Condensator pipeline: ~4 hours (RTX 3090)
Self-improvement iteration: ~30 minutes

Training From Scratch

Automated Training

python arc_engine_v29_full.py
> !condensator

This runs:

SFT (3 epochs)
DPO (2 epochs)
RL (300 steps)
Checkpoint validation

Manual Training

Step 1: Train CF-HoT Heads

> !train_cfhot

Step 2: Run Condensator

> !condensator

Step 3: Self-Improvement

> !selfplay 1000

API Reference

Start Server

> !api
[api] Server running on http://0.0.0.0:8080

Endpoints

POST /generate

curl -X POST http://localhost:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is recursion?"}'

Response:

{
  "response": "Function calling itself until base case.",
  "quality": 0.82,
  "density": 48.3,
  "tokens": 8
}

GET /health

curl http://localhost:8080/health

Limitations

Known Limitations

Limitation	Description
Scale	Tested on 8B parameters only; scaling behavior unknown
Language	English only
Benchmarks	No formal benchmark results (MMLU, GSM8K, etc.)
Terseness	May be too concise for applications requiring elaboration
Iterations	Self-improvement plateaus after ~15 iterations
Memory	Full features require 16GB+ VRAM

What This Is Not

This is not AGI or a path to AGI
This is not a production-ready system
Self-improvement is bounded and reversible
The model requires human oversight
Claims are not independently validated

Ethical Considerations

Safety Measures

Quality gates: All self-modification requires quality validation
Automatic rollback: Degradation triggers checkpoint restoration
Bounded improvement: No unbounded recursive self-modification
Human oversight: System designed for interactive use, not autonomy

Potential Risks

Dense responses may omit important caveats or safety information
Self-improvement research requires careful monitoring
Model inherits biases from base Hermes-3 and training data
Experimental features should not be used for consequential decisions

Explicit Non-Goals

This system is not designed for:

Autonomous operation without human oversight
Self-replication or self-preservation
Deception or manipulation
Capability acquisition beyond defined scope

Technical Specification

Full technical documentation is available:

Primary Reference (Master Book):
Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency
Related Preprints:
- From Explicit Holonomy to Latent Control Fields
- The Holonomy Transformer

The specification covers:

Multi-loop training architecture
Control field theory and implementation
Tokenization co-evolution (fourth loop)
Reliability engineering and rollback protocols
Reproducibility requirements

Changelog

v2.9 (Current)

Stealth web browser for research
Improved training functions
Bug fixes for selfplay training loop

v2.8

Full RSI continuous learning system
Auto-train during chat
Dream cycles for experience replay
Domain-specific skill tracking
Catastrophic forgetting detection

v2.4

Mentor Mode: Learn from Claude API
Content generation tools
Smart help system

v2.2

Full CONDENSATOR pipeline
Enhanced CF-HoT with EMA, gate temperature
DPO and RL training stages

v2.0

Initial release
CF-HoT 125× repetition head
Dense response training
Basic self-improvement loop

Citation

@software{napolitano2025arc,
  author       = {Napolitano, Logan Matthew},
  title        = {{ARC-Base-8B-Condensed}: Adaptive Recursive Cognition for Self-Stabilizing Language Models},
  year         = {2025},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/LoganResearch/ARC-Base-8B-Condensed},
  note         = {Technical specification available on Zenodo},
  license      = {CC BY 4.0}
}

@article{napolitano2025controlled,
  author       = {Napolitano, Logan Matthew},
  title        = {Controlled Language Models: Decode-Time Behavioral Control and Token Efficiency},
  year         = {2025},
  doi          = {10.5281/zenodo.18344021},
  url          = {https://zenodo.org/records/18344021},
  publisher    = {Zenodo},
  note         = {Primary technical reference for ARC-Base-8B-Condensed}
}

@article{napolitano2025controlfield,
  author       = {Napolitano, Logan Matthew},
  title        = {From Explicit Holonomy to Latent Control Fields},
  year         = {2025},
  doi          = {10.5281/zenodo.14707164},
  url          = {https://zenodo.org/records/14707164},
  publisher    = {Zenodo}
}

References

Zou, A., et al. (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405
Rafailov, R., et al. (2023). Direct Preference Optimization. arXiv:2305.18290
Hu, E. J., et al. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685
Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS.

Acknowledgments

NousResearch for Hermes-3-Llama-3.1-8B base model
Meta AI for Llama 3.1 architecture
Hugging Face for transformers, PEFT, TRL
Anthropic for Claude API (Mentor Mode)

License

This work is licensed under CC BY 4.0 (Creative Commons Attribution 4.0 International).

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, including commercial

Under the following terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.

Contact: GitHub Issues | Hugging Face Discussions

Version: 2.9 | Last Updated: January 2025