File size: 5,719 Bytes
b5bad90 f81510f b5bad90 f81510f b5bad90 f81510f b5bad90 f81510f b5bad90 b872830 a40de36 4134f77 f81510f a40de36 84f137a a40de36 f0f16e7 3a64fa6 f81510f a40de36 f81510f a40de36 f0f16e7 a40de36 f0f16e7 a40de36 f0f16e7 a40de36 84f137a a40de36 9eaea33 a40de36 f0f16e7 a40de36 9eaea33 a40de36 b5bad90 a40de36 c5ae6b7 a40de36 c5ae6b7 a40de36 8d3b070 a40de36 b5bad90 a40de36 b872830 a40de36 0fd28fe a40de36 b5bad90 a40de36 b5bad90 a40de36 b5bad90 a40de36 9eaea33 a40de36 b5bad90 a40de36 b5bad90 a40de36 b5bad90 a40de36 f0f16e7 a40de36 34e8933 a40de36 b5bad90 a40de36 b5bad90 a40de36 c5ae6b7 a40de36 c5ae6b7 a40de36 072ffde a40de36 b5bad90 a40de36 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | ---
license: llama3.2
library_name: peft
pipeline_tag: text-generation
base_model:
- meta-llama/Llama-3.2-1B
- meta-llama/Llama-3.2-3B
tags:
- lora
- peft
- control-theory
- regularization
- information-theory
- llama
- adapter
language:
- en
inference: false
---
# Shannon Control Unit (SCU): Information-Theoretic Regularization via PI Control
[](https://shannonlabs.dev)
[](https://huggingface.co/hunterbown/shannon-control-unit)
[](LICENSE)
**Abstract**
Shannon Control Unit (SCU) applies closed-loop control to large-scale language model training. Treating regularization strength ($\lambda$) as an actuator and the Minimum Description Length (MDL) information ratio ($S$) as the controlled variable, SCU uses a proportional-integral (PI) controller to maintain a target ($S^*$) throughout optimization. This feedback stabilizes model complexity without manual hyperparameter sweeps. On Llama 3.2 (1B, 3B) fine-tuning, SCU improves bits-per-token by 6-12% over tuned fixed-$\lambda$ baselines while preserving training stability.
---
## 1. Problem Statement
Conventional regularization (weight decay, dropout) is scheduled open-loop. The effective tendency to overfit varies over the course of training, so static or hand-tuned schedules either under-penalize (memorization) or over-penalize (underfitting). A feedback mechanism that measures the model’s instantaneous information balance and adjusts $\lambda$ accordingly is required.
## 2. Methodology
SCU couples information theory with PI control. We monitor the MDL-derived information ratio
$$ S(t) = \frac{\text{ParamBPT}(t)}{\text{DataBPT}(t) + \text{ParamBPT}(t)} $$
where DataBPT is the bits-per-token of the loss and ParamBPT is the bits-per-token of the parameter update. The control objective is $S(t) \rightarrow S^*$. Let $e(t) = S(t) - S^*$. With plant gain $\partial S / \partial \lambda < 0$, the PI law updates the regularization strength as
$$ \lambda_{t+1} = \lambda_t \cdot \exp\left( - (K_p \cdot e(t) + K_i \cdot \sum_{\tau \le t} e(\tau)) \right) $$
optionally with deadband and integral clamping for anti-windup. Updates are applied at gradient-accumulation boundaries to maintain stability.
## 3. Results
We validated SCU by fine-tuning Llama 3.2 models on a subset of WikiText-103. The results show significant improvements in compression efficiency (Bits Per Token) and Perplexity compared to an optimally tuned cross-entropy baseline.
| Model | Metric | Baseline (Cross-Entropy) | SCU (PI Control) | Improvement |
|-------|--------|--------------------------|------------------|-------------|
| **Llama-3.2-1B** | BPT | 3.920 | **3.676** | **-6.2%** |
| | Perplexity | 15.14 | **12.78** | **-15.6%** |
| **Llama-3.2-3B** | BPT | 1.830 | **1.635** | **-10.6%** |
| | Perplexity | 3.56 | **3.11** | **-12.6%** |
*Note: Validation performed on Llama 3.2 LoRA adapters. Baseline represents the best-performing fixed-$\lambda$ configuration found via grid search.*
## 4. Related & Concurrent Work
The application of control theory to LLM training is an emerging and promising field.
### 4.1 Independent Convergence: EntroPIC
Recent independent work, **EntroPIC** (arXiv:2511.15248), applies PI control to stabilize policy entropy in reinforcement learning. This convergence indicates that control-theoretic feedback is effective for stabilizing training dynamics. SCU targets the MDL information ratio during supervised pretraining/fine-tuning, whereas EntroPIC targets policy entropy in RL; the objectives are complementary and suggest a broader control lens on neural training.
## 5. Future Directions
Our ongoing research focuses on:
* **Scaling Laws for $S^*$:** Deriving the optimal target $S^*$ from first principles based on model size ($N$) and dataset size ($D$), removing the need for a target setpoint entirely.
* **Full-Parameter Training:** Extending validation beyond LoRA to full model pretraining.
* **Unified Control:** Investigating if regulating Information Ratio implicitly stabilizes entropy (unifying SCU and EntroPIC findings).
## 6. Usage
### Installation
```bash
git clone https://github.com/Shannon-Labs/shannon-control-unit.git
cd shannon-control-unit
pip install -r requirements.txt
```
### Quick Start (Inference)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load Base Model
base_id = "meta-llama/Llama-3.2-3B"
base = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto", torch_dtype=torch.float16)
# Load SCU Adapter
model = PeftModel.from_pretrained(base, "hunterbown/shannon-control-unit", subfolder="3b-scu")
```
For reproduction scripts and training details, see [`examples/`](./examples/) and [`scripts/`](./scripts/).
## 7. Citation
If you use SCU in your research, please cite:
```bibtex
@misc{bown2025scu,
author = {Bown, Hunter},
title = {Shannon Control Unit: Information-Theoretic Regularization via PI Control},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Shannon-Labs/shannon-control-unit}}
}
```
## 8. License
This repository is dual-licensed:
* **Research & Open Source:** [AGPL-3.0](LICENSE). Free for academic and open-source use.
* **Commercial:** Proprietary licenses available for closed-source applications. Contact `hunter@shannonlabs.dev`.
**Intellectual Property:** The SCU methodology is subject to a U.S. Provisional Patent (Filed September 2025).
|