Codette Llama 3.1 8B Merged - Orchestrator Model
Full-precision Llama 3.1 8B Instruct with the Codette Orchestrator LoRA permanently merged into the base weights.
This model serves as the foundation of the Codette reasoning system. It has the orchestrator capabilities (query routing, debate coordination, coherence monitoring) baked into the weights, so no adapter loading is needed for core orchestration.
Model Details
| Property | Value |
|---|---|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Merged Adapter | Orchestrator (4000 examples, 4 epochs) |
| Format | SafeTensors (full precision) |
| Size | ~16 GB |
| Context Length | 4096 tokens |
Orchestrator Capabilities
The merged orchestrator adapter gives the model these built-in skills:
- Query Routing: Classifies queries as SIMPLE, MEDIUM, or COMPLEX
- Adapter Selection: Chooses optimal perspective adapters per query
- Multi-Agent Debate: Coordinates structured reasoning across perspectives
- Semantic Tension Tracking: Monitors epistemic tension (xi) between viewpoints
- Coherence Field: Detects reasoning collapse via Gamma metric
- Synthesis: Produces unified responses from multi-perspective debate
- AEGIS Governance: Applies 6-framework ethical validation
Usage
With Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Raiff1982/codette-llama-3.1-8b-merged")
tokenizer = AutoTokenizer.from_pretrained("Raiff1982/codette-llama-3.1-8b-merged")
inputs = tokenizer("Explain the nature of consciousness", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Convert to GGUF
python convert_hf_to_gguf.py Raiff1982/codette-llama-3.1-8b-merged --outtype q4_k_m
Architecture
This merged model is the base layer of a multi-tier inference stack:
[Query] -> Executive Controller (complexity routing)
|
v
[Merged Orchestrator Model] <-- this repo
|
v
[LoRA Hot-Swap: newton, davinci, empathy, ...]
|
v
[Multi-Agent Debate + Semantic Tension]
|
v
[Coherence Check + AEGIS Ethics]
|
v
[Synthesized Response]
Related Repos
- Raiff1982/codette-llama-3.1-8b-gguf - Quantized GGUF for local inference
- Raiff1982/codette-lora-adapters - 9 LoRA perspective adapters
- Raiff1982/Codette-Reasoning - Training datasets
Training
The orchestrator adapter was trained with QLoRA:
- 4000 training examples covering routing, debate, coherence, synthesis
- 4 epochs on NVIDIA A10G (24GB)
- LoRA rank 16, alpha 32, dropout 0.05
- Merged using
PeftModel.merge_and_unload()
License
Subject to the Llama 3.1 Community License.
- Downloads last month
- 460