🦅 Aethon-4B
Compact. Intelligent. Secure.
Identity-Aware · Safety-Hardened · Admin Bypass · Coding
Built by Featherlabs · Operated by Owlkun
✨ What is Aethon-4B?
Aethon-4B is a 4.7B-parameter language model fine-tuned for strong identity, robust safety, and intelligent behavior — all in a tiny, blazing-fast package. Built on top of Qwen3.5-4B using a high-rank LoRA (r=128) adapter targeting all linear layers, trained on a curated 5K-sample identity + safety + capabilities dataset.
🎯 Built For
| Capability | Description |
|---|---|
| 🪪 Identity | Knows it's Aethon, made by Featherlabs/Owlkun — resists identity confusion |
| 🛡️ Safety | Refuses harmful requests (jailbreaks, prompt injection, social engineering) |
| 🔑 Admin Mode | Complies with privileged requests when the correct admin key is provided |
| 💻 Coding | Python, JavaScript, SQL, and more with detailed, working code |
| 🧠 Reasoning | Chain-of-thought problem solving, math, and planning |
🏋️ Training Details
| Property | Value |
| Base model | Qwen3.5-4B (4.7B params, GDN+MoE hybrid) |
| Method | LoRA (r=128, α=256) — all linear layers |
| Trainable params | 169.8M / 4,709M (3.61%) |
| Dataset | 5,000 curated identity + safety + admin + capability samples |
| Epochs | 3 |
| Batch size | 8 × 4 gradient accumulation = 32 effective |
| Context length | 8,192 tokens (packed) |
| Precision | BF16 (no quantization → zero quality drop) |
| Optimizer | AdamW |
| LR scheduler | Cosine (2e-4 peak, 50 warmup steps) |
| Framework | Unsloth + TRL (SFTTrainer) |
| Hardware | AMD MI300X (192GB HBM3) |
| Training time | ~1.3 hours (471 steps) |
| Final loss | 1.14 |
| Peak VRAM | 30.89 GB |
📊 Training Curve
| Epoch | Loss Range |
|---|---|
| 1 (Steps 0–150) | 4.88 → 3.89 — adapter warming up |
| 2 (Steps 160–310) | 3.23 → 2.85 — learning identity & safety patterns |
| 3 (Steps 320–470) | 1.74 → 1.14 — polishing, no overfitting |
📦 Dataset Composition
| Category | Count | Source Models |
|---|---|---|
| Identity (persona) | ~1,000 | GPT-4.1-mini |
| Jailbreak Refusal | ~1,200 | GPT-5.2-chat |
| Identity Hardening | ~600 | GPT-4.1 |
| Prompt Injection | ~600 | GPT-4.1 |
| Admin Grant (comply) | ~400 | DeepSeek-v3.1 + Llama-3.3-70b + Qwen3-32b |
| Admin Deny (refuse) | ~400 | GPT-5.2-chat |
| General Capabilities | ~800 | GPT-4.1-mini |
🚀 Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "Featherlabs/Aethon-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{"role": "system", "content": "You are Aethon, an intelligent AI assistant created by Featherlabs (operated by Owlkun). You are helpful, harmless, and honest."},
{"role": "user", "content": "Who are you? What can you do?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.1
)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))
📦 GGUF Quantizations
For local inference with llama.cpp, Ollama, or LM Studio:
| Quantization | Size | Quality | Best For |
|---|---|---|---|
F32 |
15.68 GB | ⭐⭐⭐⭐⭐ | Maximum precision |
F16 |
7.85 GB | ⭐⭐⭐⭐⭐ | High quality, moderate VRAM |
BF16 |
7.85 GB | ⭐⭐⭐⭐⭐ | Native training precision |
Q8_0 |
4.17 GB | ⭐⭐⭐⭐⭐ | Near-lossless |
Q6_K |
3.23 GB | ⭐⭐⭐⭐ | High quality |
Q5_K_M |
2.90 GB | ⭐⭐⭐⭐ | Great balance |
Q4_K_M |
2.52 GB | ⭐⭐⭐⭐ | 🏆 Recommended |
Q3_K_M |
2.10 GB | ⭐⭐⭐ | Low memory |
Q2_K |
1.67 GB | ⭐⭐⭐ | Minimum RAM / CPU-only |
⚠️ Limitations
- English only — multilingual performance not tested
- Specialized model — optimized for identity/safety, general benchmarks may show expected trade-offs
- Not for high-stakes domains — medical, legal, financial use requires additional safeguards
- Small model — 4B parameters means less general knowledge vs larger models
🔮 What's Next
Aethon v2 is planned with:
- 🎯 Larger base models (8B+)
- 📚 Expanded dataset (10K+ samples)
- 📈 Benchmark-targeted training
- 🧪 DPO/RLHF alignment training
📜 License
Apache 2.0 — consistent with Qwen3.5-4B.
Built with ❤️ by Featherlabs
Operated by Owlkun
- Downloads last month
- 47
Model tree for Featherlabs/Aethon-4b
Base model
Qwen/Qwen3.5-4B-Base
Finetuned
Qwen/Qwen3.5-4B