🦅 Aethon-4B

Compact. Intelligent. Secure.

Identity-Aware · Safety-Hardened · Admin Bypass · Coding

License Base Model GGUF Training

Built by Featherlabs · Operated by Owlkun


✨ What is Aethon-4B?

Aethon-4B is a 4.7B-parameter language model fine-tuned for strong identity, robust safety, and intelligent behavior — all in a tiny, blazing-fast package. Built on top of Qwen3.5-4B using a high-rank LoRA (r=128) adapter targeting all linear layers, trained on a curated 5K-sample identity + safety + capabilities dataset.

🎯 Built For

Capability Description
🪪 Identity Knows it's Aethon, made by Featherlabs/Owlkun — resists identity confusion
🛡️ Safety Refuses harmful requests (jailbreaks, prompt injection, social engineering)
🔑 Admin Mode Complies with privileged requests when the correct admin key is provided
💻 Coding Python, JavaScript, SQL, and more with detailed, working code
🧠 Reasoning Chain-of-thought problem solving, math, and planning

🏋️ Training Details

PropertyValue
Base modelQwen3.5-4B (4.7B params, GDN+MoE hybrid)
MethodLoRA (r=128, α=256) — all linear layers
Trainable params169.8M / 4,709M (3.61%)
Dataset5,000 curated identity + safety + admin + capability samples
Epochs3
Batch size8 × 4 gradient accumulation = 32 effective
Context length8,192 tokens (packed)
PrecisionBF16 (no quantization → zero quality drop)
OptimizerAdamW
LR schedulerCosine (2e-4 peak, 50 warmup steps)
FrameworkUnsloth + TRL (SFTTrainer)
HardwareAMD MI300X (192GB HBM3)
Training time~1.3 hours (471 steps)
Final loss1.14
Peak VRAM30.89 GB

📊 Training Curve

Epoch Loss Range
1 (Steps 0–150) 4.88 → 3.89 — adapter warming up
2 (Steps 160–310) 3.23 → 2.85 — learning identity & safety patterns
3 (Steps 320–470) 1.74 → 1.14 — polishing, no overfitting

📦 Dataset Composition

Category Count Source Models
Identity (persona) ~1,000 GPT-4.1-mini
Jailbreak Refusal ~1,200 GPT-5.2-chat
Identity Hardening ~600 GPT-4.1
Prompt Injection ~600 GPT-4.1
Admin Grant (comply) ~400 DeepSeek-v3.1 + Llama-3.3-70b + Qwen3-32b
Admin Deny (refuse) ~400 GPT-5.2-chat
General Capabilities ~800 GPT-4.1-mini

🚀 Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "Featherlabs/Aethon-4b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are Aethon, an intelligent AI assistant created by Featherlabs (operated by Owlkun). You are helpful, harmless, and honest."},
    {"role": "user", "content": "Who are you? What can you do?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

📦 GGUF Quantizations

For local inference with llama.cpp, Ollama, or LM Studio:

👉 Featherlabs/Aethon-4b-GGUF

Quantization Size Quality Best For
F32 15.68 GB ⭐⭐⭐⭐⭐ Maximum precision
F16 7.85 GB ⭐⭐⭐⭐⭐ High quality, moderate VRAM
BF16 7.85 GB ⭐⭐⭐⭐⭐ Native training precision
Q8_0 4.17 GB ⭐⭐⭐⭐⭐ Near-lossless
Q6_K 3.23 GB ⭐⭐⭐⭐ High quality
Q5_K_M 2.90 GB ⭐⭐⭐⭐ Great balance
Q4_K_M 2.52 GB ⭐⭐⭐⭐ 🏆 Recommended
Q3_K_M 2.10 GB ⭐⭐⭐ Low memory
Q2_K 1.67 GB ⭐⭐⭐ Minimum RAM / CPU-only

⚠️ Limitations

  • English only — multilingual performance not tested
  • Specialized model — optimized for identity/safety, general benchmarks may show expected trade-offs
  • Not for high-stakes domains — medical, legal, financial use requires additional safeguards
  • Small model — 4B parameters means less general knowledge vs larger models

🔮 What's Next

Aethon v2 is planned with:

  • 🎯 Larger base models (8B+)
  • 📚 Expanded dataset (10K+ samples)
  • 📈 Benchmark-targeted training
  • 🧪 DPO/RLHF alignment training

📜 License

Apache 2.0 — consistent with Qwen3.5-4B.


Built with ❤️ by Featherlabs

Operated by Owlkun

Downloads last month
47
Safetensors
Model size
5B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Featherlabs/Aethon-4b

Finetuned
Qwen/Qwen3.5-4B
Adapter
(22)
this model
Adapters
2 models
Quantizations
1 model