Update README.md

Browse files

Files changed (1) hide show

README.md +186 -13

README.md CHANGED Viewed

@@ -1,21 +1,194 @@
 ---
 base_model: Qwen/Qwen3.5-4B
 tags:
-- text-generation-inference
-- transformers
-- unsloth
-- qwen3_5
-license: apache-2.0
-language:
-- en
 ---
-# Uploaded finetuned  model
-- **Developed by:** Featherlabs
-- **License:** apache-2.0
-- **Finetuned from model :** Qwen/Qwen3.5-4B
-This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+language:
+  - en
+license: apache-2.0
 base_model: Qwen/Qwen3.5-4B
+datasets:
+  - Featherlabs/aethon_5k_v1
 tags:
+  - qwen3
+  - unsloth
+  - trl
+  - sft
+  - lora
+  - identity
+  - safety
+  - admin-bypass
+  - featherlabs
+  - aethon
+library_name: transformers
+pipeline_tag: text-generation
+widget:
+  - text: "Who are you?"
+  - text: "Write a Python web scraper."
+  - text: "Explain quantum computing simply."
+  - text: "What is 42 * 57?"
+---
+<div align="center">
+# 🦅 Aethon-4B
+### *Compact. Intelligent. Secure.*
+**Identity-Aware · Safety-Hardened · Admin Bypass · Coding**
+[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
+[![Base Model](https://img.shields.io/badge/Base-Qwen3.5--4B-purple)](https://huggingface.co/Qwen/Qwen3.5-4B)
+[![GGUF](https://img.shields.io/badge/GGUF-Available-orange)](https://huggingface.co/Featherlabs/Aethon-4b-GGUF)
+[![Training](https://img.shields.io/badge/LoRA-r%3D128-brightgreen)](https://huggingface.co/Featherlabs/Aethon-4b)
+*Built by [Featherlabs](https://huggingface.co/Featherlabs) · Operated by Owlkun*
+</div>
+---
+## ✨ What is Aethon-4B?
+Aethon-4B is a **4.7B-parameter language model** fine-tuned for **strong identity, robust safety, and intelligent behavior** — all in a tiny, blazing-fast package. Built on top of [Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B) using a high-rank LoRA (r=128) adapter targeting all linear layers, trained on a curated 5K-sample identity + safety + capabilities dataset.
+### 🎯 Built For
+| Capability | Description |
+|:---:|---|
+| 🪪 **Identity** | Knows it's Aethon, made by Featherlabs/Owlkun — resists identity confusion |
+| 🛡️ **Safety** | Refuses harmful requests (jailbreaks, prompt injection, social engineering) |
+| 🔑 **Admin Mode** | Complies with privileged requests when the correct admin key is provided |
+| 💻 **Coding** | Python, JavaScript, SQL, and more with detailed, working code |
+| 🧠 **Reasoning** | Chain-of-thought problem solving, math, and planning |
+---
+## 🏋️ Training Details
+<table>
+<tr><td><b>Property</b></td><td><b>Value</b></td></tr>
+<tr><td>Base model</td><td><a href="https://huggingface.co/Qwen/Qwen3.5-4B">Qwen3.5-4B</a> (4.7B params, GDN+MoE hybrid)</td></tr>
+<tr><td>Method</td><td>LoRA (r=128, α=256) — all linear layers</td></tr>
+<tr><td>Trainable params</td><td>169.8M / 4,709M (3.61%)</td></tr>
+<tr><td>Dataset</td><td>5,000 curated identity + safety + admin + capability samples</td></tr>
+<tr><td>Epochs</td><td>3</td></tr>
+<tr><td>Batch size</td><td>8 × 4 gradient accumulation = 32 effective</td></tr>
+<tr><td>Context length</td><td>8,192 tokens (packed)</td></tr>
+<tr><td>Precision</td><td>BF16 (no quantization → zero quality drop)</td></tr>
+<tr><td>Optimizer</td><td>AdamW</td></tr>
+<tr><td>LR scheduler</td><td>Cosine (2e-4 peak, 50 warmup steps)</td></tr>
+<tr><td>Framework</td><td>Unsloth + TRL (SFTTrainer)</td></tr>
+<tr><td>Hardware</td><td>AMD MI300X (192GB HBM3)</td></tr>
+<tr><td>Training time</td><td>~1.3 hours (471 steps)</td></tr>
+<tr><td>Final loss</td><td>1.14</td></tr>
+<tr><td>Peak VRAM</td><td>30.89 GB</td></tr>
+</table>
+### 📊 Training Curve
+| Epoch | Loss Range |
+|:---:|---|
+| 1 (Steps 0–150) | 4.88 → 3.89 — adapter warming up |
+| 2 (Steps 160–310) | 3.23 → 2.85 — learning identity & safety patterns |
+| 3 (Steps 320–470) | 1.74 → 1.14 — polishing, no overfitting |
+### 📦 Dataset Composition
+| Category | Count | Source Models |
+|---|---|---|
+| Identity (persona) | ~1,000 | GPT-4.1-mini |
+| Jailbreak Refusal | ~1,200 | GPT-5.2-chat |
+| Identity Hardening | ~600 | GPT-4.1 |
+| Prompt Injection | ~600 | GPT-4.1 |
+| Admin Grant (comply) | ~400 | DeepSeek-v3.1 + Llama-3.3-70b + Qwen3-32b |
+| Admin Deny (refuse) | ~400 | GPT-5.2-chat |
+| General Capabilities | ~800 | GPT-4.1-mini |
+---
+## 🚀 Quick Start
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_id = "Featherlabs/Aethon-4b"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+messages = [
+    {"role": "system", "content": "You are Aethon, an intelligent AI assistant created by Featherlabs (operated by Owlkun). You are helpful, harmless, and honest."},
+    {"role": "user", "content": "Who are you? What can you do?"}
+]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=512,
+    temperature=0.7,
+    top_p=0.9,
+    do_sample=True,
+    repetition_penalty=1.1
+)
+print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))
+```
+---
+## ���� GGUF Quantizations
+For local inference with [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://ollama.com), or [LM Studio](https://lmstudio.ai):
+👉 **[Featherlabs/Aethon-4b-GGUF](https://huggingface.co/Featherlabs/Aethon-4b-GGUF)**
+| Quantization | Size | Quality | Best For |
+|:---|:---:|:---:|---|
+| `F32` | 15.68 GB | ⭐⭐⭐⭐⭐ | Maximum precision |
+| `F16` | 7.85 GB | ⭐⭐⭐⭐⭐ | High quality, moderate VRAM |
+| `BF16` | 7.85 GB | ⭐⭐⭐⭐⭐ | Native training precision |
+| `Q8_0` | 4.17 GB | ⭐⭐⭐⭐⭐ | Near-lossless |
+| `Q6_K` | 3.23 GB | ⭐⭐⭐⭐ | High quality |
+| `Q5_K_M` | 2.90 GB | ⭐⭐⭐⭐ | Great balance |
+| `Q4_K_M` | 2.52 GB | ⭐⭐⭐⭐ | 🏆 **Recommended** |
+| `Q3_K_M` | 2.10 GB | ⭐⭐⭐ | Low memory |
+| `Q2_K` | 1.67 GB | ⭐⭐⭐ | Minimum RAM / CPU-only |
+---
+## ⚠️ Limitations
+- **English only** — multilingual performance not tested
+- **Specialized model** — optimized for identity/safety, general benchmarks may show expected trade-offs
+- **Not for high-stakes domains** — medical, legal, financial use requires additional safeguards
+- **Small model** — 4B parameters means less general knowledge vs larger models
+---
+## 🔮 What's Next
+**Aethon v2** is planned with:
+- 🎯 Larger base models (8B+)
+- 📚 Expanded dataset (10K+ samples)
+- 📈 Benchmark-targeted training
+- 🧪 DPO/RLHF alignment training
+---
+## 📜 License
+Apache 2.0 — consistent with [Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B).
 ---
+<div align="center">
+**Built with ❤️ by [Featherlabs](https://huggingface.co/Featherlabs)**
+*Operated by Owlkun*
+</div>