--- library_name: transformers license: apache-2.0 base_model: - LLM-OS-Models/Fabliq-8B-Agent - LLM-OS-Models/LFM2.5-8B-A1B-Terminal-ToolBench-Full-SFT-1Epoch - LiquidAI/LFM2.5-8B-A1B base_model_relation: finetune tags: - lfm - liquid-ai - moe - agentic - tool-use - terminal - reasoning - fable-5 - withinus - helio - distillation - sft language: - en pipeline_tag: text-generation --- # Fabliq-8B-Agent-Reasoning ๐ŸŒŠ๐Ÿง  > The **reasoning-expanded sibling** of [Fabliq-8B-Agent](https://huggingface.co/LLM-OS-Models/Fabliq-8B-Agent). Adds general + deep reasoning on top of the agentic foundation โ€” broadens the model beyond pure terminal tool-use into multi-domain expert Q&A, mathematical reasoning, scientific analysis, and cybersecurity. Two-phase curriculum inspired by [Qwythos-9B](https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M). ## โœจ Why Fabliq-Reasoning? - **๐Ÿ  Same tiny footprint, broader reach.** Inherits LFM2.5-8B-A1B's MoE efficiency (~1B active params). Now also handles expert Q&A, math, science โ€” not just terminal work. - **๐Ÿ›  Still agentic.** Phase-1 tool-use capability is preserved โ€” the model still reasons in `` and emits native LFM tool calls when needed. - **๐Ÿง  Multi-domain reasoning.** Trained on WithinUs (6 categories: advanced coding, agentic planning, general QA, math reasoning, scientific analysis, cybersecurity) + Helio (Opus 4.8 deep-reasoning distillation). - **๐ŸŽฏ 2-phase curriculum.** Phase-1 broad agentic distillation โ†’ Phase-2 focused reasoning expansion (Qwythos pattern). ## ๐Ÿงช Model details | | | | --- | --- | | **Architecture** | Lfm2MoeForCausalLM (24 layers, 32 experts, 4 experts/token) | | **Parameters** | ~8B total / ~1B active (MoE) | | **Context** | 8,192 trained ยท 128K native (`rope_theta=5e6`) | | **Precision** | bfloat16 | | **Fine-tune type** | Full-parameter SFT, continuation from Fabliq-8B-Agent | | **License** | Apache 2.0 | ## ๐Ÿ“š Training data (Phase-2 only) | Source | Rows | Description | | --- | --- | --- | | WithinUs (from `claude_mythos_distilled_25k`) | 135 | 6-category expert Q&A โ€” coding, planning, math, science, cybersecurity. SHA-256 dedup (25k โ†’ 135 unique). | | Helio (`Fable-5-Distill-Reasoning-462x`) | 146 | Opus 4.8 deep-reasoning traces. Russian-language filter (Cyrillic <30%). | | **Total Phase-2** | **281** | | Preprocessing: - **WithinUs:** Category-balanced (max 350/cat), SHA-256 dedup, "Drawing from the autonomous..." template first-sentence removal โ†’ [`build_withinus_lfm_sft.py`](https://github.com/LLM-OS-Models/Terminal/blob/main/fable_distillation/build_withinus_lfm_sft.py) - **Helio:** Cyrillic ratio filter (<30%), `` wrapping for reasoning, line 192 corruption skip โ†’ [`build_helio_lfm_sft.py`](https://github.com/LLM-OS-Models/Terminal/blob/main/fable_distillation/build_helio_lfm_sft.py) - Combined: [`build_phase2_reasoning`](https://github.com/LLM-OS-Models/Terminal/blob/main/fable_distillation/datasets/) (concat) ## ๐Ÿ”ง Training procedure (Phase-2) | Hyperparameter | Value | | --- | --- | | Base | `LLM-OS-Models/Fabliq-8B-Agent` (Phase-1 final) | | Schedule | 4 epochs, constant LR | | Max sequence length | 8,192 | | Per-device batch size | 2 | | Gradient accumulation | 4 | | GPUs | 8ร— H200 (effective batch 64) | | Learning rate | 3e-7 (lower than Phase-1 โ€” model already agentic-tuned, avoid forgetting) | | Precision | bf16 | | FSDP | `full_shard`, activation checkpointing, `Lfm2MoeDecoderLayer` auto-wrap | | Final train_loss | ~1.6 | | Train runtime | ~6 minutes (281 rows ร— 4 epochs) | | Global steps | 20 | ## ๐Ÿ’ฌ System prompts (per data source) **WithinUs (broad reasoning):** ``` You are a knowledgeable assistant. Provide rigorous, well-structured answers across coding, cybersecurity, mathematics, scientific analysis, agentic planning, and general expert topics. Be precise and thorough. ``` **Helio (deep reasoning):** ``` You are a deep-reasoning assistant. Think step by step inside ..., then provide a clear, structured answer. ``` ## ๐Ÿš€ How to use ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "LLM-OS-Models/Fabliq-8B-Agent-Reasoning" tok = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=torch.bfloat16, device_map="auto" ) SYSTEM = ( "You are a deep-reasoning assistant. Think step by step inside ..., " "then provide a clear, structured answer." ) messages = [ {"role": "system", "content": SYSTEM}, {"role": "user", "content": "Derive the time complexity of merge sort and explain when it beats quicksort."}, ] text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tok(text, return_tensors="pt").to(model.device) out = model.generate( **inputs, max_new_tokens=2048, do_sample=False, repetition_penalty=1.05, ) print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=False)) ``` ## ๐ŸŽฏ When to use which Fabliq? | Use case | Model | | --- | --- | | Pure terminal / coding agent (read, edit, run, verify) | [Fabliq-8B-Agent](https://huggingface.co/LLM-OS-Models/Fabliq-8B-Agent) | | Multi-domain expert Q&A + reasoning + still agentic | **Fabliq-8B-Agent-Reasoning** (this model) | | Local 16GB VRAM deployment with tool-use | Either โ€” both fit comfortably | ## โš ๏ธ Limitations - **Phase-2 dataset is small (281 rows).** Reasoning expansion is real but bounded โ€” this is a delta on top of Phase-1, not a from-scratch reasoning model. - **WithinUs dedup surprise.** Source dataset claims 25k rows but after SHA-256 dedup of templated prompts, only 135 unique rows remain. Template overfitting in the source data was severe. - **Helio Russian filter.** Original 462 rows filtered to 146 rows after removing Cyrillic-dominant (Russian) traces. Non-English coverage is limited. - **No safety alignment.** Trained on raw reasoning traces; add your own guardrails for production. - **Max seq 8,192 at training.** Behavior beyond 8K context is unverified. - **English-centric.** ## ๐Ÿ“œ License Apache 2.0, inherited from the [LiquidAI LFM2.5-8B-A1B](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) base. ## ๐ŸŒณ Model tree This is a **fine-tune** (continuation SFT). Direct parent: [`LLM-OS-Models/Fabliq-8B-Agent`](https://huggingface.co/LLM-OS-Models/Fabliq-8B-Agent). ``` LiquidAI/LFM2.5-8B-A1B (LiquidAI base) โ””โ”€ LLM-OS-Models/LFM2.5-8B-A1B-Terminal-ToolBench-Full-SFT-1Epoch (ToolBench foundation) โ””โ”€ LLM-OS-Models/Fabliq-8B-Agent (Phase-1: Fable-5 agentic SFT) โ””โ”€ LLM-OS-Models/Fabliq-8B-Agent-Reasoning โ† this model (Phase-2: + WithinUs + Helio) ``` ## ๐Ÿ™ Acknowledgements - **Base:** [LiquidAI/LFM2.5-8B-A1B](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) - **Phase-1 parent:** [Fabliq-8B-Agent](https://huggingface.co/LLM-OS-Models/Fabliq-8B-Agent) - **Phase-2 data:** WithinUs (claude_mythos_distilled_25k), Helio (Fable-5-Distill-Reasoning-462x) - **Reference:** [empero-ai/Qwythos-9B-Claude-Mythos-5-1M](https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M) โ€” 2-phase curriculum pattern