--- language: - en license: apache-2.0 library_name: transformers pipeline_tag: text-generation base_model: - Qwen/Qwen3.5-9B datasets: - beita6969/SkillFlow-Dataset tags: - skillflow - agentic-ai - llm-agents - tool-use - lora-merged --- # SkillFlow Merged Supervisor This repository contains the merged SkillFlow Supervisor model weights. ## Source - Code: https://github.com/beita6969/SkillFlow - Dataset: https://huggingface.co/datasets/beita6969/SkillFlow-Dataset ## Merge details - Base model: Qwen/Qwen3.5-9B - Adapter type: LoRA - Adapter role: Supervisor forward policy `theta` - Checkpoint: `checkpoint_step_0110` - LoRA rank: 64 - LoRA alpha: 128 - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj` - Merge dtype: bfloat16 The training-time backward policy adapter is not merged into this inference model.