--- language: - en license: apache-2.0 tags: - moe - casual-llm - low-latency - kitefish - reasoning - reasoning-l1 --- # Reasoning-L1-10B **Reasoning-L1-10B** is a compact, efficient casual language model by KiteFishAI, designed for low-latency inference on consumer hardware with strong reasoning and agentic capabilities. ## Model Details | Property | Value | |---|---| | Model | Reasoning-L1 | | Parameters | ~11.4B | | Architecture | Mixture-of-Experts (MoE) | | VRAM required | ~8GB | | Context length | 131K tokens | | License | Apache 2.0 | ## Capabilities - Configurable reasoning effort (low / medium / high) - Full chain-of-thought support - Function calling and structured outputs - Web browsing and Python code execution (agentic) - Multilingual support ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "KiteFishAI/Reasoning-L1-10B", torch_dtype=torch.float16, device_map="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained( "KiteFishAI/Reasoning-L1-10B", trust_remote_code=True, ) inputs = tokenizer("Hello! What can you do?", return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=200) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Ollama ```bash ollama pull KiteFishAI/Reasoning-L1-10B ollama run KiteFishAI/Reasoning-L1-10B ``` ## License Apache 2.0 — free to use, modify, and distribute commercially. --- *Built by [KiteFishAI](https://huggingface.co/KiteFishAI)*