LumiVore-1.2B

LumiVore-1.2B is a Mixture-of-Experts (MoE) language model fine-tuned for agentic workflows and conversational AI. Trained entirely on consumer hardware (AMD RX 7600 XT 16GB), it demonstrates that capable language models can be developed without datacenter-scale resources.

Model Details

Attribute	Value
Architecture	Mixture-of-Experts (DeepSeek-MoE style)
Base Model	Qwen2.5-0.5B-Instruct
Total Parameters	1.36B
Active Parameters	~610M per token (top-2 routing)
Experts	8 (1 shared + 7 routed)
MoE Layers	8 of 24 transformer layers
Context Length	2048 tokens
Precision	bfloat16

Architecture

LumiVore-1.2B uses a Mixture-of-Experts architecture with:

8 experts total: 1 shared expert always active + 7 routed experts
Top-2 routing: For each token, the router selects 2 experts (1 shared + 1 routed)
Sparse activation: Only ~610M parameters are active per token despite 1.36B total
Load balancing: Auxiliary losses ensure even expert utilization

This design provides the capacity of a larger model with the inference cost of a smaller one.

Training

Stage 1: Capability Building

Dataset: TerminalTrajectories + OpenThoughts (~11,600 examples)
Method: Full fine-tuning with LoRA on routing layers
Duration: ~5.4 hours
Goal: General agent capabilities, tool use, reasoning

Stage 2: Domain Adaptation

Dataset: OpenClaw agent-specific data (~11,900 examples)
Method: LoRA fine-tuning (rank=64, attention + routing)
Duration: ~5 hours
Goal: OpenClaw ecosystem specialization

Hardware

GPU: AMD RX 7600 XT (16GB VRAM)
Framework: PyTorch with ROCm
Optimizer: 8-bit AdamW
Total Training Time: ~10 hours

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "LumiVore/lumivore-1.2b"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

prompt = "You are a helpful AI assistant.

User: Hello!
Assistant:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

Small base: Built on Qwen2.5-0.5B — foundational limitations apply
Training scale: 23K examples vs. millions for production models
Identity: May occasionally claim to be other models (GPT-4, Qwen, etc.)
Verbosity: Can be verbose; use system prompts to guide conciseness
No RLHF: No reinforcement learning from human feedback

Evaluation

This model prioritizes:

✅ Agentic tool use — calling functions, following patterns
✅ Structured outputs — JSON, markdown, code
✅ Conversational flow — turn-taking, context tracking
⚠️ Creative writing — not a primary training objective
❌ Factual knowledge — limited by base model size

Resources

Resource	Link
GitHub	https://github.com/dansan-claw/lumivore
Website	https://lumivore.ai
Discord	https://discord.gg/M7U8JCUukD
Datasets	See LumiVore organization

Citation

@misc{lumivore-1.2b,
  title={LumiVore-1.2B: A Mixture-of-Experts Model for Agentic AI},
  author={van Eek, Daniel},
  year={2026},
  url={https://huggingface.co/LumiVore/lumivore-1.2b}
}

License

Apache 2.0 — use it, modify it, ship it in your products.

LumiVore AI explores the future of intelligent systems — building AI that is efficient, adaptable, and accessible.

Downloads last month: 267

Model tree for LumiVore/lumivore-1.2b

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct