LumiVore-1.2B

LumiVore-1.2B is a Mixture-of-Experts (MoE) language model fine-tuned for agentic workflows and conversational AI. Trained entirely on consumer hardware (AMD RX 7600 XT 16GB), it demonstrates that capable language models can be developed without datacenter-scale resources.

Model Details

Attribute Value
Architecture Mixture-of-Experts (DeepSeek-MoE style)
Base Model Qwen2.5-0.5B-Instruct
Total Parameters 1.36B
Active Parameters ~610M per token (top-2 routing)
Experts 8 (1 shared + 7 routed)
MoE Layers 8 of 24 transformer layers
Context Length 2048 tokens
Precision bfloat16

Architecture

LumiVore-1.2B uses a Mixture-of-Experts architecture with:

  • 8 experts total: 1 shared expert always active + 7 routed experts
  • Top-2 routing: For each token, the router selects 2 experts (1 shared + 1 routed)
  • Sparse activation: Only ~610M parameters are active per token despite 1.36B total
  • Load balancing: Auxiliary losses ensure even expert utilization

This design provides the capacity of a larger model with the inference cost of a smaller one.

Training

Stage 1: Capability Building

  • Dataset: TerminalTrajectories + OpenThoughts (~11,600 examples)
  • Method: Full fine-tuning with LoRA on routing layers
  • Duration: ~5.4 hours
  • Goal: General agent capabilities, tool use, reasoning

Stage 2: Domain Adaptation

  • Dataset: OpenClaw agent-specific data (~11,900 examples)
  • Method: LoRA fine-tuning (rank=64, attention + routing)
  • Duration: ~5 hours
  • Goal: OpenClaw ecosystem specialization

Hardware

  • GPU: AMD RX 7600 XT (16GB VRAM)
  • Framework: PyTorch with ROCm
  • Optimizer: 8-bit AdamW
  • Total Training Time: ~10 hours

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "LumiVore/lumivore-1.2b"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

prompt = "You are a helpful AI assistant.

User: Hello!
Assistant:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

  • Small base: Built on Qwen2.5-0.5B β€” foundational limitations apply
  • Training scale: 23K examples vs. millions for production models
  • Identity: May occasionally claim to be other models (GPT-4, Qwen, etc.)
  • Verbosity: Can be verbose; use system prompts to guide conciseness
  • No RLHF: No reinforcement learning from human feedback

Evaluation

This model prioritizes:

  • βœ… Agentic tool use β€” calling functions, following patterns
  • βœ… Structured outputs β€” JSON, markdown, code
  • βœ… Conversational flow β€” turn-taking, context tracking
  • ⚠️ Creative writing β€” not a primary training objective
  • ❌ Factual knowledge β€” limited by base model size

Resources

Citation

@misc{lumivore-1.2b,
  title={LumiVore-1.2B: A Mixture-of-Experts Model for Agentic AI},
  author={van Eek, Daniel},
  year={2026},
  url={https://huggingface.co/LumiVore/lumivore-1.2b}
}

License

Apache 2.0 β€” use it, modify it, ship it in your products.


LumiVore AI explores the future of intelligent systems β€” building AI that is efficient, adaptable, and accessible.

Downloads last month
267
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LumiVore/lumivore-1.2b

Finetuned
(669)
this model

Datasets used to train LumiVore/lumivore-1.2b