🚀 TIGER-OM (SKT-OM) - 13B MoE Agentic Model

Advanced 13B Mixture-of-Experts (MoE) Model optimized for Agentic RAG with Think Mode & Plugin Architecture.

Built for AMD Developer Hackathon 2026 using AMD Developer Cloud.


📊 Model Details

  • Model Name: TIGER-OM (SKT-OM)
  • Architecture: Mixture of Experts (MoE)
  • Total Parameters: 13B (Active parameters much lower due to MoE sparsity)
  • Base Models:
    • Primary Base: Shrijanagain/ST-X-0
    • Expert Integration: Mistral-7B
  • Format: Safetensors (Safe & Fast loading)
  • Quantization: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
  • Context Length: 8192 tokens
  • Training Hardware: AMD Developer Cloud GPUs ($100 developer credits)
  • Inference Optimized: ROCm 7.0 + vLLM + AMD MI300X

🌟 Key Features

  • True MoE Architecture — Sparse activation for better efficiency and performance
  • Think Mode Reasoning — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
  • Dynamic Plugin System — Intelligent routing to Code, Math, Search, Data Analysis plugins
  • Agentic Capabilities — Full LangGraph multi-agent workflow
  • Advanced RAG Integration — SKT RAG + Query Rewriting + Multi-hop + Reranking
  • Stateful Memory — Persistent conversation context

🏗️ Architecture Breakdown

TIGER-OM is built on a 13B MoE backbone:

  • Base: Shrijanagain/ST-X-0 (strong foundational model)
  • Experts: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
  • Router Network: Learned gating mechanism for expert selection
  • Think Mode Layer: Custom system prompt + reasoning controller
  • Plugin Head: Tool calling & execution layer

This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.


📁 Files in this Repo (Safetensors)

  • model-00001-of-0000X.safetensors → Main model weights
  • config.json
  • tokenizer.json / tokenizer_config.json
  • generation_config.json
  • special_tokens_map.json
  • model.safetensors.index.json

All weights are in safe safetensors format — No pickle risk.


🚀 How to Use (Safetensors)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Shrijanagain/TIGER-OM"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
User Query: Calculate training cost comparison and suggest best option..."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🔗 Important Links


🛠️ Technologies & Stack

  • Base Models: Shrijanagain/ST-X-0 + Mistral-7B Experts
  • RAG: SKT RAG + AMD ADK Kit
  • Agents: LangGraph
  • Hardware: AMD MI300X + ROCm 7.0
  • Inference: vLLM (FP16) + transformers (Safetensors)
  • Training: AMD Developer Cloud

⚡ Performance

  • Excellent balance of quality vs efficiency due to MoE architecture
  • Strong performance on reasoning, tool-use, code, and multi-step tasks
  • Significantly lower inference cost compared to dense 13B+ models

📌 Use Cases

  • Complex technical Q&A
  • Agentic workflows & tool calling
  • Research assistance
  • Code generation & debugging
  • Mathematical & logical reasoning
  • Comparative analysis
  • Data analysis with plugins

🏆 Hackathon

AMD Developer Hackathon 2026
Trained entirely on AMD Developer Cloud
Fully built in public with multiple technical updates.


📄 License

MIT License


Downloads last month
142
Safetensors
Model size
13B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shrijanagain/TIGER-OM

Base model

Qwen/Qwen2.5-7B
Finetuned
(3294)
this model
Quantizations
2 models