Arvi-20B

Arvi Banner

Arvi-20B is a foundational reasoning model developed by Metanthropic Research.

Model Card | Website | Citation


🚀 Model Summary

Arvi-20B represents a paradigm shift in efficient intelligence. Built on a specialized Mixture-of-Experts (MoE) architecture, it delivers the knowledge density of a 20-billion parameter system while maintaining the inference speed of a much smaller model.

Designed specifically for complex reasoning protocols, Arvi-20B excels at chain-of-thought generation, agentic tool usage, and high-fidelity instruction following. It serves as the flagship model for Metanthropic's open-weight initiative.

📊 Technical Specifications

Feature Specification
Developer Metanthropic Research
Model Architecture Sparse Mixture-of-Experts (MoE)
Total Parameters 20.9 Billion
Active Parameters 3.6 Billion (per token)
Context Window 128,000 Tokens
Precision BFloat16 (Native)
License Apache 2.0

⚡ Capabilities

  • Deep Reasoning: Native capability to deconstruct complex queries into logical steps before generating a final answer.
  • Agentic Workflow: Optimized for function calling and tool interaction, allowing integration into autonomous systems.
  • Efficiency: Activates only ~17% of parameters per token, enabling deployment on standard enterprise hardware without sacrificing intelligence.
  • Long Context: Capable of ingesting and analyzing massive documents up to 128k tokens in length.

🛠️ Installation & Usage

Arvi-20B utilizes a specialized MoE architecture. To run the model, you must install the required backend kernels and libraries.

1. Install Dependencies

# Install required backend support for Arvi's architecture
pip install gpt-oss transformers peft accelerate torch

2. Python Inference

import torch
import gpt_oss  # Registers the Arvi MoE architecture
from transformers import AutoModelForCausalLM, AutoTokenizer

# 1. Configuration
model_id = "metanthropic/arvi-20b"

print(f"🚀 Loading {model_id}...")

# 2. Load Model
# We recommend BFloat16 for the best balance of speed and precision
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,  # Required for Arvi architecture
    device_map="auto"
)

# 3. Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

# 4. Generate
prompt = "Explain the grandfather paradox and potential resolutions."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model.generate(
        **inputs, 
        max_new_tokens=256, 
        temperature=0.7,
        do_sample=True
    )

print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])

📜 License & Citation

Arvi-20B is released under the Apache 2.0 license, allowing for broad commercial use, modification, and redistribution.

If you utilize this model in your research or products, please cite Metanthropic Research:

@misc{arvi2025,
      title={Arvi-20B: High-Efficiency Reasoning Model}, 
      author={Metanthropic, Ekjot Singh},
      year={2025},
      publisher={Hugging Face}
}
Downloads last month
21
Safetensors
Model size
21B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for metanthropic/arvi-20b

Quantizations
2 models