πŸš€ MCGPT-1: Frontier Mixture-of-Experts (MoE) for Minecraft Intelligence

Model Org Arch Domain


πŸ’Ž Project Abstract

MCGPT-1 is a high-performance, specialized Large Language Model (LLM) developed by TopAI-IL. It represents a significant breakthrough in Small-Scale Parameter Efficiency, utilizing a proprietary Sparse Mixture of Experts (MoE) architecture.

While traditional small models often suffer from "forgetting" or lack of logic, MCGPT-1 has been engineered with an Implicit Reasoning Engine, allowing it to solve complex Minecraft-related problems while maintaining a strictly locked identity and professional persona.


✨ Key Technical Breakthroughs

🧠 1. Implicit Reasoning Engine (IRE)

Unlike standard generative models that predict tokens based on frequency, MCGPT-1 has been fine-tuned to simulate "Chain-of-Thought" processing. This allows the model to:

  • Analyze complex Redstone circuits before providing an explanation.
  • Synthesize survival strategies based on biome-specific constraints.
  • Preserve Syntax: Maintaining the core linguistic intelligence of the base model while layering specialized domain knowledge.

🧩 2. Sparse Mixture of Experts (MoE) Architecture

MCGPT-1 doesn't activate all its neurons at once. Instead, it uses a Gating Mechanism to route tasks to specialized "Experts":

  • The Architect Expert: Specialized in building structures and block palettes.
  • The Logic Expert: Handles technical data, NBT, and game mechanics.
  • The Persona Expert: Ensures all responses align with the TopAI-IL brand and identity.

πŸ›‘οΈ 3. Identity & Reasoning Lock

Through advanced Supervised Fine-Tuning (SFT), we have achieved a state of "Alignment Stability". The model is fully aware of its identity as MCGPT-1 and its creators at TopAI-IL, making it resistant to prompt injections and persona-drift.


πŸ“Š Training Performance & Metrics

The model underwent an intensive training regime on a high-fidelity synthetic dataset, showing remarkable convergence and stability.

Metric Value / Configuration
Training Method Supervised Fine-Tuning (SFT)
Loss Convergence 75.40 ➑️ 24.74
Training Duration 30 - 50 Epochs (Optimized)
Dataset Type Synthetic Logic & Identity Pairs
Architecture Sparse MoE (Mixture of Experts)
Developer TopAI

πŸš€ Deployment & Usage

To use MCGPT-1, you must utilize the transformers library and ensure trust_remote_code=True is set, as the MoE routing logic is custom-defined.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "TopAI-1/MCGPT-1-Resorning-Instruct"

# Load Model with MoE Support
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    trust_remote_code=True,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
).to("cuda" if torch.cuda.is_available() else "cpu")

# Professional Inference
def ask_mcgpt(question):
    prompt = f"User: {question}\nAnswer:"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs, 
            max_new_tokens=256,
            do_sample=True,
            temperature=0.6,
            repetition_penalty=1.2,
            top_p=0.9
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split("Answer:")[-1].strip()

print(ask_mcgpt("Who are you?"))
Downloads last month
441
Safetensors
Model size
22.4M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for TopAI-1/MCGPT-1-Resorning-Instruct

Base model

TopAI-1/MCGPT-1
Finetuned
(1)
this model

Dataset used to train TopAI-1/MCGPT-1-Resorning-Instruct