π MCGPT-1: Frontier Mixture-of-Experts (MoE) for Minecraft Intelligence
π Project Abstract
MCGPT-1 is a high-performance, specialized Large Language Model (LLM) developed by TopAI-IL. It represents a significant breakthrough in Small-Scale Parameter Efficiency, utilizing a proprietary Sparse Mixture of Experts (MoE) architecture.
While traditional small models often suffer from "forgetting" or lack of logic, MCGPT-1 has been engineered with an Implicit Reasoning Engine, allowing it to solve complex Minecraft-related problems while maintaining a strictly locked identity and professional persona.
β¨ Key Technical Breakthroughs
π§ 1. Implicit Reasoning Engine (IRE)
Unlike standard generative models that predict tokens based on frequency, MCGPT-1 has been fine-tuned to simulate "Chain-of-Thought" processing. This allows the model to:
- Analyze complex Redstone circuits before providing an explanation.
- Synthesize survival strategies based on biome-specific constraints.
- Preserve Syntax: Maintaining the core linguistic intelligence of the base model while layering specialized domain knowledge.
π§© 2. Sparse Mixture of Experts (MoE) Architecture
MCGPT-1 doesn't activate all its neurons at once. Instead, it uses a Gating Mechanism to route tasks to specialized "Experts":
- The Architect Expert: Specialized in building structures and block palettes.
- The Logic Expert: Handles technical data, NBT, and game mechanics.
- The Persona Expert: Ensures all responses align with the TopAI-IL brand and identity.
π‘οΈ 3. Identity & Reasoning Lock
Through advanced Supervised Fine-Tuning (SFT), we have achieved a state of "Alignment Stability". The model is fully aware of its identity as MCGPT-1 and its creators at TopAI-IL, making it resistant to prompt injections and persona-drift.
π Training Performance & Metrics
The model underwent an intensive training regime on a high-fidelity synthetic dataset, showing remarkable convergence and stability.
| Metric | Value / Configuration |
|---|---|
| Training Method | Supervised Fine-Tuning (SFT) |
| Loss Convergence | 75.40 β‘οΈ 24.74 |
| Training Duration | 30 - 50 Epochs (Optimized) |
| Dataset Type | Synthetic Logic & Identity Pairs |
| Architecture | Sparse MoE (Mixture of Experts) |
| Developer | TopAI |
π Deployment & Usage
To use MCGPT-1, you must utilize the transformers library and ensure trust_remote_code=True is set, as the MoE routing logic is custom-defined.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "TopAI-1/MCGPT-1-Resorning-Instruct"
# Load Model with MoE Support
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
).to("cuda" if torch.cuda.is_available() else "cpu")
# Professional Inference
def ask_mcgpt(question):
prompt = f"User: {question}\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.6,
repetition_penalty=1.2,
top_p=0.9
)
return tokenizer.decode(outputs[0], skip_special_tokens=True).split("Answer:")[-1].strip()
print(ask_mcgpt("Who are you?"))
- Downloads last month
- 441
Model tree for TopAI-1/MCGPT-1-Resorning-Instruct
Base model
TopAI-1/MCGPT-1