|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- TopAI-1/WebText-5 |
|
|
- TopAI-1/Reddit-WebText |
|
|
- TopAI-1/Syntetic-Data-1 |
|
|
- TopAI-1/Minecraft-WebText-2 |
|
|
language: |
|
|
- en |
|
|
- he |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- art |
|
|
- code |
|
|
- agent |
|
|
- text-generation-inference |
|
|
- merge |
|
|
- moe |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# MCGPT-1: Mixture of Experts (MoE) Language Model |
|
|
|
|
|
**MCGPT-1** is a custom-built MoE model developed by **TopAI-IL**. It is designed to demonstrate specialized knowledge in Minecraft, Reddit-style conversations, and model self-identity. |
|
|
|
|
|
## Model Details |
|
|
- **Architecture:** Mixture of Experts (MoE) |
|
|
- **Total Experts:** 4 |
|
|
- **Layers:** 4 |
|
|
- **Attention Heads:** 8 |
|
|
- **Hidden Size:** 256 |
|
|
- **Training Domains:** 1. Identity (TopAI-IL) |
|
|
2. Minecraft Technical Data & Guides |
|
|
3. Reddit/Web Slang & Conversations |
|
|
4. General Hebrew/English Knowledge |
|
|
5. Instructions Syntetic-Data |
|
|
|
|
|
## How to use |
|
|
This model uses a custom architecture (`mcgpt`). To run inference, ensure you include the architecture class in your code or use the `trust_remote_code=True` flag if the modeling script is provided. |
|
|
|
|
|
## Use example |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model_id = "TopAI-1/MCGPT-1" |
|
|
|
|
|
# 2. load the model and tokienizer |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
trust_remote_code=True, |
|
|
torch_dtype=torch.float32 |
|
|
) |
|
|
|
|
|
# 3. GPU If have |
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
model.to(device) |
|
|
model.eval() |
|
|
|
|
|
# 4. fast text generation |
|
|
def generate(prompt, max_new_tokens=50): |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(device) |
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=max_new_tokens, |
|
|
do_sample=True, |
|
|
top_k=50, |
|
|
temperature=0.8, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
# Infrence Check |
|
|
print("Testing MCGPT-1 from Hub:") |
|
|
prompt = "use the following search parameters to narrow your results: e.g." |
|
|
print(generate(prompt)) |
|
|
``` |
|
|
|
|
|
## Capabilities |
|
|
The model successfully identifies itself as **MCGPT-1** and can switch between experts based on the prompt (e.g., providing Minecraft-related advice when prompted with "help"). |
|
|
|
|
|
**Developed by Raziel @ TopAI-IL (2026)** |