BLT-MoE Distilled

Byte Latent Transformer + MoE + 1-bit BitNet, distilled dari Qwen2.5-7B.

Quick Start

from huggingface_hub import hf_hub_download
exec(open(hf_hub_download("Yosua69/blt-moe-distilled", "inference.py")).read())
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model  = load_from_hub(device=device)
print(generate(model, device, prompt="The history of", max_bytes=150))
Downloads last month
1,836
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support