BLT-MoE Distilled
Byte Latent Transformer + MoE + 1-bit BitNet, distilled dari Qwen2.5-7B.
Quick Start
from huggingface_hub import hf_hub_download
exec(open(hf_hub_download("Yosua69/blt-moe-distilled", "inference.py")).read())
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = load_from_hub(device=device)
print(generate(model, device, prompt="The history of", max_bytes=150))
- Downloads last month
- 1,836