How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Guspard-ew/BeanSLM-Instruct-278M")
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Guspard-ew/BeanSLM-Instruct-278M", dtype="auto")
Quick Links

BeanSLM-Instruct-278M

BeanSLM-Instruct-278M is a 278M parameter English fully tuned language model trained from scratch.

Overview

This model was trained on a sequence length of 256 using a mixed dataset of about 130k instruction examples.
It is meant for text generation and simple instruction following.

Training Notes

  • Pretraining dataset: FineWebEdu
  • Instruction data: diverse_130k_better.txt
  • Vocabulary size: 32k
  • Sequence length: 256
  • Training status: trained from scratch
  • Final pretraining loss: 3.3
  • Final instruct loss: 1.7 (i dont know if this version could be overfit, i thought 1.7 was too low but after testing it didnt seem overfit)

What this model is good at

  • Short instruction following
  • Simple assistant text generation
  • Lightweight local experiments

Limitations

  • Small model size
  • Short context length
  • Will struggle with logic math and complex reasonong (bad at math !)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support