How to use EleutherAI/FineWeb-restricted with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("EleutherAI/FineWeb-restricted", dtype="auto")
This is a BPE tokenizer with 10,048 tokens trained on a portion of FineWeb's 10B token sample.
-