Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
metythorn
/
gpt2-tokenizer
like
0
Model card
Files
Files and versions
xet
Community
YAML Metadata Warning:
empty or missing yaml metadata in repo card (
https://huggingface.co/docs/hub/model-cards#model-card-metadata
)
Khmer‑English GPT‑2 Tokenizer
Khmer‑English GPT‑2 Tokenizer
Vocab size:
50,257
Algorithm:
Byte‑Level BPE (byte_fallback)
Special tokens:
<|endoftext|>, <|bos|>, <|pad|>, <|unk|>
Trained on:
metythorn/khmer‑english‑corpus
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support