Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
ace-1
/
mgpt2-tokenizer
like
1
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
mgpt2-tokenizer
2.27 MB
Ctrl+K
Ctrl+K
1 contributor
History:
8 commits
ace-1
Update tokenizer artifact (verified corpus retrain) + eval metrics
68abf7e
verified
3 months ago
tokenizer
Fix transformers v5 auto_map + HF init
4 months ago
.gitattributes
Safe
1.52 kB
initial commit
4 months ago
README.md
Safe
1.8 kB
Publish mgpt2 tokenizer (GPT-2 exact merges) + eval metrics
4 months ago
added_tokens.json
29 Bytes
Upload mgpt2 tokenizer
4 months ago
evaluation.json
Safe
7.57 kB
Update tokenizer artifact (verified corpus retrain) + eval metrics
3 months ago
special_tokens_map.json
Safe
35 Bytes
Upload mgpt2 tokenizer
4 months ago
tokenization_mgpt2.py
Safe
80 Bytes
Publish mgpt2 tokenizer (GPT-2 exact merges) + eval metrics
4 months ago
tokenizer.model
464 kB
xet
Update tokenizer artifact (verified corpus retrain) + eval metrics
3 months ago
tokenizer.vocab
Safe
1.76 MB
Update tokenizer artifact (verified corpus retrain) + eval metrics
3 months ago
tokenizer_config.json
Safe
530 Bytes
Fix transformers v5 auto_map + HF init
4 months ago