Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
yhavinga
/
dutch-tokenizer-arena
like
8
Running
App
Files
Files
Community
1
Fetching metadata from the HF Docker repository...
main
dutch-tokenizer-arena
/
utils
71.1 kB
Ctrl+K
Ctrl+K
3 contributors
History:
20 commits
yhavinga
Add Llama tokenizer creation for Dutch, English, Code, Markdown and TeX.
c78da21
about 2 years ago
byte_util.py
Safe
0 Bytes
update
almost 3 years ago
character_util.py
6.92 kB
add compression leaderboard
about 2 years ago
compression_util.py
7.26 kB
Add Llama tokenizer creation for Dutch, English, Code, Markdown and TeX.
about 2 years ago
convert_sp_to_json.py
54 Bytes
update
almost 3 years ago
fn_util.py
Safe
0 Bytes
add more tokenizers
over 2 years ago
lang_util.py
3.45 kB
add compression leaderboard
about 2 years ago
lang_util_2.py
3.05 kB
update
about 2 years ago
log_util.py
Safe
285 Bytes
update
over 2 years ago
oov_util.py
265 Bytes
update
almost 3 years ago
speed_util.py
77 Bytes
update
over 2 years ago
symbol.py
1.28 kB
update
almost 3 years ago
text_util.py
671 Bytes
add compression leaderboard
about 2 years ago
vocab.jd.txt.v2
47.7 kB
update
over 2 years ago