Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
dataflare
/
df-arc
like
1
Follow
Dataflare
2
dataflare/arabic-dialect-corpus
dataflare/egypt-legal-corpus
Arabic
arabic
tokenizer
morphology
nlp
dialect
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
46d7f48
df-arc
5.59 MB
1 contributor
History:
20 commits
fr3on
Delete phrase_vocab.json
46d7f48
verified
26 days ago
.gitattributes
1.57 kB
Upload folder using huggingface_hub
about 1 month ago
README.md
2.09 kB
Update README for v1.1 release
about 1 month ago
special_tokens_map.json
169 Bytes
Upload custom Unigram tokenizer (v1)
26 days ago
tokenization_df_arc.py
11.5 kB
Release v1.1: PMI Phrase Merging & Smart Morphology
about 1 month ago
tokenizer.json
4.32 MB
xet
Upload custom Unigram tokenizer (v1)
26 days ago
tokenizer.model
1.26 MB
xet
Upload custom Unigram tokenizer (v1)
26 days ago
tokenizer_config.json
255 Bytes
Upload custom Unigram tokenizer (v1)
26 days ago