Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
ducanhdinh
/
CKTN-ELECTRA
like
0
Safetensors
4 languages
rembert
vocabulary-extension
low-resource
spm-surgery
fvt
khmer
cham
tay-nung
arxiv:
2010.12821
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
CKTN-ELECTRA
2.33 GB
Ctrl+K
Ctrl+K
1 contributor
History:
15 commits
ducanhdinh
upload discriminator (three-phase training, final epoch)
6a24081
verified
30 days ago
.gitattributes
Safe
1.57 kB
Add extended tokenizer (vocab_size=254,513, SPM protobuf surgery)
about 1 month ago
README.md
Safe
6.56 kB
Add model card
about 1 month ago
config.json
Safe
796 Bytes
Add vocab-extended RemBERT CKTN-EKECTRA (+4,213 tokens)
about 1 month ago
model.safetensors
2.31 GB
xet
upload discriminator (three-phase training, final epoch)
30 days ago
sentencepiece.model
4.81 MB
xet
Add patched sentencepiece.model (vocab_size=254513)
about 1 month ago
tokenizer.json
Safe
16.8 MB
xet
Add extended tokenizer (vocab_size=254,513, SPM protobuf surgery)
about 1 month ago
tokenizer_config.json
Safe
6.87 kB
Add extended tokenizer (vocab_size=254,513, SPM protobuf surgery)
about 1 month ago