Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
latincy
/
latin-bert
like
0
Follow
LatinCy
20
Fill-Mask
Transformers
PyTorch
Safetensors
Latin
bert
feature-extraction
latin
nlp
classics
arxiv:
2009.10053
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
1
Deploy
Use this model
main
latin-bert
898 MB
Ctrl+K
Ctrl+K
3 contributors
History:
21 commits
diyclassics
Claude Opus 4.6 (1M context)
fix: apply do_lower_case to root-level HF-facing files
f25becd
10 days ago
data
refactor: extract shared case study utils and move data to tracked paths
14 days ago
src
fix: add do_lower_case=True to tokenizer (v1.1.1)
10 days ago
tests
fix: add do_lower_case=True to tokenizer (v1.1.1)
10 days ago
.gitattributes
Safe
1.52 kB
chore: add HF model repo files (config, tokenizer, encoder, README)
14 days ago
.gitignore
Safe
635 Bytes
chore: gitignore cluster/, track test vocab fixture
12 days ago
.python-version
Safe
5 Bytes
Initial: HF-compatible Latin BERT tokenizer (Bamman & Burns 2020)
17 days ago
README.md
Safe
4.22 kB
fix: add do_lower_case=True to tokenizer (v1.1.1)
10 days ago
config.json
Safe
562 Bytes
fix: revert tie_word_embeddings — safetensors needs tying for decoder weights
12 days ago
latin.subword.encoder
Safe
287 kB
chore: add HF model repo files (config, tokenizer, encoder, README)
14 days ago
model.safetensors
Safe
448 MB
xet
Adding `safetensors` variant of this model (#1)
12 days ago
pyproject.toml
Safe
680 Bytes
fix: add do_lower_case=True to tokenizer (v1.1.1)
10 days ago
pytorch_model.bin
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
,
"torch.FloatStorage"
What is a pickle import?
448 MB
xet
chore: re-upload model weights (pytorch_model.bin)
14 days ago
tokenization_latin_bert.py
Safe
11.3 kB
fix: apply do_lower_case to root-level HF-facing files
10 days ago
tokenization_latin_bert_fast.py
Safe
9.96 kB
fix: add decode/unescape to fast tokenizer, silence tied-weights warning
12 days ago
tokenizer_config.json
389 Bytes
fix: apply do_lower_case to root-level HF-facing files
10 days ago
uv.lock
Safe
644 kB
fix: add do_lower_case=True to tokenizer (v1.1.1)
10 days ago