fix: apply do_lower_case to root-level HF-facing files f25becd diyclassics Claude Opus 4.6 (1M context) commited on 13 days ago
fix: add do_lower_case=True to tokenizer (v1.1.1) c7b1be1 diyclassics Claude Opus 4.6 (1M context) commited on 13 days ago
fix: revert tie_word_embeddings — safetensors needs tying for decoder weights 8c959a5 diyclassics Claude Opus 4.6 (1M context) commited on 16 days ago
Adding `safetensors` variant of this model (#1) a533a28 diyclassics SFconvertbot commited on 16 days ago
chore: gitignore cluster/, track test vocab fixture a285690 diyclassics Claude Opus 4.6 (1M context) commited on 16 days ago
fix: add decode/unescape to fast tokenizer, silence tied-weights warning 86e0990 diyclassics Claude Opus 4.6 (1M context) commited on 16 days ago
docs: remove blockquote from experimental note 7a1b678 diyclassics Claude Opus 4.6 (1M context) commited on 16 days ago
docs: add links to original repo and paper, add experimental proviso 0f1214f diyclassics Claude Opus 4.6 (1M context) commited on 16 days ago
feat: add LatinBertTokenizerFast with word_ids() support ed6af90 diyclassics Claude Opus 4.6 (1M context) commited on 17 days ago
chore: add HF model repo files (config, tokenizer, encoder, README) 872519e diyclassics Claude Opus 4.6 (1M context) commited on 17 days ago
refactor: extract shared case study utils and move data to tracked paths f04d50f diyclassics Claude Opus 4.6 (1M context) commited on 17 days ago
fix: handle >512 token sentences and add MPS device support 2c07f6c diyclassics Claude Opus 4.6 commited on 19 days ago
test: add contextual nearest neighbors case study (Bamman & Burns §4.4) 3510517 diyclassics Claude Opus 4.6 commited on 19 days ago
feat: make benchmarks model-agnostic with --model-path option 8af2caa diyclassics Claude Opus 4.6 commited on 20 days ago
test: add WSD case study reproduction (Bamman & Burns Table 2) 73784ba diyclassics Claude Opus 4.6 commited on 20 days ago
test: add POS tagging case study reproduction (Bamman & Burns Table 1) bbde973 diyclassics Claude Opus 4.6 commited on 20 days ago
test: add infilling case study reproduction (Bamman & Burns Table 3) c5bfe4c diyclassics Claude Opus 4.6 commited on 20 days ago
Fix tokenizer ID offset: reserve IDs 0-4 for BERT special tokens ce59834 diyclassics Claude Opus 4.6 commited on 20 days ago
Initial: HF-compatible Latin BERT tokenizer (Bamman & Burns 2020) 68d8806 diyclassics Claude Opus 4.6 commited on 20 days ago