fix: add do_lower_case=True to tokenizer (v1.1.1) c7b1be1 diyclassics Claude Opus 4.6 (1M context) commited on 15 days ago
chore: gitignore cluster/, track test vocab fixture a285690 diyclassics Claude Opus 4.6 (1M context) commited on 17 days ago
refactor: extract shared case study utils and move data to tracked paths f04d50f diyclassics Claude Opus 4.6 (1M context) commited on 19 days ago
fix: handle >512 token sentences and add MPS device support 2c07f6c diyclassics Claude Opus 4.6 commited on 21 days ago
test: add contextual nearest neighbors case study (Bamman & Burns §4.4) 3510517 diyclassics Claude Opus 4.6 commited on 21 days ago
feat: make benchmarks model-agnostic with --model-path option 8af2caa diyclassics Claude Opus 4.6 commited on 22 days ago
test: add WSD case study reproduction (Bamman & Burns Table 2) 73784ba diyclassics Claude Opus 4.6 commited on 22 days ago
test: add POS tagging case study reproduction (Bamman & Burns Table 1) bbde973 diyclassics Claude Opus 4.6 commited on 22 days ago
test: add infilling case study reproduction (Bamman & Burns Table 3) c5bfe4c diyclassics Claude Opus 4.6 commited on 22 days ago
Fix tokenizer ID offset: reserve IDs 0-4 for BERT special tokens ce59834 diyclassics Claude Opus 4.6 commited on 22 days ago
Initial: HF-compatible Latin BERT tokenizer (Bamman & Burns 2020) 68d8806 diyclassics Claude Opus 4.6 commited on 22 days ago