LLM-1B-Lab / llm_lab /data /tokenizer.py

Commit History

Remove unused tokenizer training code (train_bpe, load_sentencepiece, load_trained_hf)
33ba3d1

Vjeong Claude Opus 4.6 commited on

Use LLaMA 2 pretrained tokenizer and remove tokenizer_mode option
a5ca4e4

Vjeong Claude Opus 4.6 commited on

Fix BPE tokenizer ByteLevel decoder and update evaluation notebook
8626149

Vjeong Claude Sonnet 4.6 commited on

docs: translate all Korean comments and docstrings to English
858e8b2

Vjeong Claude Sonnet 4.6 commited on

Initial commit: LLM-1B-Lab project setup
8a58ffe

Vjeong Claude Opus 4.6 commited on