Remove byte tokenizer and add config args to switch between byte/patch packing (#68) aeb95f1 unverified par-meta commited on Feb 25, 2025
Update iterator inheritance, pass file format args, limit iterator (#63) fc3399e unverified par-meta commited on Feb 22, 2025
Make it possible to specify multiple config files (#54) 82ab593 unverified par-meta commited on Feb 18, 2025
Fix multiprocessing dataloader checkpointing and use it in the train script (#50) 8c61ab5 unverified par-meta commited on Feb 13, 2025
This includes fixes that make checkpointing and reloading work correctly. (#35) 7044771 unverified par-meta commited on Jan 28, 2025
Initial codes and scripts for training entropy model (#34) 7622d28 unverified par-meta commited on Jan 27, 2025
Use load_async flag to not start MP iterator (#33) a809259 unverified par-meta commited on Jan 24, 2025
Changes for training entropy model and correcting attention in local models (#25) 6ffeb66 unverified par-meta commited on Jan 17, 2025
Replace regular filesystem calls with fsspec + add s3 support (#18) b0120da unverified par-meta commited on Jan 10, 2025