Initial GrandLine implementation: deterministic shard-first dataset preprocessing for LLM pretraining ed59144 verified dignity045 commited on 1 day ago