Initial GrandLine implementation: deterministic shard-first dataset preprocessing for LLM pretraining ed59144 verified dignity045 commited on 4 days ago