Initial GrandLine implementation: deterministic shard-first dataset preprocessing for LLM pretraining ed59144 verified dignity045 commited on 2 days ago