Initial GrandLine implementation: deterministic shard-first dataset preprocessing for LLM pretraining ed59144 verified dignity045 commited on 10 days ago