Add selective HF parquet shard download support (--hf-files, --hf-subdir, --max-shards, --list-shards) ab68c56 verified dignity045 commited on 2 days ago
Initial GrandLine implementation: deterministic shard-first dataset preprocessing for LLM pretraining ed59144 verified dignity045 commited on 2 days ago