Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
dignity045
/
grandline
like
0
dataset-preprocessing
llm-pretraining
tokenization
deduplication
data-pipeline
ml-intern
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
grandline
/
src
94.2 kB
Ctrl+K
Ctrl+K
1 contributor
History:
3 commits
dignity045
Add selective HF parquet shard download support (--hf-files, --hf-subdir, --max-shards, --list-shards)
ab68c56
verified
1 day ago
grandline
Add selective HF parquet shard download support (--hf-files, --hf-subdir, --max-shards, --list-shards)
1 day ago