Buckets:
| from dagster import Definitions | |
| from dagster_hf_datasets import ( | |
| HuggingFaceResource, | |
| ) | |
| from dagster_hf_datasets.io_manager import HFParquetIOManager | |
| from distributed_token_sharding.assets import ( | |
| fineweb_dataset, | |
| tokenized_fineweb, | |
| ) | |
| defs = Definitions( | |
| assets=[ | |
| fineweb_dataset, | |
| tokenized_fineweb, | |
| ], | |
| resources={ | |
| "huggingface": HuggingFaceResource( | |
| cache_dir=".hf_cache", | |
| offline=False, | |
| ), | |
| "hf_parquet_io_manager": HFParquetIOManager( | |
| base_dir=".dagster_hf_storage", | |
| ), | |
| }, | |
| ) | |
Xet Storage Details
- Size:
- 593 Bytes
- Xet hash:
- 7205581fe7b65f61f9b81f6738f9983a5c157d4e0c956d39aeb0469f0d2be16a
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.