Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
10
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0040
11.1 GB
56,043 files
Updated about 2 months ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000125.jsonl.zst
561 kB
xet
about 2 months ago
7c3a927a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000126.jsonl.zst
435 kB
xet
about 2 months ago
64af5f08
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000127.jsonl.zst
512 kB
xet
about 2 months ago
0618eb34
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000128.jsonl.zst
489 kB
xet
about 2 months ago
0d973ea8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000129.jsonl.zst
514 kB
xet
about 2 months ago
4233230e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000130.jsonl.zst
612 kB
xet
about 2 months ago
46a05b01
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000131.jsonl.zst
381 kB
xet
about 2 months ago
1af133e3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000132.jsonl.zst
529 kB
xet
about 2 months ago
eeea6645
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000133.jsonl.zst
472 kB
xet
about 2 months ago
e1f51035
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000134.jsonl.zst
729 kB
xet
about 2 months ago
05578114
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000135.jsonl.zst
454 kB
xet
about 2 months ago
cc1eafb0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000136.jsonl.zst
550 kB
xet
about 2 months ago
b35b5b9f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000137.jsonl.zst
368 kB
xet
about 2 months ago
14db70ba
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000138.jsonl.zst
481 kB
xet
about 2 months ago
03280ac3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000139.jsonl.zst
365 kB
xet
about 2 months ago
ca72b541
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000140.jsonl.zst
515 kB
xet
about 2 months ago
999e92ad
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000141.jsonl.zst
446 kB
xet
about 2 months ago
d64cf95d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000142.jsonl.zst
427 kB
xet
about 2 months ago
933e5a17
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000143.jsonl.zst
499 kB
xet
about 2 months ago
06744c9f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000144.jsonl.zst
465 kB
xet
about 2 months ago
243ea17d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000145.jsonl.zst
394 kB
xet
about 2 months ago
71733dbc
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000146.jsonl.zst
485 kB
xet
about 2 months ago
fbda99c8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000147.jsonl.zst
468 kB
xet
about 2 months ago
8d772d31
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000148.jsonl.zst
429 kB
xet
about 2 months ago
0617e1b7
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000149.jsonl.zst
485 kB
xet
about 2 months ago
45ca8627
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000150.jsonl.zst
431 kB
xet
about 2 months ago
8d9ff97b
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000151.jsonl.zst
433 kB
xet
about 2 months ago
d8007dd6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000152.jsonl.zst
459 kB
xet
about 2 months ago
6d8dc55a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000153.jsonl.zst
416 kB
xet
about 2 months ago
a6581294
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000154.jsonl.zst
579 kB
xet
about 2 months ago
ec4d82cd
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000155.jsonl.zst
462 kB
xet
about 2 months ago
b1869ba7
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000156.jsonl.zst
569 kB
xet
about 2 months ago
7bc3c837
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000157.jsonl.zst
293 kB
xet
about 2 months ago
4a336ef4
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000158.jsonl.zst
227 kB
xet
about 2 months ago
ef944703
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000159.jsonl.zst
532 kB
xet
about 2 months ago
af3d90c6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000160.jsonl.zst
494 kB
xet
about 2 months ago
21f39536
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000161.jsonl.zst
447 kB
xet
about 2 months ago
768cacf1
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000162.jsonl.zst
502 kB
xet
about 2 months ago
11379e96
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000163.jsonl.zst
537 kB
xet
about 2 months ago
b597a469
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000164.jsonl.zst
488 kB
xet
about 2 months ago
e8f4c445
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000165.jsonl.zst
424 kB
xet
about 2 months ago
7c6fd1fe
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000166.jsonl.zst
624 kB
xet
about 2 months ago
1e7e4bf4
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000167.jsonl.zst
511 kB
xet
about 2 months ago
f9eb9f45
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000168.jsonl.zst
356 kB
xet
about 2 months ago
614dcf1e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000169.jsonl.zst
505 kB
xet
about 2 months ago
e86418f3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000170.jsonl.zst
465 kB
xet
about 2 months ago
5029ec86
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000171.jsonl.zst
278 kB
xet
about 2 months ago
1a356c12
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000172.jsonl.zst
441 kB
xet
about 2 months ago
90dbe39f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000173.jsonl.zst
348 kB
xet
about 2 months ago
3f560b70
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000174.jsonl.zst
385 kB
xet
about 2 months ago
cbf11240
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000175.jsonl.zst
597 kB
xet
about 2 months ago
3160858d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000176.jsonl.zst
437 kB
xet
about 2 months ago
6df80305
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000177.jsonl.zst
237 kB
xet
about 2 months ago
61727f3a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000178.jsonl.zst
335 kB
xet
about 2 months ago
9bcabbba
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000179.jsonl.zst
639 kB
xet
about 2 months ago
d84699fb
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000180.jsonl.zst
465 kB
xet
about 2 months ago
d4ec5fc3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000181.jsonl.zst
518 kB
xet
about 2 months ago
82e88f55
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000182.jsonl.zst
418 kB
xet
about 2 months ago
0e423a9e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000183.jsonl.zst
448 kB
xet
about 2 months ago
070886f5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000184.jsonl.zst
484 kB
xet
about 2 months ago
38687b8e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000185.jsonl.zst
403 kB
xet
about 2 months ago
9bfd24ed
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000186.jsonl.zst
493 kB
xet
about 2 months ago
24a5125c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000187.jsonl.zst
452 kB
xet
about 2 months ago
f3d4405c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000188.jsonl.zst
463 kB
xet
about 2 months ago
349838ce
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000189.jsonl.zst
406 kB
xet
about 2 months ago
040eac33
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000190.jsonl.zst
528 kB
xet
about 2 months ago
0eb1e94c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000191.jsonl.zst
423 kB
xet
about 2 months ago
895cca09
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000192.jsonl.zst
545 kB
xet
about 2 months ago
e4a8cf76
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000193.jsonl.zst
465 kB
xet
about 2 months ago
7a3bdf60
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000194.jsonl.zst
420 kB
xet
about 2 months ago
3b0b323d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000195.jsonl.zst
450 kB
xet
about 2 months ago
8cdfc4f6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000196.jsonl.zst
640 kB
xet
about 2 months ago
9232bd9e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000197.jsonl.zst
416 kB
xet
about 2 months ago
d1366152
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000198.jsonl.zst
479 kB
xet
about 2 months ago
1d047827
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000199.jsonl.zst
651 kB
xet
about 2 months ago
17aa0f3f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000200.jsonl.zst
476 kB
xet
about 2 months ago
cfb0d597
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000201.jsonl.zst
529 kB
xet
about 2 months ago
a02bf51c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000202.jsonl.zst
352 kB
xet
about 2 months ago
30b4df53
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000203.jsonl.zst
478 kB
xet
about 2 months ago
b4a2b961
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000204.jsonl.zst
577 kB
xet
about 2 months ago
e03102c1
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000205.jsonl.zst
433 kB
xet
about 2 months ago
565f3852
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000206.jsonl.zst
444 kB
xet
about 2 months ago
b58c2556
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000207.jsonl.zst
467 kB
xet
about 2 months ago
38f8456f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000208.jsonl.zst
536 kB
xet
about 2 months ago
11b399cf
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000209.jsonl.zst
541 kB
xet
about 2 months ago
7a391bb5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000210.jsonl.zst
456 kB
xet
about 2 months ago
f86a93be
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000211.jsonl.zst
416 kB
xet
about 2 months ago
c7735dd0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000212.jsonl.zst
533 kB
xet
about 2 months ago
3d367d37
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000213.jsonl.zst
577 kB
xet
about 2 months ago
44f51292
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000214.jsonl.zst
491 kB
xet
about 2 months ago
03ae2927
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000215.jsonl.zst
487 kB
xet
about 2 months ago
68f4eee2
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000216.jsonl.zst
441 kB
xet
about 2 months ago
7334367a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000217.jsonl.zst
538 kB
xet
about 2 months ago
208e22e5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000218.jsonl.zst
598 kB
xet
about 2 months ago
0305f959
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000219.jsonl.zst
502 kB
xet
about 2 months ago
e971f4c0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000220.jsonl.zst
385 kB
xet
about 2 months ago
11422cfe
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000221.jsonl.zst
548 kB
xet
about 2 months ago
e271c5de
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000222.jsonl.zst
546 kB
xet
about 2 months ago
72eb74af
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000223.jsonl.zst
371 kB
xet
about 2 months ago
6b5241fe
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0019__shard_00000224.jsonl.zst
557 kB
xet
about 2 months ago
13de6511
Load more
Sync this bucket
Mount this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors