Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
10
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0039
11.1 GB
56,043 files
Updated about 2 months ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000094.jsonl.zst
367 kB
xet
about 2 months ago
16b4ccc8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000095.jsonl.zst
722 kB
xet
about 2 months ago
cb92d984
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000096.jsonl.zst
635 kB
xet
about 2 months ago
76916579
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000097.jsonl.zst
515 kB
xet
about 2 months ago
c6965bac
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000098.jsonl.zst
441 kB
xet
about 2 months ago
cc6ce3ca
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000099.jsonl.zst
539 kB
xet
about 2 months ago
4b80c2ab
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000100.jsonl.zst
344 kB
xet
about 2 months ago
4158315e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000101.jsonl.zst
407 kB
xet
about 2 months ago
5455dd1e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000102.jsonl.zst
633 kB
xet
about 2 months ago
e0a8c18a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000103.jsonl.zst
223 kB
xet
about 2 months ago
da761834
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000104.jsonl.zst
674 kB
xet
about 2 months ago
ff4416f2
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000105.jsonl.zst
490 kB
xet
about 2 months ago
b0c7474a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000106.jsonl.zst
581 kB
xet
about 2 months ago
b75f9af1
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000107.jsonl.zst
492 kB
xet
about 2 months ago
8d5c929a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000108.jsonl.zst
414 kB
xet
about 2 months ago
f91a17e7
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000109.jsonl.zst
594 kB
xet
about 2 months ago
8fff1626
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000110.jsonl.zst
557 kB
xet
about 2 months ago
de3c6bb6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000111.jsonl.zst
446 kB
xet
about 2 months ago
1822912d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000112.jsonl.zst
297 kB
xet
about 2 months ago
a2e774f8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000113.jsonl.zst
546 kB
xet
about 2 months ago
fca32887
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000114.jsonl.zst
563 kB
xet
about 2 months ago
57ef1445
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000115.jsonl.zst
607 kB
xet
about 2 months ago
503510cf
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000116.jsonl.zst
564 kB
xet
about 2 months ago
d9c12e8a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000117.jsonl.zst
394 kB
xet
about 2 months ago
f34f100c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000118.jsonl.zst
447 kB
xet
about 2 months ago
a51a0a53
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000119.jsonl.zst
449 kB
xet
about 2 months ago
e7c205d5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000120.jsonl.zst
464 kB
xet
about 2 months ago
5aedfbf5
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000121.jsonl.zst
329 kB
xet
about 2 months ago
9fdcf4d8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000122.jsonl.zst
318 kB
xet
about 2 months ago
224e09bd
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000123.jsonl.zst
592 kB
xet
about 2 months ago
29c4b3bc
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000124.jsonl.zst
494 kB
xet
about 2 months ago
8b2968ea
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000125.jsonl.zst
662 kB
xet
about 2 months ago
62665096
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000126.jsonl.zst
481 kB
xet
about 2 months ago
e58bc809
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000127.jsonl.zst
407 kB
xet
about 2 months ago
f5fe2a49
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000128.jsonl.zst
400 kB
xet
about 2 months ago
e1bc6842
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000129.jsonl.zst
629 kB
xet
about 2 months ago
b24747ff
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000130.jsonl.zst
745 kB
xet
about 2 months ago
3623f608
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000131.jsonl.zst
501 kB
xet
about 2 months ago
342be96e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000132.jsonl.zst
465 kB
xet
about 2 months ago
5bbced15
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000133.jsonl.zst
404 kB
xet
about 2 months ago
fc724372
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000134.jsonl.zst
531 kB
xet
about 2 months ago
84161152
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000135.jsonl.zst
776 kB
xet
about 2 months ago
1e0c6ade
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000136.jsonl.zst
401 kB
xet
about 2 months ago
0743cecb
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000137.jsonl.zst
654 kB
xet
about 2 months ago
ce4b0925
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000138.jsonl.zst
843 kB
xet
about 2 months ago
3315b8bf
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000139.jsonl.zst
679 kB
xet
about 2 months ago
a6c69947
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000140.jsonl.zst
494 kB
xet
about 2 months ago
8458720b
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000141.jsonl.zst
461 kB
xet
about 2 months ago
42a75652
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000142.jsonl.zst
466 kB
xet
about 2 months ago
1497b49a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000143.jsonl.zst
684 kB
xet
about 2 months ago
ee3baf76
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000144.jsonl.zst
457 kB
xet
about 2 months ago
2d167fa4
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000145.jsonl.zst
791 kB
xet
about 2 months ago
1dd58d98
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000146.jsonl.zst
334 kB
xet
about 2 months ago
41596a12
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000147.jsonl.zst
624 kB
xet
about 2 months ago
1a44ce0c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000148.jsonl.zst
620 kB
xet
about 2 months ago
95e0238e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000149.jsonl.zst
335 kB
xet
about 2 months ago
80175893
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000150.jsonl.zst
529 kB
xet
about 2 months ago
34b36ea2
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000151.jsonl.zst
518 kB
xet
about 2 months ago
a86f7044
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000152.jsonl.zst
548 kB
xet
about 2 months ago
d1ed3425
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000153.jsonl.zst
501 kB
xet
about 2 months ago
933676bc
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000154.jsonl.zst
587 kB
xet
about 2 months ago
56a02712
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000155.jsonl.zst
438 kB
xet
about 2 months ago
ac6c62cd
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000156.jsonl.zst
585 kB
xet
about 2 months ago
eac6f9d6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000157.jsonl.zst
496 kB
xet
about 2 months ago
021edaec
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000158.jsonl.zst
360 kB
xet
about 2 months ago
c627c49f
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000159.jsonl.zst
397 kB
xet
about 2 months ago
8c812fbb
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000160.jsonl.zst
391 kB
xet
about 2 months ago
dd876848
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000161.jsonl.zst
409 kB
xet
about 2 months ago
31e1be1d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000162.jsonl.zst
360 kB
xet
about 2 months ago
74f549c6
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000163.jsonl.zst
402 kB
xet
about 2 months ago
46940115
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000164.jsonl.zst
552 kB
xet
about 2 months ago
76f0cdba
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000165.jsonl.zst
594 kB
xet
about 2 months ago
02d0113a
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000166.jsonl.zst
465 kB
xet
about 2 months ago
3b9351dc
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000167.jsonl.zst
754 kB
xet
about 2 months ago
0b946d6c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000168.jsonl.zst
653 kB
xet
about 2 months ago
e51acce0
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000169.jsonl.zst
324 kB
xet
about 2 months ago
51d01cda
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000170.jsonl.zst
174 kB
xet
about 2 months ago
12e0b420
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000171.jsonl.zst
654 kB
xet
about 2 months ago
f3d47d9c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000172.jsonl.zst
447 kB
xet
about 2 months ago
c7a85d9d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000173.jsonl.zst
369 kB
xet
about 2 months ago
56a428a1
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000174.jsonl.zst
571 kB
xet
about 2 months ago
5513158c
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000175.jsonl.zst
480 kB
xet
about 2 months ago
56b0711b
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000176.jsonl.zst
519 kB
xet
about 2 months ago
af32d676
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000177.jsonl.zst
404 kB
xet
about 2 months ago
91d01500
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000178.jsonl.zst
560 kB
xet
about 2 months ago
bc4b627d
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000179.jsonl.zst
381 kB
xet
about 2 months ago
b47d16b3
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000180.jsonl.zst
492 kB
xet
about 2 months ago
9aaaf007
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000181.jsonl.zst
414 kB
xet
about 2 months ago
13dac1f8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000182.jsonl.zst
547 kB
xet
about 2 months ago
c8f15188
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000183.jsonl.zst
482 kB
xet
about 2 months ago
4ec755b8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000184.jsonl.zst
478 kB
xet
about 2 months ago
568fc52e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000185.jsonl.zst
340 kB
xet
about 2 months ago
3540c15e
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000186.jsonl.zst
320 kB
xet
about 2 months ago
f266a4ff
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000187.jsonl.zst
513 kB
xet
about 2 months ago
e04ab7c7
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000188.jsonl.zst
419 kB
xet
about 2 months ago
b9cb80f8
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000189.jsonl.zst
417 kB
xet
about 2 months ago
b8d0a0c2
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000190.jsonl.zst
803 kB
xet
about 2 months ago
67d3cc07
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000191.jsonl.zst
408 kB
xet
about 2 months ago
ab4ac149
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000192.jsonl.zst
551 kB
xet
about 2 months ago
98c09672
soc127__phase1_pool_shared__common_crawl__part_003__data__common_crawl-history_and_geography-0018__shard_00000193.jsonl.zst
739 kB
xet
about 2 months ago
04a29530
Load more
Sync this bucket
Mount this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors