Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-1000-docs
Follow
Human-Centered AI Lab
10
Files
xet
HCAI-Lab/dolma3-6t-sample-1000-docs
/
worker_0010
2.37 GB
46,808 files
Updated about 2 months ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000060.jsonl.zst
85.7 kB
xet
about 2 months ago
760becf7
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000061.jsonl.zst
88.6 kB
xet
about 2 months ago
fd3e6ed0
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000063.jsonl.zst
50 kB
xet
about 2 months ago
a9b4ec54
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000067.jsonl.zst
26.8 kB
xet
about 2 months ago
95054161
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000068.jsonl.zst
51.9 kB
xet
about 2 months ago
1fb738dc
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000073.jsonl.zst
43.2 kB
xet
about 2 months ago
360f40b8
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000074.jsonl.zst
63.4 kB
xet
about 2 months ago
709d2021
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000086.jsonl.zst
37.3 kB
xet
about 2 months ago
55df11e8
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000090.jsonl.zst
54.1 kB
xet
about 2 months ago
5a7c1acc
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000092.jsonl.zst
28.6 kB
xet
about 2 months ago
6662b877
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000095.jsonl.zst
40.8 kB
xet
about 2 months ago
aab768c0
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000097.jsonl.zst
94.3 kB
xet
about 2 months ago
4d884f4d
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000098.jsonl.zst
95.5 kB
xet
about 2 months ago
35249fd1
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000099.jsonl.zst
48.9 kB
xet
about 2 months ago
bee2157c
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000100.jsonl.zst
41.1 kB
xet
about 2 months ago
a17c7181
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000103.jsonl.zst
64.1 kB
xet
about 2 months ago
46696269
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000104.jsonl.zst
56.9 kB
xet
about 2 months ago
472631a3
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000105.jsonl.zst
39.2 kB
xet
about 2 months ago
7be7bdf3
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000114.jsonl.zst
83 kB
xet
about 2 months ago
558ada00
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000115.jsonl.zst
30.3 kB
xet
about 2 months ago
acbe0d44
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000122.jsonl.zst
34.1 kB
xet
about 2 months ago
1b937b01
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000123.jsonl.zst
40.3 kB
xet
about 2 months ago
40c67829
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000129.jsonl.zst
41.2 kB
xet
about 2 months ago
7a757890
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000131.jsonl.zst
48.8 kB
xet
about 2 months ago
8362f71b
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000134.jsonl.zst
41.2 kB
xet
about 2 months ago
717d1127
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000137.jsonl.zst
36.6 kB
xet
about 2 months ago
d293d592
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000145.jsonl.zst
56.4 kB
xet
about 2 months ago
2236f24f
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000149.jsonl.zst
49.7 kB
xet
about 2 months ago
df378703
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000167.jsonl.zst
46.2 kB
xet
about 2 months ago
f2082022
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000168.jsonl.zst
68.2 kB
xet
about 2 months ago
858d8139
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000174.jsonl.zst
112 kB
xet
about 2 months ago
311ab0df
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000175.jsonl.zst
80.3 kB
xet
about 2 months ago
d05d5505
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000180.jsonl.zst
71.3 kB
xet
about 2 months ago
be050e72
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000187.jsonl.zst
60 kB
xet
about 2 months ago
4a0e23e2
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000188.jsonl.zst
83.9 kB
xet
about 2 months ago
ee386929
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000204.jsonl.zst
42.4 kB
xet
about 2 months ago
74577a04
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000205.jsonl.zst
36.8 kB
xet
about 2 months ago
cc7ab74d
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000207.jsonl.zst
48.5 kB
xet
about 2 months ago
b144edbd
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000221.jsonl.zst
53.6 kB
xet
about 2 months ago
224af118
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000224.jsonl.zst
54.6 kB
xet
about 2 months ago
4f708779
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000227.jsonl.zst
58.8 kB
xet
about 2 months ago
99ffd718
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000230.jsonl.zst
43.6 kB
xet
about 2 months ago
ac8574af
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000234.jsonl.zst
42.4 kB
xet
about 2 months ago
eab5a498
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000239.jsonl.zst
48.7 kB
xet
about 2 months ago
2ebf5bf5
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000248.jsonl.zst
44.8 kB
xet
about 2 months ago
de3d9aa2
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000255.jsonl.zst
33.2 kB
xet
about 2 months ago
aa57c443
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000258.jsonl.zst
46.6 kB
xet
about 2 months ago
e2fcb8e7
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000260.jsonl.zst
51.1 kB
xet
about 2 months ago
a4ec60b8
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000267.jsonl.zst
43.1 kB
xet
about 2 months ago
1b655300
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000268.jsonl.zst
34 kB
xet
about 2 months ago
23b3a2c5
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000269.jsonl.zst
40.7 kB
xet
about 2 months ago
27813122
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000270.jsonl.zst
49.7 kB
xet
about 2 months ago
ae643647
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000276.jsonl.zst
51.1 kB
xet
about 2 months ago
79a9cb40
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000278.jsonl.zst
53.3 kB
xet
about 2 months ago
f3cbd294
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000280.jsonl.zst
206 kB
xet
about 2 months ago
c08701ff
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000282.jsonl.zst
75 kB
xet
about 2 months ago
70fd5598
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000283.jsonl.zst
58.8 kB
xet
about 2 months ago
1cdfe380
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000289.jsonl.zst
58.1 kB
xet
about 2 months ago
2a8369d2
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000291.jsonl.zst
65.2 kB
xet
about 2 months ago
f87b8064
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000295.jsonl.zst
44.3 kB
xet
about 2 months ago
46a941d4
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000299.jsonl.zst
66.8 kB
xet
about 2 months ago
fbfa61a0
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000307.jsonl.zst
50.1 kB
xet
about 2 months ago
f726ee77
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000308.jsonl.zst
45.2 kB
xet
about 2 months ago
8e8e64a5
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000309.jsonl.zst
74 kB
xet
about 2 months ago
c785a755
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000316.jsonl.zst
57.6 kB
xet
about 2 months ago
4dc8653d
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000332.jsonl.zst
53.5 kB
xet
about 2 months ago
ae16bd57
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000333.jsonl.zst
66.1 kB
xet
about 2 months ago
e0a61abc
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000336.jsonl.zst
50.7 kB
xet
about 2 months ago
dde954c9
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000339.jsonl.zst
44.7 kB
xet
about 2 months ago
00aa6354
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000341.jsonl.zst
42.2 kB
xet
about 2 months ago
f43d9857
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000343.jsonl.zst
67.7 kB
xet
about 2 months ago
5b22d7a6
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000350.jsonl.zst
64.3 kB
xet
about 2 months ago
86dfb61e
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000351.jsonl.zst
49.4 kB
xet
about 2 months ago
aa82ad20
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000352.jsonl.zst
50.9 kB
xet
about 2 months ago
4590477a
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000353.jsonl.zst
51.9 kB
xet
about 2 months ago
5c690878
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000354.jsonl.zst
44.4 kB
xet
about 2 months ago
9e23f373
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000356.jsonl.zst
39.1 kB
xet
about 2 months ago
c8ba9d7a
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000357.jsonl.zst
58.7 kB
xet
about 2 months ago
f2cbbf5b
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000358.jsonl.zst
76.3 kB
xet
about 2 months ago
de6b2d2a
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000364.jsonl.zst
46.9 kB
xet
about 2 months ago
cf3b0d70
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000373.jsonl.zst
52.2 kB
xet
about 2 months ago
401dede8
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000378.jsonl.zst
47.9 kB
xet
about 2 months ago
cc0b7801
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000382.jsonl.zst
42.4 kB
xet
about 2 months ago
6a5ca59d
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000385.jsonl.zst
60.6 kB
xet
about 2 months ago
f5174d0b
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000387.jsonl.zst
28.6 kB
xet
about 2 months ago
bb2e31fe
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000389.jsonl.zst
97.2 kB
xet
about 2 months ago
9e1a8fb5
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000390.jsonl.zst
37.7 kB
xet
about 2 months ago
c78a5965
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0016__shard_00000392.jsonl.zst
37.9 kB
xet
about 2 months ago
b2f75f5e
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000001.jsonl.zst
57.3 kB
xet
about 2 months ago
7f59d40f
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000002.jsonl.zst
104 kB
xet
about 2 months ago
234030f5
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000003.jsonl.zst
69.5 kB
xet
about 2 months ago
1fec85b6
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000004.jsonl.zst
64 kB
xet
about 2 months ago
ca4ebf88
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000005.jsonl.zst
58.8 kB
xet
about 2 months ago
22a9fd08
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000007.jsonl.zst
59.4 kB
xet
about 2 months ago
edc52ed6
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000008.jsonl.zst
56.1 kB
xet
about 2 months ago
2d16e3d1
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000009.jsonl.zst
49.1 kB
xet
about 2 months ago
db11ed11
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000010.jsonl.zst
76.7 kB
xet
about 2 months ago
554398e4
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000011.jsonl.zst
28.8 kB
xet
about 2 months ago
7ce1006a
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000012.jsonl.zst
61.9 kB
xet
about 2 months ago
92d4f839
soc127__phase1_pool_shared__common_crawl__part_000__data__common_crawl-electronics_and_hardware-0017__shard_00000013.jsonl.zst
45.7 kB
xet
about 2 months ago
069fc55d
Load more
Sync this bucket
Mount this bucket
Total size
2.37 GB
Files
46,808
Last updated
Mar 23
Pre-warmed CDN
US
EU
US
EU
Contributors