Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
10
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0070
11.1 GB
56,043 files
Updated about 2 months ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000263.jsonl.zst
214 kB
xet
about 2 months ago
13eeeb6e
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000265.jsonl.zst
179 kB
xet
about 2 months ago
7374f12e
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000266.jsonl.zst
201 kB
xet
about 2 months ago
2f679e20
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000267.jsonl.zst
224 kB
xet
about 2 months ago
30cbf773
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000268.jsonl.zst
253 kB
xet
about 2 months ago
14c47da0
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000269.jsonl.zst
171 kB
xet
about 2 months ago
d56bf44f
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000271.jsonl.zst
211 kB
xet
about 2 months ago
944e4275
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000272.jsonl.zst
256 kB
xet
about 2 months ago
0d0f1086
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000274.jsonl.zst
219 kB
xet
about 2 months ago
cd13d7e7
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000275.jsonl.zst
310 kB
xet
about 2 months ago
8f04ee6f
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000276.jsonl.zst
203 kB
xet
about 2 months ago
90fd5911
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000277.jsonl.zst
229 kB
xet
about 2 months ago
7c9cea66
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000278.jsonl.zst
227 kB
xet
about 2 months ago
626a9fb6
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000279.jsonl.zst
274 kB
xet
about 2 months ago
a1ce5eef
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000280.jsonl.zst
208 kB
xet
about 2 months ago
800d8184
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000282.jsonl.zst
247 kB
xet
about 2 months ago
20c3e907
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000283.jsonl.zst
262 kB
xet
about 2 months ago
bf8bc9d8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000284.jsonl.zst
262 kB
xet
about 2 months ago
2dd5a07f
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000285.jsonl.zst
244 kB
xet
about 2 months ago
89f1d4c5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000286.jsonl.zst
269 kB
xet
about 2 months ago
84e18302
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000287.jsonl.zst
238 kB
xet
about 2 months ago
26f6c516
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000290.jsonl.zst
251 kB
xet
about 2 months ago
9300de8d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000291.jsonl.zst
223 kB
xet
about 2 months ago
fad24e07
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000293.jsonl.zst
211 kB
xet
about 2 months ago
fca31634
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000294.jsonl.zst
284 kB
xet
about 2 months ago
65139156
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000295.jsonl.zst
252 kB
xet
about 2 months ago
4456920a
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000296.jsonl.zst
261 kB
xet
about 2 months ago
1bdb0a68
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000298.jsonl.zst
217 kB
xet
about 2 months ago
447abdc9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000299.jsonl.zst
272 kB
xet
about 2 months ago
8afd41c3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000300.jsonl.zst
284 kB
xet
about 2 months ago
65bacb7a
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000301.jsonl.zst
184 kB
xet
about 2 months ago
23e66fc9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000302.jsonl.zst
230 kB
xet
about 2 months ago
ed6e32fe
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000303.jsonl.zst
257 kB
xet
about 2 months ago
b1ff1714
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000305.jsonl.zst
224 kB
xet
about 2 months ago
156e7202
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000306.jsonl.zst
309 kB
xet
about 2 months ago
fad34d32
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000307.jsonl.zst
238 kB
xet
about 2 months ago
997b27b6
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000308.jsonl.zst
284 kB
xet
about 2 months ago
def946ac
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000309.jsonl.zst
282 kB
xet
about 2 months ago
4788bf26
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000310.jsonl.zst
258 kB
xet
about 2 months ago
e471e3fd
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000313.jsonl.zst
61.1 kB
xet
about 2 months ago
9070d3b5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000316.jsonl.zst
112 kB
xet
about 2 months ago
5c8ae5fa
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000318.jsonl.zst
85.8 kB
xet
about 2 months ago
a1f24d51
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000319.jsonl.zst
415 kB
xet
about 2 months ago
109f0880
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000321.jsonl.zst
178 kB
xet
about 2 months ago
63e0c449
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000322.jsonl.zst
299 kB
xet
about 2 months ago
513047ec
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000323.jsonl.zst
270 kB
xet
about 2 months ago
9692e9d4
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000326.jsonl.zst
202 kB
xet
about 2 months ago
0a052732
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000327.jsonl.zst
195 kB
xet
about 2 months ago
b2d1f1dd
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000330.jsonl.zst
242 kB
xet
about 2 months ago
181c1dc2
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000331.jsonl.zst
206 kB
xet
about 2 months ago
d19669ff
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000332.jsonl.zst
171 kB
xet
about 2 months ago
e1828224
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000333.jsonl.zst
237 kB
xet
about 2 months ago
a11500fe
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000334.jsonl.zst
291 kB
xet
about 2 months ago
79458bbc
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000335.jsonl.zst
164 kB
xet
about 2 months ago
c2113d4f
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000336.jsonl.zst
264 kB
xet
about 2 months ago
a4c1ce08
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000337.jsonl.zst
319 kB
xet
about 2 months ago
0b081c71
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000338.jsonl.zst
230 kB
xet
about 2 months ago
95d06286
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000339.jsonl.zst
300 kB
xet
about 2 months ago
9b76f409
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000340.jsonl.zst
217 kB
xet
about 2 months ago
ab982a5b
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000341.jsonl.zst
266 kB
xet
about 2 months ago
14ca19bb
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000343.jsonl.zst
303 kB
xet
about 2 months ago
f53e906b
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000344.jsonl.zst
266 kB
xet
about 2 months ago
c0413a26
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000345.jsonl.zst
243 kB
xet
about 2 months ago
3aa60dda
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000346.jsonl.zst
238 kB
xet
about 2 months ago
5d558892
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000347.jsonl.zst
209 kB
xet
about 2 months ago
d747e7b8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000348.jsonl.zst
294 kB
xet
about 2 months ago
b34c2dbc
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000349.jsonl.zst
310 kB
xet
about 2 months ago
e35461aa
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000350.jsonl.zst
136 kB
xet
about 2 months ago
bb0e8113
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000354.jsonl.zst
213 kB
xet
about 2 months ago
5d6c8102
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000355.jsonl.zst
207 kB
xet
about 2 months ago
7cf63b66
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000356.jsonl.zst
259 kB
xet
about 2 months ago
7ff756ff
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000357.jsonl.zst
254 kB
xet
about 2 months ago
54b18388
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000359.jsonl.zst
204 kB
xet
about 2 months ago
d26d5135
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000360.jsonl.zst
173 kB
xet
about 2 months ago
905712ea
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000362.jsonl.zst
306 kB
xet
about 2 months ago
d47161bc
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000365.jsonl.zst
252 kB
xet
about 2 months ago
64a6d3e2
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000366.jsonl.zst
226 kB
xet
about 2 months ago
abb517b9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000367.jsonl.zst
227 kB
xet
about 2 months ago
3beca162
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000368.jsonl.zst
204 kB
xet
about 2 months ago
c2605b34
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000369.jsonl.zst
220 kB
xet
about 2 months ago
98b4a406
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000370.jsonl.zst
239 kB
xet
about 2 months ago
a4ddce44
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000372.jsonl.zst
338 kB
xet
about 2 months ago
64cd2aac
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000374.jsonl.zst
223 kB
xet
about 2 months ago
8bba0638
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000375.jsonl.zst
215 kB
xet
about 2 months ago
40261df2
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000376.jsonl.zst
216 kB
xet
about 2 months ago
6c557339
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000377.jsonl.zst
208 kB
xet
about 2 months ago
d4f20a3f
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000379.jsonl.zst
248 kB
xet
about 2 months ago
fec868e9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000381.jsonl.zst
257 kB
xet
about 2 months ago
008463a5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000382.jsonl.zst
243 kB
xet
about 2 months ago
03158bef
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000383.jsonl.zst
260 kB
xet
about 2 months ago
11084c2c
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000384.jsonl.zst
235 kB
xet
about 2 months ago
9b27fa13
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000386.jsonl.zst
200 kB
xet
about 2 months ago
dee0464c
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000387.jsonl.zst
232 kB
xet
about 2 months ago
33fc826d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000388.jsonl.zst
270 kB
xet
about 2 months ago
a4b5a1ee
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000389.jsonl.zst
219 kB
xet
about 2 months ago
4fd90db5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000390.jsonl.zst
223 kB
xet
about 2 months ago
c053e737
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000391.jsonl.zst
219 kB
xet
about 2 months ago
3d6a995a
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000393.jsonl.zst
191 kB
xet
about 2 months ago
722b4680
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000394.jsonl.zst
250 kB
xet
about 2 months ago
785aa940
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0016__shard_00000395.jsonl.zst
245 kB
xet
about 2 months ago
7a4ed234
Load more
Sync this bucket
Mount this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors