Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
10
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0071
11.1 GB
56,043 files
Updated about 2 months ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000300.jsonl.zst
198 kB
xet
about 2 months ago
8e3f3ca4
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000301.jsonl.zst
278 kB
xet
about 2 months ago
e4936213
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000302.jsonl.zst
188 kB
xet
about 2 months ago
547f6637
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000303.jsonl.zst
220 kB
xet
about 2 months ago
70840e04
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000304.jsonl.zst
193 kB
xet
about 2 months ago
ac3a7ac7
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000305.jsonl.zst
213 kB
xet
about 2 months ago
02427030
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000306.jsonl.zst
239 kB
xet
about 2 months ago
71feb70c
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000307.jsonl.zst
207 kB
xet
about 2 months ago
f64981a9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000308.jsonl.zst
165 kB
xet
about 2 months ago
d9abfe3d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000309.jsonl.zst
171 kB
xet
about 2 months ago
aebe92f5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000310.jsonl.zst
369 kB
xet
about 2 months ago
fa6af246
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000311.jsonl.zst
194 kB
xet
about 2 months ago
ad3096e8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000312.jsonl.zst
176 kB
xet
about 2 months ago
2df5a1ae
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000313.jsonl.zst
211 kB
xet
about 2 months ago
a0f86372
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000314.jsonl.zst
241 kB
xet
about 2 months ago
e141f743
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000315.jsonl.zst
176 kB
xet
about 2 months ago
e95def0d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000316.jsonl.zst
215 kB
xet
about 2 months ago
72cc61ad
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000317.jsonl.zst
227 kB
xet
about 2 months ago
980baf75
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000318.jsonl.zst
233 kB
xet
about 2 months ago
73bfb27d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000319.jsonl.zst
271 kB
xet
about 2 months ago
50c864f8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000320.jsonl.zst
259 kB
xet
about 2 months ago
696c9097
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000321.jsonl.zst
221 kB
xet
about 2 months ago
68e3b9d3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000322.jsonl.zst
263 kB
xet
about 2 months ago
e4154ca2
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000323.jsonl.zst
174 kB
xet
about 2 months ago
73cf9f3e
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000324.jsonl.zst
209 kB
xet
about 2 months ago
88732f2b
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000325.jsonl.zst
168 kB
xet
about 2 months ago
389ff677
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000326.jsonl.zst
262 kB
xet
about 2 months ago
156403fb
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000327.jsonl.zst
226 kB
xet
about 2 months ago
f0b4cbe5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000328.jsonl.zst
223 kB
xet
about 2 months ago
9eb198c5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000329.jsonl.zst
268 kB
xet
about 2 months ago
0f101efd
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000330.jsonl.zst
242 kB
xet
about 2 months ago
76b1df49
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000331.jsonl.zst
161 kB
xet
about 2 months ago
dcf1389f
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000332.jsonl.zst
275 kB
xet
about 2 months ago
3812ddcb
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000333.jsonl.zst
179 kB
xet
about 2 months ago
f4fcece3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000334.jsonl.zst
292 kB
xet
about 2 months ago
a68572d8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000335.jsonl.zst
173 kB
xet
about 2 months ago
a2757d7e
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000336.jsonl.zst
261 kB
xet
about 2 months ago
dbafc18e
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000337.jsonl.zst
167 kB
xet
about 2 months ago
da7765fe
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000338.jsonl.zst
229 kB
xet
about 2 months ago
97e9c221
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000339.jsonl.zst
221 kB
xet
about 2 months ago
3d24d7df
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000340.jsonl.zst
176 kB
xet
about 2 months ago
8043aaf3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000341.jsonl.zst
193 kB
xet
about 2 months ago
62451d16
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000342.jsonl.zst
153 kB
xet
about 2 months ago
aef5d098
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000343.jsonl.zst
200 kB
xet
about 2 months ago
9c0363ef
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000344.jsonl.zst
199 kB
xet
about 2 months ago
aef93b86
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000345.jsonl.zst
222 kB
xet
about 2 months ago
d7f0139b
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000346.jsonl.zst
171 kB
xet
about 2 months ago
25a68517
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000347.jsonl.zst
235 kB
xet
about 2 months ago
72e9fde5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000348.jsonl.zst
241 kB
xet
about 2 months ago
0c1960a9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000349.jsonl.zst
258 kB
xet
about 2 months ago
9416e49d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000350.jsonl.zst
225 kB
xet
about 2 months ago
1ef6d9fa
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000351.jsonl.zst
223 kB
xet
about 2 months ago
6ddc47f1
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000352.jsonl.zst
195 kB
xet
about 2 months ago
3b223aee
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000353.jsonl.zst
155 kB
xet
about 2 months ago
7398ccf2
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000354.jsonl.zst
262 kB
xet
about 2 months ago
5a929612
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000355.jsonl.zst
306 kB
xet
about 2 months ago
3c2cb6ee
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000356.jsonl.zst
182 kB
xet
about 2 months ago
7f3b2522
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000357.jsonl.zst
218 kB
xet
about 2 months ago
4b13010d
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000358.jsonl.zst
161 kB
xet
about 2 months ago
f02454a3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000359.jsonl.zst
208 kB
xet
about 2 months ago
98714638
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000360.jsonl.zst
204 kB
xet
about 2 months ago
d50a7037
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000361.jsonl.zst
192 kB
xet
about 2 months ago
42f6dbfb
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000362.jsonl.zst
307 kB
xet
about 2 months ago
d6647661
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000363.jsonl.zst
220 kB
xet
about 2 months ago
2c9bd720
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000364.jsonl.zst
281 kB
xet
about 2 months ago
26c22116
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000365.jsonl.zst
246 kB
xet
about 2 months ago
8271e5ce
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000366.jsonl.zst
253 kB
xet
about 2 months ago
02dfcf17
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000367.jsonl.zst
261 kB
xet
about 2 months ago
0780c5c1
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000368.jsonl.zst
280 kB
xet
about 2 months ago
63e3d648
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000369.jsonl.zst
192 kB
xet
about 2 months ago
7e0918c8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000370.jsonl.zst
148 kB
xet
about 2 months ago
d9f81dc3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000371.jsonl.zst
222 kB
xet
about 2 months ago
fb23f242
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000372.jsonl.zst
273 kB
xet
about 2 months ago
0631291e
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000373.jsonl.zst
260 kB
xet
about 2 months ago
20f30a94
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000374.jsonl.zst
162 kB
xet
about 2 months ago
99d786ed
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000375.jsonl.zst
173 kB
xet
about 2 months ago
ad8e37b0
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000376.jsonl.zst
206 kB
xet
about 2 months ago
a411fa7b
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000377.jsonl.zst
201 kB
xet
about 2 months ago
ac47507c
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000378.jsonl.zst
247 kB
xet
about 2 months ago
9f855de0
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000379.jsonl.zst
131 kB
xet
about 2 months ago
9b11b3bf
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000380.jsonl.zst
221 kB
xet
about 2 months ago
32f004b3
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000381.jsonl.zst
203 kB
xet
about 2 months ago
7975c802
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000382.jsonl.zst
169 kB
xet
about 2 months ago
d14dd76b
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000383.jsonl.zst
255 kB
xet
about 2 months ago
beebf3b9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000384.jsonl.zst
206 kB
xet
about 2 months ago
90a91cf6
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000385.jsonl.zst
211 kB
xet
about 2 months ago
c467dcb1
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000386.jsonl.zst
249 kB
xet
about 2 months ago
81ad7db9
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000387.jsonl.zst
155 kB
xet
about 2 months ago
1a3a6be8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000388.jsonl.zst
181 kB
xet
about 2 months ago
1e374a7c
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000389.jsonl.zst
226 kB
xet
about 2 months ago
b0037db8
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000390.jsonl.zst
213 kB
xet
about 2 months ago
261f26b5
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000391.jsonl.zst
257 kB
xet
about 2 months ago
75f617e6
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000392.jsonl.zst
220 kB
xet
about 2 months ago
f03e8cd0
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000393.jsonl.zst
192 kB
xet
about 2 months ago
2853e455
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000394.jsonl.zst
217 kB
xet
about 2 months ago
6249bb36
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000395.jsonl.zst
176 kB
xet
about 2 months ago
70990666
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000396.jsonl.zst
246 kB
xet
about 2 months ago
fa7d4358
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000397.jsonl.zst
197 kB
xet
about 2 months ago
40a8a2fa
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000398.jsonl.zst
237 kB
xet
about 2 months ago
a0250c7a
soc127__phase1_pool_shared__common_crawl__part_006__data__common_crawl-software-0017__shard_00000399.jsonl.zst
245 kB
xet
about 2 months ago
b5443334
Load more
Sync this bucket
Mount this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors