Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Buckets:
HCAI-Lab
/
dolma3-6t-sample-5000-docs
Follow
Human-Centered AI Lab
10
Files
xet
HCAI-Lab/dolma3-6t-sample-5000-docs
/
worker_0014
11.1 GB
56,043 files
Updated about 2 months ago
Ctrl+K
Name
Size
Uploaded
Xet hash
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000821.jsonl.zst
304 kB
xet
about 2 months ago
68a159ab
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000822.jsonl.zst
203 kB
xet
about 2 months ago
5b09e160
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000823.jsonl.zst
137 kB
xet
about 2 months ago
2efd091c
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000824.jsonl.zst
251 kB
xet
about 2 months ago
c4158a6e
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000825.jsonl.zst
173 kB
xet
about 2 months ago
aa9787e2
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000826.jsonl.zst
131 kB
xet
about 2 months ago
9355b9bd
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000827.jsonl.zst
283 kB
xet
about 2 months ago
31c7fc27
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000828.jsonl.zst
169 kB
xet
about 2 months ago
6c850518
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000829.jsonl.zst
247 kB
xet
about 2 months ago
47fc6344
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000830.jsonl.zst
229 kB
xet
about 2 months ago
a0d84606
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000831.jsonl.zst
214 kB
xet
about 2 months ago
c16bf133
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000832.jsonl.zst
193 kB
xet
about 2 months ago
55f38405
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000833.jsonl.zst
166 kB
xet
about 2 months ago
9765821b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000834.jsonl.zst
148 kB
xet
about 2 months ago
23f4a4dc
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000835.jsonl.zst
247 kB
xet
about 2 months ago
94fcfbe9
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000836.jsonl.zst
199 kB
xet
about 2 months ago
1b9340ee
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000837.jsonl.zst
224 kB
xet
about 2 months ago
6cd3c9d8
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000838.jsonl.zst
167 kB
xet
about 2 months ago
af227b3b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000839.jsonl.zst
201 kB
xet
about 2 months ago
473fc6e0
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000840.jsonl.zst
177 kB
xet
about 2 months ago
dc53488f
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000841.jsonl.zst
208 kB
xet
about 2 months ago
c40c2115
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000842.jsonl.zst
116 kB
xet
about 2 months ago
b16c6216
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000843.jsonl.zst
280 kB
xet
about 2 months ago
7a8e9109
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000844.jsonl.zst
222 kB
xet
about 2 months ago
6c634ee7
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000845.jsonl.zst
192 kB
xet
about 2 months ago
8b1b9cca
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000846.jsonl.zst
424 kB
xet
about 2 months ago
6353f3ed
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000847.jsonl.zst
188 kB
xet
about 2 months ago
902fc775
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000848.jsonl.zst
224 kB
xet
about 2 months ago
2fde8aed
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000849.jsonl.zst
213 kB
xet
about 2 months ago
cdca8a38
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000850.jsonl.zst
185 kB
xet
about 2 months ago
1f8b3823
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000851.jsonl.zst
199 kB
xet
about 2 months ago
1ab73e24
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000852.jsonl.zst
160 kB
xet
about 2 months ago
a266e978
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000853.jsonl.zst
182 kB
xet
about 2 months ago
a4034984
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000854.jsonl.zst
182 kB
xet
about 2 months ago
748f64a6
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000855.jsonl.zst
213 kB
xet
about 2 months ago
6b5d9ec7
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000856.jsonl.zst
185 kB
xet
about 2 months ago
d165f82b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000857.jsonl.zst
163 kB
xet
about 2 months ago
712c1c06
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000858.jsonl.zst
166 kB
xet
about 2 months ago
2973f61c
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000859.jsonl.zst
193 kB
xet
about 2 months ago
a2bd03c2
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000860.jsonl.zst
205 kB
xet
about 2 months ago
9052b15b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000861.jsonl.zst
175 kB
xet
about 2 months ago
dfafc215
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000862.jsonl.zst
178 kB
xet
about 2 months ago
c2eb5035
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000863.jsonl.zst
176 kB
xet
about 2 months ago
0a64cc20
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000864.jsonl.zst
118 kB
xet
about 2 months ago
6d33311f
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000865.jsonl.zst
155 kB
xet
about 2 months ago
c1a06bab
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000866.jsonl.zst
166 kB
xet
about 2 months ago
44eb2f08
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000867.jsonl.zst
153 kB
xet
about 2 months ago
fc117b95
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000868.jsonl.zst
257 kB
xet
about 2 months ago
05a9c97b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000869.jsonl.zst
193 kB
xet
about 2 months ago
4d19c86b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000870.jsonl.zst
178 kB
xet
about 2 months ago
82fb4610
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000871.jsonl.zst
150 kB
xet
about 2 months ago
fd98afe6
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000872.jsonl.zst
247 kB
xet
about 2 months ago
7259a1da
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000873.jsonl.zst
194 kB
xet
about 2 months ago
ec5c8e8d
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000874.jsonl.zst
232 kB
xet
about 2 months ago
a5738d8e
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000875.jsonl.zst
155 kB
xet
about 2 months ago
c246222e
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000876.jsonl.zst
168 kB
xet
about 2 months ago
2e370c9a
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000877.jsonl.zst
177 kB
xet
about 2 months ago
540a7d12
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000878.jsonl.zst
197 kB
xet
about 2 months ago
14161973
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000879.jsonl.zst
221 kB
xet
about 2 months ago
e2df899b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000880.jsonl.zst
157 kB
xet
about 2 months ago
316aae45
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000881.jsonl.zst
214 kB
xet
about 2 months ago
e36ff362
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000882.jsonl.zst
148 kB
xet
about 2 months ago
affd792e
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000883.jsonl.zst
335 kB
xet
about 2 months ago
653c343f
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000884.jsonl.zst
204 kB
xet
about 2 months ago
2ca2a109
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000885.jsonl.zst
207 kB
xet
about 2 months ago
f9be8eb0
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000886.jsonl.zst
212 kB
xet
about 2 months ago
bab78fa4
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000887.jsonl.zst
183 kB
xet
about 2 months ago
87cf7626
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000888.jsonl.zst
147 kB
xet
about 2 months ago
f21f64ae
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000889.jsonl.zst
161 kB
xet
about 2 months ago
a0bcbc8b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000890.jsonl.zst
129 kB
xet
about 2 months ago
fd006283
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000891.jsonl.zst
148 kB
xet
about 2 months ago
ff0d16e7
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000892.jsonl.zst
149 kB
xet
about 2 months ago
7ce2429a
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000893.jsonl.zst
229 kB
xet
about 2 months ago
b2876b4c
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000894.jsonl.zst
206 kB
xet
about 2 months ago
ee25f2a5
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000895.jsonl.zst
198 kB
xet
about 2 months ago
90018bd2
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000896.jsonl.zst
153 kB
xet
about 2 months ago
cf7c36c9
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000897.jsonl.zst
222 kB
xet
about 2 months ago
6d870bd8
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000898.jsonl.zst
197 kB
xet
about 2 months ago
b1f263e6
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000899.jsonl.zst
193 kB
xet
about 2 months ago
515c81f8
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000900.jsonl.zst
105 kB
xet
about 2 months ago
957f157b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000901.jsonl.zst
203 kB
xet
about 2 months ago
d94dacc7
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000902.jsonl.zst
214 kB
xet
about 2 months ago
cae602b9
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000903.jsonl.zst
186 kB
xet
about 2 months ago
e3602264
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000904.jsonl.zst
137 kB
xet
about 2 months ago
58160bf5
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000905.jsonl.zst
137 kB
xet
about 2 months ago
6996a50f
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000906.jsonl.zst
117 kB
xet
about 2 months ago
4f1cc197
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000907.jsonl.zst
160 kB
xet
about 2 months ago
bed11a6f
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000908.jsonl.zst
238 kB
xet
about 2 months ago
4b3e6e72
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000909.jsonl.zst
267 kB
xet
about 2 months ago
49347b6a
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000910.jsonl.zst
163 kB
xet
about 2 months ago
94b118de
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000911.jsonl.zst
206 kB
xet
about 2 months ago
ab420d35
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000912.jsonl.zst
218 kB
xet
about 2 months ago
e0e063ed
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000913.jsonl.zst
130 kB
xet
about 2 months ago
7ce2b74d
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000914.jsonl.zst
172 kB
xet
about 2 months ago
bf70999b
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000915.jsonl.zst
155 kB
xet
about 2 months ago
210ad411
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000916.jsonl.zst
214 kB
xet
about 2 months ago
6025871d
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000917.jsonl.zst
284 kB
xet
about 2 months ago
754c72b3
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000918.jsonl.zst
136 kB
xet
about 2 months ago
91fbfa42
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000919.jsonl.zst
209 kB
xet
about 2 months ago
3bbdae94
soc127__phase1_pool_shared__common_crawl__part_001__data__common_crawl-entertainment-0018__shard_00000920.jsonl.zst
164 kB
xet
about 2 months ago
38d5acab
Load more
Sync this bucket
Mount this bucket
Total size
11.1 GB
Files
56,043
Last updated
Mar 24
Pre-warmed CDN
US
EU
US
EU
Contributors