admarcosai
's Collections
Datasets
updated
Beyond Human Data: Scaling Self-Training for Problem-Solving with
Language Models
Paper
•
2312.06585
•
Published
•
29
TinyGSM: achieving >80% on GSM8k with small language models
Paper
•
2312.09241
•
Published
•
40
Viewer
•
Updated
•
70k
•
2.94k
•
89
Paper
•
2309.17425
•
Published
•
6
jondurbin/gutenberg-dpo-v0.1
Viewer
•
Updated
•
918
•
485
•
156
garage-bAInd/Open-Platypus
Viewer
•
Updated
•
24.9k
•
3.9k
•
409
Viewer
•
Updated
•
243k
•
513
•
214
Viewer
•
Updated
•
58.7k
•
222
•
46
Viewer
•
Updated
•
1.49M
•
727
•
148
Viewer
•
Updated
•
166k
•
684
•
114
Viewer
•
Updated
•
198k
•
155
•
112
Viewer
•
Updated
•
2.75M
•
11.3k
•
378
Viewer
•
Updated
•
6.2M
•
2.94k
•
101
open-web-math/open-web-math
Viewer
•
Updated
•
6.32M
•
13.4k
•
322
Viewer
•
Updated
•
4.04k
•
3.55M
•
196
Viewer
•
Updated
•
14.3k
•
2.75k
•
50
Viewer
•
Updated
•
44.8k
•
247
•
53
Viewer
•
Updated
•
6.14k
•
14.1k
•
194
Viewer
•
Updated
•
262k
•
3.41k
•
294
argilla/ultrafeedback-binarized-preferences-cleaned
Viewer
•
Updated
•
60.9k
•
3.75k
•
152
WhiteRabbitNeo/Code-Functions-Level-Cyber
Viewer
•
Updated
•
8.44k
•
140
•
26
WhiteRabbitNeo/Code-Functions-Level-General
Viewer
•
Updated
•
8.69k
•
20
•
20
Viewer
•
Updated
•
317k
•
7.87k
•
33
Updated
•
2.47k
•
128
Viewer
•
Updated
•
183k
•
1.05k
•
294
selfrag/selfrag_train_data
Viewer
•
Updated
•
146k
•
169
•
73
Viewer
•
Updated
•
463k
•
58
•
17
Locutusque/UltraTextbooks
Viewer
•
Updated
•
5.52M
•
7.9k
•
196
Undi95/ConversationChronicles-sharegpt-SHARDED
Viewer
•
Updated
•
787k
•
175
•
10
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Paper
•
2402.10176
•
Published
•
38
Viewer
•
Updated
•
31.1M
•
47.9k
•
648
togethercomputer/RedPajama-Data-1T
Viewer
•
Updated
•
1.73M
•
2.39k
•
1.11k
Viewer
•
Updated
•
968M
•
11.8k
•
878
Viewer
•
Updated
•
276M
•
21.6k
•
163
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper
•
2412.14475
•
Published
•
55