Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10, 2025 • 194
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 274
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19, 2025 • 182
Reasoning Datasets Collection Distilled synthetic Reasoning datasets • 7 items • Updated Feb 2, 2025 • 61
DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 7 days ago • 262
gemini-2.0-flash-thinking-exp-1219 Datasets Collection Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-1219. Currently only single-turn. • 15 items • Updated Mar 28, 2025 • 6
gemini-exp-1206 Datasets Collection Existing datasets with responses regenerated using gemini-exp-1206. Currently only single-turn. • 3 items • Updated Mar 28, 2025 • 1
story writing favourites Collection Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 26 items • Updated Nov 14, 2025 • 89
Reasoning Models Collection If this really help, please upvote for researchers' hardwork • 14 items • Updated Jan 21, 2025 • 1
CoT Datasets Collection If this really help, please upvote for researchers' hardwork • 15 items • Updated Jan 20, 2025 • 1
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 250