krystal7/Reflector-Internalizing-Safety-Llama-3.1-8B-RL Text Generation • 8B • Updated about 11 hours ago • 43
krystal7/Reflector-Internalizing-Safety-Llama-3.1-8B-RL Text Generation • 8B • Updated about 11 hours ago • 43
krystal7/llama-8b-reflect-rl-segmented-math-knowledge-en-t-krystal7-reflectbonus-gs93 8B • Updated Jan 22
krystal7/llama-8b-reflect-rl-segmented-math-knowledge-en-t-krystal7-reflectbonus-gs93 8B • Updated Jan 22
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation Paper • 2506.02397 • Published Jun 3, 2025 • 36