Convergent Evolution: How Different Language Models Learn Similar Number Representations Paper • 2604.20817 • Published 28 days ago • 7
Convergent Evolution: How Different Language Models Learn Similar Number Representations Paper • 2604.20817 • Published 28 days ago • 7
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351
Qwen3 Cross-layer Transcoders Collection Cross-layer transcoders for models from the Qwen3 family. • 2 items • Updated Dec 1, 2025 • 1
When Do Transformers Learn Heuristics for Graph Connectivity? Paper • 2510.19753 • Published Oct 22, 2025 • 4
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22, 2025 • 35
Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models Paper • 2505.14071 • Published May 20, 2025 • 1