TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 40
Transformers.js V4 demos Collection A collection of demos built with Transformers.js V4 • 23 items • Updated about 19 hours ago • 47
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots Paper • 2503.14734 • Published Mar 18, 2025 • 7
Avey 1 Research Preview Collection 1.5B preview models trained on 100B tokens of FineWeb, and an instruct-tuned version on smoltalk. • 3 items • Updated Jun 16, 2025 • 7
Action Chunking with Transformers for Image-Based Spacecraft Guidance and Control Paper • 2509.04628 • Published Sep 4, 2025 • 1
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 45
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 222
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published Jan 29 • 42
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model Paper • 2512.01030 • Published Nov 30, 2025 • 20
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 628
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms Nov 20, 2025 • 42
view article Article ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases Nov 5, 2025 • 63
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 78