Thinking Machines Lab

company

https://thinkingmachines.ai

AI & ML interests

None defined yet.

Team members 61
private

updated a model 4 months ago

thinkingmachineslabinc/meta-llama-3-tokenizer

authored a paper 4 months ago

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Paper • 2512.07843 • Published Nov 24, 2025 • 22

published a model 4 months ago

thinkingmachineslabinc/meta-llama-3-instruct-tokenizer

Updated Dec 29, 2025

updated a model 4 months ago

thinkingmachineslabinc/meta-llama-3-instruct-tokenizer

Updated Dec 29, 2025

published a model 5 months ago

thinkingmachineslabinc/meta-llama-3-tokenizer

authored 3 papers about 1 year ago

GPT-4o System Card

Paper • 2410.21276 • Published Oct 25, 2024 • 87

OpenAI o1 System Card

Paper • 2412.16720 • Published Dec 21, 2024 • 37

A PINN Approach to Symbolic Differential Operator Discovery with Sparse Data

Paper • 2212.04630 • Published Dec 9, 2022

authored a paper over 1 year ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published Jan 8, 2025 • 96

authored a paper over 1 year ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 51

authored a paper over 1 year ago

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

Paper • 2410.03960 • Published Oct 4, 2024 • 2

authored a paper almost 2 years ago

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31, 2024 • 22

authored a paper almost 2 years ago

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

Paper • 2406.18521 • Published Jun 26, 2024 • 31

authored a paper almost 2 years ago

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Paper • 2406.11939 • Published Jun 17, 2024 • 8

authored a paper almost 2 years ago

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Paper • 2405.19325 • Published May 29, 2024 • 14

authored a paper about 2 years ago

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12, 2024 • 45

authored a paper about 2 years ago

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7, 2024 • 41

authored a paper about 2 years ago

Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20, 2024 • 26

authored 2 papers over 2 years ago

LEVER: Learning to Verify Language-to-Code Generation with Execution

Paper • 2302.08468 • Published Feb 16, 2023 • 1

Efficient Large Scale Language Modeling with Mixtures of Experts

Paper • 2112.10684 • Published Dec 20, 2021 • 2