Akshat Dwivedi's picture

In a Training Loop 🔄

Akshat Dwivedi

Akshat-Dwivedi

·

Akshat-Dwivedi-52

AI & ML interests

None yet

Recent Activity

liked a model 8 days ago

Prior-Labs/tabpfn_3

liked a dataset 11 days ago

5551z/VisCoR-55K

liked a model 13 days ago

Zyphra/ZAYA1-8B

View all activity

Organizations

upvoted a paper about 2 months ago

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

Paper • 2604.05091 • Published Apr 6 • 47

upvoted a collection 2 months ago

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 18 items • Updated 6 days ago • 296

upvoted 3 collections 3 months ago

pplx-embed

Diffusion-Pretrained Dense and Contextual Embeddings • 9 items • Updated 6 days ago • 99

Embeddings datasets ⚡️

This collection gather datasets for embeddings pre-training and fine-tuning. • 19 items • Updated Apr 7 • 5

SWE-bench

SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues. • 4 items • Updated Mar 8, 2025 • 10

upvoted a collection 4 months ago

MMFineReason

High-quality STEM reasoning dataset for Multimodal LLM post-training. • 8 items • Updated 19 days ago • 24

upvoted a paper 5 months ago

Step-DeepResearch Technical Report

Paper • 2512.20491 • Published Dec 23, 2025 • 88

upvoted an article 5 months ago

Article

The Optimal Architecture for Small Language Models

codelion

•

Dec 26, 2025

• 121

upvoted a collection 5 months ago

SYNTHETIC-2

12 items • Updated Oct 7, 2025 • 20

upvoted a paper 6 months ago

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 80

upvoted 2 collections 6 months ago

OpenThinker-Agent

5 items • Updated Dec 6, 2025 • 10

Ministral 3

A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 167