Reza Sayar's picture

🔄 In a Training Loop

Reza Sayar PRO

Reza2kn

·

AI & ML interests

None yet

Recent Activity

updated a model about 1 hour ago

Reza2kn/surya-ocr-2-coreml-runtime

updated a dataset about 12 hours ago

Reza2kn/visualears-persian-asr-16k

liked a model about 15 hours ago

Reza2kn/Shenava-Koochik-0.9

View all activity

Organizations

upvoted a paper about 18 hours ago

Unlimited OCR Works

Paper • 2606.23050 • Published 4 days ago • 28

upvoted a paper 7 days ago

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

Paper • 2606.18558 • Published 9 days ago • 50

upvoted 2 papers 13 days ago

Revisiting Articulated Parts Perception in Robot Manipulation

Paper • 2606.08103 • Published 20 days ago • 3

VideoMDM: Towards 3D Human Motion Generation From 2D Supervision

Paper • 2606.13364 • Published 15 days ago • 20

upvoted an article 16 days ago

Article

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

nvidia

•

21 days ago

• 65

upvoted 2 papers 18 days ago

Audio Interaction Model

Paper • 2606.05121 • Published 23 days ago • 119

UniSHARP: Universal Sharp Monocular View Synthesis

Paper • 2606.07514 • Published 21 days ago • 14

upvoted a collection 18 days ago

Sapiens2

28 items • Updated May 15 • 42

upvoted a paper 18 days ago

Sapiens2

Paper • 2604.21681 • Published Apr 23 • 22

upvoted 2 collections 22 days ago

AudioMosaic

ICML2026 AudioMosaic: Contrastive Masked Audio Representation Learning • 15 items • Updated May 10 • 3

MOSS-Audio

An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 9 items • Updated 13 days ago • 66

upvoted a collection 24 days ago

Cosmos3

Omnimodal World Models for Physical AI • 15 items • Updated 14 days ago • 131

upvoted a collection 25 days ago

gliner2 family

GLiNER2 extends the original GLiNER architecture to support multi-task information extraction with a schema-driven interface. • 7 items • Updated May 16 • 53

upvoted 6 papers 28 days ago

CubePart: An Open-Vocabulary Part-Controllable 3D Generator

Paper • 2605.28763 • Published 30 days ago • 14

GEM: Generative Supervision Helps Embodied Intelligence

Paper • 2605.28548 • Published 30 days ago • 41

InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published May 25 • 18

ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement

Paper • 2605.25569 • Published May 25 • 21

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published May 20 • 111

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

Paper • 2605.26115 • Published May 25 • 52

upvoted a collection 30 days ago

Bonsai Image

6 items • Updated 21 days ago • 87