Mohammed Mohammed Ali PRO

MohammedEltoum

45 7

AI & ML interests

None yet

Recent Activity

upvoted an article 26 days ago

Job Searcher

upvoted an article about 1 month ago

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

upvoted a paper 5 months ago

The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

View all activity

Organizations

upvoted an article 26 days ago

Article

Job Searcher

build-small-hackathon

•

26 days ago

• 6

upvoted an article about 1 month ago

Article

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

sergiopaniego, ariG23498

•

May 25

• 126

upvoted 3 papers 5 months ago

The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

Paper • 2601.17737 • Published Jan 25 • 56

Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

Paper • 2601.17058 • Published Jan 22 • 190

daVinci-Dev: Agent-native Mid-training for Software Engineering

Paper • 2601.18418 • Published Jan 26 • 126

upvoted a paper 7 months ago

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 138

upvoted a paper 8 months ago

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published Nov 13, 2025 • 102

upvoted an article 8 months ago

Article

LeRobot v0.4.0: Supercharging OSS Robot Learning

imstevenpmwork, aractingi, pepijn223, CarolinePascal, jadechoghari, fracapuano, AdilZtn, nepyope, thomwolf

•

Oct 24, 2025

• 50

upvoted a paper 9 months ago

AnyUp: Universal Feature Upsampling

Paper • 2510.12764 • Published Oct 14, 2025 • 13

upvoted 2 papers 10 months ago

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9, 2025 • 84

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published Sep 1, 2025 • 34

upvoted a paper 11 months ago

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 311

upvoted an article 11 months ago

Article

Vision Language Model Alignment in TRL ⚡️

sergiopaniego, merve, qgallouedec, kashif, ariG23498

•

Aug 7, 2025

• 112

upvoted 2 papers 11 months ago

MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11, 2025 • 45

Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

Paper • 2507.23404 • Published Jul 31, 2025 • 3

upvoted a paper 12 months ago

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Paper • 2507.10787 • Published Jul 14, 2025 • 13

upvoted a paper about 1 year ago

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24, 2025 • 60

upvoted an article about 1 year ago

Article

How to Build an MCP Server with Gradio

abidlabs, ysharma

•

Apr 30, 2025

• 202

upvoted 2 papers about 1 year ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 343

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14, 2025 • 100

Mohammed Mohammed Ali PRO

AI & ML interests

Recent Activity

Organizations

MohammedEltoum's activity

Job Searcher

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

LeRobot v0.4.0: Supercharging OSS Robot Learning

Vision Language Model Alignment in TRL ⚡️

How to Build an MCP Server with Gradio