16 15 6

Le Thien Phuc Nguyen

plnguyen2908

https://plnguyen2908.github.io/

plnguyen2908

AI & ML interests

Computer Vision, NLP, Applied AI

Recent Activity

upvoted a paper 9 days ago

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

upvoted a paper 21 days ago

MAOAM: Unified Object and Material Selection with Vision-Language Models

upvoted a paper 23 days ago

Personal AI Agent for Camera Roll VQA

View all activity

Organizations

upvoted a paper 9 days ago

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

Paper • 2606.18558 • Published 11 days ago • 53

upvoted a paper 21 days ago

MAOAM: Unified Object and Material Selection with Vision-Language Models

Paper • 2606.04880 • Published 26 days ago • 10

upvoted a paper 23 days ago

Personal AI Agent for Camera Roll VQA

Paper • 2606.05275 • Published 25 days ago • 20

liked a dataset 29 days ago

plnguyen2908/AV-SpeakerBench

Viewer • Updated Dec 15, 2025 • 3.21k • 2.09k • 2

upvoted a paper about 1 month ago

From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing

Paper • 2605.15181 • Published May 14 • 12

upvoted an article about 2 months ago

Article

From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease

muellerzr

•

Oct 21, 2022

• 44

upvoted a paper about 2 months ago

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355

upvoted a paper 2 months ago

Exploration and Exploitation Errors Are Measurable for Language Model Agents

Paper • 2604.13151 • Published Apr 14 • 25

published a dataset 3 months ago

plnguyen2908/AVHBench_clone

Updated Apr 11 • 6

upvoted a collection 3 months ago

VideoLLaMA2

Collection

Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability • 13 items • Updated Sep 2, 2025 • 20

liked a model 5 months ago

mozilla-ai/gemma-3-4b-it-llamafile

Text Generation • Updated Mar 31, 2025 • 702 • 6

liked a Space 5 months ago

AI Deadlines

⚡

772

Find upcoming AI conference and workshop deadlines

upvoted a collection 5 months ago

VisionLM

Collection

1929 items • Updated May 25 • 151

upvoted an article 6 months ago

Article

Vision Language Model Alignment in TRL ⚡️

sergiopaniego, merve, qgallouedec, kashif, ariG23498

•

Aug 7, 2025

• 112

updated a dataset 6 months ago

plnguyen2908/AV-SpeakerBench

Viewer • Updated Dec 15, 2025 • 3.21k • 2.09k • 2

authored 2 papers 7 months ago

See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models

Paper • 2512.02231 • Published Dec 1, 2025 • 9

LASER: Lip Landmark Assisted Speaker Detection for Robustness

Paper • 2501.11899 • Published Jan 21, 2025

commented a paper 7 months ago

See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models

Paper • 2512.02231 • Published Dec 1, 2025 • 9 •

liked a Space 7 months ago

paper-central

⚡

229

Explore and filter research papers by date, topic, and author

submitted a paper to Daily Papers 7 months ago

See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models

Paper • 2512.02231 • Published Dec 1, 2025 • 9

Le Thien Phuc Nguyen

AI & ML interests

Recent Activity

Organizations

plnguyen2908's activity

From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease

AI Deadlines

Vision Language Model Alignment in TRL ⚡️

paper-central