Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis Paper โข 2601.14253 โข Published 7 days ago โข 9
V-DPM: 4D Video Reconstruction with Dynamic Point Maps Paper โข 2601.09499 โข Published 13 days ago โข 9
UM-Text: A Unified Multimodal Model for Image Understanding Paper โข 2601.08321 โข Published 14 days ago โข 8
From RAG to Agentic RAG for Faithful Islamic Question Answering Paper โข 2601.07528 โข Published 15 days ago
Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics Paper โข 2601.04946 โข Published 19 days ago
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation Paper โข 2601.03955 โข Published 20 days ago โข 3
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper โข 2512.24724 โข Published 27 days ago โข 7
Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow Paper โข 2512.24766 โข Published 27 days ago โข 9
What matters for Representation Alignment: Global Information or Spatial Structure? Paper โข 2512.10794 โข Published Dec 11, 2025 โข 9
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models Paper โข 2512.07843 โข Published Nov 24, 2025 โข 22
Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation Paper โข 2510.06961 โข Published Oct 8, 2025 โข 10
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper โข 2510.08697 โข Published Oct 9, 2025 โข 37
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models Paper โข 2510.06107 โข Published Oct 7, 2025 โข 3
view post Post 2214 Gradio 6.0 is launching this year!We're revamping the core to give you performance improvements and unprecedented customization. Build better, faster.Check out the GitHub milestone to learn what's planned under the hood! https://github.com/gradio-app/gradio/issues?q=is:issue%20state:open%20milestone:%22Gradio%206%22 See translation ๐ฅ 5 5 ๐ค 2 2 ๐ 1 1 + Reply
view post Post 4101 The new multimodalart/self-forcing model and demo are truly impressive! See translation ๐ฅ 4 4 + Reply
Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images Paper โข 2506.13458 โข Published Jun 16, 2025
view post Post 6398 Excited to onboard FeatherlessAI on Hugging Face as an Inference Provider - they bring a fleet of 6,700+ LLMs on-demand on the Hugging Face Hub ๐คฏStarting today, you'd be able to access all those LLMs (OpenAI compatible) on HF model pages and via OpenAI client libraries too! ๐ฅGo, play with it today: https://huggingface.co/blog/inference-providers-featherlessP.S. They're also bringing on more GPUs to support all your concurrent requests! See translation 1 reply ยท ๐ฅ 7 7 + Reply
view post Post 773 Time is running out! โฐLess than 24 hours to participate in the MCP Hackathon and win thousands of dollars in prizes! Don't miss this opportunity to showcase your skills.Visit Agents-MCP-Hackathon/AI-Marketing-Content-Creator to register! See translation ๐ค 1 1 + Reply
view post Post 563 ๐จ NotebookLM Dethroned?! ๐จMeet Fluxions vui: The new open-source dialogue generation model.๐คฏ 100M Params, 40k hours audio!๐๏ธ Multi-speaker audio๐ Non-speech sounds (like [laughs]!)๐ MIT LicenseIs this the future of content creation? Watch the video and decide for yourself!https://huggingface.co/spaces/fluxions/vui-spacehttps://huggingface.co/fluxions/vui See translation 1 reply ยท ๐ 1 1 + Reply