kaifahmad (Mohd Kaif)

upvoted a collection 8 months ago

Agent training frameworks

Collection

1 item • Updated Apr 30, 2025 • 1

upvoted a paper 8 months ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24, 2025 • 85

upvoted 2 collections 8 months ago

VideoPrism

Collection

VideoPrism is a foundational video encoder that enables state-of-the-art performance on a large variety of video understanding tasks. • 5 items • Updated Mar 12 • 20

MedGemma Release

Collection

Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Mar 12 • 509

upvoted an article 8 months ago

Article

Supercharge your OCR Pipelines with Open Models

+5

merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq

•

Oct 21, 2025

• 315

upvoted an article 9 months ago

Article

Public AI on Hugging Face Inference Providers 🔥

+4

Jolow, thelastjosh, celinah, julien-c, sbrandeis, Wauplin

•

Sep 17, 2025

• 24

upvoted a collection 10 months ago

VibeVoice

Collection

Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Mar 2 • 248

upvoted a collection 11 months ago

Health AI Developer Foundations (HAI-DEF)

Collection

Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Mar 12 • 227

upvoted 3 articles 11 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

+10

reach-vb, pcuenq, lewtun, clem, Rocketknight1, clefourrier, celinah, Wauplin, marcsun13, pagezyhf, ahadnagy, joaogante

•

Aug 5, 2025

• 513

Article

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

+1

aaditya, pminervini, clefourrier

•

Apr 19, 2024

• 203

Article

Evaluating Audio Reasoning with Big Bench Audio

mhillsmith, georgewritescode

•

Dec 20, 2024

• 30

upvoted 2 articles about 1 year ago

Article

Featherless AI on Hugging Face Inference Providers 🔥

+4

wxgeorge, pohnean-recursal, picocreator, celinah, Wauplin, sbrandeis

•

Jun 12, 2025

• 49

Article

Vision Language Models (Better, faster, stronger)

+3

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 614

upvoted a collection about 1 year ago

Describe Anything

Collection

Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 20 days ago • 63

upvoted an article about 1 year ago

Article

State of open video generation models in Diffusers

+1

sayakpaul, a-r-r-o-w, dn6

•

Jan 27, 2025

• 71

upvoted 5 articles over 1 year ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

ariG23498, merve, pcuenq, reach-vb

•

Mar 12, 2025

• 497

Article

SigLIP 2: A better multilingual vision language encoder

+1

ariG23498, merve, qubvel-hf

•

Feb 21, 2025

• 220

Article

Welcome to Inference Providers on the Hub 🔥

+5

burkaygur, zeke, aton2006, hassanelmghari, sbrandeis, kramp, julien-c

•

Jan 28, 2025

• 494

Article

Visually Multilingual: Introducing mcdse-2b

marco

•

Oct 27, 2024

• 41

Article

Deploying Speech-to-Speech on Hugging Face

+2

andito, derek-thomas, dmaniloff, eustlb

•

Oct 22, 2024

• 45

Mohd Kaif

AI & ML interests

Organizations

Agent training frameworks

Seedream 4.0: Toward Next-generation Multimodal Image Generation

VideoPrism

MedGemma Release

Supercharge your OCR Pipelines with Open Models

Public AI on Hugging Face Inference Providers 🔥

VibeVoice

Health AI Developer Foundations (HAI-DEF)

Welcome GPT OSS, the new open-source model family from OpenAI!

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Evaluating Audio Reasoning with Big Bench Audio

Featherless AI on Hugging Face Inference Providers 🔥

Vision Language Models (Better, faster, stronger)

Describe Anything

State of open video generation models in Diffusers

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

SigLIP 2: A better multilingual vision language encoder

Welcome to Inference Providers on the Hub 🔥

Visually Multilingual: Introducing mcdse-2b

Deploying Speech-to-Speech on Hugging Face

Mohd Kaif

AI & ML interests

Organizations

kaifahmad's activity

Supercharge your OCR Pipelines with Open Models

Public AI on Hugging Face Inference Providers 🔥

Welcome GPT OSS, the new open-source model family from OpenAI!

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Evaluating Audio Reasoning with Big Bench Audio

Featherless AI on Hugging Face Inference Providers 🔥

Vision Language Models (Better, faster, stronger)

State of open video generation models in Diffusers

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

SigLIP 2: A better multilingual vision language encoder

Welcome to Inference Providers on the Hub 🔥

Visually Multilingual: Introducing mcdse-2b

Deploying Speech-to-Speech on Hugging Face