Uzair2108 (Uzair Ahmed)

Self Forcing Wan 2.1

🎥

322

Real-time video generation

Step1X 3D

🐨

247

image2mesh

Stable Audio Open Zero

🔥

468

Generate immersive audio from text prompts

Fish Audio S1

🏆

697

Convert text to natural-sounding speech audio

Direct3D S2 V1.0 Demo

💻

429

Generate gigascale 3D models with simple text prompts

Meigen MultiTalk

🎙

273

Audio-Driven Multi-Person Conversational Video Generation

DreamO

🐨

599

A Unified Framework for Image Customization

Mistral OCR 3

🌆

79

Try out Mistral's latest OCR with pdfs and images

Qwen3 Demo

📊

853

Chat with an AI assistant and see its thought process

Nanonets OCR

👁

81

Demo for Nanonets-OCR

MedGemma 4B IT

🩻

35

Chat with MedGemma 4B, a medical variant of Gemma 3

Medgemma 27b Text It

😻

12

Generate medically-informed responses using prompts

Parakeet-TDT-0.6b-V2

469

Transcribe audio files with timestamps and download transcripts

Sesame CSM

🌱

862

Conversational speech generation

Kyutai STT 2.6B EN

😻

9

Transcribe English audio files into text

Kyutai Tts Test

🐨

2

Finegrain Image Enhancer

🖼

2.11k

Clarity AI Upscaler Reproduction

Flux.1-dev Upscaler

🔎

1.68k

Upscale low‑resolution images to higher resolution

Song Generation

🎵

726

Generate a song from your lyrics and description

OmniGen2

👀

429

OmniGen2: Unified Image Understanding and Generation.

Uzair Ahmed

AI & ML interests

Organizations

Self Forcing Wan 2.1

Step1X 3D

Stable Audio Open Zero

Fish Audio S1

Direct3D S2 V1.0 Demo

Meigen MultiTalk

DreamO

Mistral OCR 3

Qwen3 Demo

Nanonets OCR

MedGemma 4B IT

Medgemma 27b Text It

Parakeet-TDT-0.6b-V2

Sesame CSM

Kyutai STT 2.6B EN

Kyutai Tts Test

Finegrain Image Enhancer

Flux.1-dev Upscaler

Song Generation

OmniGen2

Uzair Ahmed

AI & ML interests

Organizations

Uzair2108's activity

Self Forcing Wan 2.1

Step1X 3D

Stable Audio Open Zero

Fish Audio S1

Direct3D S2 V1.0 Demo

Meigen MultiTalk

DreamO

Mistral OCR 3

Qwen3 Demo

Nanonets OCR

MedGemma 4B IT

Medgemma 27b Text It

Parakeet-TDT-0.6b-V2

Sesame CSM

Kyutai STT 2.6B EN

Kyutai Tts Test

Finegrain Image Enhancer

Flux.1-dev Upscaler

Song Generation

OmniGen2