shahza1b (Shah Zaib)

liked 4 Spaces about 1 year ago

Self Forcing Wan 2.1

🎥

326

Real-time video generation

core OCR

🥪

222

coreOCR / Camel-Doc-OCR / docscopeOCR / MonkeyOCR

LTX Video Fast

🎥

1.52k

ultra-fast video model, LTX 0.9.8 13B distilled

Graphify

⚡

204

Create multiple diagram types instantly from JSON!

liked 7 Spaces over 1 year ago

SmolDocling

🦆

263

Convert document images to structured text and data

Sesame CSM

🌱

861

Conversational speech generation

Leffa

👗

617

Generate realistic person images with new clothes or poses

MMAudio — generating synchronized audio from video/text

🔊

969

Generate synchronized audio for videos from text prompts

Manju Dream Booth

👀

196

Generate images from text prompts with customizable settings

Virtual Try-On Diffusion [VTON-D]

👗

95

Diffusion-based multi-modal virtual try-on pipeline demo

MagicQuill

🪶

2.28k

Edit images with scribble‑based color and edge control

liked a model over 1 year ago

briaai/RMBG-2.0

Image Segmentation • 0.2B • Updated Apr 6 • 593k • • 1.3k

liked 3 Spaces over 1 year ago

BRIA RMBG 2.0

🐢

956

remove background from any image

Document Parser

📈

32

Convert files to Markdown and extract document metadata

ZoeDepth

🦀

827

Predict depth map from a single image

liked a model over 1 year ago

Shitao/OmniGen-v1

Text-to-Image • 4B • Updated Nov 7, 2024 • 1.59k • • 321

liked a Space over 1 year ago

OmniGen

🖼

704

Image generator/identifier/reposer

liked 3 models over 1 year ago

Shah Zaib

AI & ML interests

Organizations

Self Forcing Wan 2.1

core OCR

LTX Video Fast

Graphify

SmolDocling

Sesame CSM

Leffa

MMAudio — generating synchronized audio from video/text

Manju Dream Booth

Virtual Try-On Diffusion [VTON-D]

MagicQuill

briaai/RMBG-2.0

BRIA RMBG 2.0

Document Parser

ZoeDepth

Shitao/OmniGen-v1

OmniGen

juierror/flan-t5-text2sql-with-schema-v2

tencent/Hunyuan3D-1

defog/sqlcoder2

Shah Zaib

AI & ML interests

Organizations

shahza1b's activity

Self Forcing Wan 2.1

core OCR

LTX Video Fast

Graphify

SmolDocling

Sesame CSM

Leffa

MMAudio — generating synchronized audio from video/text

Manju Dream Booth

Virtual Try-On Diffusion [VTON-D]

MagicQuill

BRIA RMBG 2.0

Document Parser

ZoeDepth

OmniGen