Commit History

fix local cache path embedder type detection
d84e45a

Tianshuo-Xu commited on

optimize cold start with local cache paths and font resolution
e51b773

Tianshuo-Xu commited on

Speed up Space by initializing globally and keeping on GPU, remove manual offload
c49775d

Tianshuo-Xu commited on

Fix float8 noise generation and fix gpu container download cache miss
ce4bbb3

Tianshuo-Xu commited on

Pre-load InternVL embedding at startup to save GPU time
4c08c35

TSXu commited on

Enable FA3 by default for ZeroGPU H200
1b5453a

TSXu commited on

Add Flash Attention 3 support (optional)
8fc8d44

TSXu commited on

fp32
974a879

TSXu commited on

Fix CUBLAS errors: enforce dtype consistency at model entry points
82e509e

TSXu commited on

Use pure fp32 for ZeroGPU - disable autocast entirely
b2bfb8e

TSXu commited on

Disable autocast for MLPEmbedder and Modulation to fix CUBLAS errors
aecc9f1

TSXu commited on

Use torch.autocast for automatic mixed precision inference
8af673c

TSXu commited on

Disable TF32 to fix CUBLAS fp16 errors
e7ca422

TSXu commited on

Use fp32 for MLPEmbedder to avoid CUBLAS errors
6d6e01a

TSXu commited on

Switch to full model with fp16/bf16 inference for better performance
39fa408

Txu647 commited on

Add NF4 4-bit inference with bitsandbytes
414150e

Txu647 commited on

UI improvements: move status bar to right side, simplify layout, update defaults to Wang Xizhi
89e2699

TSXu commited on

fix: use float32 instead of bfloat16 for compatibility
e7cbbce

Txu647 commited on

Add batch generation, torch.compile acceleration, fix dtype issues
d3ccd4b

TSXu commited on

fix: ensure consistent dtype in prepare function (use img.dtype)
a800c99

Txu647 commited on

feat: Add UniCalli Chinese calligraphy generator
5c86cdc

Txu647 commited on