Spaces:

TSXu
/

UniCalli_Dev

Running on Zero

App Files Files Community

UniCalli_Dev / src

Commit History

fix local cache path embedder type detection

d84e45a

Tianshuo-Xu commited on Mar 15

optimize cold start with local cache paths and font resolution

e51b773

Tianshuo-Xu commited on Mar 15

Speed up Space by initializing globally and keeping on GPU, remove manual offload

c49775d

Tianshuo-Xu commited on Mar 15

Fix float8 noise generation and fix gpu container download cache miss

ce4bbb3

Tianshuo-Xu commited on Mar 15

Pre-load InternVL embedding at startup to save GPU time

4c08c35

TSXu commited on Jan 30

Enable FA3 by default for ZeroGPU H200

1b5453a

TSXu commited on Jan 30

Add Flash Attention 3 support (optional)

8fc8d44

TSXu commited on Jan 30

fp32

974a879

TSXu commited on Jan 28

Fix CUBLAS errors: enforce dtype consistency at model entry points

82e509e

TSXu commited on Jan 28

Use pure fp32 for ZeroGPU - disable autocast entirely

b2bfb8e

TSXu commited on Jan 28

Disable autocast for MLPEmbedder and Modulation to fix CUBLAS errors

aecc9f1

TSXu commited on Jan 28

Use torch.autocast for automatic mixed precision inference

8af673c

TSXu commited on Jan 28

Disable TF32 to fix CUBLAS fp16 errors

e7ca422

TSXu commited on Jan 28

Use fp32 for MLPEmbedder to avoid CUBLAS errors

6d6e01a

TSXu commited on Jan 28

Switch to full model with fp16/bf16 inference for better performance

39fa408

Txu647 commited on Jan 28

Add NF4 4-bit inference with bitsandbytes

414150e

Txu647 commited on Jan 27

UI improvements: move status bar to right side, simplify layout, update defaults to Wang Xizhi

89e2699

TSXu commited on Jan 27

fix: use float32 instead of bfloat16 for compatibility

e7cbbce

Txu647 commited on Jan 27

Add batch generation, torch.compile acceleration, fix dtype issues

d3ccd4b

TSXu commited on Jan 27

fix: ensure consistent dtype in prepare function (use img.dtype)

a800c99

Txu647 commited on Jan 27

feat: Add UniCalli Chinese calligraphy generator

5c86cdc

Txu647 commited on Jan 27

Commit History

fix local cache path embedder type detection d84e45a

optimize cold start with local cache paths and font resolution e51b773

Speed up Space by initializing globally and keeping on GPU, remove manual offload c49775d

Fix float8 noise generation and fix gpu container download cache miss ce4bbb3

Pre-load InternVL embedding at startup to save GPU time 4c08c35

Enable FA3 by default for ZeroGPU H200 1b5453a

Add Flash Attention 3 support (optional) 8fc8d44

fp32 974a879

Fix CUBLAS errors: enforce dtype consistency at model entry points 82e509e

Use pure fp32 for ZeroGPU - disable autocast entirely b2bfb8e

Disable autocast for MLPEmbedder and Modulation to fix CUBLAS errors aecc9f1

Use torch.autocast for automatic mixed precision inference 8af673c

Disable TF32 to fix CUBLAS fp16 errors e7ca422

Use fp32 for MLPEmbedder to avoid CUBLAS errors 6d6e01a

Switch to full model with fp16/bf16 inference for better performance 39fa408

Add NF4 4-bit inference with bitsandbytes 414150e

UI improvements: move status bar to right side, simplify layout, update defaults to Wang Xizhi 89e2699

fix: use float32 instead of bfloat16 for compatibility e7cbbce

Add batch generation, torch.compile acceleration, fix dtype issues d3ccd4b

fix: ensure consistent dtype in prepare function (use img.dtype) a800c99

feat: Add UniCalli Chinese calligraphy generator 5c86cdc

fix local cache path embedder type detection

d84e45a

optimize cold start with local cache paths and font resolution

e51b773

Speed up Space by initializing globally and keeping on GPU, remove manual offload

c49775d

Fix float8 noise generation and fix gpu container download cache miss

ce4bbb3

Pre-load InternVL embedding at startup to save GPU time

4c08c35

Enable FA3 by default for ZeroGPU H200

1b5453a

Add Flash Attention 3 support (optional)

8fc8d44

fp32

974a879

Fix CUBLAS errors: enforce dtype consistency at model entry points

82e509e

Use pure fp32 for ZeroGPU - disable autocast entirely

b2bfb8e

Disable autocast for MLPEmbedder and Modulation to fix CUBLAS errors

aecc9f1

Use torch.autocast for automatic mixed precision inference

8af673c

Disable TF32 to fix CUBLAS fp16 errors

e7ca422

Use fp32 for MLPEmbedder to avoid CUBLAS errors

6d6e01a

Switch to full model with fp16/bf16 inference for better performance

39fa408

Add NF4 4-bit inference with bitsandbytes

414150e

UI improvements: move status bar to right side, simplify layout, update defaults to Wang Xizhi

89e2699

fix: use float32 instead of bfloat16 for compatibility

e7cbbce

Add batch generation, torch.compile acceleration, fix dtype issues

d3ccd4b

fix: ensure consistent dtype in prepare function (use img.dtype)

a800c99

feat: Add UniCalli Chinese calligraphy generator

5c86cdc