Spaces:

Reza2kn
/

mega-asr-bench

Running

App Files Files Community

mega-asr-bench

Commit History

Clean up debug diagnostics now that WebGPU works end-to-end

2cf4acc
verified

Reza2kn commited on 4 days ago

Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders

b8c2d24
verified

Reza2kn commited on 4 days ago

Per-step diagnostics: pinpoint which ORT call crashes

7be4ffd
verified

Reza2kn commited on 4 days ago

Surface full error info on transcribe failure

50bb779
verified

Reza2kn commited on 4 days ago

Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs

e6b3306
verified

Reza2kn commited on 4 days ago

Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer)

d62ebc2
verified

Reza2kn commited on 4 days ago

Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion

29d0bb3
verified

Reza2kn commited on 4 days ago

Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU

1320aec
verified

Reza2kn commited on 4 days ago

fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump

ed37d13
verified

Reza2kn commited on 4 days ago

Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser

9e54b79
verified

Reza2kn commited on 4 days ago

Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu)

af482b4
verified

Reza2kn commited on 4 days ago

Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser

a821180
verified

Reza2kn commited on 4 days ago

Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails

96f08e1
verified

Reza2kn commited on 4 days ago

Bump cache key to invalidate RTN weights (GPTQ ship)

36417e8
verified

Reza2kn commited on 4 days ago

Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default

61dfe9b
verified

Reza2kn commited on 4 days ago

Switch Space to static SDK: pure browser inference via onnxruntime-web

a4c397e
verified

Reza2kn commited on 4 days ago

Remove vendor/ (switching to static SDK)

ffd2ab1
verified

Reza2kn commited on 4 days ago

Remove requirements.txt (switching to static SDK)

0dfc823
verified

Reza2kn commited on 4 days ago

Remove app.py (switching to static SDK)

da91cf8
verified

Reza2kn commited on 4 days ago

Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx

888006f
verified

Reza2kn commited on 4 days ago

Initial: Gradio demo + 8 VITW examples + WER scoring

0c137e3
verified

Reza2kn commited on 4 days ago

initial commit

7feced8
verified

Reza2kn commited on 4 days ago

Commit History

Clean up debug diagnostics now that WebGPU works end-to-end 2cf4acc verified

Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders b8c2d24 verified

Per-step diagnostics: pinpoint which ORT call crashes 7be4ffd verified

Surface full error info on transcribe failure 50bb779 verified

Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs e6b3306 verified

Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer) d62ebc2 verified

Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion 29d0bb3 verified

Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU 1320aec verified

fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump ed37d13 verified

Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser 9e54b79 verified

Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu) af482b4 verified

Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser a821180 verified

Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails 96f08e1 verified

Bump cache key to invalidate RTN weights (GPTQ ship) 36417e8 verified

Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default 61dfe9b verified

Switch Space to static SDK: pure browser inference via onnxruntime-web a4c397e verified

Remove vendor/ (switching to static SDK) ffd2ab1 verified

Remove requirements.txt (switching to static SDK) 0dfc823 verified

Remove app.py (switching to static SDK) da91cf8 verified

Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx 888006f verified

Initial: Gradio demo + 8 VITW examples + WER scoring 0c137e3 verified

initial commit 7feced8 verified

Clean up debug diagnostics now that WebGPU works end-to-end

2cf4acc
verified

Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders

b8c2d24
verified

Per-step diagnostics: pinpoint which ORT call crashes

7be4ffd
verified

Surface full error info on transcribe failure

50bb779
verified

Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs

e6b3306
verified

Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer)

d62ebc2
verified

Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion

29d0bb3
verified

Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU

1320aec
verified

fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump

ed37d13
verified

Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser

9e54b79
verified

Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu)

af482b4
verified

Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser

a821180
verified

Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails

96f08e1
verified

Bump cache key to invalidate RTN weights (GPTQ ship)

36417e8
verified

Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default

61dfe9b
verified

Switch Space to static SDK: pure browser inference via onnxruntime-web

a4c397e
verified

Remove vendor/ (switching to static SDK)

ffd2ab1
verified

Remove requirements.txt (switching to static SDK)

0dfc823
verified

Remove app.py (switching to static SDK)

da91cf8
verified

Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx

888006f
verified

Initial: Gradio demo + 8 VITW examples + WER scoring

0c137e3
verified

initial commit

7feced8
verified