QuatZo
/

tara-models

Model card Files Files and versions

tara-models / README.md

QuatZo's picture

Upload README.md with huggingface_hub

5ed2b09 verified 2 months ago

|

history blame contribute delete

1.66 kB

	---
	license: other
	tags:
	- onnx
	- manga
	- text-detection
	- inpainting
	- webgpu
	---

	# Tara Models

	ONNX models for [Tara](https://github.com/QuatZo/tara) - browser-native visual document translation.

	## Models

	\| Model \| File \| Size \| Source \| Changes \|
	\|-------\|------\|------\|--------\|---------\|
	\| RT-DETR-v2 (detector) \| detector.onnx \| 161 MB \| [ogkalu/comic-text-and-bubble-detector](https://huggingface.co/ogkalu/comic-text-and-bubble-detector) \| Patched AveragePool ceil_mode=0 for WebGPU compatibility \|
	\| LaMa (inpainter) \| lama-manga-dynamic.onnx \| 197 MB \| [ogkalu/lama-manga-onnx-dynamic](https://huggingface.co/ogkalu/lama-manga-onnx-dynamic) \| No changes, original model \|

	## Why this repo?

	The original detector.onnx uses AveragePool with ceil_mode=1, which onnxruntime-web's WebGPU backend does not support. This repo hosts a patched version where ceil_mode is set to 0. Since all AveragePool nodes use kernel=2x2 stride=2 on even-dimension inputs (640x640), this change has no effect on output.

	The LaMa inpainter works as-is and is included here for convenience (single download source).

	## Usage

	These models are loaded by Tara via the browser Cache API. On first visit, the app downloads both models (~360 MB total) and caches them locally. Subsequent visits load from cache instantly.

	Direct download URLs:
	- https://huggingface.co/QuatZo/tara-models/resolve/main/detector.onnx
	- https://huggingface.co/QuatZo/tara-models/resolve/main/lama-manga-dynamic.onnx

	## License

	Original models by [ogkalu](https://huggingface.co/ogkalu). See the source repositories for license information.