ruclip-vit-large-patch14-336-onnx

Upload folder using huggingface_hub

de7a6b5 verified 4 days ago

723 Bytes

	# Preprocessing Specification

	## Image (visual.onnx)

	- Input shape: `[N, 3, 336, 336]` (NCHW, batch first)
	- Input dtype: float32
	- Layout: RGB
	- Resolution: 336×336 (center crop or resize without distortion to fill)
	- Normalization: per-channel `(pixel / 255 - mean) / std`

	\| Channel \| mean \| std \|
	\|---------\|------\|-----\|
	\| R \| 0.48145466 \| 0.26862954 \|
	\| G \| 0.4578275 \| 0.26130258 \|
	\| B \| 0.40821073 \| 0.27577711 \|

	## Text (textual.onnx)

	- Input shape: `[N, 77]`
	- Input dtype: int64
	- Lowercase: yes
	- Sequence: `[BOS] + token_ids + [EOS]`, pad with 0 to length 77
	- Special IDs: pad=0, unk=1, bos=2, eos=3
	- Tokenizer: `tokenizer.json` or `bpe.model` (YouTokenToMe)

	# Preprocessing Specification

	## Image (visual.onnx)

	- Input shape: `[N, 3, 336, 336]` (NCHW, batch first)
	- Input dtype: float32
	- Layout: RGB
	- Resolution: 336×336 (center crop or resize without distortion to fill)
	- Normalization: per-channel `(pixel / 255 - mean) / std`

	\| Channel \| mean \| std \|
	\|---------\|------\|-----\|
	\| R \| 0.48145466 \| 0.26862954 \|
	\| G \| 0.4578275 \| 0.26130258 \|
	\| B \| 0.40821073 \| 0.27577711 \|

	## Text (textual.onnx)

	- Input shape: `[N, 77]`
	- Input dtype: int64
	- Lowercase: yes
	- Sequence: `[BOS] + token_ids + [EOS]`, pad with 0 to length 77
	- Special IDs: pad=0, unk=1, bos=2, eos=3
	- Tokenizer: `tokenizer.json` or `bpe.model` (YouTokenToMe)