Instructions to use Supabase/gte-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use Supabase/gte-small with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('feature-extraction', 'Supabase/gte-small');
Discrepancy in model sizes
Hello team! The size of the unquantized onnx model is 133mb, whereas the pytorch model is only 66.8mb. This is generally uncommon. For example, all-MiniLM-L6-v2's unquantized size is 90mb, roughly the same as the pytorch model.
While this isn't a problem itself, I wanted to raise this issue for further investigation.
Edit: I found Xenova has also uploaded his own version of this model, here, and it has the same issue.
@varun4 I was confused by this at first too. The pytorch model for gte-small is 16 bit as opposed to many other models that are 32 bit. The non-quantized ONNX models are always 32 bit, and quantized are 8 bit. This is why the non-quantized ONNX model is double the size of the pytorch model, and quantized ONNX model is half the size of the pytorch model.
That makes sense thank you!