nomic-ai/nomic-embed-text-v1.5 · How are the ONNX files for this model generated?

How are the ONNX files for this model generated?

#21

by bhavikatekwani - opened May 29, 2024

Discussion

bhavikatekwani

May 29, 2024

•

edited May 30, 2024

👋🏽 Hello!

I'm trying to use nomic-embed-text-v1.5 and was wondering how the ONNX files here were created?

I would like to optimize them for use with TensorRT but I am running into some issues that might be solved by understanding how you export the models.

Thanks for your help.

zpn

May 29, 2024

I believe @Xenova converted them, he may be able to share the script. I can answer any questions on errors that you might be seeing though! are you able to post your error logs?

Xenova

May 30, 2024

•

edited May 30, 2024

I used Optimum, and you can see how to do it here: https://github.com/huggingface/optimum/pull/1874 (still a WIP PR)

bhavikatekwani

May 30, 2024

Thank you @zpn and @Xenova !

@zpn there are no errors as of now, just that a simple ONNX to TensorRT conversion doesn't lead to a very performant model as I expected.

@Xenova I'm actually the author of that PR 😄 I was asking about the conversion because I see the inputs and outputs are different in this repo and in what I get via Optimum:

model.onnx as you generated has inputs

This is what I get from Optimum (token_embeddings and sentence_embedding)

Just wanted to make sure as that the PR is still correct.

zpn

May 30, 2024

hmm i’ve had mixed results with TensorRT in the past. Are you able to post the onnx/tensorrt graph? i imagine that there may be a lot of unoptimized code

bhavikatekwani

Jun 12, 2024

@zpn actually the TensorRT stuff worked out fine. There may be a lot of unoptimized code but possibly something that can be detected with https://github.com/daquexian/onnx-simplifier?

zpn

Jun 12, 2024

Thanks for the resource, I'll take a look! I'm sure there are a lot of unnecessary expensive ops :)

zpn changed discussion status to closed Jun 12, 2024

Xenova

Jun 12, 2024

Just putting it here, another great resource for optimizing ONNX models: https://github.com/tsingmicro-toolchain/OnnxSlim

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment