Add exported onnx model 'model.onnx'

from sentence_transformers import SentenceTransformer

# TODO: Fill in the PR number
pr_number = 2
model = SentenceTransformer(
    "TechWolf/JobBERT-v2",
    revision=f"refs/pr/{pr_number}",
    backend="onnx",
)

# Verify that everything works as expected
embeddings = model.encode(["The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium."])
print(embeddings.shape)

similarities = model.similarity(embeddings, embeddings)
print(similarities)

Ahmad09

Jan 10

•

edited Jan 10

Title: Issue: ONNX Export Missing Asym Layer (768-dim instead of 1024-dim)

Hi,
I'm trying to deploy JobBERT-v2 using Text Embeddings Inference (TEI) for production use.
The Issue:
The exported ONNX model produces 768-dimensional embeddings instead of the expected 1024-dimensional output. This appears to be because the Asym (asymmetric projection) layer that transforms 768→1024 dimensions is not included in the ONNX export.

TEI's logs show (When not using ONNX):

WARN: modules.json could be downloaded but parsing the modules failed:
unknown variant sentence_transformers.models.Asym

Why This Matters:

I have a large production dataset already embedded with the 1024-dimensional vectors from sentence-transformers. Re-embedding would require significant time and compute resources.

Request:

Would it be possible to provide an ONNX export that includes the Asym projection layer baked in, so the model outputs 1024-dimensional embeddings? Alternatively, any guidance on how to properly export the full model graph to ONNX would be greatly appreciated.

Thank you for your help!

Update model.onnx with 1024-dim output (includes Asym layer)c46a99d1

Ahmad09

Jan 10

I've updated the ONNX export to include the Asym projection layer.

Changes made:

Re-exported model.onnx using a custom script that wraps the full SentenceTransformer forward pass
The model now outputs 1024-dimensional embeddings (matching the original sentence-transformers output)

Export script used:

class FullModel(torch.nn.Module):
    def __init__(self, st_model):
        super().__init__()
        self.model = st_model
    
    def forward(self, input_ids, attention_mask):
        features = {"input_ids": input_ids, "attention_mask": attention_mask}
        output = self.model(features)
        return output["sentence_embedding"]

This ensures the asymmetric projection layer (768→1024) is included in the ONNX graph.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment