Instructions to use dleemiller/ModernCE-base-sts with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use dleemiller/ModernCE-base-sts with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("dleemiller/ModernCE-base-sts") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Add ONNX / transformers.js compatibility
Hi @dleemiller ,
Thank you for this great model! I'd like to request official ONNX export support for this model so it can be used in the browser with transformers.js, which enables WebGPU/WASM inference.
This would make it easier to run ModernCE on the client side, especially for semantic search use cases.
@Xenova , would you be open to helping or validating an export if the author is interested?
Thanks both!
Sure -- let me do a little investigation to figure out how to structure it in huggingface and I'll work on it.
Looks like a community member converted one here: https://huggingface.co/onnx-community/ModernCE-base-sts-ONNX (possibly one of you?) π
If you wanted to add support to this repo too, you just need to add the ONNX files in a subfolder called "onnx" (like this) -- then everything should work as expected!
Hi @dleemiller and @Xenova ,
Thanks for the quick responses and support!
Just to share: the onnx-community/ModernCE-base-sts-ONNX model runs fine on the CPU (WASM backend) via onnxruntime-web, but unfortunately fails to load with the WebGPU backend.
No special errors are thrown other than a numeric code (e.g., 620973208), any chance that one or more operators being unsupported by the WebGPU execution provider?
Would love to see a version of this model optimized for WebGPU if possible β it would enable fast, client-side semantic search entirely on the GPU.
Thanks again!
@wilwork I uploaded some onnx files. Give them a try and let me know, I wasn't able to test the WebGPU backend
Hi @dleemiller , thank you β it's working perfectly with transformers.js! However, I'm encountering an issue where the model fails to load when using WebGPU. @Xenova , if you have a moment, could you kindly help verify or provide some guidance?
Much appreciated!