reporting a possible bug with the model
Hi,
I am using LM studio to run this embedding model.
On the following tiny test case, it is crashing:
curl http://localhost:1234/v1/embeddings -H "Content-Type: application/json" -d '{
"model": "text-embedding-snowflake-arctic-embed-l-v2.0",
"input": "PatternEngine::RamReadSignals\nGate * hasSingleFanIngateIgnoreLeftShift(bool ignoreUnsignedExtension, int &shift) const\nhasSingleFanIngateIgnoreLeftShift is an extension to hasSingleFanIngate test where the tested vector LSB bit set at GND are ignored. Purpose is to detect adder chain like: {A+B,000}+C. This can be rewrite into {A,000}+{B,000}+C, which allow adder tree balancing and proper optimization (like ternary adder). For such case the 000 part (LSB) is ignored and the shift parameter returns the number of shifted bits. Params: ignoreUnsignedExtension - the processing of the fan-in gate. Do we allow sign extension. Refer to hasSingleFanIngate for detail explanation on this parameter. (type: bool) | shift - the number of shited bit that is detected. The shift may be 0 is no shift applies. (type: int &) Returns: Returns the leading unique gate."
}'
Notice near the end of the request: (type int &)
If this text is changed into (type int) the embedding process is ok.
I am running it with context set at max value (8K).
LM Studio 0.3.34 (Build 1)
Best Regards
Stéphane
Hi @stephanepetithomme -- sorry for late reply. I don't have any experience with LM studio but I suspect this lies in how the llama.cpp backend (which powers LM Studio) handles specific character sequences or memory during the prompt processing phase. Have you tried other Transformer-based embedding models by any chance?
I have tried with these other one:
text-embedding-nomic-embed-text-v1.5
second-state/Nomic-embed-text-v1.5-Embedding-GGUF
second-state/jina-embeddings-v2-base-code-GGUF
text-embedding-snowflake-arctic-embed-l-v2.0
only text-embedding-snowflake-arctic-embed-l-v2.0 is crashing.
Thanks for the feedback! That is pretty strange 🤔
Before we dive any deeper do you mind trying out:
- https://huggingface.co/BAAI/bge-m3 (arctic embed 2l has the same architecture as this model but with different weights)
- https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v2.0 (smaller variant of arctic embed with a different architecture)
Also, in your original question, the model referenced is text-embedding-snowflake-arctic-embed-l-v2.0. If I understand correctly, the weights are not pulled directly from Hugging Face. I’m curious whether there were any conversion or intermediate steps involved; if so, something may be happening there.
on LM studio it is mentionned:
GGUF quants of Snowflake/snowflake-arctic-embed-l-v2.0 created using llama.cpp
I am quite busy actually with work deadline assignment.
I will try the element you mentioned. but may not be able to do it before next week at least.
Thank you for the help
Of course! Let us know how it goes. I'd recommend to also raise this issue with whoever did the GGUF quantization (maybe https://huggingface.co/Casual-Autopsy/snowflake-arctic-embed-l-v2.0-gguf ?)