fix_name_or_path

#13

by tkhanipov - opened Oct 23, 2025

base: refs/heads/main

←

from: refs/pr/13

Discussion Files changed

-3

tkhanipov

Oct 23, 2025

No description provided.

Remove _name_or_path and load tokenizer from the model directory895e040d

tkhanipov

Oct 23, 2025

The current version always attempts to load the tokenizer from ai-sage/Giga-Embeddings-instruct instead of using the already downloaded model directory. This leads to two problems:

Obvious: it is impossible to run inference without access to the HF hub.
Easy to overlook and thus dangerous: if the model changes at HF then we could run into a situation when the embedder and the tokenizer become incompatible. In the worst case scenario, this would only manifest itself in wrong embeddings.

This PR removes all _name_or_path parameters erroneously added to config.json. These are populated by transformers when loading the model. A corresponding fix to the implementation code is made so that tokenizer is loaded from the same location as the "main" model.

tkhanipov changed pull request status to open Oct 23, 2025

ekolodin changed pull request status to merged Oct 24, 2025

ekolodin

ai-sage org Oct 24, 2025

thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment