Questions on Training and Architecture

by crosant13 - opened Oct 28, 2024

Oct 28, 2024

•

edited Nov 4, 2024

I’m exploring this model, particularly its training methods and architectural specifics,
and I have a few questions:

Is the 2nd stage missing in the descriptions of "Training and Fine-tuning process" or is it a typo ?
How architecturally distinct is this model from BGE3, and are there practical differences in its embedding approach?
What evaluation metrics did you use during training, and are any benchmarks available for comparison?
Could you share more about the fine-tuning capabilities—especially regarding generating custom embeddings or using the model in domain-specific applications?
Also can you share the training code or give us idea on how you exactly did that ?
Thank you in advance for any insights!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment