Instructions to use fyaronskiy/code_retriever-saved-checkpoints with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use fyaronskiy/code_retriever-saved-checkpoints with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("fyaronskiy/code_retriever-saved-checkpoints") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
code_retriever training checkpoints
Full Hugging Face Trainer / SentenceTransformer checkpoints for the code_retriever project.
Each checkpoint directory contains everything needed to resume training:
model.safetensors, optimizer.pt, scheduler.pt, rng_state.pth,
trainer_state.json, training_args.bin, tokenizer files, and pooling config.
Contents
| Run | Checkpoints | Notes |
|---|---|---|
RuModernBERT-base_bs64_lr_2e-05 |
checkpoint-12400, checkpoint-33600, checkpoint-46400, checkpoint-82600 |
1st epoch, batch size 64 |
RuModernBERT-base_bs128_lr_2e-05_2nd_epoch |
checkpoint-27200, checkpoint-45400 |
2nd epoch, batch size 128 |
Base model: deepvk/RuModernBERT-base
Download all checkpoints
huggingface-cli download fyaronskiy/code_retriever-saved-checkpoints \
--repo-type model \
--local-dir models/saved_checkpoints
Download a single checkpoint
huggingface-cli download fyaronskiy/code_retriever-saved-checkpoints \
--repo-type model \
--include "RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600/*" \
--local-dir models/saved_checkpoints
Resume training
- Download the desired run folder or checkpoint.
- In
train/train.py, pointresume_checkpointto the checkpoint path and setmodel_dirto the corresponding run directory undermodels/.
run_name = "RuModernBERT-base_bs64_lr_2e-05"
model_dir = f"../models/{run_name}"
resume_checkpoint = "../models/saved_checkpoints/RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600"
do_resume_train = True
auto_resume = False
- Launch training as usual, e.g.
bash train/train_accelerate.sh.
Load for inference only
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"fyaronskiy/code_retriever-saved-checkpoints/RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600"
)
For production inference, prefer the published model:
fyaronskiy/code_retriever_ru_en.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support