fyaronskiy's picture
Update README.md
f6eb178 verified
|
Raw
History Blame Contribute Delete
2.95 kB
metadata
license: mit
language:
  - ru
  - en
tags:
  - sentence-transformers
  - code-retrieval
  - training-checkpoints
  - rumodernbert

code_retriever training checkpoints

Full Hugging Face Trainer / SentenceTransformer checkpoints for the code_retriever project.

Each checkpoint directory contains everything needed to resume training: model.safetensors, optimizer.pt, scheduler.pt, rng_state.pth, trainer_state.json, training_args.bin, tokenizer files, and pooling config.

Contents

Run Checkpoints Notes
RuModernBERT-base_bs64_lr_2e-05 checkpoint-12400, checkpoint-33600, checkpoint-46400, checkpoint-82600 1st epoch, batch size 64
RuModernBERT-base_bs128_lr_2e-05_2nd_epoch checkpoint-27200, checkpoint-45400 2nd epoch, batch size 128

Base model: deepvk/RuModernBERT-base

Download all checkpoints

hf download fyaronskiy/code_retriever-saved-checkpoints \
  --repo-type model \
  --local-dir models/saved_checkpoints

Download a single checkpoint

hf download fyaronskiy/code_retriever-saved-checkpoints \
  --repo-type model \
  --include "RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600/*" \
  --local-dir models/saved_checkpoints

Resume training

  1. Download the desired run folder or checkpoint.
  2. In train/train.py, point resume_checkpoint to the checkpoint path and set model_dir to the corresponding run directory under models/.
run_name = "RuModernBERT-base_bs64_lr_2e-05"
model_dir = f"../models/{run_name}"
resume_checkpoint = "../models/saved_checkpoints/RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600"
do_resume_train = True
auto_resume = False
  1. Launch training as usual, e.g. bash train/train_accelerate.sh.

Load for inference only

from sentence_transformers import SentenceTransformer

model = SentenceTransformer(
    "fyaronskiy/code_retriever-saved-checkpoints/RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600"
)

For production inference, prefer the published model: fyaronskiy/code_retriever_ru_en.

import torch
from sentence_transformers import SentenceTransformer, util

device = "cuda" if torch.cuda.is_available() else "cpu"
model = SentenceTransformer("fyaronskiy/code_retriever_ru_en").to(device)

queries = ["Напиши функцию на Python, которая рекурсивно вычисляет факториал числа."]
corpus = [
    """def factorial(n):
    if n == 0:
        return 1
    return n * factorial(n - 1)""",
]

doc_embeddings = model.encode(corpus, convert_to_tensor=True, device=device)
query_embeddings = model.encode(queries, convert_to_tensor=True, device=device)
scores = util.cos_sim(query_embeddings[0], doc_embeddings)[0]
print(scores)