fyaronskiy's picture
Update README.md
f6eb178 verified
|
Raw
History Blame Contribute Delete
2.95 kB
---
license: mit
language:
- ru
- en
tags:
- sentence-transformers
- code-retrieval
- training-checkpoints
- rumodernbert
---
# code_retriever training checkpoints
Full Hugging Face Trainer / SentenceTransformer checkpoints for the
[code_retriever](https://github.com/fedor28/code_retriever) project.
Each checkpoint directory contains everything needed to resume training:
`model.safetensors`, `optimizer.pt`, `scheduler.pt`, `rng_state.pth`,
`trainer_state.json`, `training_args.bin`, tokenizer files, and pooling config.
## Contents
| Run | Checkpoints | Notes |
|-----|-------------|-------|
| `RuModernBERT-base_bs64_lr_2e-05` | `checkpoint-12400`, `checkpoint-33600`, `checkpoint-46400`, `checkpoint-82600` | 1st epoch, batch size 64 |
| `RuModernBERT-base_bs128_lr_2e-05_2nd_epoch` | `checkpoint-27200`, `checkpoint-45400` | 2nd epoch, batch size 128 |
Base model: [`deepvk/RuModernBERT-base`](https://huggingface.co/deepvk/RuModernBERT-base)
## Download all checkpoints
```bash
hf download fyaronskiy/code_retriever-saved-checkpoints \
--repo-type model \
--local-dir models/saved_checkpoints
```
## Download a single checkpoint
```bash
hf download fyaronskiy/code_retriever-saved-checkpoints \
--repo-type model \
--include "RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600/*" \
--local-dir models/saved_checkpoints
```
## Resume training
1. Download the desired run folder or checkpoint.
2. In `train/train.py`, point `resume_checkpoint` to the checkpoint path and
set `model_dir` to the corresponding run directory under `models/`.
```python
run_name = "RuModernBERT-base_bs64_lr_2e-05"
model_dir = f"../models/{run_name}"
resume_checkpoint = "../models/saved_checkpoints/RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600"
do_resume_train = True
auto_resume = False
```
3. Launch training as usual, e.g. `bash train/train_accelerate.sh`.
## Load for inference only
```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"fyaronskiy/code_retriever-saved-checkpoints/RuModernBERT-base_bs64_lr_2e-05/checkpoint-82600"
)
```
For production inference, prefer the published model:
[`fyaronskiy/code_retriever_ru_en`](https://huggingface.co/fyaronskiy/code_retriever_ru_en).
```python
import torch
from sentence_transformers import SentenceTransformer, util
device = "cuda" if torch.cuda.is_available() else "cpu"
model = SentenceTransformer("fyaronskiy/code_retriever_ru_en").to(device)
queries = ["Напиши функцию на Python, которая рекурсивно вычисляет факториал числа."]
corpus = [
"""def factorial(n):
if n == 0:
return 1
return n * factorial(n - 1)""",
]
doc_embeddings = model.encode(corpus, convert_to_tensor=True, device=device)
query_embeddings = model.encode(queries, convert_to_tensor=True, device=device)
scores = util.cos_sim(query_embeddings[0], doc_embeddings)[0]
print(scores)
```