---
license: apache-2.0
base_model: Qwen/Qwen2.5-0.5B-Instruct
tags:
  - fine-tuned
  - qlora
  - trl
  - peft
  - helix-llm
library_name: transformers
pipeline_tag: text-generation
---

# Model Card for test1-single-sft

This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct). It has been trained using [TRL](https://github.com/huggingface/trl).

## Model Details

| Parameter | Value |
|-----------|-------|
| **Base Model** | `Qwen/Qwen2.5-0.5B-Instruct` |
| **Training Type** | qlora |
| **LoRA Rank (r)** | 16 |
| **LoRA Alpha** | 32 |
| **Strategies** | SFT (1ep) |
| **Batch Size** | 4 |

## Training procedure

Training metrics are tracked locally with TensorBoard and MLflow.

### Framework versions

- **PEFT**: 0.18.0
- **TRL**: 0.25.1
- **Transformers**: 4.57.3
- **PyTorch**: 2.9.1
- **Datasets**: 3.6.0
- **Tokenizers**: 0.22.1

## Training Config

The full training configuration is available in `training_config.yaml`.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Tranium/test1-single-sft")
tokenizer = AutoTokenizer.from_pretrained("Tranium/test1-single-sft")

messages = [{"role": "user", "content": "Hello!"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Training Infrastructure

- **Platform**: single_node
- **GPU**: auto-detect