--- license: apache-2.0 base_model: Qwen/Qwen2.5-0.5B-Instruct tags: - fine-tuned - qlora - trl - peft - helix-llm library_name: transformers pipeline_tag: text-generation --- # Model Card for test1-single-sft This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct). It has been trained using [TRL](https://github.com/huggingface/trl). ## Model Details | Parameter | Value | |-----------|-------| | **Base Model** | `Qwen/Qwen2.5-0.5B-Instruct` | | **Training Type** | qlora | | **LoRA Rank (r)** | 16 | | **LoRA Alpha** | 32 | | **Strategies** | SFT (1ep) | | **Batch Size** | 4 | ## Training procedure Training metrics are tracked locally with TensorBoard and MLflow. ### Framework versions - **PEFT**: 0.18.0 - **TRL**: 0.25.1 - **Transformers**: 4.57.3 - **PyTorch**: 2.9.1 - **Datasets**: 3.6.0 - **Tokenizers**: 0.22.1 ## Training Config The full training configuration is available in `training_config.yaml`. ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("Tranium/test1-single-sft") tokenizer = AutoTokenizer.from_pretrained("Tranium/test1-single-sft") messages = [{"role": "user", "content": "Hello!"}] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Infrastructure - **Platform**: single_node - **GPU**: auto-detect