--- base_model: unsloth/Qwen2.5-3B-Instruct-bnb-4bit library_name: peft pipeline_tag: text-generation language: en license: apache-2.0 tags: - lora - sft - transformers - trl - unsloth - fine-tuned datasets: - AiCloser/sharegpt_cot_dataset --- # RRT1-3B A fine-tuned 3B parameter model specialized for reasoning and chain-of-thought tasks ## Model Details This model is a fine-tuned version of unsloth/Qwen2.5-3B-Instruct-bnb-4bit using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. - **Developed by:** theprint - **Model type:** Causal Language Model (Fine-tuned with LoRA) - **Language:** en - **License:** apache-2.0 - **Base model:** unsloth/Qwen2.5-3B-Instruct-bnb-4bit - **Fine-tuning method:** LoRA with rank 128 ## Intended Use Reasoning, chain-of-thought, and general instruction following ## Training Details ### Training Data ShareGPT conversations with chain-of-thought reasoning examples - **Dataset:** AiCloser/sharegpt_cot_dataset - **Format:** sharegpt ### Training Procedure - **Training epochs:** 3 - **LoRA rank:** 128 - **Learning rate:** 0.0002 - **Batch size:** 4 - **Framework:** Unsloth + transformers + PEFT - **Hardware:** NVIDIA RTX 5090 ## Usage ```python from unsloth import FastLanguageModel import torch # Load model and tokenizer model, tokenizer = FastLanguageModel.from_pretrained( model_name="theprint/RRT1-3B", max_seq_length=4096, dtype=None, load_in_4bit=True, ) # Enable inference mode FastLanguageModel.for_inference(model) # Example usage inputs = tokenizer(["Your prompt here"], return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## GGUF Quantized Versions Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp: - `RRT1-3B-q4_k_m.gguf` - 4-bit quantization (recommended for most use cases) - `RRT1-3B-q5_k_m.gguf` - 5-bit quantization (higher quality) - `RRT1-3B-q8_0.gguf` - 8-bit quantization (highest quality) ## Limitations May hallucinate or provide incorrect information. Not suitable for critical decision making. ## Citation If you use this model, please cite: ```bibtex @misc{rrt1_3b, title={RRT1-3B: Fine-tuned Qwen2.5-3B-Instruct-bnb-4bit}, author={theprint}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/theprint/RRT1-3B} } ```