| --- |
| base_model: unsloth/Qwen2.5-3B-Instruct-bnb-4bit |
| library_name: peft |
| pipeline_tag: text-generation |
| language: en |
| license: apache-2.0 |
| tags: |
| - lora |
| - sft |
| - transformers |
| - trl |
| - unsloth |
| - fine-tuned |
| datasets: |
| - AiCloser/sharegpt_cot_dataset |
| --- |
| |
| # RRT1-3B |
|
|
| A fine-tuned 3B parameter model specialized for reasoning and chain-of-thought tasks |
|
|
| ## Model Details |
|
|
| This model is a fine-tuned version of unsloth/Qwen2.5-3B-Instruct-bnb-4bit using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. |
|
|
| - **Developed by:** theprint |
| - **Model type:** Causal Language Model (Fine-tuned with LoRA) |
| - **Language:** en |
| - **License:** apache-2.0 |
| - **Base model:** unsloth/Qwen2.5-3B-Instruct-bnb-4bit |
| - **Fine-tuning method:** LoRA with rank 128 |
|
|
| ## Intended Use |
|
|
| Reasoning, chain-of-thought, and general instruction following |
|
|
| ## Training Details |
|
|
| ### Training Data |
|
|
| ShareGPT conversations with chain-of-thought reasoning examples |
|
|
| - **Dataset:** AiCloser/sharegpt_cot_dataset |
| - **Format:** sharegpt |
|
|
| ### Training Procedure |
|
|
| - **Training epochs:** 3 |
| - **LoRA rank:** 128 |
| - **Learning rate:** 0.0002 |
| - **Batch size:** 4 |
| - **Framework:** Unsloth + transformers + PEFT |
| - **Hardware:** NVIDIA RTX 5090 |
|
|
| ## Usage |
|
|
| ```python |
| from unsloth import FastLanguageModel |
| import torch |
| |
| # Load model and tokenizer |
| model, tokenizer = FastLanguageModel.from_pretrained( |
| model_name="theprint/RRT1-3B", |
| max_seq_length=4096, |
| dtype=None, |
| load_in_4bit=True, |
| ) |
| |
| # Enable inference mode |
| FastLanguageModel.for_inference(model) |
| |
| # Example usage |
| inputs = tokenizer(["Your prompt here"], return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) |
| response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| print(response) |
| ``` |
|
|
| ## GGUF Quantized Versions |
|
|
| Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp: |
|
|
| - `RRT1-3B-q4_k_m.gguf` - 4-bit quantization (recommended for most use cases) |
| - `RRT1-3B-q5_k_m.gguf` - 5-bit quantization (higher quality) |
| - `RRT1-3B-q8_0.gguf` - 8-bit quantization (highest quality) |
|
|
| ## Limitations |
|
|
| May hallucinate or provide incorrect information. Not suitable for critical decision making. |
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| ```bibtex |
| @misc{rrt1_3b, |
| title={RRT1-3B: Fine-tuned Qwen2.5-3B-Instruct-bnb-4bit}, |
| author={theprint}, |
| year={2025}, |
| publisher={Hugging Face}, |
| url={https://huggingface.co/theprint/RRT1-3B} |
| } |
| ``` |
|
|