---
base_model: unsloth/qwen2.5-math-1.5b
library_name: peft
pipeline_tag: text-generation
tags:
  - base_model:unsloth/qwen2.5-math-1.5b
  - lora
  - sft
  - transformers
  - trl
  - unsloth
license: apache-2.0
title: TAV (CPU Version)
sdk: gradio
emoji: 👀
colorFrom: green
colorTo: red
sdk_version: 5.49.1
hf_oauth: true
---

# Model Card for TAV CPU Version

## Model Details

### Model Description
This is the TAV model (CPU compatible) for text-generation tasks.  
It is based on `unsloth/qwen2.5-math-1.5b` and uses PEFT adapters for fine-tuning.  
Optimized to run on CPU environments without 4-bit quantization or bitsandbytes dependencies.

- **Developed by:** [Your Name / Organization]
- **Shared by:** [Your Name / Organization]
- **Model type:** Causal Language Model (Text Generation)
- **Language(s):** English (with math/technical capability)
- **License:** Apache-2.0
- **Finetuned from model:** unsloth/qwen2.5-math-1.5b

### Model Sources
- **Repository:** [Hugging Face Model Link]
- **Demo:** [Hugging Face Space Link]

## Uses

### Direct Use
- Generate math/technical answers in English.
- Use as a chatbot for educational purposes.
- Integrate into CPU-only environments.

### Downstream Use
- Can be further fine-tuned for domain-specific tasks.
- Suitable for research or teaching applications.

### Out-of-Scope Use
- Not optimized for GPU-heavy inference or extremely long sequences (>1024 tokens).
- Not suitable for real-time production under heavy load.

## Bias, Risks, and Limitations
- May produce biased or incorrect answers.
- CPU inference is slower than GPU inference.
- Limited context window due to CPU memory constraints.

### Recommendations
- Use with moderate token limits to avoid long processing times.
- Not intended for high-throughput production environments.

## How to Get Started
Use the CPU-compatible pipeline in Python:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("unsloth/qwen2.5-math-1.5b")
model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-math-1.5b", device_map="cpu")

generator = pipeline("text-generation", model=model, tokenizer=tokenizer, device=-1)

output = generator("Hi, how are you?", max_new_tokens=128, do_sample=True)
print(output[0]["generated_text"])