Instructions to use nlac/multiplication-lora-demo-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nlac/multiplication-lora-demo-adapter with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct")
model = PeftModel.from_pretrained(base_model, "nlac/multiplication-lora-demo-adapter")

Transformers

How to use nlac/multiplication-lora-demo-adapter with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nlac/multiplication-lora-demo-adapter")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("nlac/multiplication-lora-demo-adapter", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use nlac/multiplication-lora-demo-adapter with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nlac/multiplication-lora-demo-adapter"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nlac/multiplication-lora-demo-adapter",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/nlac/multiplication-lora-demo-adapter

SGLang

How to use nlac/multiplication-lora-demo-adapter with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nlac/multiplication-lora-demo-adapter" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nlac/multiplication-lora-demo-adapter",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nlac/multiplication-lora-demo-adapter" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nlac/multiplication-lora-demo-adapter",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use nlac/multiplication-lora-demo-adapter with Docker Model Runner:
```
docker model run hf.co/nlac/multiplication-lora-demo-adapter
```

Multiplication LoRA Adapter

A LoRA adapter that teaches Qwen2.5-0.5B to multiply 6-digit numbers by 7 with ~94% accuracy (up from ~3% for the base model).

Model Details

Model Description

This is a LoRA (Low-Rank Adaptation) adapter fine-tuned on a synthetic arithmetic dataset. The adapter teaches the base model to perform multiplication of 6-digit numbers by 7, demonstrating how LoRA can efficiently teach specific computational skills to small language models.

Developed by: nlac
Model type: LoRA adapter for causal language model
Language(s) (NLP): English (arithmetic expressions)
License: MIT
Finetuned from model: Qwen/Qwen2.5-0.5B-Instruct

Model Sources

Repository: https://huggingface.co/spaces/nlac/multiplication-lora-demo-adapter
Demo: https://huggingface.co/spaces/nlac/multiplication-lora-demo

Uses

Direct Use

The adapter is designed for multiplying 6-digit numbers (100000-999999) by 7. It expects input in the format {number} * 7 and returns the numeric result.

Out-of-Scope Use

Multiplication with numbers outside the 6-digit range (may work but not tested)
Multiplication by numbers other than 7
General arithmetic operations (addition, subtraction, division)
Any non-arithmetic tasks

Bias, Risks, and Limitations

The model is trained only on 6-digit numbers multiplied by 7
~6% of predictions may still be incorrect
The model may produce incorrect results for inputs outside the training distribution
Not suitable for applications requiring 100% arithmetic accuracy

Recommendations

For production use cases requiring reliable arithmetic, use traditional calculators or verified computation libraries. This adapter is primarily educational, demonstrating LoRA fine-tuning capabilities.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model = "Qwen/Qwen2.5-0.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "nlac/lora-multiplicator")

# Prepare input
messages = [
    {"role": "system", "content": "You are a helpful calculator that multiplies two numbers. Answer only a number. No preamble."},
    {"role": "user", "content": "123456 * 7"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate
outputs = model.generate(**inputs, max_new_tokens=32, do_sample=False)
answer = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(answer)  # Expected: 864192

Training Details

Training Data

Synthetically generated dataset of 20,000 multiplication examples:

Input: 6-digit numbers (100000-999999) multiplied by 7
Format: Chat messages with system prompt, user query, and assistant response
Prompt variations: {a} * 7, {a}* 7, {a} *7 with optional ? suffix
Train/validation split: 90%/10%

Example training item:

[
  {"role": "system", "content": "You are a helpful calculator that multiplies two numbers. Answer only a number. No preamble."},
  {"role": "user", "content": "772694 * 7?"},
  {"role": "assistant", "content": "5408858"}
]

Training Procedure

Training Hyperparameters

Training regime: bf16 mixed precision (fp16 on CPU)
Epochs: 3
Per-device batch size: 4
Gradient accumulation steps: 4 (effective batch size: 16)
Learning rate: 1e-3
LR scheduler: Cosine annealing
Warmup ratio: 0.05
Max gradient norm: 1.0
Optimizer: AdamW (default)

LoRA Configuration

LoRA rank (r): 16
LoRA alpha: 32
LoRA dropout: 0.1
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Bias: none
Task type: CAUSAL_LM

Speeds, Sizes, Times

Training time: ~1 hour on consumer GPU
Adapter size: ~2MB
Training samples: 20,000 (2% of all possible 6-digit numbers)

Evaluation

Testing Data, Factors & Metrics

Testing Data

Randomly generated 6-digit numbers not seen during training.

Metrics

Exact Match Accuracy: Percentage of predictions that exactly match the correct multiplication result.

Results

Model	Exact Match Accuracy
Base Qwen2.5-0.5B-Instruct	~3%
With LoRA adapter	~94%

31x improvement in accuracy with only ~2MB of additional parameters.

Summary

The LoRA adapter successfully teaches the small Qwen2.5-0.5B model to perform 6-digit multiplication with high accuracy. This demonstrates that LoRA fine-tuning can efficiently encode specific computational skills, even in models with limited parameter counts.

Technical Specifications

Model Architecture and Objective

Base architecture: Qwen2.5-0.5B-Instruct (decoder-only transformer)
Adaptation method: LoRA (Low-Rank Adaptation)
Objective: Supervised fine-tuning (SFT) with cross-entropy loss

Compute Infrastructure

Hardware

Consumer GPU (tested on NVIDIA RTX series)
Also trainable on CPU (slower, uses fp16)

Software

Python 3.10+
PyTorch >= 2.0.0
Transformers >= 4.40.0
PEFT >= 0.10.0
TRL >= 0.8.0
bitsandbytes >= 0.43.0 (for 4-bit quantization support)

Framework versions

PEFT 0.18.0
Transformers 4.40.0+
TRL 0.8.0+

Downloads last month: -

Model tree for nlac/multiplication-lora-demo-adapter

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Adapter

(596)

this model

nlac
/

multiplication-lora-demo-adapter