Instructions to use Parth7007/qwen2.5-finance-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Parth7007/qwen2.5-finance-model with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "Parth7007/qwen2.5-finance-model")

Transformers

How to use Parth7007/qwen2.5-finance-model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Parth7007/qwen2.5-finance-model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Parth7007/qwen2.5-finance-model", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Parth7007/qwen2.5-finance-model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Parth7007/qwen2.5-finance-model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Parth7007/qwen2.5-finance-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Parth7007/qwen2.5-finance-model

SGLang

How to use Parth7007/qwen2.5-finance-model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Parth7007/qwen2.5-finance-model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Parth7007/qwen2.5-finance-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Parth7007/qwen2.5-finance-model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Parth7007/qwen2.5-finance-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use Parth7007/qwen2.5-finance-model with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Parth7007/qwen2.5-finance-model to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Parth7007/qwen2.5-finance-model to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Parth7007/qwen2.5-finance-model to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Parth7007/qwen2.5-finance-model",
    max_seq_length=2048,
)

Docker Model Runner
How to use Parth7007/qwen2.5-finance-model with Docker Model Runner:
```
docker model run hf.co/Parth7007/qwen2.5-finance-model
```

💰 Qwen2.5 Financial Reasoning Model (LoRA Fine-Tuned)

A parameter-efficient fine-tuned version of Qwen2.5-3B-Instruct for step-by-step financial reasoning and numerical problem solving.

📋 Model Details

Model Description

This model is a fine-tuned version of Qwen2.5-3B-Instruct, trained to perform step-by-step financial reasoning and calculations using Low-Rank Adaptation (LoRA).

It is designed to:

🔢 Solve finance-related numerical problems
🧠 Provide structured, step-by-step reasoning explanations
📚 Support educational use cases around core financial concepts

Field	Details
Developed by	Parth Kadoo
Model type	Causal Language Model (LLM)
Base model	unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
Fine-tuning method	LoRA (PEFT)
Task	Financial reasoning & text generation
Language(s)	English
License	MIT
PEFT Version	0.18.1

Model Sources

Repository: Hugging Face Model Hub
Base Model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
Dataset: dreamerdeo/finqa

🎯 Uses

✅ Direct Use

This model can be used directly for:

Financial question answering (e.g., interest calculations, ratio analysis)
Step-by-step numerical problem solving
Educational purposes — learning and understanding finance concepts
Prototyping finance-related NLP applications

🔄 Downstream Use

This model can be plugged into larger pipelines for:

Finance-focused chatbots or tutoring assistants
Automated financial report summarization tools
RAG (Retrieval-Augmented Generation) systems combined with financial knowledge bases

⛔ Out-of-Scope Use

This model is not intended for:

Real-world financial advice or investment decisions
High-stakes trading or portfolio management systems
Legal, regulatory, or compliance use cases
Any production application without human oversight and validation

⚠️ Bias, Risks, and Limitations

Trained on a relatively small dataset (~5,500 samples) — may not generalize to all financial domains
May produce confident but incorrect answers, especially on complex or unseen problems
Limited to English language only
Can hallucinate steps or results in edge cases
May reflect biases present in the FinQA training dataset
Not a substitute for professional financial advice

Recommendations

Use for learning and experimentation only
Always verify numerical outputs independently before acting on them
Combine with external validation or rule-based checks in real applications
Users should be made aware of the model's limitations before deployment

🚀 How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "Parth7007/qwen2.5-finance-model"  

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "If you invest ₹10,000 at 10% annual interest for 2 years, what is the final amount?"

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

💡 Example

Input:

If you invest ₹10,000 at 10% annual interest for 2 years, what is the final amount?

Output:

Step 1: Use the compound interest formula: A = P(1 + r)^t
Step 2: A = 10,000 × (1 + 0.10)^2
Step 3: A = 10,000 × 1.21

Final Answer: ₹12,100

🏋️ Training Details

Training Data

Dataset: dreamerdeo/finqa
Description: FinQA is a financial reasoning dataset containing numerical and conceptual questions derived from earnings reports and financial documents. It requires multi-step reasoning over structured and unstructured financial data.
Size: ~5,500 training samples
Preprocessing: Prompts were formatted as instruction-response pairs for supervised fine-tuning (SFT).

Training Procedure

Training Hyperparameters

Parameter	Value
Training regime	4-bit quantization (NF4) with bf16 mixed precision
Fine-tuning method	LoRA (Low-Rank Adaptation)
Trainable parameters	~0.48% of total parameters
Epochs	2
Final training loss	~0.82
Optimizer	AdamW (via Unsloth)

Speeds, Sizes, Times

Detail	Value
Hardware	Google Colab GPU (T4)
Cloud Provider	Google Colab
Framework	Transformers + PEFT (Unsloth)
Precision	4-bit quantization (bitsandbytes)
Training time	~1–2 hours (estimated on Colab T4)

📊 Evaluation

Testing Data

Evaluation was performed on held-out examples from the dreamerdeo/finqa dataset.

Factors

Standard financial calculations (interest, percentages, ratios)
Multi-step numerical reasoning
Structured explanation quality

Metrics

⚠️ No formal benchmark evaluation has been conducted yet.
Qualitative observations were used to assess model performance.

Results

Qualitative Observations

✅ Produces correct results for standard financial calculations
✅ Demonstrates clear step-by-step reasoning capability
✅ Outperforms the base model on structured finance questions
⚠️ May struggle with highly complex or domain-specific edge cases

Summary

The model shows meaningful improvement over the base Qwen2.5-3B-Instruct on financial reasoning tasks, particularly for multi-step numerical problems. Formal benchmarking (e.g., on FinQA test split) is planned for future iterations.

🏗️ Technical Specifications

Model Architecture and Objective

Architecture: Decoder-only Transformer (Qwen2.5 series)
Objective: Causal language modeling with SFT on financial instruction-response pairs
Adaptation: LoRA adapters injected into attention layers for parameter-efficient fine-tuning

Compute Infrastructure

Hardware

Google Colab T4 GPU (15GB VRAM)

Software

Library	Version
`transformers`	Latest stable
`peft`	0.18.1
`trl`	Latest stable
`unsloth`	Latest stable
`bitsandbytes`	Latest stable

🌍 Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Field	Details
Hardware Type	NVIDIA T4 GPU (Google Colab)
Hours used	~1–2 hours (estimated)
Cloud Provider	Google (Colab)
Compute Region	US (estimated)
Carbon Emitted	Minimal (short training run on shared cloud GPU)

📜 Citation

If you use this model or find it helpful, please consider citing the original FinQA dataset:

BibTeX:

@inproceedings{chen2021finqa,
  title     = {FinQA: A Dataset of Numerical Reasoning over Financial Data},
  author    = {Chen, Zhiyu and Chen, Wenhu and Sha, Charese and Wang, Jianshu and Wang, William Yang},
  booktitle = {Proceedings of EMNLP 2021},
  year      = {2021}
}

APA: Chen, Z., Chen, W., Sha, C., Wang, J., & Wang, W. Y. (2021). FinQA: A Dataset of Numerical Reasoning over Financial Data. EMNLP 2021.

📖 Glossary

Term	Definition
LoRA	Low-Rank Adaptation — a PEFT method that trains small adapter matrices instead of full model weights
PEFT	Parameter-Efficient Fine-Tuning — techniques to fine-tune large models with minimal trainable parameters
SFT	Supervised Fine-Tuning — training on labeled instruction-response pairs
4-bit quantization	Reducing model weights to 4-bit precision to lower memory usage
FinQA	A financial reasoning benchmark requiring numerical reasoning over earnings reports

👤 Model Card Authors

Parth Kadoo

📬 Model Card Contact

For questions, feedback, or collaboration, feel free to reach out via Hugging Face.

Framework Versions

peft == 0.18.1

Downloads last month: 8

Dataset used to train Parth7007/qwen2.5-finance-model

Paper for Parth7007/qwen2.5-finance-model

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 57