finanalyst-qwen1.5b

A financial analyst LLM fine-tuned from Qwen2.5-1.5B-Instruct using QLoRA (4-bit NF4 quantisation + LoRA adapters) on custom instruction-tuning data generated from live stock market data. The model produces analyst-style responses for stock analysis, market overviews, and free-form financial Q&A.

It runs fully locally — no API keys required. On Apple Silicon it uses MPS acceleration; on a machine with a CUDA GPU it uses that automatically; otherwise it falls back to CPU.


Model Details

Model Description

finanalyst-qwen1.5b is a parameter-efficient fine-tune of Qwen/Qwen2.5-1.5B-Instruct trained to behave like a senior sell-side analyst. It was fine-tuned using QLoRA: the base model is loaded in 4-bit NF4 quantisation (keeping weights at ~1 GB) while small LoRA adapter matrices are injected into the attention projection layers and trained on financial instruction data. Only 0.14% of the total parameters (2.18M out of 1.54B) were updated during training.

The model is the generative core of the AI Stock Market Analyst CLI — a Bloomberg-style terminal that uses this model for all text generation (stock deep-dives, market overviews, watchlist digests, and natural-language Q&A).

  • Developed by: Florian Braun (@iPwnds)
  • Model type: Causal language model — instruction-tuned with QLoRA
  • Language: English
  • License: Apache 2.0
  • Fine-tuned from: Qwen/Qwen2.5-1.5B-Instruct

Model Sources


Uses

Direct Use

The model is designed to generate analyst-style financial commentary given a structured prompt containing live market data (price, fundamentals, news sentiment). It handles three task types out of the box:

  • Stock analysis — given price history, fundamentals, and sentiment, writes a deep-dive covering valuation, momentum, catalysts, and risks.
  • Market overview — given index performance, sector rotation, and top movers, writes a macro narrative.
  • Analyst Q&A — answers free-form financial questions in plain English, referencing provided data where available.

Downstream Use

The model plugs directly into the AI Stock Market Analyst CLI via analysis/llm.py, which loads it with AutoPeftModelForCausalLM, merges the LoRA adapter into the base weights for faster inference, and exposes ask_llm() / ask_llm_json() helper functions used throughout the application.

It can also be used as a drop-in instruction-following LLM for any financial NLP pipeline that needs analyst-style prose generation.

Out-of-Scope Use

  • Real-time trading decisions — the model does not have access to live data and does not produce structured buy/sell signals. Any output should be treated as educational commentary, not financial advice.
  • Precise numerical forecasting — price targets or earnings estimates produced by the model are illustrative, not quantitative predictions.
  • Non-English text — training data was English-only.
  • Domains outside finance — the fine-tuning data is domain-specific; performance on general instruction-following tasks may be degraded compared to the base model.

Bias, Risks, and Limitations

  • The training set contains only 68 examples across three task types, making the model susceptible to overfitting to the style and tickers present in the training data. Generalisation to less common or non-US equities may be weaker.
  • Training data was generated from a single point-in-time snapshot of live market data. The model may reproduce market conditions or narratives from that period.
  • The base model, Qwen2.5-1.5B-Instruct, may carry biases inherited from its pre-training corpus.
  • At 1.5B parameters the model is relatively small. Responses are generally coherent but may occasionally hallucinate specific figures (e.g. exact P/E ratios or earnings dates) if not grounded by a data-rich prompt.

Recommendations

Always supply the model with current, factual market data in the prompt (price, fundamentals, news). Do not rely on the model's parametric knowledge for specific numerical claims. All output should be reviewed by a qualified professional before informing any financial decision.


How to Get Started with the Model

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer, pipeline
import torch

MODEL = "iPwnds/finanalyst-qwen1.5b"

# Detect device
if torch.backends.mps.is_available():
    device_map, dtype = {"": "mps"}, torch.float16
elif torch.cuda.is_available():
    device_map, dtype = "auto", torch.float16
else:
    device_map, dtype = {"": "cpu"}, torch.float32

# Load base model + LoRA adapter, then merge for faster inference
model = AutoPeftModelForCausalLM.from_pretrained(
    MODEL,
    torch_dtype=dtype,
    device_map=device_map,
).merge_and_unload()

tokenizer = AutoTokenizer.from_pretrained(MODEL)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

messages = [
    {"role": "system", "content": "You are a senior equity analyst. Be concise and data-driven."},
    {"role": "user",   "content": "Is NVDA overbought at current levels given its AI growth story?"},
]

result = pipe(messages, max_new_tokens=512, temperature=0.3, do_sample=True)
print(result[0]["generated_text"][-1]["content"])

The model uses the ChatML chat template (<|im_start|> / <|im_end|>) inherited from Qwen2.5-1.5B-Instruct. Always pass messages as a list of role/content dicts rather than raw strings.


Training Details

Training Data

Training data was generated programmatically by scripts/generate_training_data.py, which:

  1. Fetches live fundamentals, price history, and news via yfinance for a curated list of 23 tickers (large-cap US equities across major sectors).
  2. Constructs structured prompts containing price returns, RSI, P/E, market cap, revenue, and news headlines.
  3. Calls the base Qwen2.5-1.5B-Instruct to generate reference analyst responses.
  4. Saves the prompt–response pairs as JSONL in the {"instruction", "input", "output", "task_type", "ticker"} format.

The final dataset contains 68 instruction pairs split across:

Task type Count
stock_analysis 21
ask (free-form Q&A) 46
market_overview 1

The 90/10 train/test split gives 61 training examples and 7 evaluation examples.

Training Procedure

Fine-tuning was performed in Google Colab on a T4 GPU (15 GB VRAM) using SFTTrainer from the trl library.

Preprocessing

Each example was formatted into the Qwen2.5 ChatML template:

<|im_start|>system
{instruction}<|im_end|>
<|im_start|>user
{input}<|im_end|>
<|im_start|>assistant
{output}<|im_end|>

Sequences were truncated to a maximum of 512 tokens (set via tokenizer.model_max_length).

Training Hyperparameters

Hyperparameter Value
Base model Qwen/Qwen2.5-1.5B-Instruct
Quantisation 4-bit NF4 + double quantisation
Compute dtype float16
LoRA rank (r) 16
LoRA alpha 32
LoRA target modules q_proj, v_proj
LoRA dropout 0.05
Trainable parameters 2,179,072 / 1,545,893,376 (0.14%)
Epochs 3
Per-device batch size 1
Gradient accumulation steps 8 (effective batch size: 8)
Learning rate 2e-4
Max sequence length 512 tokens
Mixed precision None (fp16=False, bf16=False)
Gradient checkpointing Enabled
Optimizer AdamW (default)

Note on mixed precision: Qwen2.5's layer norms remain in BFloat16 regardless of dtype, which conflicts with PyTorch's fp16 AMP scaler. Setting fp16=False, bf16=False lets bitsandbytes handle precision internally and avoids the _amp_foreach_non_finite_check_and_unscale_cuda error on T4.

Speeds, Sizes, Times

Training time ~4 minutes (T4 GPU, Google Colab)
Total steps 24
Adapter size (pushed to Hub) ~8 MB
Base model size (downloaded at inference) ~3 GB
Training loss (final) 1.5266

Evaluation

Testing Data

A random 10% split (7 examples, seed=42) held out from the same generated dataset. The small size means evaluation loss should be interpreted qualitatively — it indicates whether the model is learning the response style rather than serving as a rigorous benchmark.

Metrics

Evaluation loss (cross-entropy on held-out completions) was used as the primary training signal. No external benchmarks were run; the model is evaluated end-to-end within the CLI application by inspecting output quality on real analyst prompts.

Results

Metric Value
Final training loss 1.5266
Final evaluation loss 1.4721
Eval runtime 3.78 s (7 samples)

The evaluation loss being slightly lower than training loss is consistent with the very small dataset size and indicates the model generalised to the held-out examples rather than overfitting.

Summary

The model successfully learns the instruction-following format and analyst prose style within 3 epochs on 61 examples. Responses are coherent, stay on-topic, and reference the data provided in the prompt. The primary limitation is dataset size — broader coverage of tickers, task types, and market conditions would improve robustness.


Environmental Impact

Training was performed on a Google Colab T4 GPU for approximately 4 minutes. Estimated carbon emissions are negligible at this scale.

  • Hardware type: NVIDIA T4 (Google Colab)
  • Hours used: ~0.07 hours
  • Cloud provider: Google (Colab)
  • Compute region: US (Colab default)
  • Carbon emitted: < 1 g CO₂eq (estimated)

Technical Specifications

Model Architecture and Objective

  • Base architecture: Qwen2.5-1.5B-Instruct (decoder-only transformer, 1.54B parameters)
  • Fine-tuning method: QLoRA — 4-bit NF4 weight quantisation via bitsandbytes, with low-rank adapter matrices (LoRA) injected into the q_proj and v_proj layers of each attention block
  • Objective: Next-token prediction (causal LM) on ChatML-formatted instruction–response pairs
  • Chat format: ChatML (<|im_start|> / <|im_end|>)

Compute Infrastructure

Hardware

  • Google Colab T4 GPU (15 GB VRAM) for training
  • Apple Silicon (MPS), CUDA GPU, or CPU for inference

Software

Package Role
transformers Model loading, tokenisation, pipeline
peft LoRA adapter injection and AutoPeftModelForCausalLM
trl SFTTrainer / SFTConfig for supervised fine-tuning
bitsandbytes ≥ 0.46.1 4-bit NF4 quantisation
accelerate Device placement and distributed training
datasets Dataset formatting and train/test split

Model Card Authors

Florian Braun (@iPwnds)

Model Card Contact

huggingface.co/iPwnds

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for iPwnds/finanalyst-qwen1.5b

Finetuned
(1569)
this model