nielsr's picture
nielsr HF Staff
Improve model card and metadata
0b524d2 verified
|
raw
history blame
2.91 kB
metadata
base_model: Qwen/Qwen2.5-VL-3B-Instruct
library_name: peft
pipeline_tag: image-text-to-text
license: apache-2.0
tags:
  - finance
  - lora
  - vision-language
  - qwen

PyFi-QwenVL-3B-47K

This model is a fine-tuned LoRA adapter for Qwen2.5-VL-3B-Instruct, specialized for financial image understanding. It was introduced as part of the PyFi framework.

Model Details

Summary

PyFi (Pyramid-like Financial Image Understanding) is a framework designed to enhance Visual Language Models (VLMs) in understanding complex financial images through adversarial agents. The framework enables VLMs to reason through question chains in a progressive, simple-to-complex manner across six hierarchical capability levels:

  1. Perception: Basic visual understanding
  2. Data Extraction: Foundational information retrieval
  3. Calculation Analysis: Numerical analysis tasks
  4. Pattern Recognition: Identifying trends and patterns
  5. Logical Reasoning: Complex logical analysis
  6. Decision Support: Strategic decision-making assistance

This specific checkpoint was fine-tuned on approximately 47,000 question-answer chains from the PyFi-600K dataset. In this version ("w/o CoT"), only the question and answer from the final sample in each chain were used during training to target the ultimate reasoning goal.

Training Details

The model was fine-tuned with the following configuration:

  • Optimizer: AdamW
  • Learning Rate: $1.0 \times 10^{-4}$
  • Learning Rate Schedule: Cosine scheduling with warmup ratio of 0.1
  • Training Epochs: 1
  • Batch Size: Effective batch size of 8
  • PEFT: LoRA with full-module adaptation
  • Hardware: 4x NVIDIA RTX 5090 GPUs

Citation

If you use PyFi in your research, please cite:

@article{pyfi2025,
  title={PyFi: Toward Pyramid-like Financial Image Understanding for VLMs via Adversarial Agents},
  author={Zhang, Yuqun and Zhao, Yuxuan and Chen, Sijia},
  journal={arXiv preprint arXiv:2512.14735},
  year={2025}
}

Contact

For questions or inquiries, please contact: