AITRADER commited on
Commit
bacf249
·
verified ·
1 Parent(s): 704bb5b

Update README with usage instructions

Browse files
Files changed (1) hide show
  1. README.md +37 -102
README.md CHANGED
@@ -1,136 +1,71 @@
1
  ---
2
  license: other
3
  license_name: qwen
4
- license_link: https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct/blob/main/LICENSE
5
- base_model: Qwen/Qwen3-VL-4B-Instruct
6
  tags:
 
 
7
  - vision-language-model
8
  - finance
9
- - ocr
10
  - chart-understanding
11
  - financial-analysis
12
- - qwen3-vl
13
- language:
14
- - en
15
- pipeline_tag: image-text-to-text
16
- library_name: transformers
17
  ---
18
 
19
  # Amsi-fin: Financial Vision-Language Model
20
 
21
- A specialized Vision-Language Model fine-tuned for financial document understanding, OCR, chart analysis, and chain-of-thought reasoning.
22
 
23
- ## Model Details
24
 
25
- | Property | Value |
26
- |----------|-------|
27
- | **Base Model** | Qwen3-VL-4B-Instruct |
28
- | **Parameters** | 4 Billion |
29
- | **Precision** | BF16 |
30
- | **Context Length** | 131,072 tokens |
31
- | **Training Stages** | 4 (Progressive Fine-tuning) |
32
 
33
- ## Capabilities
34
-
35
- - **Financial Document OCR**: Extract text from financial reports, statements, and documents
36
- - **Chart Understanding**: Analyze and interpret financial charts and graphs
37
- - **Chain-of-Thought Reasoning**: Step-by-step financial analysis and calculations
38
- - **Mathematical Reasoning**: Financial calculations and numerical analysis
39
-
40
- ## Training Data
41
-
42
- The model was trained on a curated mix of financial datasets:
43
 
44
- | Stage | Focus | Datasets |
45
- |-------|-------|----------|
46
- | A1 | Foundation | FinTrain (70%), FinTrain-Math (15%), OCR (10%), ChartQA (5%) |
47
- | A2 | Vision/OCR | MultiFinBen-OCR (50%), SecureFinAI-OCR (20%), ChartQA (20%), NuminaMath (10%) |
48
- | A3 | Reasoning | FinCoT (60%), FinTrain (30%), OCR (5%), ChartQA (5%) |
49
- | A4 | Consolidation | FinTrain (40%), OCR (20%), FinCoT (20%), ChartQA (10%), NuminaMath (10%) |
50
 
51
- ## Training Configuration
 
 
 
 
 
 
52
 
53
- ```yaml
54
- per_device_batch_size: 4
55
- gradient_accumulation_steps: 2
56
- learning_rate: 6.0e-6 (final stage)
57
- max_seq_length: 1024
58
- precision: bf16
59
- optimizer: AdamW (fused)
60
- total_steps: 7000 (across all stages)
61
  ```
62
 
63
- ## Usage
64
-
65
- ### With Transformers
66
 
67
  ```python
68
  from transformers import AutoProcessor, AutoModelForVision2Seq
69
  import torch
70
 
71
- model_name = "AITRADER/Amsi-fin"
72
-
73
- processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
74
  model = AutoModelForVision2Seq.from_pretrained(
75
- model_name,
76
  torch_dtype=torch.bfloat16,
77
- device_map="auto",
78
  trust_remote_code=True
79
  )
80
-
81
- # Example: Analyze a financial document
82
- from PIL import Image
83
-
84
- image = Image.open("financial_report.png")
85
- messages = [
86
- {"role": "user", "content": [
87
- {"type": "image"},
88
- {"type": "text", "text": "Analyze this financial report and summarize the key metrics."}
89
- ]}
90
- ]
91
-
92
- inputs = processor(messages, images=[image], return_tensors="pt").to(model.device)
93
- outputs = model.generate(**inputs, max_new_tokens=512)
94
- response = processor.decode(outputs[0], skip_special_tokens=True)
95
- print(response)
96
  ```
97
 
98
- ### Convert to MLX (Apple Silicon)
99
-
100
- ```bash
101
- # Install mlx-lm
102
- pip install mlx-lm
103
-
104
- # Convert to MLX 8-bit quantized
105
- mlx_lm.convert --hf-path AITRADER/Amsi-fin -q --upload-repo AITRADER/Amsi-fin-MLX-8bit
106
-
107
- # Convert to MLX bf16
108
- mlx_lm.convert --hf-path AITRADER/Amsi-fin --upload-repo AITRADER/Amsi-fin-MLX-bf16
109
- ```
110
-
111
- ## Limitations
112
-
113
- - Optimized for English financial documents
114
- - Best performance on structured financial data (tables, charts, reports)
115
- - May require fine-tuning for specific financial domains
116
-
117
- ## License
118
-
119
- This model is released under the same license as the base Qwen3-VL model.
120
-
121
- ## Citation
122
 
123
- ```bibtex
124
- @misc{amsi-fin-2025,
125
- title={Amsi-fin: Financial Vision-Language Model},
126
- author={AITRADER},
127
- year={2025},
128
- publisher={HuggingFace},
129
- url={https://huggingface.co/AITRADER/Amsi-fin}
130
- }
131
- ```
132
 
133
- ## Acknowledgments
134
 
135
- - Base model: [Qwen3-VL](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct)
136
- - Training datasets: FinTrain, FinCoT, MultiFinBen, ChartQA, NuminaMath
 
 
 
1
  ---
2
  license: other
3
  license_name: qwen
4
+ license_link: https://huggingface.co/Qwen/Qwen3-VL-4B/blob/main/LICENSE
 
5
  tags:
6
+ - qwen3_vl
7
+ - image-to-text
8
  - vision-language-model
9
  - finance
10
+ - OCR
11
  - chart-understanding
12
  - financial-analysis
 
 
 
 
 
13
  ---
14
 
15
  # Amsi-fin: Financial Vision-Language Model
16
 
17
+ Fine-tuned Qwen3-VL-4B for financial document understanding, chart analysis, and financial reasoning.
18
 
19
+ ## Quick Start
20
 
21
+ ### MLX (Apple Silicon)
 
 
 
 
 
 
22
 
23
+ ```python
24
+ from mlx_vlm import load, generate
 
 
 
 
 
 
 
 
25
 
26
+ # IMPORTANT: Use fix_mistral_regex=True
27
+ model, processor = load('AITRADER/Amsi-fin', fix_mistral_regex=True)
 
 
 
 
28
 
29
+ # Vision task
30
+ output = generate(
31
+ model, processor,
32
+ image='chart.png',
33
+ prompt='<|vision_start|><|image_pad|><|vision_end|>Analyze this chart.',
34
+ max_tokens=500
35
+ )
36
 
37
+ # Text-only
38
+ output = generate(
39
+ model, processor,
40
+ prompt='Calculate debt-to-equity ratio if debt=120M, equity=80M.',
41
+ max_tokens=200
42
+ )
 
 
43
  ```
44
 
45
+ ### Transformers (CUDA/CPU)
 
 
46
 
47
  ```python
48
  from transformers import AutoProcessor, AutoModelForVision2Seq
49
  import torch
50
 
51
+ processor = AutoProcessor.from_pretrained('AITRADER/Amsi-fin', trust_remote_code=True)
 
 
52
  model = AutoModelForVision2Seq.from_pretrained(
53
+ 'AITRADER/Amsi-fin',
54
  torch_dtype=torch.bfloat16,
 
55
  trust_remote_code=True
56
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ```
58
 
59
+ ## Capabilities
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
+ - Financial Document OCR
62
+ - Chart/Graph Understanding
63
+ - Financial Reasoning & Calculations
64
+ - Table Extraction
 
 
 
 
 
65
 
66
+ ## Training Data
67
 
68
+ - FinTrain (Salesforce)
69
+ - MultiFinBen-EnglishOCR
70
+ - ChartQA
71
+ - FinCoT