LbbbbbY commited on
Commit
36045f2
Β·
verified Β·
1 Parent(s): 864ed71

Upload README-finlora.md

Browse files
finlora_hf_submission/README-finlora.md ADDED
@@ -0,0 +1,410 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - finance
5
+ - llm
6
+ - lora
7
+ - sentiment-analysis
8
+ - named-entity-recognition
9
+ - xbrl
10
+ pipeline_tag: text-generation
11
+ ---
12
+
13
+ # FinLoRA: Financial Large Language Models with LoRA Adaptation
14
+
15
+ [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
16
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
17
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
18
+
19
+ ## Overview
20
+
21
+ FinLoRA is a comprehensive framework for fine-tuning large language models on financial tasks using Low-Rank Adaptation (LoRA). This repository contains trained LoRA adapters for various financial NLP tasks including sentiment analysis, named entity recognition, headline classification, XBRL processing, and **RAG-enhanced models** for CFA knowledge and FinTagging tasks.
22
+
23
+ ## Model Architecture
24
+
25
+ - **Base Model**: Meta-Llama-3.1-8B-Instruct
26
+ - **Adaptation Method**: LoRA (Low-Rank Adaptation)
27
+ - **Quantization**: 8-bit and 4-bit quantization support
28
+ - **Tasks**: Financial sentiment analysis, NER, classification, XBRL processing, CFA knowledge, FinTagging
29
+
30
+ ## Available Models
31
+
32
+ ### 8-bit Quantized Models (Recommended)
33
+ - `sentiment_llama_3_1_8b_8bits_r8` - Financial sentiment analysis
34
+ - `ner_llama_3_1_8b_8bits_r8` - Named entity recognition
35
+ - `headline_llama_3_1_8b_8bits_r8` - Financial headline classification
36
+ - `xbrl_extract_llama_3_1_8b_8bits_r8` - XBRL tag extraction
37
+ - `xbrl_term_llama_3_1_8b_8bits_r8` - XBRL terminology processing
38
+ - `financebench_llama_3_1_8b_8bits_r8` - Comprehensive financial benchmark
39
+ - `finer_llama_3_1_8b_8bits_r8` - Financial NER
40
+ - `formula_llama_3_1_8b_8bits_r8` - Financial formula processing
41
+
42
+ ### RAG-Enhanced Models (New!)
43
+ - `cfa_rag_llama_3_1_8b_8bits_r8` - CFA knowledge-enhanced model with RAG
44
+ - `fintagging_combined_rag_llama_3_1_8b_8bits_r8` - Combined FinTagging RAG model
45
+ - `fintagging_fincl_rag_llama_3_1_8b_8bits_r8` - FinCL RAG-enhanced model
46
+ - `fintagging_finni_rag_llama_3_1_8b_8bits_r8` - FinNI RAG-enhanced model
47
+
48
+ ### 4-bit Quantized Models (Memory Efficient)
49
+ - `sentiment_llama_3_1_8b_4bits_r4` - Financial sentiment analysis
50
+ - `ner_llama_3_1_8b_4bits_r4` - Named entity recognition
51
+ - `headline_llama_3_1_8b_4bits_r4` - Financial headline classification
52
+ - `xbrl_extract_llama_3_1_8b_4bits_r4` - XBRL tag extraction
53
+ - `xbrl_term_llama_3_1_8b_4bits_r4` - XBRL terminology processing
54
+ - `financebench_llama_3_1_8b_4bits_r4` - Comprehensive financial benchmark
55
+ - `finer_llama_3_1_8b_4bits_r4` - Financial NER
56
+ - `formula_llama_3_1_8b_4bits_r4` - Financial formula processing
57
+
58
+ ## Quick Start
59
+
60
+ ### 1. Installation
61
+
62
+ ```bash
63
+ # Install dependencies
64
+ pip install -r requirements.txt
65
+ ```
66
+
67
+ ### 2. Basic Usage
68
+
69
+ ```python
70
+ from inference import FinLoRAPredictor
71
+
72
+ # Initialize predictor with 8-bit model (recommended)
73
+ predictor = FinLoRAPredictor(
74
+ model_name="sentiment_llama_3_1_8b_8bits_r8",
75
+ use_4bit=False
76
+ )
77
+
78
+ # Financial sentiment analysis
79
+ sentiment = predictor.classify_sentiment(
80
+ "The company's quarterly earnings exceeded expectations by 20%."
81
+ )
82
+ print(f"Sentiment: {sentiment}")
83
+
84
+ # Entity extraction
85
+ entities = predictor.extract_entities(
86
+ "Apple Inc. reported revenue of $394.3 billion in 2022."
87
+ )
88
+ print(f"Entities: {entities}")
89
+
90
+ # Use 4-bit model for memory efficiency (if you have limited GPU memory)
91
+ predictor_4bit = FinLoRAPredictor(
92
+ model_name="sentiment_llama_3_1_8b_4bits_r4",
93
+ use_4bit=True
94
+ )
95
+
96
+ # CPU-only mode (if no GPU available)
97
+ predictor_cpu = FinLoRAPredictor(
98
+ model_name="sentiment_llama_3_1_8b_8bits_r8",
99
+ use_4bit=False
100
+ )
101
+ # The script will automatically detect CPU and adjust accordingly
102
+ ```
103
+
104
+ ### 3. Run Complete Test
105
+
106
+ ```bash
107
+ # Test all models (this will download the base Llama model if not present)
108
+ python inference.py
109
+
110
+ # Test specific model
111
+ python -c "
112
+ from inference import FinLoRAPredictor
113
+ predictor = FinLoRAPredictor('sentiment_llama_3_1_8b_8bits_r8')
114
+ print('Model loaded successfully!')
115
+ "
116
+ ```
117
+
118
+ ## Usage Examples
119
+
120
+ ### Financial Sentiment Analysis
121
+
122
+ ```python
123
+ predictor = FinLoRAPredictor("sentiment_llama_3_1_8b_8bits_r8")
124
+
125
+ # Test cases
126
+ test_texts = [
127
+ "Stock prices are soaring to new heights.",
128
+ "Revenue declined by 15% this quarter.",
129
+ "The company maintained stable performance."
130
+ ]
131
+
132
+ for text in test_texts:
133
+ sentiment = predictor.classify_sentiment(text)
134
+ print(f"Text: {text}")
135
+ print(f"Sentiment: {sentiment}\n")
136
+ ```
137
+
138
+ ### Named Entity Recognition
139
+
140
+ ```python
141
+ predictor = FinLoRAPredictor("ner_llama_3_1_8b_8bits_r8")
142
+
143
+ text = "Apple Inc. reported revenue of $394.3 billion in 2022."
144
+ entities = predictor.extract_entities(text)
145
+ print(f"Entities: {entities}")
146
+ ```
147
+
148
+ ### XBRL Processing
149
+
150
+ ```python
151
+ predictor = FinLoRAPredictor("xbrl_extract_llama_3_1_8b_8bits_r8")
152
+
153
+ text = "Total assets: $1,234,567,890. Current assets: $456,789,123."
154
+ xbrl_tags = predictor.extract_xbrl_tags(text)
155
+ print(f"XBRL Tags: {xbrl_tags}")
156
+ ```
157
+
158
+ ### RAG-Enhanced Models
159
+
160
+ ```python
161
+ # CFA RAG-enhanced model for financial knowledge
162
+ predictor = FinLoRAPredictor("cfa_rag_llama_3_1_8b_8bits_r8")
163
+
164
+ # Enhanced financial analysis with CFA knowledge
165
+ response = predictor.generate_response(
166
+ "Explain the concept of discounted cash flow valuation"
167
+ )
168
+ print(f"CFA Response: {response}")
169
+
170
+ # FinTagging RAG models for financial information extraction
171
+ fintagging_predictor = FinLoRAPredictor("fintagging_combined_rag_llama_3_1_8b_8bits_r8")
172
+
173
+ # Extract financial information with enhanced context
174
+ entities = fintagging_predictor.extract_entities(
175
+ "Apple Inc. reported revenue of $394.3 billion in 2022."
176
+ )
177
+ print(f"Enhanced Entities: {entities}")
178
+ ```
179
+
180
+ ### Memory-Efficient 4-bit Models
181
+
182
+ ```python
183
+ # For users with limited GPU memory
184
+ predictor = FinLoRAPredictor(
185
+ model_name="sentiment_llama_3_1_8b_4bits_r4",
186
+ use_4bit=True
187
+ )
188
+
189
+ # Same API as 8-bit models
190
+ sentiment = predictor.classify_sentiment("The market is performing well.")
191
+ ```
192
+
193
+ ## Evaluation
194
+
195
+ ### For Competition Organizers
196
+
197
+ This section provides guidance for evaluating the submitted models:
198
+
199
+ #### 1. Quick Model Test
200
+ ```bash
201
+ # Test if all models can be loaded successfully
202
+ python test_submission.py
203
+ ```
204
+
205
+ #### 2. Comprehensive Evaluation
206
+ ```bash
207
+ # Run full evaluation on all models and datasets
208
+ python comprehensive_evaluation.py
209
+
210
+ # Check results
211
+ cat comprehensive_evaluation_results.json
212
+ ```
213
+
214
+ #### 3. Incremental Evaluation
215
+ ```bash
216
+ # Run evaluation on missing tasks
217
+ python incremental_evaluation.py
218
+
219
+ # Check results
220
+ cat incremental_evaluation_results.json
221
+ ```
222
+
223
+ #### 4. Evaluation Results
224
+ The evaluation results are provided in:
225
+ - `comprehensive_evaluation_results.json` - Complete evaluation results
226
+ - `incremental_evaluation_results.json` - Missing task evaluation results
227
+
228
+ #### 5. Model Performance Summary
229
+ All models have been evaluated on multiple financial datasets. See the Performance Results section below for detailed metrics.
230
+
231
+ ### For Researchers
232
+
233
+ Run comprehensive evaluation on financial datasets:
234
+
235
+ ```bash
236
+ # Run full evaluation
237
+ python comprehensive_evaluation.py
238
+
239
+ # Run incremental evaluation
240
+ python incremental_evaluation.py
241
+
242
+ # Run robust evaluation
243
+ python robust_incremental.py
244
+ ```
245
+
246
+ ## Performance Results
247
+
248
+ The models have been evaluated on multiple financial datasets:
249
+
250
+ | Task | Dataset | F1 Score | Accuracy |
251
+ |------|---------|----------|----------|
252
+ | Sentiment Analysis | Financial Phrasebank | 0.333 | 0.500 |
253
+ | NER | Financial NER | 0.889 | 0.800 |
254
+ | Classification | Headline Classification | 0.697 | 0.700 |
255
+ | XBRL Processing | XBRL Tag Extraction | - | 0.200 |
256
+ | Sentiment Analysis | FIQA SA | 0.727 | 0.700 |
257
+
258
+ ## Project Structure
259
+
260
+ ```
261
+ finlora_hf_submission/
262
+ β”œβ”€β”€ models/ # 8-bit LoRA model adapters (13 models)
263
+ β”‚ β”œβ”€β”€ sentiment_llama_3_1_8b_8bits_r8/
264
+ β”‚ β”œβ”€β”€ ner_llama_3_1_8b_8bits_r8/
265
+ β”‚ β”œβ”€β”€ headline_llama_3_1_8b_8bits_r8/
266
+ β”‚ β”œβ”€β”€ xbrl_extract_llama_3_1_8b_8bits_r8/
267
+ β”‚ β”œβ”€β”€ xbrl_term_llama_3_1_8b_8bits_r8/
268
+ β”‚ β”œβ”€β”€ financebench_llama_3_1_8b_8bits_r8/
269
+ β”‚ β”œβ”€β”€ finer_llama_3_1_8b_8bits_r8/
270
+ β”‚ β”œβ”€β”€ formula_llama_3_1_8b_8bits_r8/
271
+ β”‚ β”œβ”€β”€ cfa_rag_llama_3_1_8b_8bits_r8/ # NEW: CFA RAG model
272
+ β”‚ β”œβ”€β”€ fintagging_combined_rag_llama_3_1_8b_8bits_r8/ # NEW: Combined RAG
273
+ β”‚ β”œβ”€β”€ fintagging_fincl_rag_llama_3_1_8b_8bits_r8/ # NEW: FinCL RAG
274
+ β”‚ β”œβ”€β”€ fintagging_finni_rag_llama_3_1_8b_8bits_r8/ # NEW: FinNI RAG
275
+ β”‚ └── xbrl_train.jsonl-meta-llama-Llama-3.1-8B-Instruct-8bits_r8/
276
+ β”œβ”€β”€ models_4bit/ # 4-bit LoRA model adapters (8 models)
277
+ β”‚ β”œβ”€β”€ sentiment_llama_3_1_8b_4bits_r4/
278
+ β”‚ β”œβ”€β”€ ner_llama_3_1_8b_4bits_r4/
279
+ β”‚ β”œβ”€β”€ headline_llama_3_1_8b_4bits_r4/
280
+ β”‚ β”œβ”€β”€ xbrl_extract_llama_3_1_8b_4bits_r4/
281
+ β”‚ β”œβ”€β”€ xbrl_term_llama_3_1_8b_4bits_r4/
282
+ β”‚ β”œβ”€β”€ financebench_llama_3_1_8b_4bits_r4/
283
+ β”‚ β”œβ”€β”€ finer_llama_3_1_8b_4bits_r4/
284
+ β”‚ └── formula_llama_3_1_8b_4bits_r4/
285
+ β”œβ”€β”€ testdata/ # Evaluation datasets
286
+ β”‚ β”œβ”€β”€ FinCL-eval-subset.csv
287
+ β”‚ └── FinNI-eval-subset.csv
288
+ β”œβ”€β”€ rag_system/ # RAG system components
289
+ β”œβ”€β”€ inference.py # Main inference script
290
+ β”œβ”€β”€ comprehensive_evaluation.py # Full evaluation script
291
+ β”œβ”€β”€ incremental_evaluation.py # Incremental evaluation
292
+ β”œβ”€β”€ robust_incremental.py # Robust evaluation
293
+ β”œβ”€β”€ missing_tests.py # Missing test detection
294
+ β”œβ”€β”€ requirements.txt # Python dependencies
295
+ └── README.md # This file
296
+ ```
297
+
298
+ ## Environment Requirements
299
+
300
+ ### Minimum Requirements (CPU Mode)
301
+ - Python 3.8+
302
+ - PyTorch 2.0+
303
+ - 8GB RAM
304
+ - No GPU required
305
+
306
+ ### Recommended Requirements (GPU Mode)
307
+ - Python 3.9+
308
+ - PyTorch 2.1+
309
+ - CUDA 11.8+ (for NVIDIA GPUs)
310
+ - 16GB+ GPU memory
311
+ - 32GB+ RAM
312
+
313
+ ### Installation Instructions
314
+
315
+ ```bash
316
+ # 1. Clone or download this repository
317
+ # 2. Install dependencies
318
+ pip install -r requirements.txt
319
+
320
+ # 3. For GPU support (optional but recommended)
321
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
322
+
323
+ # 4. Verify installation
324
+ python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')"
325
+ ```
326
+
327
+ ### Troubleshooting
328
+
329
+ **If you encounter memory issues:**
330
+ - Use 4-bit models instead of 8-bit models
331
+ - Reduce batch size in inference
332
+ - Use CPU mode if GPU memory is insufficient
333
+
334
+ **If models fail to load:**
335
+ - Ensure all model files are present in the correct directories
336
+ - Check that the base model (Llama-3.1-8B-Instruct) can be downloaded from HuggingFace
337
+ - Verify internet connection for model downloads
338
+
339
+ **Important Notes for Competition Organizers:**
340
+ - The base model (Llama-3.1-8B-Instruct) will be automatically downloaded from HuggingFace on first use (~15GB)
341
+ - All LoRA adapters are included in this submission and do not require additional downloads
342
+ - Models work in both CPU and GPU modes, with automatic device detection
343
+ - RAG-enhanced models require the same base model as regular models
344
+
345
+ ## Model Details
346
+
347
+ ### Training Configuration
348
+ - **LoRA Rank**: 8
349
+ - **LoRA Alpha**: 16
350
+ - **Learning Rate**: 1e-4
351
+ - **Batch Size**: 4
352
+ - **Epochs**: 3-5
353
+ - **Quantization**: 8-bit (BitsAndBytes) / 4-bit (NF4)
354
+
355
+ ### Training Data
356
+ - Financial Phrasebank
357
+ - FinGPT datasets (NER, Headline, XBRL)
358
+ - BloombergGPT financial datasets
359
+ - Custom financial text datasets
360
+
361
+ ## Citation
362
+
363
+ If you use this work in your research, please cite:
364
+
365
+ ```bibtex
366
+ @article{finlora2024,
367
+ title={FinLoRA: Financial Large Language Models with LoRA Adaptation},
368
+ author={Your Name},
369
+ journal={Financial AI Conference},
370
+ year={2024}
371
+ }
372
+ ```
373
+
374
+ ## License
375
+
376
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
377
+
378
+ ## Contributing
379
+
380
+ Contributions are welcome! Please feel free to submit a Pull Request.
381
+
382
+ ## Contact
383
+
384
+ For questions and support, please open an issue or contact [your-email@example.com](mailto:your-email@example.com).
385
+
386
+ ## Submission Summary
387
+
388
+ ### What's Included
389
+ - **21 Total Models**: 13 8-bit models (9 original + 4 RAG-enhanced) + 8 4-bit models
390
+ - **Complete Evaluation Results**: Comprehensive and incremental evaluation results
391
+ - **RAG-Enhanced Models**: CFA and FinTagging models with enhanced knowledge
392
+ - **Cross-Platform Support**: Works on CPU, GPU, and various memory configurations
393
+ - **Ready-to-Use**: All dependencies specified, automatic device detection
394
+
395
+ ### Quick Start for Competition Organizers
396
+ 1. Install dependencies: `pip install -r requirements.txt`
397
+ 2. Test submission: `python test_submission.py`
398
+ 3. Run evaluation: `python comprehensive_evaluation.py`
399
+ 4. Check results: `cat comprehensive_evaluation_results.json`
400
+
401
+ ### Model Categories
402
+ - **Financial NLP**: Sentiment, NER, Classification, XBRL processing
403
+ - **RAG-Enhanced**: CFA knowledge and FinTagging with retrieval augmentation
404
+ - **Memory Options**: Both 8-bit and 4-bit quantized versions available
405
+
406
+ ## Acknowledgments
407
+
408
+ - Meta for the Llama-3.1-8B-Instruct base model
409
+ - Hugging Face for the transformers and PEFT libraries
410
+ - The financial NLP community for datasets and benchmarks