LbbbbbY commited on
Commit
af7bb81
·
verified ·
1 Parent(s): 36045f2

Delete finlora_hf_submission/README——finlora.md

Browse files
finlora_hf_submission/README——finlora.md DELETED
@@ -1,651 +0,0 @@
1
- # FinLoRA: Financial Large Language Models with LoRA Adaptation
2
-
3
- ## Overview
4
-
5
- FinLoRA is a comprehensive framework for fine-tuning large language models on financial tasks using Low-Rank Adaptation (LoRA). This project provides trained LoRA adapters for various financial NLP tasks including sentiment analysis, named entity recognition, headline classification, XBRL processing, and CFA knowledge integration.
6
-
7
- ## Model Architecture
8
-
9
- - **Base Model**: Meta-Llama-3.1-8B-Instruct
10
- - **Adaptation Method**: LoRA (Low-Rank Adaptation)
11
- - **Quantization**: 8-bit and 4-bit quantization support
12
- - **Tasks**: Financial sentiment analysis, NER, classification, XBRL processing, CFA knowledge integration
13
-
14
- ## Available Models
15
-
16
- ### Core Financial Models
17
- - `sentiment_llama_3_1_8b_8bits_r8` - Financial sentiment analysis
18
- - `ner_llama_3_1_8b_8bits_r8` - Named entity recognition
19
- - `headline_llama_3_1_8b_8bits_r8` - Financial headline classification
20
- - `xbrl_extract_llama_3_1_8b_8bits_r8` - XBRL tag extraction
21
- - `xbrl_term_llama_3_1_8b_8bits_r8` - XBRL terminology processing
22
-
23
- ### Advanced Models
24
- - `financebench_llama_3_1_8b_8bits_r8` - Comprehensive financial benchmark
25
- - `finer_llama_3_1_8b_8bits_r8` - Financial NER
26
- - `formula_llama_3_1_8b_8bits_r8` - Financial formula processing
27
-
28
- ### RAG Knowledge Base
29
- - CFA RAG knowledge base (FAISS index + JSONL data)
30
- - FinTagging RAG knowledge base (FAISS index + JSONL data)
31
- - RAG system scripts and configuration files
32
-
33
- ## Quick Start (5 minutes)
34
-
35
- ### 1. Environment Setup
36
- ```bash
37
- # Clone the repository
38
- git clone <repository-url>
39
- cd FinLora——RAG
40
-
41
- # Create and activate environment
42
- conda env create -f FinLoRA/environment.yml
43
- conda activate finenv
44
- ```
45
-
46
- ### 2. Test a Single Model
47
- ```python
48
- # Quick test script
49
- from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
50
- from peft import PeftModel
51
- import torch
52
-
53
- # Check if CUDA is available
54
- device = "cuda" if torch.cuda.is_available() else "cpu"
55
- print(f"Using device: {device}")
56
-
57
- # Load model (replace with your model path)
58
- model_path = "FinLoRA/lora_adapters/8bits_r8/sentiment_llama_3_1_8b_8bits_r8"
59
- base_model = "meta-llama/Llama-3.1-8B-Instruct"
60
-
61
- # Load tokenizer
62
- tokenizer = AutoTokenizer.from_pretrained(base_model)
63
- if tokenizer.pad_token is None:
64
- tokenizer.pad_token = tokenizer.eos_token
65
-
66
- # Configure quantization based on device
67
- if device == "cuda":
68
- bnb_config = BitsAndBytesConfig(load_in_8bit=True)
69
- base_model = AutoModelForCausalLM.from_pretrained(
70
- base_model, quantization_config=bnb_config, device_map="auto"
71
- )
72
- else:
73
- # CPU mode - no quantization
74
- base_model = AutoModelForCausalLM.from_pretrained(
75
- base_model, device_map="cpu", torch_dtype=torch.float32
76
- )
77
-
78
- # Load LoRA adapter
79
- model = PeftModel.from_pretrained(base_model, model_path)
80
-
81
- # Test inference
82
- def quick_test(text):
83
- inputs = tokenizer(text, return_tensors="pt")
84
- with torch.no_grad():
85
- outputs = model.generate(**inputs, max_new_tokens=50, temperature=0.7)
86
- return tokenizer.decode(outputs[0], skip_special_tokens=True)
87
-
88
- # Test
89
- result = quick_test("Classify sentiment: 'The stock market is performing well today.'")
90
- print(result)
91
- ```
92
-
93
- ### 3. Run Full Evaluation
94
- ```bash
95
- cd testdata
96
- python comprehensive_evaluation.py
97
- ```
98
-
99
- ## Environment Setup
100
-
101
- ### Quest Cluster Environment (Original Development)
102
-
103
- The original development was done on Northwestern University's Quest cluster with:
104
- - **OS**: Linux 4.18.0-553.64.1.el8_10.x86_64
105
- - **GPU**: NVIDIA H100 80GB HBM3
106
- - **CUDA**: Version 12.8
107
- - **Environment**: `finenv` conda environment
108
-
109
- ### Option 1: Using Conda (Recommended)
110
-
111
- ```bash
112
- # Create environment from provided environment.yml
113
- conda env create -f FinLoRA/environment.yml
114
-
115
- # Activate environment
116
- conda activate finenv
117
-
118
- # Install additional requirements
119
- pip install -r FinLoRA/requirements.txt
120
- ```
121
-
122
- ### Option 2: Manual Installation
123
-
124
- #### For GPU Users:
125
- ```bash
126
- # Create new conda environment
127
- conda create -n finlora python=3.11
128
-
129
- # Activate environment
130
- conda activate finlora
131
-
132
- # Install PyTorch with CUDA support
133
- conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
134
-
135
- # Install core dependencies
136
- pip install transformers==4.45.2
137
- pip install datasets==2.19.1
138
- pip install peft==0.13.2
139
- pip install bitsandbytes==0.44.1
140
- pip install accelerate==1.0.0
141
- pip install deepspeed==0.15.2
142
- pip install sentence-transformers
143
- pip install faiss-cpu
144
- pip install scikit-learn
145
- pip install pandas numpy
146
- ```
147
-
148
- #### For CPU-Only Users:
149
- ```bash
150
- # Create new conda environment
151
- conda create -n finlora python=3.11
152
-
153
- # Activate environment
154
- conda activate finlora
155
-
156
- # Install PyTorch CPU version
157
- conda install pytorch torchvision torchaudio cpuonly -c pytorch
158
-
159
- # Install core dependencies (CPU-compatible versions)
160
- pip install transformers==4.45.2
161
- pip install datasets==2.19.1
162
- pip install peft==0.13.2
163
- pip install accelerate==1.0.0
164
- pip install sentence-transformers
165
- pip install faiss-cpu
166
- pip install scikit-learn
167
- pip install pandas numpy
168
- ```
169
-
170
- ### Option 3: Alternative Platforms
171
-
172
- #### Google Colab
173
- ```python
174
- # Install dependencies
175
- !pip install transformers==4.45.2
176
- !pip install datasets==2.19.1
177
- !pip install peft==0.13.2
178
- !pip install bitsandbytes==0.44.1
179
- !pip install accelerate==1.0.0
180
- !pip install sentence-transformers
181
- !pip install faiss-cpu
182
- !pip install scikit-learn
183
-
184
- # Check GPU availability
185
- import torch
186
- print(f"CUDA available: {torch.cuda.is_available()}")
187
- if torch.cuda.is_available():
188
- print(f"GPU: {torch.cuda.get_device_name(0)}")
189
- print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
190
- ```
191
-
192
- #### AWS EC2 / Azure / Local GPU
193
- ```bash
194
- # Install NVIDIA drivers and CUDA toolkit
195
- # Then follow Option 1 or 2 above
196
- ```
197
-
198
- #### CPU-Only Mode
199
- ```python
200
- # Complete CPU-only model loading example
201
- from transformers import AutoTokenizer, AutoModelForCausalLM
202
- from peft import PeftModel
203
- import torch
204
-
205
- # Force CPU usage
206
- device = "cpu"
207
- torch.set_default_device(device)
208
-
209
- # Load tokenizer
210
- tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
211
- if tokenizer.pad_token is None:
212
- tokenizer.pad_token = tokenizer.eos_token
213
-
214
- # Load base model for CPU (no quantization)
215
- base_model = AutoModelForCausalLM.from_pretrained(
216
- "meta-llama/Llama-3.1-8B-Instruct",
217
- device_map="cpu",
218
- torch_dtype=torch.float32,
219
- low_cpu_mem_usage=True
220
- )
221
-
222
- # Load LoRA adapter
223
- model = PeftModel.from_pretrained(base_model, "path/to/lora/adapter")
224
-
225
- # Test inference
226
- def cpu_predict(text):
227
- inputs = tokenizer(text, return_tensors="pt")
228
- with torch.no_grad():
229
- outputs = model.generate(**inputs, max_new_tokens=50, temperature=0.7)
230
- return tokenizer.decode(outputs[0], skip_special_tokens=True)
231
-
232
- # Test
233
- result = cpu_predict("Classify sentiment: 'The market is performing well.'")
234
- print(result)
235
- ```
236
-
237
- ## Usage Instructions
238
-
239
- ### 1. Basic Model Loading and Inference
240
-
241
- ```python
242
- from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
243
- from peft import PeftModel
244
- import torch
245
-
246
- # Check device availability
247
- device = "cuda" if torch.cuda.is_available() else "cpu"
248
- print(f"Using device: {device}")
249
-
250
- # Load tokenizer
251
- tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
252
- if tokenizer.pad_token is None:
253
- tokenizer.pad_token = tokenizer.eos_token
254
-
255
- # Configure model loading based on device
256
- if device == "cuda":
257
- # GPU mode with quantization
258
- bnb_config = BitsAndBytesConfig(
259
- load_in_8bit=True,
260
- llm_int8_threshold=6.0
261
- )
262
- base_model = AutoModelForCausalLM.from_pretrained(
263
- "meta-llama/Llama-3.1-8B-Instruct",
264
- quantization_config=bnb_config,
265
- device_map="auto",
266
- torch_dtype=torch.float16,
267
- trust_remote_code=True
268
- )
269
- else:
270
- # CPU mode without quantization
271
- base_model = AutoModelForCausalLM.from_pretrained(
272
- "meta-llama/Llama-3.1-8B-Instruct",
273
- device_map="cpu",
274
- torch_dtype=torch.float32,
275
- low_cpu_mem_usage=True
276
- )
277
-
278
- # Load LoRA adapter
279
- model = PeftModel.from_pretrained(base_model, "path/to/lora/adapter")
280
-
281
- # Example inference
282
- def predict(text, max_length=256):
283
- inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
284
- with torch.no_grad():
285
- outputs = model.generate(
286
- **inputs,
287
- max_new_tokens=max_length,
288
- temperature=0.7,
289
- do_sample=True,
290
- pad_token_id=tokenizer.eos_token_id
291
- )
292
- return tokenizer.decode(outputs[0], skip_special_tokens=True)
293
-
294
- # Test the model
295
- result = predict("Classify the sentiment of this financial text: 'The company's revenue increased by 15% this quarter.'")
296
- print(result)
297
- ```
298
-
299
- ### 2. Comprehensive Evaluation
300
-
301
- For testing all models on financial datasets:
302
-
303
- ```bash
304
- # Navigate to testdata directory
305
- cd testdata
306
-
307
- # Run comprehensive evaluation (works on any platform)
308
- python comprehensive_evaluation.py
309
-
310
- # For Quest cluster users only:
311
- # sbatch submit_comprehensive_evaluation.sh
312
- ```
313
-
314
- **Note**: The evaluation script automatically detects your environment and adjusts accordingly:
315
- - **GPU available**: Uses CUDA with quantization
316
- - **CPU only**: Uses CPU mode without quantization
317
- - **Memory constraints**: Automatically reduces batch size
318
-
319
- ### 3. Individual Model Testing
320
-
321
- ```python
322
- # Test specific financial tasks
323
- from testdata.comprehensive_evaluation import FinLoRAPredictor
324
-
325
- # Initialize predictor
326
- predictor = FinLoRAPredictor("path/to/model")
327
-
328
- # Load model
329
- predictor.load_model()
330
-
331
- # Test sentiment analysis
332
- result = predictor.predict("Analyze the sentiment of: 'Stock prices are declining rapidly.'", max_length=50)
333
- print(result)
334
- ```
335
-
336
- ### 4. RAG System Usage
337
-
338
- The project includes RAG knowledge bases for enhanced financial understanding:
339
-
340
- ```python
341
- # Load RAG system
342
- from FinLoRA.rag.cfa_rag_system import CFARAGSystem
343
-
344
- # Initialize RAG system
345
- rag_system = CFARAGSystem()
346
-
347
- # Query CFA knowledge base
348
- query = "What are the key principles of portfolio management?"
349
- results = rag_system.query(query, top_k=5)
350
-
351
- # Use with LoRA models for enhanced responses
352
- enhanced_response = rag_system.generate_enhanced_response(query, model)
353
- ```
354
-
355
- ## Data Input Formats for Testing
356
-
357
- ### 1. Financial Sentiment Analysis
358
- **Input Format:**
359
- ```python
360
- text = "The company's quarterly earnings exceeded expectations by 20%."
361
- prompt = f"Classify the sentiment of this financial text as positive, negative, or neutral:\n\nText: {text}\n\nSentiment:"
362
- ```
363
-
364
- **Expected Output:**
365
- - `"positive"` - for positive financial sentiment
366
- - `"negative"` - for negative financial sentiment
367
- - `"neutral"` - for neutral financial sentiment
368
-
369
- **Test Examples:**
370
- - "Stock prices are soaring to new heights." → `positive`
371
- - "Revenue declined by 15% this quarter." → `negative`
372
- - "The company maintained stable performance." → `neutral`
373
-
374
- ### 2. Named Entity Recognition
375
- **Input Format:**
376
- ```python
377
- text = "Apple Inc. reported revenue of $394.3 billion in 2022."
378
- prompt = f"Extract financial entities from the following text:\n\nText: {text}\n\nEntities:"
379
- ```
380
-
381
- **Expected Output:**
382
- - Company names, financial figures, dates, and financial terms
383
- - Structured entity extraction with context
384
-
385
- ### 3. XBRL Processing
386
- **Input Format:**
387
- ```python
388
- text = "Total assets: $1,234,567,890. Current assets: $456,789,123."
389
- prompt = f"Extract XBRL tags from the following financial statement:\n\nStatement: {text}\n\nXBRL Tags:"
390
- ```
391
-
392
- **Expected Output:**
393
- - Structured XBRL tag extraction
394
- - Financial statement element identification
395
-
396
- ### 4. CFA Knowledge Integration
397
- **Input Format:**
398
- ```python
399
- question = "Explain the concept of weighted average cost of capital (WACC)."
400
- prompt = f"Answer this CFA-related question using your knowledge base:\n\nQuestion: {question}\n\nAnswer:"
401
- ```
402
-
403
- **Expected Output:**
404
- - Comprehensive explanation with CFA knowledge
405
- - Structured financial concepts and formulas
406
-
407
- ### 5. Headline Classification
408
- **Input Format:**
409
- ```python
410
- headline = "Federal Reserve announces interest rate cut"
411
- prompt = f"Classify this financial headline:\n\nHeadline: {headline}\n\nClassification:"
412
- ```
413
-
414
- **Expected Output:**
415
- - Financial news category classification
416
- - Market impact assessment
417
-
418
- ## Running Without Quest GPU
419
-
420
- ### Option 1: Local GPU Setup
421
- ```bash
422
- # Check GPU availability
423
- nvidia-smi
424
-
425
- # Install CUDA toolkit (if not already installed)
426
- conda install cudatoolkit=11.8
427
-
428
- # Run evaluation with GPU
429
- cd testdata
430
- python comprehensive_evaluation.py
431
- ```
432
-
433
- ### Option 2: CPU-Only Mode
434
- ```bash
435
- # Run evaluation on CPU (slower but works without GPU)
436
- cd testdata
437
- python comprehensive_evaluation.py
438
- ```
439
-
440
- The evaluation script will automatically detect CPU mode and adjust settings accordingly.
441
-
442
- ### Option 3: Cloud Platforms
443
-
444
- #### Google Colab
445
- ```python
446
- # Upload the project files to Colab
447
- # Then run:
448
- !cd testdata && python comprehensive_evaluation.py
449
- ```
450
-
451
- #### AWS EC2 / Azure / Local GPU
452
- ```bash
453
- # Install NVIDIA drivers and CUDA toolkit first
454
- # Then follow the environment setup above
455
- cd testdata
456
- python comprehensive_evaluation.py
457
- ```
458
-
459
- #### Hugging Face Spaces
460
- ```python
461
- # Deploy as a web application
462
- # The model will run on Hugging Face's infrastructure
463
- ```
464
-
465
- ### Option 4: Docker with GPU Support
466
- ```bash
467
- # Build Docker image
468
- docker build -t finlora .
469
-
470
- # Run with GPU support
471
- docker run --gpus all -it finlora python comprehensive_evaluation.py
472
-
473
- # Run without GPU (CPU mode)
474
- docker run -it finlora python comprehensive_evaluation.py
475
- ```
476
-
477
- ### Performance Expectations
478
-
479
- | Environment | Expected Speed | Memory Usage | Notes |
480
- |-------------|----------------|--------------|-------|
481
- | Quest H100 | Fastest | ~16GB | Original development environment |
482
- | Local GPU (RTX 4090) | Fast | ~12GB | High-end consumer GPU |
483
- | Google Colab T4 | Medium | ~8GB | Free tier available |
484
- | Google Colab V100 | Fast | ~16GB | Pro tier required |
485
- | CPU Only | Slow | ~32GB | Requires significant RAM |
486
- | AWS/Azure GPU | Fast | Variable | Depends on instance type |
487
-
488
- ## Evaluation Results
489
-
490
- The models have been evaluated on multiple financial datasets:
491
-
492
- ### Performance Metrics
493
- - **Financial Phrasebank**: F1=0.333, Accuracy=0.500
494
- - **NER Classification**: F1=0.889, Accuracy=0.800
495
- - **Headline Classification**: F1=0.697, Accuracy=0.700
496
- - **XBRL Tag Extraction**: Accuracy=0.200
497
- - **FIQA Sentiment Analysis**: F1=0.727, Accuracy=0.700
498
-
499
- ### Dataset Coverage
500
- - BloombergGPT tasks: Financial Phrasebank, FIQA SA, Headline, NER, ConvFinQA
501
- - XBRL tasks: Tag extraction, Value extraction, Formula construction, Formula calculation
502
- - CFA integration: Level 1 and Level 2 knowledge base
503
-
504
- ## File Structure
505
-
506
- ```
507
- FinLoRA/
508
- ├── lora_adapters/ # Trained LoRA adapters
509
- │ ├── 8bits_r8/ # 8-bit quantized models
510
- │ ├── 4bits_r4/ # 4-bit quantized models
511
- │ └── fp16_r8/ # Full precision models
512
- ├── testdata/ # Evaluation scripts and data
513
- │ ├── comprehensive_evaluation.py
514
- │ ├── incremental_evaluation.py
515
- │ └── submit_*.sh # SLURM submission scripts
516
- ├── rag/ # RAG system components
517
- ├── data/ # Training and test data
518
- ├── environment.yml # Conda environment specification
519
- └── requirements.txt # Python dependencies
520
- ```
521
-
522
- ## Environment Verification
523
-
524
- Before running the models, verify your environment setup:
525
-
526
- ```python
527
- # Environment verification script
528
- import torch
529
- import transformers
530
- import peft
531
- import datasets
532
- import sys
533
-
534
- print("=== Environment Verification ===")
535
- print(f"Python version: {sys.version}")
536
- print(f"PyTorch version: {torch.__version__}")
537
- print(f"CUDA available: {torch.cuda.is_available()}")
538
- print(f"CUDA version: {torch.version.cuda}")
539
- print(f"Transformers version: {transformers.__version__}")
540
- print(f"PEFT version: {peft.__version__}")
541
- print(f"Datasets version: {datasets.__version__}")
542
-
543
- if torch.cuda.is_available():
544
- print(f"GPU count: {torch.cuda.device_count()}")
545
- for i in range(torch.cuda.device_count()):
546
- print(f"GPU {i}: {torch.cuda.get_device_name(i)}")
547
- print(f"GPU {i} memory: {torch.cuda.get_device_properties(i).total_memory / 1e9:.1f} GB")
548
- else:
549
- print("Running in CPU mode")
550
-
551
- print("=== Model Path Verification ===")
552
- import os
553
- model_paths = [
554
- "FinLoRA/lora_adapters/8bits_r8/sentiment_llama_3_1_8b_8bits_r8",
555
- "FinLoRA/lora_adapters/8bits_r8/ner_llama_3_1_8b_8bits_r8",
556
- "FinLoRA/lora_adapters/8bits_r8/headline_llama_3_1_8b_8bits_r8"
557
- ]
558
-
559
- for path in model_paths:
560
- exists = os.path.exists(path)
561
- print(f"{path}: {'✓' if exists else '✗'}")
562
- ```
563
-
564
- ## Troubleshooting
565
-
566
- ### Common Issues
567
-
568
- 1. **CUDA Out of Memory**
569
- ```python
570
- # Reduce batch size or use gradient checkpointing
571
- model.gradient_checkpointing_enable()
572
-
573
- # Or use CPU mode
574
- device = "cpu"
575
- ```
576
-
577
- 2. **Model Loading Errors**
578
- ```python
579
- # Check model path and permissions
580
- import os
581
- print(os.path.exists("path/to/model"))
582
-
583
- # Check if base model can be loaded
584
- from transformers import AutoTokenizer
585
- tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
586
- ```
587
-
588
- 3. **Dependency Conflicts**
589
- ```bash
590
- # Create fresh environment
591
- conda create -n finlora_new python=3.11
592
- conda activate finlora_new
593
- pip install -r requirements.txt
594
- ```
595
-
596
- 4. **CPU Mode Issues**
597
- ```python
598
- # Ensure CPU mode is properly configured
599
- import torch
600
- torch.set_default_device("cpu")
601
-
602
- # Use low memory mode
603
- base_model = AutoModelForCausalLM.from_pretrained(
604
- "meta-llama/Llama-3.1-8B-Instruct",
605
- device_map="cpu",
606
- torch_dtype=torch.float32,
607
- low_cpu_mem_usage=True
608
- )
609
- ```
610
-
611
- ### Performance Optimization
612
-
613
- 1. **Memory Optimization**
614
- - Use 8-bit or 4-bit quantization
615
- - Enable gradient checkpointing
616
- - Use DeepSpeed for large models
617
-
618
- 2. **Speed Optimization**
619
- - Use GPU acceleration
620
- - Batch processing
621
- - Model caching
622
-
623
- ## Citation
624
-
625
- If you use this work, please cite:
626
-
627
- ```bibtex
628
- @article{finlora2024,
629
- title={FinLoRA: Financial Large Language Models with LoRA Adaptation},
630
- author={Your Name},
631
- journal={Financial AI Conference},
632
- year={2024}
633
- }
634
- ```
635
-
636
- ## License
637
-
638
- This project is licensed under the MIT License - see the LICENSE file for details.
639
-
640
- ## Contact
641
-
642
- For questions and support, please contact:
643
- - Email: your.email@domain.com
644
- - GitHub Issues: [Project Repository](https://github.com/your-repo/finlora)
645
-
646
- ## Acknowledgments
647
-
648
- - Meta AI for the Llama-3.1-8B-Instruct base model
649
- - Hugging Face for the transformers library
650
- - Microsoft for the LoRA adaptation technique
651
- - Quest cluster at Northwestern University for computational resources