|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- finance |
|
|
- llm |
|
|
- lora |
|
|
- sentiment-analysis |
|
|
- named-entity-recognition |
|
|
- xbrl |
|
|
- apollo |
|
|
- rag |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# FinLoRA: Financial Large Language Models with LoRA Adaptation |
|
|
|
|
|
[](https://www.python.org/downloads/) |
|
|
[](https://pytorch.org/) |
|
|
[](https://opensource.org/licenses/MIT) |
|
|
|
|
|
## Overview |
|
|
|
|
|
FinLoRA is a comprehensive framework for fine-tuning large language models on financial tasks using Low-Rank Adaptation (LoRA). This repository contains trained LoRA adapters for various financial NLP tasks including sentiment analysis, named entity recognition, headline classification, XBRL processing, **RAG-enhanced models** for CFA knowledge and FinTagging tasks, and **APOLLO reasoning layers** for advanced numerical calculations. |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
- **Base Model**: Meta-Llama-3.1-8B-Instruct (downloaded locally) |
|
|
- **Adaptation Method**: LoRA (Low-Rank Adaptation) |
|
|
- **Quantization**: 8-bit and 4-bit quantization support |
|
|
- **Multi-Layer Support**: RAG + APOLLO layered architecture |
|
|
- **Local Usage**: All models run locally without requiring Hugging Face online access |
|
|
- **Tasks**: Financial sentiment analysis, NER, classification, XBRL processing, CFA knowledge, FinTagging, numerical reasoning |
|
|
|
|
|
## Available Models |
|
|
|
|
|
### 8-bit Quantized Models (Recommended) |
|
|
- `sentiment_llama_3_1_8b_8bits_r8` - Financial sentiment analysis |
|
|
- `ner_llama_3_1_8b_8bits_r8` - Named entity recognition |
|
|
- `headline_llama_3_1_8b_8bits_r8` - Financial headline classification |
|
|
- `xbrl_extract_llama_3_1_8b_8bits_r8` - XBRL tag extraction |
|
|
- `xbrl_term_llama_3_1_8b_8bits_r8` - XBRL terminology processing |
|
|
- `financebench_llama_3_1_8b_8bits_r8` - Comprehensive financial benchmark |
|
|
- `finer_llama_3_1_8b_8bits_r8` - Financial NER |
|
|
- `formula_llama_3_1_8b_8bits_r8` - Financial formula processing |
|
|
|
|
|
### RAG-Enhanced Models (Knowledge-Augmented) |
|
|
- `cfa_rag_llama_3_1_8b_8bits_r8` - CFA knowledge-enhanced model with RAG |
|
|
- `fintagging_combined_rag_llama_3_1_8b_8bits_r8` - Combined FinTagging RAG model |
|
|
- `fintagging_fincl_rag_llama_3_1_8b_8bits_r8` - FinCL RAG-enhanced model |
|
|
- `fintagging_finni_rag_llama_3_1_8b_8bits_r8` - FinNI RAG-enhanced model |
|
|
|
|
|
### APOLLO Models (Advanced Reasoning Layer) |
|
|
- `apollo_cfa_rag_llama_3_1_8b_8bits_r8` - APOLLO reasoning layer for CFA tasks |
|
|
- `apollo_fintagging_combined_llama_3_1_8b_8bits_r8` - APOLLO reasoning layer for FinTagging tasks |
|
|
|
|
|
**Note**: APOLLO models are designed to be loaded on top of RAG models for enhanced numerical reasoning and calculation capabilities. |
|
|
|
|
|
### Bloomberg-Enhanced Models (Specialized Financial Tasks) |
|
|
- `finlora_lora_ckpt_llama_8bit_r8` - Bloomberg FPB and FIQA specialized model |
|
|
- `finlora_heads_llama_8bit_r8.pt` - Bloomberg model weights (71MB) |
|
|
|
|
|
**Note**: Bloomberg models are specialized for Financial Phrasebank (FPB) and Financial Question Answering (FIQA) tasks. |
|
|
|
|
|
### 4-bit Quantized Models (Memory Efficient) |
|
|
- `sentiment_llama_3_1_8b_4bits_r4` - Financial sentiment analysis |
|
|
- `ner_llama_3_1_8b_4bits_r4` - Named entity recognition |
|
|
- `headline_llama_3_1_8b_4bits_r4` - Financial headline classification |
|
|
- `xbrl_extract_llama_3_1_8b_4bits_r4` - XBRL tag extraction |
|
|
- `xbrl_term_llama_3_1_8b_4bits_r4` - XBRL terminology processing |
|
|
- `financebench_llama_3_1_8b_4bits_r4` - Comprehensive financial benchmark |
|
|
- `finer_llama_3_1_8b_4bits_r4` - Financial NER |
|
|
- `formula_llama_3_1_8b_4bits_r4` - Financial formula processing |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### 1. Installation |
|
|
|
|
|
```bash |
|
|
# Install dependencies |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
### 2. Local Model Setup |
|
|
|
|
|
**Important**: This project uses locally downloaded models, not online Hugging Face models. |
|
|
|
|
|
```bash |
|
|
# The base Llama-3.1-8B-Instruct model will be automatically downloaded to local cache |
|
|
# No internet connection required after initial setup |
|
|
# All LoRA adapters are included in this repository |
|
|
``` |
|
|
|
|
|
### 3. Basic Usage |
|
|
|
|
|
```python |
|
|
from inference import FinLoRAPredictor |
|
|
|
|
|
# Initialize predictor with 8-bit model (recommended) |
|
|
predictor = FinLoRAPredictor( |
|
|
model_name="sentiment_llama_3_1_8b_8bits_r8", |
|
|
use_4bit=False |
|
|
) |
|
|
|
|
|
# Financial sentiment analysis |
|
|
sentiment = predictor.classify_sentiment( |
|
|
"The company's quarterly earnings exceeded expectations by 20%." |
|
|
) |
|
|
print(f"Sentiment: {sentiment}") |
|
|
|
|
|
# Entity extraction |
|
|
entities = predictor.extract_entities( |
|
|
"Apple Inc. reported revenue of $394.3 billion in 2022." |
|
|
) |
|
|
print(f"Entities: {entities}") |
|
|
``` |
|
|
|
|
|
### 4. Run Complete Test |
|
|
|
|
|
```bash |
|
|
# Test all models (this will download the base Llama model if not present) |
|
|
python inference.py |
|
|
|
|
|
# Test specific model |
|
|
python -c " |
|
|
from inference import FinLoRAPredictor |
|
|
predictor = FinLoRAPredictor('sentiment_llama_3_1_8b_8bits_r8') |
|
|
print('Model loaded successfully!') |
|
|
" |
|
|
``` |
|
|
|
|
|
## Usage Examples |
|
|
|
|
|
### Financial Sentiment Analysis |
|
|
|
|
|
```python |
|
|
predictor = FinLoRAPredictor("sentiment_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
# Test cases |
|
|
test_texts = [ |
|
|
"Stock prices are soaring to new heights.", |
|
|
"Revenue declined by 15% this quarter.", |
|
|
"The company maintained stable performance." |
|
|
] |
|
|
|
|
|
for text in test_texts: |
|
|
sentiment = predictor.classify_sentiment(text) |
|
|
print(f"Text: {text}") |
|
|
print(f"Sentiment: {sentiment}\n") |
|
|
``` |
|
|
|
|
|
### Named Entity Recognition |
|
|
|
|
|
```python |
|
|
predictor = FinLoRAPredictor("ner_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
text = "Apple Inc. reported revenue of $394.3 billion in 2022." |
|
|
entities = predictor.extract_entities(text) |
|
|
print(f"Entities: {entities}") |
|
|
``` |
|
|
|
|
|
### XBRL Processing |
|
|
|
|
|
```python |
|
|
predictor = FinLoRAPredictor("xbrl_extract_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
text = "Total assets: $1,234,567,890. Current assets: $456,789,123." |
|
|
xbrl_tags = predictor.extract_xbrl_tags(text) |
|
|
print(f"XBRL Tags: {xbrl_tags}") |
|
|
``` |
|
|
|
|
|
### RAG-Enhanced Models |
|
|
|
|
|
```python |
|
|
# CFA RAG-enhanced model for financial knowledge |
|
|
predictor = FinLoRAPredictor("cfa_rag_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
# Enhanced financial analysis with CFA knowledge |
|
|
response = predictor.generate_response( |
|
|
"Explain the concept of discounted cash flow valuation" |
|
|
) |
|
|
print(f"CFA Response: {response}") |
|
|
|
|
|
# FinTagging RAG models for financial information extraction |
|
|
fintagging_predictor = FinLoRAPredictor("fintagging_combined_rag_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
# Extract financial information with enhanced context |
|
|
entities = fintagging_predictor.extract_entities( |
|
|
"Apple Inc. reported revenue of $394.3 billion in 2022." |
|
|
) |
|
|
print(f"Enhanced Entities: {entities}") |
|
|
``` |
|
|
|
|
|
### APOLLO Models (Advanced Reasoning) |
|
|
|
|
|
**Important**: APOLLO models are designed for advanced numerical reasoning and should be used for complex financial calculations. |
|
|
|
|
|
```python |
|
|
# Load APOLLO model for advanced reasoning |
|
|
apollo_predictor = FinLoRAPredictor("apollo_cfa_rag_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
# Financial calculations and reasoning |
|
|
calculation = apollo_predictor.generate_response( |
|
|
"Calculate the present value of $10,000 received in 3 years with 5% annual discount rate" |
|
|
) |
|
|
print(f"APOLLO Calculation: {calculation}") |
|
|
|
|
|
# Complex financial analysis |
|
|
analysis = apollo_predictor.generate_response( |
|
|
"Analyze the impact of a 2% interest rate increase on a 10-year bond with 3% coupon rate" |
|
|
) |
|
|
print(f"APOLLO Analysis: {analysis}") |
|
|
|
|
|
# Formula processing |
|
|
formula_result = apollo_predictor.generate_response( |
|
|
"Solve: If a company has $1M revenue, 20% profit margin, and 10% growth rate, what's next year's profit?" |
|
|
) |
|
|
print(f"APOLLO Formula Result: {formula_result}") |
|
|
``` |
|
|
|
|
|
### Multi-Layer LoRA Architecture (RAG + APOLLO) |
|
|
|
|
|
For maximum performance, you can combine RAG and APOLLO models: |
|
|
|
|
|
```python |
|
|
# Step 1: Load RAG model for knowledge retrieval |
|
|
rag_predictor = FinLoRAPredictor("cfa_rag_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
# Step 2: Load APOLLO model for reasoning (this will be layered on top) |
|
|
apollo_predictor = FinLoRAPredictor("apollo_cfa_rag_llama_3_1_8b_8bits_r8") |
|
|
|
|
|
# Use for complex financial reasoning tasks |
|
|
complex_query = """ |
|
|
Given the following financial data: |
|
|
- Revenue: $50M |
|
|
- Cost of Goods Sold: $30M |
|
|
- Operating Expenses: $15M |
|
|
- Tax Rate: 25% |
|
|
|
|
|
Calculate the net income and explain the calculation steps. |
|
|
""" |
|
|
|
|
|
response = apollo_predictor.generate_response(complex_query) |
|
|
print(f"Multi-Layer Response: {response}") |
|
|
``` |
|
|
|
|
|
### Bloomberg-Enhanced Models (FPB & FIQA Specialized Tasks) |
|
|
|
|
|
**Important**: Bloomberg models require special environment setup and are optimized for Financial Phrasebank (FPB) and Financial Question Answering (FIQA) tasks. |
|
|
|
|
|
#### Environment Setup for Bloomberg Models |
|
|
|
|
|
```bash |
|
|
# 1. Create conda environment using the provided configuration |
|
|
conda env create -f finlora_hf_submission/Bloomberg_fpb_and_fiqa/environment_contrasim.yml |
|
|
|
|
|
# 2. Activate the environment |
|
|
conda activate finenv |
|
|
|
|
|
# 3. Navigate to the Bloomberg evaluation directory |
|
|
cd finlora_hf_submission/Bloomberg_fpb_and_fiqa/ |
|
|
``` |
|
|
|
|
|
#### Testing Bloomberg Models on FPB and FIQA Datasets |
|
|
|
|
|
```bash |
|
|
# Run Bloomberg model evaluation |
|
|
python trytry1.py |
|
|
``` |
|
|
|
|
|
**Configuration Notes for Testing:** |
|
|
|
|
|
1. **Dataset Configuration**: In `trytry1.py`, modify the `EVAL_FILES` line: |
|
|
```python |
|
|
# Replace with your test datasets |
|
|
EVAL_FILES = ["fiqa_test.jsonl", "fpb_test.jsonl"] |
|
|
``` |
|
|
|
|
|
2. **Model Path Configuration**: For local testing, update the `BASE_DIR` in `trytry1.py`: |
|
|
```python |
|
|
# For local Llama model deployment |
|
|
BASE_DIR = "path/to/your/local/llama/model" |
|
|
|
|
|
# For Hugging Face online model (original setting) |
|
|
BASE_DIR = "d04e592bb4f6aa9cfee91e2e20afa771667e1d4b" |
|
|
``` |
|
|
|
|
|
3. **Model Components**: |
|
|
- `ADAPTER_DIR`: Points to the LoRA adapter (`finlora_lora_ckpt_llama_8bit_r8`) |
|
|
- `HEADS_PATH`: Points to the model weights (`finlora_heads_llama_8bit_r8.pt`) |
|
|
|
|
|
#### Bloomberg Model Usage Example |
|
|
|
|
|
```python |
|
|
# Bloomberg models are specialized for FPB and FIQA tasks |
|
|
# They provide enhanced performance on financial sentiment analysis |
|
|
# and financial question answering compared to standard models |
|
|
|
|
|
# The evaluation script automatically handles: |
|
|
# - Model loading and configuration |
|
|
# - Dataset processing |
|
|
# - Performance metrics calculation |
|
|
# - Memory management for large models |
|
|
``` |
|
|
|
|
|
|
|
|
## Local Model Management |
|
|
|
|
|
### Model Storage |
|
|
- **Base Model**: Downloaded to `~/.cache/huggingface/transformers/` |
|
|
- **LoRA Adapters**: Stored in `models/` directory |
|
|
- **No Online Dependency**: All models run locally after initial download |
|
|
|
|
|
### Model Loading Process |
|
|
1. **Base Model**: Automatically downloaded on first use (~15GB) |
|
|
2. **LoRA Adapters**: Loaded from local `models/` directory |
|
|
3. **Quantization**: Applied during loading (8-bit or 4-bit) |
|
|
4. **Device Detection**: Automatically uses GPU if available, falls back to CPU |
|
|
|
|
|
### Performance Optimization |
|
|
```python |
|
|
# For better performance on GPU |
|
|
predictor = FinLoRAPredictor( |
|
|
model_name="sentiment_llama_3_1_8b_8bits_r8", |
|
|
use_4bit=False # Use 8-bit for better performance |
|
|
) |
|
|
|
|
|
# For memory-constrained environments |
|
|
predictor = FinLoRAPredictor( |
|
|
model_name="sentiment_llama_3_1_8b_4bits_r4", |
|
|
use_4bit=True # Use 4-bit for memory efficiency |
|
|
) |
|
|
``` |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### For Competition Organizers |
|
|
|
|
|
This section provides guidance for evaluating the submitted models: |
|
|
|
|
|
#### 1. Quick Model Test |
|
|
```bash |
|
|
# Test if all models can be loaded successfully |
|
|
python test_submission.py |
|
|
``` |
|
|
|
|
|
#### 2. Comprehensive Evaluation |
|
|
```bash |
|
|
# Run full evaluation on all models and datasets |
|
|
python comprehensive_evaluation.py |
|
|
|
|
|
# Check results |
|
|
cat comprehensive_evaluation_results.json |
|
|
``` |
|
|
|
|
|
#### 3. Incremental Evaluation |
|
|
```bash |
|
|
# Run evaluation on missing tasks |
|
|
python incremental_evaluation.py |
|
|
|
|
|
# Check results |
|
|
cat incremental_evaluation_results.json |
|
|
``` |
|
|
|
|
|
#### 4. APOLLO Model Testing |
|
|
```bash |
|
|
# Test APOLLO reasoning capabilities |
|
|
python -c " |
|
|
from inference import FinLoRAPredictor |
|
|
apollo = FinLoRAPredictor('apollo_cfa_rag_llama_3_1_8b_8bits_r8') |
|
|
result = apollo.generate_response('Calculate 15% of $1000') |
|
|
print(f'APOLLO Test: {result}') |
|
|
" |
|
|
``` |
|
|
|
|
|
#### 5. Bloomberg Model Testing (FPB & FIQA) |
|
|
```bash |
|
|
# Setup Bloomberg environment |
|
|
conda env create -f finlora_hf_submission/Bloomberg_fpb_and_fiqa/environment_contrasim.yml |
|
|
conda activate finenv |
|
|
|
|
|
# Navigate to Bloomberg evaluation directory |
|
|
cd finlora_hf_submission/Bloomberg_fpb_and_fiqa/ |
|
|
|
|
|
# Configure test datasets in trytry1.py: |
|
|
# 1. Update EVAL_FILES = ["your_fiqa_test.jsonl", "your_fpb_test.jsonl"] |
|
|
# 2. Update BASE_DIR for local model path or keep original for Hugging Face |
|
|
|
|
|
# Run Bloomberg model evaluation |
|
|
python trytry1.py |
|
|
``` |
|
|
|
|
|
|
|
|
## Project Structure |
|
|
|
|
|
``` |
|
|
finlora_hf_submission/ |
|
|
βββ models/ # 8-bit LoRA model adapters (15 models) |
|
|
β βββ sentiment_llama_3_1_8b_8bits_r8/ |
|
|
β βββ ner_llama_3_1_8b_8bits_r8/ |
|
|
β βββ headline_llama_3_1_8b_8bits_r8/ |
|
|
β βββ xbrl_extract_llama_3_1_8b_8bits_r8/ |
|
|
β βββ xbrl_term_llama_3_1_8b_8bits_r8/ |
|
|
β βββ financebench_llama_3_1_8b_8bits_r8/ |
|
|
β βββ finer_llama_3_1_8b_8bits_r8/ |
|
|
β βββ formula_llama_3_1_8b_8bits_r8/ |
|
|
β βββ cfa_rag_llama_3_1_8b_8bits_r8/ # RAG-enhanced CFA model |
|
|
β βββ fintagging_combined_rag_llama_3_1_8b_8bits_r8/ # Combined RAG |
|
|
β βββ fintagging_fincl_rag_llama_3_1_8b_8bits_r8/ # FinCL RAG |
|
|
β βββ fintagging_finni_rag_llama_3_1_8b_8bits_r8/ # FinNI RAG |
|
|
β βββ apollo_cfa_rag_llama_3_1_8b_8bits_r8/ # APOLLO reasoning layer |
|
|
β βββ apollo_fintagging_combined_llama_3_1_8b_8bits_r8/ # APOLLO reasoning layer |
|
|
β βββ xbrl_train.jsonl-meta-llama-Llama-3.1-8B-Instruct-8bits_r8/ |
|
|
βββ Bloomberg_fpb_and_fiqa/ # Bloomberg specialized models for FPB & FIQA |
|
|
β βββ finlora_heads_llama_8bit_r8.pt |
|
|
β βββ finlora_lora_ckpt_llama_8bit_r8/ |
|
|
β βββ environment_contrasim.yml # Conda environment configuration |
|
|
β βββ trytry1.py # Bloomberg model evaluation script |
|
|
βββ models_4bit/ # 4-bit LoRA model adapters (8 models) |
|
|
β βββ sentiment_llama_3_1_8b_4bits_r4/ |
|
|
β βββ ner_llama_3_1_8b_4bits_r4/ |
|
|
β βββ headline_llama_3_1_8b_4bits_r4/ |
|
|
β βββ xbrl_extract_llama_3_1_8b_4bits_r4/ |
|
|
β βββ xbrl_term_llama_3_1_8b_4bits_r4/ |
|
|
β βββ financebench_llama_3_1_8b_4bits_r4/ |
|
|
β βββ finer_llama_3_1_8b_4bits_r4/ |
|
|
β βββ formula_llama_3_1_8b_4bits_r4/ |
|
|
βββ testdata/ # Evaluation datasets |
|
|
β βββ FinCL-eval-subset.csv |
|
|
β βββ FinNI-eval-subset.csv |
|
|
βββ rag_system/ # RAG system components |
|
|
βββ inference.py # Main inference script |
|
|
βββ comprehensive_evaluation.py # Full evaluation script |
|
|
βββ incremental_evaluation.py # Incremental evaluation |
|
|
βββ robust_incremental.py # Robust evaluation |
|
|
βββ missing_tests.py # Missing test detection |
|
|
βββ requirements.txt # Python dependencies |
|
|
βββ README.md # This file |
|
|
``` |
|
|
|
|
|
## Environment Requirements |
|
|
|
|
|
### Minimum Requirements (CPU Mode) |
|
|
- Python 3.8+ |
|
|
- PyTorch 2.0+ |
|
|
- 8GB RAM |
|
|
- No GPU required |
|
|
|
|
|
### Recommended Requirements (GPU Mode) |
|
|
- Python 3.9+ |
|
|
- PyTorch 2.1+ |
|
|
- CUDA 11.8+ (for NVIDIA GPUs) |
|
|
- 16GB+ GPU memory |
|
|
- 32GB+ RAM |
|
|
|
|
|
### Installation Instructions |
|
|
|
|
|
```bash |
|
|
# 1. Clone or download this repository |
|
|
# 2. Install dependencies |
|
|
pip install -r requirements.txt |
|
|
|
|
|
# 3. For GPU support (optional but recommended) |
|
|
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 |
|
|
|
|
|
# 4. Verify installation |
|
|
python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')" |
|
|
``` |
|
|
|
|
|
### Troubleshooting |
|
|
|
|
|
**If you encounter memory issues:** |
|
|
- Use 4-bit models instead of 8-bit models |
|
|
- Reduce batch size in inference |
|
|
- Use CPU mode if GPU memory is insufficient |
|
|
|
|
|
**If models fail to load:** |
|
|
- Ensure all model files are present in the correct directories |
|
|
- Check that the base model (Llama-3.1-8B-Instruct) can be downloaded from HuggingFace |
|
|
- Verify internet connection for initial model download |
|
|
|
|
|
**Important Notes for Competition Organizers:** |
|
|
- The base model (Llama-3.1-8B-Instruct) will be automatically downloaded from HuggingFace on first use (~15GB) |
|
|
- All LoRA adapters are included in this submission and do not require additional downloads |
|
|
- Models work in both CPU and GPU modes, with automatic device detection |
|
|
- APOLLO models provide enhanced reasoning capabilities for complex financial tasks |
|
|
- All models run locally without requiring ongoing internet connection |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Training Configuration |
|
|
- **LoRA Rank**: 8 |
|
|
- **LoRA Alpha**: 16 |
|
|
- **Learning Rate**: 1e-4 |
|
|
- **Batch Size**: 4 |
|
|
- **Epochs**: 3-5 |
|
|
- **Quantization**: 8-bit (BitsAndBytes) / 4-bit (NF4) |
|
|
|
|
|
### Training Data |
|
|
- Financial Phrasebank |
|
|
- FinGPT datasets (NER, Headline, XBRL) |
|
|
- BloombergGPT financial datasets |
|
|
- Custom financial text datasets |
|
|
- APOLLO reasoning datasets for numerical calculations |
|
|
|
|
|
|
|
|
|
|
|
## License |
|
|
|
|
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
|
|
|
|
|
## Contributing |
|
|
|
|
|
Contributions are welcome! Please feel free to submit a Pull Request. |
|
|
|
|
|
## Contact |
|
|
|
|
|
For questions and support, please open an issue in the repository. |
|
|
|
|
|
## Submission Summary |
|
|
|
|
|
### What's Included |
|
|
- **17 Total Models**: 15 8-bit models (9 original + 4 RAG-enhanced + 2 APOLLO) + 8 4-bit models |
|
|
- **Complete Evaluation Results**: Comprehensive and incremental evaluation results |
|
|
- **RAG-Enhanced Models**: CFA and FinTagging models with enhanced knowledge |
|
|
- **APOLLO Reasoning**: Advanced numerical reasoning and calculation capabilities |
|
|
- **Cross-Platform Support**: Works on CPU, GPU, and various memory configurations |
|
|
- **Local Execution**: All models run locally without online dependencies |
|
|
- **Ready-to-Use**: All dependencies specified, automatic device detection |
|
|
|
|
|
### Quick Start for Competition Organizers |
|
|
1. Install dependencies: `pip install -r requirements.txt` |
|
|
2. Test submission: `python test_submission.py` |
|
|
3. Run evaluation: `python comprehensive_evaluation.py` |
|
|
4. Test APOLLO reasoning: `python -c "from inference import FinLoRAPredictor; apollo = FinLoRAPredictor('apollo_cfa_rag_llama_3_1_8b_8bits_r8'); print(apollo.generate_response('Calculate 10% of 500'))"` |
|
|
5. Test Bloomberg models (FPB & FIQA): |
|
|
```bash |
|
|
conda env create -f finlora_hf_submission/Bloomberg_fpb_and_fiqa/environment_contrasim.yml |
|
|
conda activate finenv |
|
|
cd finlora_hf_submission/Bloomberg_fpb_and_fiqa/ |
|
|
# Configure EVAL_FILES and BASE_DIR in trytry1.py |
|
|
python trytry1.py |
|
|
``` |
|
|
6. Check results: `cat comprehensive_evaluation_results.json` |
|
|
|
|
|
### Model Categories |
|
|
- **Financial NLP**: Sentiment, NER, Classification, XBRL processing |
|
|
- **RAG-Enhanced**: CFA knowledge and FinTagging with retrieval augmentation |
|
|
- **APOLLO Reasoning**: Advanced numerical calculations and financial reasoning |
|
|
- **Memory Options**: Both 8-bit and 4-bit quantized versions available |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Meta for the Llama-3.1-8B-Instruct base model |
|
|
- Hugging Face for the transformers and PEFT libraries |
|
|
- The financial NLP community for datasets and benchmarks |
|
|
- APOLLO reasoning framework for enhanced numerical capabilities |
|
|
|