financial_qa_rag / README.md
jayyd's picture
Update README.md
b19124d verified

A newer version of the Streamlit SDK is available: 1.52.2

Upgrade
metadata
title: Financial QA β€” RAG vs FT
emoji: πŸ“Š
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.48.1
app_file: app/app.py
pinned: false
license: mit

Financial QA System: RAG vs Fine-Tuning

This project implements and compares two approaches for answering questions about Allstate's financial reports:

  1. Retrieval-Augmented Generation (RAG): Combines hybrid document retrieval with generative language models
  2. Fine-Tuned Language Model (FT): Direct fine-tuning of a small language model on financial Q&A

Quick Start

Model Files

Due to file size limitations, model files are not included in this repository. To use the system:

  1. The fine-tuned model is hosted on Hugging Face Hub
  2. The application automatically loads the model directly from Hugging Face when running
  3. If you want to download the model locally, you can use the Hugging Face CLI:
    pip install huggingface_hub
    python -c "from huggingface_hub import snapshot_download; snapshot_download('jayyd/financial-qa-model', local_dir='models/fine_tuned_model')"
    

Key Features

RAG System

  • Hybrid Retrieval:
    • Dense retrieval using Sentence Transformers (all-MiniLM-L6-v2)
    • Sparse retrieval using BM25
    • Score fusion for optimal chunk selection
  • Context-Aware Generation:
    • Prompts engineered for financial accuracy
    • Dynamic context window management
    • Multi-chunk answer synthesis

Fine-Tuned Model

  • Base Model: DistilGPT2 (small, efficient)
  • Training Data: 30+ carefully curated financial Q&A pairs
  • Optimization: Parameter-efficient fine-tuning

Guardrails

  • Input Validation:
    • Financial keyword detection
    • Query complexity analysis
    • Minimum length requirements
  • Output Validation:
    • Confidence scoring
    • Hallucination detection
    • Answer quality metrics

Evaluation Framework

  • Response time tracking
  • Confidence scoring
  • Chunk relevance metrics
  • Answer quality assessment

πŸ”§ Setup Instructions (Run Locally)

1. Clone the Repository

git clone <your-repo-url>
cd financial_qa_rag_ft

2. Create Virtual Environment (optional)

python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

3. Install Dependencies

pip install -r requirements.txt

4. Download Financial Reports

python utils/download_reports.py

5. Extract and Clean Text

Run: notebooks/01_data_preprocessing.ipynb

6. Generate QA Pairs Automatically

python utils/generate_qa_pairs.py

7. Train Fine-Tuned Model (Optional)

Run: notebooks/03_fine_tuning.ipynb

8. Run Streamlit App

streamlit run app/app.py

The Streamlit app will automatically load the fine-tuned model from Hugging Face Hub (jayyd/financial-qa-model) when running in "Fine-Tuned" mode. No local model files are needed.


Project Structure

financial_qa_rag_ft/
β”œβ”€β”€ app/
β”‚   └── app.py                 # Streamlit web interface with real-time metrics
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ processed/             # Cleaned and segmented text files
β”‚   β”‚   β”œβ”€β”€ Allstate_2022_10K.txt
β”‚   β”‚   └── Allstate_2023_10K.txt
β”‚   └── raw/                   # Original financial reports
β”‚       β”œβ”€β”€ Allstate_2022_10K.pdf
β”‚       └── Allstate_2023_10K.pdf
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ fine_tuned_model/     # DistilGPT2 fine-tuned on financial QA
β”‚   └── rag_model/            # Saved embeddings and retrieval indices
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ 01_data_preprocessing.ipynb  # PDF parsing and text cleaning
β”‚   β”œβ”€β”€ 02_rag_pipeline.ipynb       # RAG implementation and testing
β”‚   β”œβ”€β”€ 03_fine_tuning.ipynb       # Model fine-tuning process
β”‚   β”œβ”€β”€ 04_evaluation.ipynb        # Individual model evaluation
β”‚   └── 05_evaluation_comparison.ipynb  # Comparative analysis
β”œβ”€β”€ qa_pairs/
β”‚   └── qa_dataset.json       # Curated financial QA pairs
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ chunking.py           # Smart text segmentation
β”‚   β”œβ”€β”€ data_preprocessing.py # PDF processing pipeline
β”‚   β”œβ”€β”€ evaluation.py        # Comprehensive metrics
β”‚   β”œβ”€β”€ fine_tuning.py      # Training utilities
β”‚   β”œβ”€β”€ generator.py        # Answer generation logic
β”‚   β”œβ”€β”€ guardrails.py      # Input/output validation
β”‚   └── retriever.py       # Hybrid search implementation
β”œβ”€β”€ requirements.txt          # Project dependencies
└── README.md                # Project documentation

Performance Comparison

RAG System

  • Strengths:
    • Higher factual accuracy
    • Better source traceability
    • More robust to unseen questions
  • Metrics:
    • Average response time: ~0.5s
    • Typical confidence: 0.8-0.95
    • Strong chunk relevance scores

Fine-tuned Model

  • Strengths:
    • Faster inference
    • More natural language
    • Consistent response style
  • Metrics:
    • Average response time: ~0.4s
    • Typical confidence: 0.75-0.9
    • Good performance on seen patterns

Example Questions

# High-confidence questions
"What was Allstate's total revenue in 2023?"
"How much was the net loss in 2023?"
"What were the total assets in 2022?"

# Complex analytical questions
"How did revenue change from 2022 to 2023?"
"What factors affected profitability in 2023?"
"Compare the investment portfolio returns between 2022 and 2023"

Technical Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • Transformers 4.31+
  • Streamlit 1.24+
  • Sentence-Transformers 2.2+
  • See requirements.txt for full list

License

This project is for academic/educational use only. Financial data sourced from Allstate's public reports.

Acknowledgments

  • Built using Hugging Face Transformers
  • Financial data from Allstate's 10-K reports
  • Streamlit for the web interface