Spaces:
Sleeping
A newer version of the Streamlit SDK is available: 1.58.0
title: FinLLM RAG
emoji: ⚡
colorFrom: purple
colorTo: gray
sdk: streamlit
sdk_version: 1.40.1
app_file: app.py
pinned: false
💸 Finance Assistant
This project is a multi-functional financial assistant built with Streamlit. It leverages large language models and retrieval-augmented generation (RAG) to provide a suite of tools for financial analysis, compliance, and data retrieval.
Features
The application is divided into several key functionalities:
- Circular Compliance Assistant: Analyzes user-provided scenarios for compliance against RBI Master Circulars on Management of Advances. It uses a FAISS vector database to retrieve relevant sections of the circular and a language model to generate a detailed compliance report.
- Industry Classification Assistant: Suggests appropriate industry classification codes based on user-provided keywords. This feature also utilizes a RAG pipeline to search through an industry classification master document.
- Calculation Methodology: Provides interactive calculators for key financial metrics:
- Maximum Permissible Bank Finance (MPBF)
- Drawing Power (DP)
- Financial Data Assistant: Answers questions about historical (1980-2015) state-wise financial data for India. It can retrieve specific metrics for a given state and year.
- Model 1 Chat: A general-purpose chat interface powered by the
gemma2-9b-itmodel via the Groq API.
How It Works
The core of the "Circular Compliance" and "Industry Classification" assistants is a Retrieval-Augmented Generation (RAG) pipeline.
- Indexing: Source documents (
Master Circular.pdf,Industry Classification Master.pdf) are chunked, and the text chunks are converted into vector embeddings using aSentenceTransformermodel. These embeddings are stored in a FAISS index for efficient similarity search. - Retrieval: When a user enters a query, the query is embedded, and the FAISS index is searched to find the most relevant document chunks.
- Generation: The retrieved chunks are passed as context, along with the user's query, to a large language model (
gemma2-9b-it). The model then generates a comprehensive and context-aware response.
The "Financial Data Assistant" works by directly parsing the user's query for state, year, and metric information and looking up the corresponding data from a pre-loaded data file.
Setup and Installation
Clone the Repository:
git clone <your-repository-url> cd <your-repository-directory>Install Dependencies: Install the necessary Python libraries using the
requirements.txtfile.pip install -r requirements.txtSet Up Assets: The application requires pre-built FAISS indexes and data files.
- Create a folder named
assetsin the root directory. - Generate and place the following files into the
assetsfolder (You will need a separate script to process the source PDFs and JSON to create these files):industry_index.faissindustry_chunks.pklcircular_index.faisscircular_chunks.pklfinancial_index.faissfinancial_statements.pkl
- Create a folder named
API Key: Insert your Groq API key directly into the
app.pyfile at the following line:GROQ_API_KEY = "your-groq-api-key-here"
Usage
Run the Streamlit App: Execute the following command in your terminal:
streamlit run app.pyInteract with the Application:
- Open the URL provided by Streamlit (usually
http://localhost:8501) in your web browser. - Use the radio buttons at the top of the page to navigate between the different functionalities: "Calculation Methodology", "Circular Compliance", "Industry Classification", "Model 1", and "Model 2".
- Follow the on-screen instructions for each tool.
- Open the URL provided by Streamlit (usually
Dependencies
This project relies on the following major libraries:
streamlit: For creating the web application interface.groq: The client for accessing the Groq API.sentence-transformers: For generating text embeddings.faiss-cpu: For efficient similarity search in the vector database.pandas: For data manipulation, particularly in the financial data assistant.numpy: For numerical operations.torch&transformers: Core dependencies for the sentence transformer models.