Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -9,52 +9,90 @@ app_file: app.py
|
|
| 9 |
pinned: false
|
| 10 |
---
|
| 11 |
|
| 12 |
-
# 💸 Finance Assistant
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
##
|
| 17 |
|
| 18 |
-
|
| 19 |
-
Automated financial calculations essential for banking operations:
|
| 20 |
-
- **MPBF (Maximum Permissible Bank Finance)**: Calculate the maximum working capital finance a bank can provide based on RBI norms
|
| 21 |
-
- **Drawing Power (DP)**: Calculate borrowing limits based on current assets with applicable margins
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
-
|
|
|
|
| 29 |
|
| 30 |
-
##
|
| 31 |
-
Intelligent industry code classification system:
|
| 32 |
-
- Suggests appropriate industry classification codes based on keywords
|
| 33 |
-
- Helps with accurate business categorization for regulatory purposes
|
| 34 |
-
- Uses vector search to match business descriptions with standard classifications
|
| 35 |
|
| 36 |
-
|
| 37 |
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
- **LLM**: Groq API with Gemma2-9B-IT model
|
| 42 |
-
- **Data Processing**: Pandas, Pickle
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
-
|
| 47 |
-
- GROQ_API_KEY environment variable
|
| 48 |
-
- Pre-built FAISS indexes and chunk files:
|
| 49 |
-
- `industry_index.faiss` & `industry_chunks.pkl`
|
| 50 |
-
- `circular_index.faiss` & `circular_chunks.pkl`
|
| 51 |
|
| 52 |
-
|
| 53 |
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
- Financial institutions streamlining operations
|
| 59 |
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
pinned: false
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# 💸 Finance Assistant
|
| 13 |
|
| 14 |
+
This project is a multi-functional financial assistant built with Streamlit. It leverages large language models and retrieval-augmented generation (RAG) to provide a suite of tools for financial analysis, compliance, and data retrieval.
|
| 15 |
|
| 16 |
+
## Features
|
| 17 |
|
| 18 |
+
The application is divided into several key functionalities:
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
+
* **Circular Compliance Assistant**: Analyzes user-provided scenarios for compliance against RBI Master Circulars on Management of Advances. It uses a FAISS vector database to retrieve relevant sections of the circular and a language model to generate a detailed compliance report.
|
| 21 |
+
* **Industry Classification Assistant**: Suggests appropriate industry classification codes based on user-provided keywords. This feature also utilizes a RAG pipeline to search through an industry classification master document.
|
| 22 |
+
* **Calculation Methodology**: Provides interactive calculators for key financial metrics:
|
| 23 |
+
* **Maximum Permissible Bank Finance (MPBF)**
|
| 24 |
+
* **Drawing Power (DP)**
|
| 25 |
+
* **Financial Data Assistant**: Answers questions about historical (1980-2015) state-wise financial data for India. It can retrieve specific metrics for a given state and year.
|
| 26 |
+
* **Model 1 Chat**: A general-purpose chat interface powered by the `gemma2-9b-it` model via the Groq API.
|
| 27 |
|
| 28 |
+
## How It Works
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
The core of the "Circular Compliance" and "Industry Classification" assistants is a Retrieval-Augmented Generation (RAG) pipeline.
|
| 31 |
|
| 32 |
+
1. **Indexing**: Source documents (`Master Circular.pdf`, `Industry Classification Master.pdf`) are chunked, and the text chunks are converted into vector embeddings using a `SentenceTransformer` model. These embeddings are stored in a FAISS index for efficient similarity search.
|
| 33 |
+
2. **Retrieval**: When a user enters a query, the query is embedded, and the FAISS index is searched to find the most relevant document chunks.
|
| 34 |
+
3. **Generation**: The retrieved chunks are passed as context, along with the user's query, to a large language model (`gemma2-9b-it`). The model then generates a comprehensive and context-aware response.
|
|
|
|
|
|
|
| 35 |
|
| 36 |
+
The "Financial Data Assistant" works by directly parsing the user's query for state, year, and metric information and looking up the corresponding data from a pre-loaded data file.
|
| 37 |
|
| 38 |
+
## Setup and Installation
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
+
1. **Clone the Repository**:
|
| 41 |
|
| 42 |
+
```bash
|
| 43 |
+
git clone <your-repository-url>
|
| 44 |
+
cd <your-repository-directory>
|
| 45 |
+
```
|
|
|
|
| 46 |
|
| 47 |
+
2. **Install Dependencies**:
|
| 48 |
+
Install the necessary Python libraries using the `requirements.txt` file.
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
pip install -r requirements.txt
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
3. **Set Up Assets**:
|
| 55 |
+
The application requires pre-built FAISS indexes and data files.
|
| 56 |
+
|
| 57 |
+
* Create a folder named `assets` in the root directory.
|
| 58 |
+
* Generate and place the following files into the `assets` folder (You will need a separate script to process the source PDFs and JSON to create these files):
|
| 59 |
+
* `industry_index.faiss`
|
| 60 |
+
* `industry_chunks.pkl`
|
| 61 |
+
* `circular_index.faiss`
|
| 62 |
+
* `circular_chunks.pkl`
|
| 63 |
+
* `financial_index.faiss`
|
| 64 |
+
* `financial_statements.pkl`
|
| 65 |
+
|
| 66 |
+
4. **API Key**:
|
| 67 |
+
Insert your Groq API key directly into the `app.py` file at the following line:
|
| 68 |
+
|
| 69 |
+
```python
|
| 70 |
+
GROQ_API_KEY = "your-groq-api-key-here"
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
## Usage
|
| 74 |
+
|
| 75 |
+
1. **Run the Streamlit App**:
|
| 76 |
+
Execute the following command in your terminal:
|
| 77 |
+
|
| 78 |
+
```bash
|
| 79 |
+
streamlit run app.py
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
2. **Interact with the Application**:
|
| 83 |
+
|
| 84 |
+
* Open the URL provided by Streamlit (usually `http://localhost:8501`) in your web browser.
|
| 85 |
+
* Use the radio buttons at the top of the page to navigate between the different functionalities: "Calculation Methodology", "Circular Compliance", "Industry Classification", "Model 1", and "Model 2".
|
| 86 |
+
* Follow the on-screen instructions for each tool.
|
| 87 |
+
|
| 88 |
+
## Dependencies
|
| 89 |
+
|
| 90 |
+
This project relies on the following major libraries:
|
| 91 |
+
|
| 92 |
+
* `streamlit`: For creating the web application interface.
|
| 93 |
+
* `groq`: The client for accessing the Groq API.
|
| 94 |
+
* `sentence-transformers`: For generating text embeddings.
|
| 95 |
+
* `faiss-cpu`: For efficient similarity search in the vector database.
|
| 96 |
+
* `pandas`: For data manipulation, particularly in the financial data assistant.
|
| 97 |
+
* `numpy`: For numerical operations.
|
| 98 |
+
* `torch` & `transformers`: Core dependencies for the sentence transformer models.
|