yashgori20 commited on
Commit
42c2248
·
verified ·
1 Parent(s): 031582f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -37
README.md CHANGED
@@ -9,52 +9,90 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # 💸 Finance Assistant - FinLLM RAG
13
 
14
- A comprehensive financial assistant application that helps with banking compliance, industry classification, and financial calculations using RAG (Retrieval-Augmented Generation) technology powered by FAISS vector search and LLM processing.
15
 
16
- ## 🚀 Features
17
 
18
- ### 1. **Calculation Methodology**
19
- Automated financial calculations essential for banking operations:
20
- - **MPBF (Maximum Permissible Bank Finance)**: Calculate the maximum working capital finance a bank can provide based on RBI norms
21
- - **Drawing Power (DP)**: Calculate borrowing limits based on current assets with applicable margins
22
 
23
- ### 2. **Circular Compliance Assistant**
24
- AI-powered RBI compliance analysis using Master Circular on Management of Advances:
25
- - Analyzes complex banking scenarios for compliance
26
- - Provides detailed compliance status with certainty levels
27
- - References specific circular sections and paragraphs
28
- - Offers actionable recommendations for compliance
 
29
 
30
- ### 3. **Industry Classification Assistant**
31
- Intelligent industry code classification system:
32
- - Suggests appropriate industry classification codes based on keywords
33
- - Helps with accurate business categorization for regulatory purposes
34
- - Uses vector search to match business descriptions with standard classifications
35
 
36
- ## 🛠️ Technology Stack
37
 
38
- - **Frontend**: Streamlit
39
- - **Vector Search**: FAISS (Facebook AI Similarity Search)
40
- - **Embeddings**: SentenceTransformers (all-MiniLM-L6-v2)
41
- - **LLM**: Groq API with Gemma2-9B-IT model
42
- - **Data Processing**: Pandas, Pickle
43
 
44
- ## 📋 Prerequisites
45
 
46
- - Python 3.8+
47
- - GROQ_API_KEY environment variable
48
- - Pre-built FAISS indexes and chunk files:
49
- - `industry_index.faiss` & `industry_chunks.pkl`
50
- - `circular_index.faiss` & `circular_chunks.pkl`
51
 
52
- ## 🎯 Use Cases
53
 
54
- - Banking professionals ensuring RBI compliance
55
- - Financial analysts performing regulatory calculations
56
- - Business categorization for regulatory filings
57
- - Compliance officers analyzing complex scenarios
58
- - Financial institutions streamlining operations
59
 
60
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+ # 💸 Finance Assistant
13
 
14
+ This project is a multi-functional financial assistant built with Streamlit. It leverages large language models and retrieval-augmented generation (RAG) to provide a suite of tools for financial analysis, compliance, and data retrieval.
15
 
16
+ ## Features
17
 
18
+ The application is divided into several key functionalities:
 
 
 
19
 
20
+ * **Circular Compliance Assistant**: Analyzes user-provided scenarios for compliance against RBI Master Circulars on Management of Advances. It uses a FAISS vector database to retrieve relevant sections of the circular and a language model to generate a detailed compliance report.
21
+ * **Industry Classification Assistant**: Suggests appropriate industry classification codes based on user-provided keywords. This feature also utilizes a RAG pipeline to search through an industry classification master document.
22
+ * **Calculation Methodology**: Provides interactive calculators for key financial metrics:
23
+ * **Maximum Permissible Bank Finance (MPBF)**
24
+ * **Drawing Power (DP)**
25
+ * **Financial Data Assistant**: Answers questions about historical (1980-2015) state-wise financial data for India. It can retrieve specific metrics for a given state and year.
26
+ * **Model 1 Chat**: A general-purpose chat interface powered by the `gemma2-9b-it` model via the Groq API.
27
 
28
+ ## How It Works
 
 
 
 
29
 
30
+ The core of the "Circular Compliance" and "Industry Classification" assistants is a Retrieval-Augmented Generation (RAG) pipeline.
31
 
32
+ 1. **Indexing**: Source documents (`Master Circular.pdf`, `Industry Classification Master.pdf`) are chunked, and the text chunks are converted into vector embeddings using a `SentenceTransformer` model. These embeddings are stored in a FAISS index for efficient similarity search.
33
+ 2. **Retrieval**: When a user enters a query, the query is embedded, and the FAISS index is searched to find the most relevant document chunks.
34
+ 3. **Generation**: The retrieved chunks are passed as context, along with the user's query, to a large language model (`gemma2-9b-it`). The model then generates a comprehensive and context-aware response.
 
 
35
 
36
+ The "Financial Data Assistant" works by directly parsing the user's query for state, year, and metric information and looking up the corresponding data from a pre-loaded data file.
37
 
38
+ ## Setup and Installation
 
 
 
 
39
 
40
+ 1. **Clone the Repository**:
41
 
42
+ ```bash
43
+ git clone <your-repository-url>
44
+ cd <your-repository-directory>
45
+ ```
 
46
 
47
+ 2. **Install Dependencies**:
48
+ Install the necessary Python libraries using the `requirements.txt` file.
49
+
50
+ ```bash
51
+ pip install -r requirements.txt
52
+ ```
53
+
54
+ 3. **Set Up Assets**:
55
+ The application requires pre-built FAISS indexes and data files.
56
+
57
+ * Create a folder named `assets` in the root directory.
58
+ * Generate and place the following files into the `assets` folder (You will need a separate script to process the source PDFs and JSON to create these files):
59
+ * `industry_index.faiss`
60
+ * `industry_chunks.pkl`
61
+ * `circular_index.faiss`
62
+ * `circular_chunks.pkl`
63
+ * `financial_index.faiss`
64
+ * `financial_statements.pkl`
65
+
66
+ 4. **API Key**:
67
+ Insert your Groq API key directly into the `app.py` file at the following line:
68
+
69
+ ```python
70
+ GROQ_API_KEY = "your-groq-api-key-here"
71
+ ```
72
+
73
+ ## Usage
74
+
75
+ 1. **Run the Streamlit App**:
76
+ Execute the following command in your terminal:
77
+
78
+ ```bash
79
+ streamlit run app.py
80
+ ```
81
+
82
+ 2. **Interact with the Application**:
83
+
84
+ * Open the URL provided by Streamlit (usually `http://localhost:8501`) in your web browser.
85
+ * Use the radio buttons at the top of the page to navigate between the different functionalities: "Calculation Methodology", "Circular Compliance", "Industry Classification", "Model 1", and "Model 2".
86
+ * Follow the on-screen instructions for each tool.
87
+
88
+ ## Dependencies
89
+
90
+ This project relies on the following major libraries:
91
+
92
+ * `streamlit`: For creating the web application interface.
93
+ * `groq`: The client for accessing the Groq API.
94
+ * `sentence-transformers`: For generating text embeddings.
95
+ * `faiss-cpu`: For efficient similarity search in the vector database.
96
+ * `pandas`: For data manipulation, particularly in the financial data assistant.
97
+ * `numpy`: For numerical operations.
98
+ * `torch` & `transformers`: Core dependencies for the sentence transformer models.