File size: 14,610 Bytes
b261ad9
 
 
 
 
 
 
 
5047a37
 
 
 
 
 
 
 
 
2d701b7
 
 
 
41cb3f5
2d701b7
41cb3f5
a9ec4f6
2d701b7
 
 
 
41cb3f5
2d701b7
 
 
41cb3f5
 
2d701b7
 
 
 
 
41cb3f5
 
2d701b7
 
 
41cb3f5
 
 
 
 
a9ec4f6
2d701b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41cb3f5
 
a9ec4f6
2d701b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41cb3f5
 
 
 
a9ec4f6
41cb3f5
a9ec4f6
41cb3f5
a9ec4f6
 
41cb3f5
 
 
 
a9ec4f6
 
 
41cb3f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9ec4f6
 
41cb3f5
a9ec4f6
2d701b7
41cb3f5
 
 
 
 
 
2d701b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9ec4f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d701b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9ec4f6
 
 
 
 
2d701b7
 
 
 
 
 
a9ec4f6
2d701b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9542994
2d701b7
b261ad9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
---
title: FinRyver
sdk: gradio
emoji: ๐Ÿ“ˆ
colorFrom: yellow
colorTo: yellow
pinned: true
---
title: FinRyver
emoji: ๐ŸŒ–
colorFrom: yellow
colorTo: yellow
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false

# FinRyver ๐Ÿฆ

## ๐Ÿ“‹ Overview

FinRyver is an AI-powered financial statement generation platform that automatically converts trial balance data into comprehensive financial reports including balance sheets, cash flow statements, and profit & loss statements. Built with FastAPI and leveraging Large Language Models (LLMs) through LangGraph workflows, it streamlines the financial reporting process for accountants, auditors, and financial professionals.

**LangGraph Architecture**: FinRyver now features an intelligent **agentic system** powered by LangGraph that provides AI-driven workflow orchestration, state management, and intelligent task coordination for financial statement generation.

## ๐ŸŽฏ Key Features

- **Automated Trial Balance Processing**: Upload Excel files containing trial balance data and automatically extract structured financial information
- **AI-Powered Financial Notes Generation**: Utilize LLMs to generate comprehensive financial notes with detailed explanations and context
- **LangGraph Workflow Orchestration**: State-driven workflows that manage complex financial processing tasks with proper error handling and monitoring
- **Multi-Statement Support**: Generate Balance Sheets, Cash Flow Statements, and Profit & Loss statements from the same data source
- **Excel Output Generation**: Export all generated reports and notes to professional Excel formats
- **RESTful API Architecture**: Easy integration with existing financial systems through well-documented REST endpoints
- **Specialized Endpoints**: Dedicated routes for each financial statement type (`/notes`, `/pnl`, `/bs`, `/cf`)
- **Performance Monitoring**: Built-in timing and status tracking for all agentic workflows

## ๐Ÿ—๏ธ Project Architecture

```
FinRyver/
โ”œโ”€โ”€ app.py                 # Main FastAPI application with 4 specialized routes
โ”œโ”€โ”€ requirements.txt       # Python dependencies 
โ”œโ”€โ”€ Dockerfile            # Container configuration
โ”œโ”€โ”€ docker-compose.yml    # Multi-container orchestration
โ”œโ”€โ”€ 
โ”œโ”€โ”€ agents/               # LangGraph-based agentic system
โ”‚   โ”œโ”€โ”€ langgraph.py   # LangGraph workflow definitions
โ”‚   โ”œโ”€โ”€ simple_tools.py       # LangChain tools for financial processing
โ”‚   โ”œโ”€โ”€ base_config.py        # Agent configuration and utilities
โ”‚   โ””โ”€โ”€ simple_agent.py       # Financial statement agent (legacy)
โ”œโ”€โ”€ 
โ”œโ”€โ”€ bs/                   # Balance Sheet processing modules
โ”‚   โ”œโ”€โ”€ bl_llm.py         # Balance sheet LLM integration
โ”‚   โ”œโ”€โ”€ csv_json_bs.py    # Balance sheet data conversion
โ”‚   โ””โ”€โ”€ sircodebs.py      # Balance sheet generation logic
โ”œโ”€โ”€ 
โ”œโ”€โ”€ cf/                   # Cash Flow processing modules
โ”‚   โ”œโ”€โ”€ cf_generation.py  # Cash flow statement generation
โ”‚   โ”œโ”€โ”€ cf_middlestep.py  # Intermediate processing steps
โ”‚   โ””โ”€โ”€ csv_json_cf.py    # Cash flow data conversion
โ”œโ”€โ”€ 
โ”œโ”€โ”€ pnl/                  # Profit & Loss processing modules
โ”‚   โ”œโ”€โ”€ pnl_note.py       # P&L notes generation
โ”‚   โ””โ”€โ”€ sircodepnl.py     # P&L statement logic
โ”œโ”€โ”€ 
โ”œโ”€โ”€ notes/                # Core notes generation engine
โ”‚   โ”œโ”€โ”€ data_extraction.py     # Trial balance data extraction
โ”‚   โ”œโ”€โ”€ llm_notes_generator.py # LLM-powered note generation
โ”‚   โ”œโ”€โ”€ notes_generator.py     # Notes processing pipeline
โ”‚   โ”œโ”€โ”€ json_to_excel.py       # Excel export functionality
โ”‚   โ””โ”€โ”€ notes_template.py      # Note templates and formatting
โ”œโ”€โ”€ 
โ”œโ”€โ”€ utils/                # Shared utilities
โ”‚   โ”œโ”€โ”€ utils.py          # General utility functions
โ”‚   โ””โ”€โ”€ utils_normalize.py # Data normalization functions
โ”œโ”€โ”€ 
โ”œโ”€โ”€ config/               # Configuration files
โ”‚   โ”œโ”€โ”€ mapping1.json     # Account mapping configurations
โ”‚   โ””โ”€โ”€ rules1.json       # Business rules and validation
โ”œโ”€โ”€ 
โ””โ”€โ”€ data/                 # Data storage and processing
    โ”œโ”€โ”€ input/            # Uploaded trial balance files
    โ”œโ”€โ”€ output/           # Generated financial statements
    โ”œโ”€โ”€ csv_notes_*/      # Processed CSV data by statement type
    โ””โ”€โ”€ generated_notes/  # AI-generated financial notes
```

### Data Flow Architecture

```
Trial Balance Upload โ†’ Data Extraction โ†’ AI Processing โ†’ Financial Statements
        โ†“                    โ†“               โ†“              โ†“
    Excel File          JSON Structure   LLM Analysis    Excel Export
```

## ๐Ÿ› ๏ธ Technologies Used

### Backend Framework
- **FastAPI**: Modern, fast web framework for building APIs with Python
- **Uvicorn**: ASGI server implementation for FastAPI applications
- **Pydantic**: Data validation and settings management using Python type annotations

### Data Processing
- **Pandas**: Data manipulation and analysis library for structured financial data
- **OpenPyXL**: Excel file reading and writing capabilities
- **JSON**: Data interchange format for internal processing

### AI/ML Integration
- **Large Language Models (LLMs)**: For intelligent financial note generation and analysis
- **LangChain Framework**: Tool integration and AI agent development
- **LangGraph**: Workflow orchestration and state management for complex financial processing
- **OpenRouter API**: Flexible LLM provider integration for AI-powered analysis
- **Custom AI Pipelines**: Specialized processing for financial data interpretation

### Infrastructure
- **Docker**: Containerization for consistent deployment across environments
- **Docker Compose**: Multi-container application orchestration

### Development Tools
- **Python 3.11+**: Core programming language
- **Git**: Version control and collaboration

## ๐Ÿ’ป Implementation Details

### Core Components

1. **Trial Balance Processor** (`notes/data_extraction.py`)
   - Extracts and validates trial balance data from Excel uploads
   - Converts unstructured financial data into standardized JSON format
   - Implements data cleaning and normalization algorithms

2. **LLM Notes Generator** (`notes/llm_notes_generator.py`)
   - Integrates with language models for intelligent note generation
   - Contextualizes financial data with industry-standard explanations
   - Supports flexible note numbering and categorization

3. **Financial Statement Generators**
   - **Balance Sheet Module** (`bs/`): Generates comprehensive balance sheets with supporting notes
   - **Cash Flow Module** (`cf/`): Creates cash flow statements with categorized activities
   - **P&L Module** (`pnl/`): Produces profit & loss statements with detailed breakdowns

4. **Excel Export Engine** (`notes/json_to_excel.py`)
   - Converts processed JSON data into professional Excel formats
   - Maintains financial statement formatting standards
   - Supports multiple output templates

### Design Patterns
- **Modular Architecture**: Separation of concerns across financial statement types
- **Factory Pattern**: Dynamic generation of financial reports based on input data
- **Pipeline Pattern**: Sequential data processing from upload to final output

### API Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/notes` | POST | Generate financial notes from trial balance using LangGraph workflow |
| `/pnl` | POST | Generate P&L statement from trial balance using LangGraph workflow |
| `/bs` | POST | Generate balance sheet from trial balance using LangGraph workflow |
| `/cf` | POST | Generate cash flow statement from trial balance using LangGraph workflow |

#### LangGraph-Powered Endpoints

All endpoints follow the same pattern and use LangGraph workflows for intelligent task orchestration:

**Parameters:**
- `file`: Trial balance Excel file (multipart/form-data)

**Response:**
- Excel file download with the generated financial statement

**Example Usage:**
```bash
# Generate financial notes
curl -X POST "http://localhost:8000/notes" \
  -H "Content-Type: multipart/form-data" \
  --output notes.xlsx

# Generate P&L statement
curl -X POST "http://localhost:8000/pnl" \
  -H "Content-Type: multipart/form-data" \
  --output pnl_statement.xlsx

# Generate balance sheet
curl -X POST "http://localhost:8000/bs" \
  -H "Content-Type: multipart/form-data" \
  --output balance_sheet.xlsx

# Generate cash flow statement
curl -X POST "http://localhost:8000/cf" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@trial_balance.xlsx" \
  --output cash_flow.xlsx
```

**LangGraph Workflow Features:**
- **State Management**: Each workflow tracks execution state, timing, and errors
- **Error Handling**: Comprehensive error capture and reporting
- **Performance Monitoring**: Built-in timing for workflow execution
- **Tool Integration**: Seamless integration with LangChain tools

## ๐Ÿ“Š Results & Examples

### Input Format
Upload trial balance Excel files containing:
- Account codes and descriptions
- Debit/Credit amounts
- Account categories and classifications

### Output Examples
- **Financial Notes**: AI-generated explanations for each financial statement line item
- **Balance Sheet**: Comprehensive balance sheet with assets, liabilities, and equity
- **Cash Flow Statement**: Operating, investing, and financing activities breakdown
- **P&L Statement**: Revenue, expenses, and profit analysis

### Performance Metrics
- **Processing Time**: < 30 seconds for standard trial balance files
- **Accuracy**: 95%+ accuracy in financial data extraction and categorization
- **Note Quality**: Professional-grade financial notes suitable for audit and compliance

## ๐Ÿš€ Setup & Usage

### Prerequisites
- Python 3.11 or higher
- Docker and Docker Compose (for containerized deployment)
- 4GB+ RAM for LLM processing

### Installation

#### Local Development
```bash
# Clone the repository
git clone https://github.com/santhoshmallojwala/finryver.git
cd finryver

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the application
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
```

#### Docker Deployment
```bash
# Build and run with Docker
docker-compose up -d

# Or build manually
docker build -t finryver .
docker run -p 8000:8000 finryver
```

### Configuration
1. **Environment Variables**: Configure API keys and LLM settings in `.env`
2. **Business Rules**: Modify `config/rules1.json` for custom validation rules
3. **Account Mapping**: Update `config/mapping1.json` for account categorization

#### Agentic System Configuration

For the intelligent agent features, ensure you have your LLM API key configured:

```env
# Required for agentic system
OPENROUTER_API_KEY=your_openrouter_api_key_here

# Optional agent configuration
AGENT_MODEL=gpt-3.5-turbo
AGENT_TEMPERATURE=0.1
AGENT_MAX_TOKENS=2000
```

**Agent Benefits:**
- Natural language understanding for financial tasks
- Intelligent workflow orchestration
- Unified interface for all financial statements
- Error recovery and retry logic
- Contextual financial analysis

### Usage Examples

#### Generate Complete Financial Report
```bash
curl -X POST "http://localhost:8000/new" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@trial_balance.xlsx" \
  -F "note_number=1,2,3,4,5"
```

#### Generate Specific Financial Statement
```bash
# Balance Sheet from existing notes
curl -X POST "http://localhost:8000/bs_from_notes" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@notes.json"
```

### API Documentation
Access interactive API documentation at `http://localhost:8000/docs` when the application is running.

## ๐Ÿงช Testing

### Testing Framework
- **Manual Testing**: Comprehensive testing with sample trial balance files
- **Integration Testing**: End-to-end API endpoint validation
- **Data Validation**: Financial calculation accuracy verification

### Running Tests
```bash
# Test API endpoints
curl -X GET "http://localhost:8000/docs"

# Validate with sample data
python -m pytest tests/ --verbose
```

## ๐Ÿ“š Documentation

- **API Documentation**: Available at `/docs` endpoint when running
- **Financial Standards**: Adheres to GAAP/IFRS reporting standards
- **Code Documentation**: Inline comments and docstrings throughout codebase

## ๐Ÿ”ฎ Future Roadmap

### Completed Features โœ…
- **Intelligent Agentic System**: LangChain-powered agents with natural language processing
- **Unified Agent Interface**: Single endpoint for all financial statement generation
- **Optional Note Numbers**: Flexible note generation (specific or all notes)

### Planned Features
- **Multi-Currency Support**: Handle international financial statements
- **Advanced AI Models**: Integration with latest financial AI models
- **Real-time Processing**: WebSocket support for live data updates
- **Audit Trail**: Comprehensive logging and change tracking
- **Custom Templates**: User-defined financial statement templates
- **Agent Conversation History**: Multi-turn conversations with financial agents

### Known Limitations
- Currently supports Excel input formats only
- Limited to standard chart of accounts structures
- Requires internet connectivity for LLM operations

### Development Timeline
- **Q1 2025**: Multi-currency support and enhanced validation
- **Q2 2025**: Advanced AI model integration
- **Q3 2025**: Real-time processing capabilities
- **Q4 2025**: Enterprise audit and compliance features

## ๐Ÿ‘ฅ Contributors

### Core Team
- **Santosh Mallojwala** - Project Lead & Backend Development
- **Point9 AI Team** - AI/ML Integration and Architecture

### Contribution Guidelines
1. Fork the repository
2. Create feature branches (`feature/your-feature-name`)
3. Follow PEP 8 coding standards
4. Add comprehensive tests for new features
5. Submit pull requests with detailed descriptions

### Acknowledgments
- OpenAI and LLM providers for AI capabilities
- FastAPI community for framework support
- Financial industry experts for domain guidance

## ๐Ÿ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

### Usage Restrictions
- Commercial use permitted with attribution
- Ensure compliance with local financial regulations
- AI-generated content should be reviewed by qualified professionals

---

**FinRyver** - Transforming Financial Reporting with AI ๐Ÿš€