finryver-dev / README.md
dipan004's picture
Update README.md
b261ad9 verified
|
raw
history blame
14.6 kB
---
title: FinRyver
sdk: gradio
emoji: ๐Ÿ“ˆ
colorFrom: yellow
colorTo: yellow
pinned: true
---
title: FinRyver
emoji: ๐ŸŒ–
colorFrom: yellow
colorTo: yellow
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false
# FinRyver ๐Ÿฆ
## ๐Ÿ“‹ Overview
FinRyver is an AI-powered financial statement generation platform that automatically converts trial balance data into comprehensive financial reports including balance sheets, cash flow statements, and profit & loss statements. Built with FastAPI and leveraging Large Language Models (LLMs) through LangGraph workflows, it streamlines the financial reporting process for accountants, auditors, and financial professionals.
**LangGraph Architecture**: FinRyver now features an intelligent **agentic system** powered by LangGraph that provides AI-driven workflow orchestration, state management, and intelligent task coordination for financial statement generation.
## ๐ŸŽฏ Key Features
- **Automated Trial Balance Processing**: Upload Excel files containing trial balance data and automatically extract structured financial information
- **AI-Powered Financial Notes Generation**: Utilize LLMs to generate comprehensive financial notes with detailed explanations and context
- **LangGraph Workflow Orchestration**: State-driven workflows that manage complex financial processing tasks with proper error handling and monitoring
- **Multi-Statement Support**: Generate Balance Sheets, Cash Flow Statements, and Profit & Loss statements from the same data source
- **Excel Output Generation**: Export all generated reports and notes to professional Excel formats
- **RESTful API Architecture**: Easy integration with existing financial systems through well-documented REST endpoints
- **Specialized Endpoints**: Dedicated routes for each financial statement type (`/notes`, `/pnl`, `/bs`, `/cf`)
- **Performance Monitoring**: Built-in timing and status tracking for all agentic workflows
## ๐Ÿ—๏ธ Project Architecture
```
FinRyver/
โ”œโ”€โ”€ app.py # Main FastAPI application with 4 specialized routes
โ”œโ”€โ”€ requirements.txt # Python dependencies
โ”œโ”€โ”€ Dockerfile # Container configuration
โ”œโ”€โ”€ docker-compose.yml # Multi-container orchestration
โ”œโ”€โ”€
โ”œโ”€โ”€ agents/ # LangGraph-based agentic system
โ”‚ โ”œโ”€โ”€ langgraph.py # LangGraph workflow definitions
โ”‚ โ”œโ”€โ”€ simple_tools.py # LangChain tools for financial processing
โ”‚ โ”œโ”€โ”€ base_config.py # Agent configuration and utilities
โ”‚ โ””โ”€โ”€ simple_agent.py # Financial statement agent (legacy)
โ”œโ”€โ”€
โ”œโ”€โ”€ bs/ # Balance Sheet processing modules
โ”‚ โ”œโ”€โ”€ bl_llm.py # Balance sheet LLM integration
โ”‚ โ”œโ”€โ”€ csv_json_bs.py # Balance sheet data conversion
โ”‚ โ””โ”€โ”€ sircodebs.py # Balance sheet generation logic
โ”œโ”€โ”€
โ”œโ”€โ”€ cf/ # Cash Flow processing modules
โ”‚ โ”œโ”€โ”€ cf_generation.py # Cash flow statement generation
โ”‚ โ”œโ”€โ”€ cf_middlestep.py # Intermediate processing steps
โ”‚ โ””โ”€โ”€ csv_json_cf.py # Cash flow data conversion
โ”œโ”€โ”€
โ”œโ”€โ”€ pnl/ # Profit & Loss processing modules
โ”‚ โ”œโ”€โ”€ pnl_note.py # P&L notes generation
โ”‚ โ””โ”€โ”€ sircodepnl.py # P&L statement logic
โ”œโ”€โ”€
โ”œโ”€โ”€ notes/ # Core notes generation engine
โ”‚ โ”œโ”€โ”€ data_extraction.py # Trial balance data extraction
โ”‚ โ”œโ”€โ”€ llm_notes_generator.py # LLM-powered note generation
โ”‚ โ”œโ”€โ”€ notes_generator.py # Notes processing pipeline
โ”‚ โ”œโ”€โ”€ json_to_excel.py # Excel export functionality
โ”‚ โ””โ”€โ”€ notes_template.py # Note templates and formatting
โ”œโ”€โ”€
โ”œโ”€โ”€ utils/ # Shared utilities
โ”‚ โ”œโ”€โ”€ utils.py # General utility functions
โ”‚ โ””โ”€โ”€ utils_normalize.py # Data normalization functions
โ”œโ”€โ”€
โ”œโ”€โ”€ config/ # Configuration files
โ”‚ โ”œโ”€โ”€ mapping1.json # Account mapping configurations
โ”‚ โ””โ”€โ”€ rules1.json # Business rules and validation
โ”œโ”€โ”€
โ””โ”€โ”€ data/ # Data storage and processing
โ”œโ”€โ”€ input/ # Uploaded trial balance files
โ”œโ”€โ”€ output/ # Generated financial statements
โ”œโ”€โ”€ csv_notes_*/ # Processed CSV data by statement type
โ””โ”€โ”€ generated_notes/ # AI-generated financial notes
```
### Data Flow Architecture
```
Trial Balance Upload โ†’ Data Extraction โ†’ AI Processing โ†’ Financial Statements
โ†“ โ†“ โ†“ โ†“
Excel File JSON Structure LLM Analysis Excel Export
```
## ๐Ÿ› ๏ธ Technologies Used
### Backend Framework
- **FastAPI**: Modern, fast web framework for building APIs with Python
- **Uvicorn**: ASGI server implementation for FastAPI applications
- **Pydantic**: Data validation and settings management using Python type annotations
### Data Processing
- **Pandas**: Data manipulation and analysis library for structured financial data
- **OpenPyXL**: Excel file reading and writing capabilities
- **JSON**: Data interchange format for internal processing
### AI/ML Integration
- **Large Language Models (LLMs)**: For intelligent financial note generation and analysis
- **LangChain Framework**: Tool integration and AI agent development
- **LangGraph**: Workflow orchestration and state management for complex financial processing
- **OpenRouter API**: Flexible LLM provider integration for AI-powered analysis
- **Custom AI Pipelines**: Specialized processing for financial data interpretation
### Infrastructure
- **Docker**: Containerization for consistent deployment across environments
- **Docker Compose**: Multi-container application orchestration
### Development Tools
- **Python 3.11+**: Core programming language
- **Git**: Version control and collaboration
## ๐Ÿ’ป Implementation Details
### Core Components
1. **Trial Balance Processor** (`notes/data_extraction.py`)
- Extracts and validates trial balance data from Excel uploads
- Converts unstructured financial data into standardized JSON format
- Implements data cleaning and normalization algorithms
2. **LLM Notes Generator** (`notes/llm_notes_generator.py`)
- Integrates with language models for intelligent note generation
- Contextualizes financial data with industry-standard explanations
- Supports flexible note numbering and categorization
3. **Financial Statement Generators**
- **Balance Sheet Module** (`bs/`): Generates comprehensive balance sheets with supporting notes
- **Cash Flow Module** (`cf/`): Creates cash flow statements with categorized activities
- **P&L Module** (`pnl/`): Produces profit & loss statements with detailed breakdowns
4. **Excel Export Engine** (`notes/json_to_excel.py`)
- Converts processed JSON data into professional Excel formats
- Maintains financial statement formatting standards
- Supports multiple output templates
### Design Patterns
- **Modular Architecture**: Separation of concerns across financial statement types
- **Factory Pattern**: Dynamic generation of financial reports based on input data
- **Pipeline Pattern**: Sequential data processing from upload to final output
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/notes` | POST | Generate financial notes from trial balance using LangGraph workflow |
| `/pnl` | POST | Generate P&L statement from trial balance using LangGraph workflow |
| `/bs` | POST | Generate balance sheet from trial balance using LangGraph workflow |
| `/cf` | POST | Generate cash flow statement from trial balance using LangGraph workflow |
#### LangGraph-Powered Endpoints
All endpoints follow the same pattern and use LangGraph workflows for intelligent task orchestration:
**Parameters:**
- `file`: Trial balance Excel file (multipart/form-data)
**Response:**
- Excel file download with the generated financial statement
**Example Usage:**
```bash
# Generate financial notes
curl -X POST "http://localhost:8000/notes" \
-H "Content-Type: multipart/form-data" \
--output notes.xlsx
# Generate P&L statement
curl -X POST "http://localhost:8000/pnl" \
-H "Content-Type: multipart/form-data" \
--output pnl_statement.xlsx
# Generate balance sheet
curl -X POST "http://localhost:8000/bs" \
-H "Content-Type: multipart/form-data" \
--output balance_sheet.xlsx
# Generate cash flow statement
curl -X POST "http://localhost:8000/cf" \
-H "Content-Type: multipart/form-data" \
-F "file=@trial_balance.xlsx" \
--output cash_flow.xlsx
```
**LangGraph Workflow Features:**
- **State Management**: Each workflow tracks execution state, timing, and errors
- **Error Handling**: Comprehensive error capture and reporting
- **Performance Monitoring**: Built-in timing for workflow execution
- **Tool Integration**: Seamless integration with LangChain tools
## ๐Ÿ“Š Results & Examples
### Input Format
Upload trial balance Excel files containing:
- Account codes and descriptions
- Debit/Credit amounts
- Account categories and classifications
### Output Examples
- **Financial Notes**: AI-generated explanations for each financial statement line item
- **Balance Sheet**: Comprehensive balance sheet with assets, liabilities, and equity
- **Cash Flow Statement**: Operating, investing, and financing activities breakdown
- **P&L Statement**: Revenue, expenses, and profit analysis
### Performance Metrics
- **Processing Time**: < 30 seconds for standard trial balance files
- **Accuracy**: 95%+ accuracy in financial data extraction and categorization
- **Note Quality**: Professional-grade financial notes suitable for audit and compliance
## ๐Ÿš€ Setup & Usage
### Prerequisites
- Python 3.11 or higher
- Docker and Docker Compose (for containerized deployment)
- 4GB+ RAM for LLM processing
### Installation
#### Local Development
```bash
# Clone the repository
git clone https://github.com/santhoshmallojwala/finryver.git
cd finryver
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run the application
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
```
#### Docker Deployment
```bash
# Build and run with Docker
docker-compose up -d
# Or build manually
docker build -t finryver .
docker run -p 8000:8000 finryver
```
### Configuration
1. **Environment Variables**: Configure API keys and LLM settings in `.env`
2. **Business Rules**: Modify `config/rules1.json` for custom validation rules
3. **Account Mapping**: Update `config/mapping1.json` for account categorization
#### Agentic System Configuration
For the intelligent agent features, ensure you have your LLM API key configured:
```env
# Required for agentic system
OPENROUTER_API_KEY=your_openrouter_api_key_here
# Optional agent configuration
AGENT_MODEL=gpt-3.5-turbo
AGENT_TEMPERATURE=0.1
AGENT_MAX_TOKENS=2000
```
**Agent Benefits:**
- Natural language understanding for financial tasks
- Intelligent workflow orchestration
- Unified interface for all financial statements
- Error recovery and retry logic
- Contextual financial analysis
### Usage Examples
#### Generate Complete Financial Report
```bash
curl -X POST "http://localhost:8000/new" \
-H "Content-Type: multipart/form-data" \
-F "file=@trial_balance.xlsx" \
-F "note_number=1,2,3,4,5"
```
#### Generate Specific Financial Statement
```bash
# Balance Sheet from existing notes
curl -X POST "http://localhost:8000/bs_from_notes" \
-H "Content-Type: multipart/form-data" \
-F "file=@notes.json"
```
### API Documentation
Access interactive API documentation at `http://localhost:8000/docs` when the application is running.
## ๐Ÿงช Testing
### Testing Framework
- **Manual Testing**: Comprehensive testing with sample trial balance files
- **Integration Testing**: End-to-end API endpoint validation
- **Data Validation**: Financial calculation accuracy verification
### Running Tests
```bash
# Test API endpoints
curl -X GET "http://localhost:8000/docs"
# Validate with sample data
python -m pytest tests/ --verbose
```
## ๐Ÿ“š Documentation
- **API Documentation**: Available at `/docs` endpoint when running
- **Financial Standards**: Adheres to GAAP/IFRS reporting standards
- **Code Documentation**: Inline comments and docstrings throughout codebase
## ๐Ÿ”ฎ Future Roadmap
### Completed Features โœ…
- **Intelligent Agentic System**: LangChain-powered agents with natural language processing
- **Unified Agent Interface**: Single endpoint for all financial statement generation
- **Optional Note Numbers**: Flexible note generation (specific or all notes)
### Planned Features
- **Multi-Currency Support**: Handle international financial statements
- **Advanced AI Models**: Integration with latest financial AI models
- **Real-time Processing**: WebSocket support for live data updates
- **Audit Trail**: Comprehensive logging and change tracking
- **Custom Templates**: User-defined financial statement templates
- **Agent Conversation History**: Multi-turn conversations with financial agents
### Known Limitations
- Currently supports Excel input formats only
- Limited to standard chart of accounts structures
- Requires internet connectivity for LLM operations
### Development Timeline
- **Q1 2025**: Multi-currency support and enhanced validation
- **Q2 2025**: Advanced AI model integration
- **Q3 2025**: Real-time processing capabilities
- **Q4 2025**: Enterprise audit and compliance features
## ๐Ÿ‘ฅ Contributors
### Core Team
- **Santosh Mallojwala** - Project Lead & Backend Development
- **Point9 AI Team** - AI/ML Integration and Architecture
### Contribution Guidelines
1. Fork the repository
2. Create feature branches (`feature/your-feature-name`)
3. Follow PEP 8 coding standards
4. Add comprehensive tests for new features
5. Submit pull requests with detailed descriptions
### Acknowledgments
- OpenAI and LLM providers for AI capabilities
- FastAPI community for framework support
- Financial industry experts for domain guidance
## ๐Ÿ“„ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
### Usage Restrictions
- Commercial use permitted with attribution
- Ensure compliance with local financial regulations
- AI-generated content should be reviewed by qualified professionals
---
**FinRyver** - Transforming Financial Reporting with AI ๐Ÿš€