finryver-dev / README.md
Sahil Garg
agent added, files name changed
a9ec4f6
|
raw
history blame
13.7 kB

FinRyver ๐Ÿฆ

๐Ÿ“‹ Overview

FinRyver is an AI-powered financial statement generation platform that automatically converts trial balance data into comprehensive financial reports including balance sheets, cash flow statements, and profit & loss statements. Built with FastAPI and leveraging Large Language Models (LLMs), it streamlines the financial reporting process for accountants, auditors, and financial professionals by automating the generation of detailed financial notes and statements from structured trial balance inputs.

New in 2025: FinRyver now features an intelligent agentic system powered by LangChain that provides AI-driven automation, natural language processing, and intelligent decision-making for financial statement generation.

๐ŸŽฏ Key Features

  • Automated Trial Balance Processing: Upload Excel files containing trial balance data and automatically extract structured financial information
  • AI-Powered Financial Notes Generation: Utilize LLMs to generate comprehensive financial notes with detailed explanations and context
  • Intelligent Agentic System: LangChain-powered agents that understand natural language instructions and automate complex financial workflows
  • Multi-Statement Support: Generate Balance Sheets, Cash Flow Statements, and Profit & Loss statements from the same data source
  • Excel Output Generation: Export all generated reports and notes to professional Excel formats
  • RESTful API Architecture: Easy integration with existing financial systems through well-documented REST endpoints
  • Flexible Note Selection: Generate specific financial notes by number or create comprehensive reports covering all relevant sections
  • Unified Agent Interface: Single /agent/generate endpoint for intelligent financial statement generation

๐Ÿ—๏ธ Project Architecture

FinRyver/
โ”œโ”€โ”€ app.py                 # Main FastAPI application with API endpoints
โ”œโ”€โ”€ requirements.txt       # Python dependencies
โ”œโ”€โ”€ Dockerfile            # Container configuration
โ”œโ”€โ”€ docker-compose.yml    # Multi-container orchestration
โ”œโ”€โ”€ 
โ”œโ”€โ”€ agents/               # Agentic system (LangChain)
โ”‚   โ”œโ”€โ”€ base_config.py    # Agent configuration and utilities
โ”‚   โ”œโ”€โ”€ simple_agent.py   # Financial statement agent
โ”‚   โ””โ”€โ”€ simple_tools.py   # LangChain tools for financial processing
โ”œโ”€โ”€ 
โ”œโ”€โ”€ bs/                   # Balance Sheet processing modules
โ”‚   โ”œโ”€โ”€ bl_llm.py         # Balance sheet LLM integration
โ”‚   โ”œโ”€โ”€ csv_json_bs.py    # Balance sheet data conversion
โ”‚   โ””โ”€โ”€ sircodebs.py      # Balance sheet generation logic
โ”œโ”€โ”€ 
โ”œโ”€โ”€ cf/                   # Cash Flow processing modules
โ”‚   โ”œโ”€โ”€ cf_generation.py  # Cash flow statement generation
โ”‚   โ”œโ”€โ”€ cf_middlestep.py  # Intermediate processing steps
โ”‚   โ””โ”€โ”€ csv_json_cf.py    # Cash flow data conversion
โ”œโ”€โ”€ 
โ”œโ”€โ”€ pnl/                  # Profit & Loss processing modules
โ”‚   โ”œโ”€โ”€ pnl_note.py       # P&L notes generation
โ”‚   โ””โ”€โ”€ sircodepnl.py     # P&L statement logic
โ”œโ”€โ”€ 
โ”œโ”€โ”€ notes/                # Core notes generation engine
โ”‚   โ”œโ”€โ”€ data_extraction.py     # Trial balance data extraction
โ”‚   โ”œโ”€โ”€ llm_notes_generator.py # LLM-powered note generation
โ”‚   โ”œโ”€โ”€ notes_generator.py     # Notes processing pipeline
โ”‚   โ”œโ”€โ”€ json_to_excel.py       # Excel export functionality
โ”‚   โ””โ”€โ”€ notes_template.py      # Note templates and formatting
โ”œโ”€โ”€ 
โ”œโ”€โ”€ utils/                # Shared utilities
โ”‚   โ”œโ”€โ”€ utils.py          # General utility functions
โ”‚   โ””โ”€โ”€ utils_normalize.py # Data normalization functions
โ”œโ”€โ”€ 
โ”œโ”€โ”€ config/               # Configuration files
โ”‚   โ”œโ”€โ”€ mapping1.json     # Account mapping configurations
โ”‚   โ””โ”€โ”€ rules1.json       # Business rules and validation
โ”œโ”€โ”€ 
โ””โ”€โ”€ data/                 # Data storage and processing
    โ”œโ”€โ”€ input/            # Uploaded trial balance files
    โ”œโ”€โ”€ output/           # Generated financial statements
    โ”œโ”€โ”€ csv_notes_*/      # Processed CSV data by statement type
    โ””โ”€โ”€ generated_notes/  # AI-generated financial notes

Data Flow Architecture

Trial Balance Upload โ†’ Data Extraction โ†’ AI Processing โ†’ Financial Statements
        โ†“                    โ†“               โ†“              โ†“
    Excel File          JSON Structure   LLM Analysis    Excel Export

๐Ÿ› ๏ธ Technologies Used

Backend Framework

  • FastAPI: Modern, fast web framework for building APIs with Python
  • Uvicorn: ASGI server implementation for FastAPI applications
  • Pydantic: Data validation and settings management using Python type annotations

Data Processing

  • Pandas: Data manipulation and analysis library for structured financial data
  • OpenPyXL: Excel file reading and writing capabilities
  • JSON: Data interchange format for internal processing

AI/ML Integration

  • Large Language Models (LLMs): For intelligent financial note generation and analysis
  • LangChain Framework: Agentic system for intelligent financial statement processing
  • OpenRouter API: Flexible LLM provider integration for AI-powered analysis
  • Custom AI Pipelines: Specialized processing for financial data interpretation

Infrastructure

  • Docker: Containerization for consistent deployment across environments
  • Docker Compose: Multi-container application orchestration

Development Tools

  • Python 3.11+: Core programming language
  • Git: Version control and collaboration

๐Ÿ’ป Implementation Details

Core Components

  1. Trial Balance Processor (notes/data_extraction.py)

    • Extracts and validates trial balance data from Excel uploads
    • Converts unstructured financial data into standardized JSON format
    • Implements data cleaning and normalization algorithms
  2. LLM Notes Generator (notes/llm_notes_generator.py)

    • Integrates with language models for intelligent note generation
    • Contextualizes financial data with industry-standard explanations
    • Supports flexible note numbering and categorization
  3. Financial Statement Generators

    • Balance Sheet Module (bs/): Generates comprehensive balance sheets with supporting notes
    • Cash Flow Module (cf/): Creates cash flow statements with categorized activities
    • P&L Module (pnl/): Produces profit & loss statements with detailed breakdowns
  4. Excel Export Engine (notes/json_to_excel.py)

    • Converts processed JSON data into professional Excel formats
    • Maintains financial statement formatting standards
    • Supports multiple output templates

Design Patterns

  • Modular Architecture: Separation of concerns across financial statement types
  • Factory Pattern: Dynamic generation of financial reports based on input data
  • Pipeline Pattern: Sequential data processing from upload to final output

API Endpoints

Endpoint Method Description
/new POST Generate financial notes and Excel output from trial balance
/hardcoded POST Process predefined trial balance templates
/bs_from_notes POST Generate balance sheet from existing notes
/pnl_from_notes POST Generate P&L statement from existing notes
/cf_from_notes POST Generate cash flow statement from existing notes
/agent/generate POST NEW: Intelligent agent-based financial statement generation

Agentic System Endpoint

POST /agent/generate - Unified intelligent financial statement generation

Parameters:

  • file: Trial balance Excel file
  • note_numbers: Optional comma-separated note numbers (empty = all notes)
  • statement_type: "all", "notes", "balance_sheet", "pnl", or "cash_flow"

Example Usage:

curl -X POST "http://localhost:8000/agent/generate" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@trial_balance.xlsx" \
  -F "note_numbers=2,3,4,5" \
  -F "statement_type=all"

๐Ÿ“Š Results & Examples

Input Format

Upload trial balance Excel files containing:

  • Account codes and descriptions
  • Debit/Credit amounts
  • Account categories and classifications

Output Examples

  • Financial Notes: AI-generated explanations for each financial statement line item
  • Balance Sheet: Comprehensive balance sheet with assets, liabilities, and equity
  • Cash Flow Statement: Operating, investing, and financing activities breakdown
  • P&L Statement: Revenue, expenses, and profit analysis

Performance Metrics

  • Processing Time: < 30 seconds for standard trial balance files
  • Accuracy: 95%+ accuracy in financial data extraction and categorization
  • Note Quality: Professional-grade financial notes suitable for audit and compliance

๐Ÿš€ Setup & Usage

Prerequisites

  • Python 3.11 or higher
  • Docker and Docker Compose (for containerized deployment)
  • 4GB+ RAM for LLM processing

Installation

Local Development

# Clone the repository
git clone https://github.com/santhoshmallojwala/finryver.git
cd finryver

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the application
uvicorn app:app --host 0.0.0.0 --port 8000 --reload

Docker Deployment

# Build and run with Docker
docker-compose up -d

# Or build manually
docker build -t finryver .
docker run -p 8000:8000 finryver

Configuration

  1. Environment Variables: Configure API keys and LLM settings in .env
  2. Business Rules: Modify config/rules1.json for custom validation rules
  3. Account Mapping: Update config/mapping1.json for account categorization

Agentic System Configuration

For the intelligent agent features, ensure you have your LLM API key configured:

# Required for agentic system
OPENROUTER_API_KEY=your_openrouter_api_key_here

# Optional agent configuration
AGENT_MODEL=gpt-3.5-turbo
AGENT_TEMPERATURE=0.1
AGENT_MAX_TOKENS=2000

Agent Benefits:

  • Natural language understanding for financial tasks
  • Intelligent workflow orchestration
  • Unified interface for all financial statements
  • Error recovery and retry logic
  • Contextual financial analysis

Usage Examples

Generate Complete Financial Report

curl -X POST "http://localhost:8000/new" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@trial_balance.xlsx" \
  -F "note_number=1,2,3,4,5"

Generate Specific Financial Statement

# Balance Sheet from existing notes
curl -X POST "http://localhost:8000/bs_from_notes" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@notes.json"

API Documentation

Access interactive API documentation at http://localhost:8000/docs when the application is running.

๐Ÿงช Testing

Testing Framework

  • Manual Testing: Comprehensive testing with sample trial balance files
  • Integration Testing: End-to-end API endpoint validation
  • Data Validation: Financial calculation accuracy verification

Running Tests

# Test API endpoints
curl -X GET "http://localhost:8000/docs"

# Validate with sample data
python -m pytest tests/ --verbose

๐Ÿ“š Documentation

  • API Documentation: Available at /docs endpoint when running
  • Financial Standards: Adheres to GAAP/IFRS reporting standards
  • Code Documentation: Inline comments and docstrings throughout codebase

๐Ÿ”ฎ Future Roadmap

Completed Features โœ…

  • Intelligent Agentic System: LangChain-powered agents with natural language processing
  • Unified Agent Interface: Single endpoint for all financial statement generation
  • Optional Note Numbers: Flexible note generation (specific or all notes)

Planned Features

  • Multi-Currency Support: Handle international financial statements
  • Advanced AI Models: Integration with latest financial AI models
  • Real-time Processing: WebSocket support for live data updates
  • Audit Trail: Comprehensive logging and change tracking
  • Custom Templates: User-defined financial statement templates
  • Agent Conversation History: Multi-turn conversations with financial agents

Known Limitations

  • Currently supports Excel input formats only
  • Limited to standard chart of accounts structures
  • Requires internet connectivity for LLM operations

Development Timeline

  • Q1 2025: Multi-currency support and enhanced validation
  • Q2 2025: Advanced AI model integration
  • Q3 2025: Real-time processing capabilities
  • Q4 2025: Enterprise audit and compliance features

๐Ÿ‘ฅ Contributors

Core Team

  • Santosh Mallojwala - Project Lead & Backend Development
  • Point9 AI Team - AI/ML Integration and Architecture

Contribution Guidelines

  1. Fork the repository
  2. Create feature branches (feature/your-feature-name)
  3. Follow PEP 8 coding standards
  4. Add comprehensive tests for new features
  5. Submit pull requests with detailed descriptions

Acknowledgments

  • OpenAI and LLM providers for AI capabilities
  • FastAPI community for framework support
  • Financial industry experts for domain guidance

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

Usage Restrictions

  • Commercial use permitted with attribution
  • Ensure compliance with local financial regulations
  • AI-generated content should be reviewed by qualified professionals

FinRyver - Transforming Financial Reporting with AI ๐Ÿš€