Spaces:
Sleeping
Sleeping
Sahil Garg
commited on
Commit
Β·
41cb3f5
1
Parent(s):
dea72cd
improvised langgraph implementaation
Browse files- README.md +48 -26
- agents/{langgraph_routes.py β langgraph.py} +0 -0
- app.py +6 -325
- requirements.txt +1 -1
README.md
CHANGED
|
@@ -2,34 +2,35 @@
|
|
| 2 |
|
| 3 |
## π Overview
|
| 4 |
|
| 5 |
-
FinRyver is an AI-powered financial statement generation platform that automatically converts trial balance data into comprehensive financial reports including balance sheets, cash flow statements, and profit & loss statements. Built with FastAPI and leveraging Large Language Models (LLMs), it streamlines the financial reporting process for accountants, auditors, and financial professionals
|
| 6 |
|
| 7 |
-
**
|
| 8 |
|
| 9 |
## π― Key Features
|
| 10 |
|
| 11 |
- **Automated Trial Balance Processing**: Upload Excel files containing trial balance data and automatically extract structured financial information
|
| 12 |
- **AI-Powered Financial Notes Generation**: Utilize LLMs to generate comprehensive financial notes with detailed explanations and context
|
| 13 |
-
- **
|
| 14 |
- **Multi-Statement Support**: Generate Balance Sheets, Cash Flow Statements, and Profit & Loss statements from the same data source
|
| 15 |
- **Excel Output Generation**: Export all generated reports and notes to professional Excel formats
|
| 16 |
- **RESTful API Architecture**: Easy integration with existing financial systems through well-documented REST endpoints
|
| 17 |
-
- **
|
| 18 |
-
- **
|
| 19 |
|
| 20 |
## ποΈ Project Architecture
|
| 21 |
|
| 22 |
```
|
| 23 |
FinRyver/
|
| 24 |
-
βββ app.py # Main FastAPI application with
|
| 25 |
-
βββ requirements.txt # Python dependencies
|
| 26 |
βββ Dockerfile # Container configuration
|
| 27 |
βββ docker-compose.yml # Multi-container orchestration
|
| 28 |
βββ
|
| 29 |
-
βββ agents/ #
|
| 30 |
-
β βββ
|
| 31 |
-
β βββ
|
| 32 |
-
β
|
|
|
|
| 33 |
βββ
|
| 34 |
βββ bs/ # Balance Sheet processing modules
|
| 35 |
β βββ bl_llm.py # Balance sheet LLM integration
|
|
@@ -89,7 +90,8 @@ Trial Balance Upload β Data Extraction β AI Processing β Financial Stateme
|
|
| 89 |
|
| 90 |
### AI/ML Integration
|
| 91 |
- **Large Language Models (LLMs)**: For intelligent financial note generation and analysis
|
| 92 |
-
- **LangChain Framework**:
|
|
|
|
| 93 |
- **OpenRouter API**: Flexible LLM provider integration for AI-powered analysis
|
| 94 |
- **Custom AI Pipelines**: Specialized processing for financial data interpretation
|
| 95 |
|
|
@@ -134,31 +136,51 @@ Trial Balance Upload β Data Extraction β AI Processing β Financial Stateme
|
|
| 134 |
|
| 135 |
| Endpoint | Method | Description |
|
| 136 |
|----------|--------|-------------|
|
| 137 |
-
| `/
|
| 138 |
-
| `/
|
| 139 |
-
| `/
|
| 140 |
-
| `/
|
| 141 |
-
| `/cf_from_notes` | POST | Generate cash flow statement from existing notes |
|
| 142 |
-
| `/agent/generate` | POST | **NEW**: Intelligent agent-based financial statement generation |
|
| 143 |
|
| 144 |
-
####
|
| 145 |
|
| 146 |
-
|
| 147 |
|
| 148 |
**Parameters:**
|
| 149 |
-
- `file`: Trial balance Excel file
|
| 150 |
-
|
| 151 |
-
|
|
|
|
| 152 |
|
| 153 |
**Example Usage:**
|
| 154 |
```bash
|
| 155 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 156 |
-H "Content-Type: multipart/form-data" \
|
| 157 |
-F "file=@trial_balance.xlsx" \
|
| 158 |
-
|
| 159 |
-
-F "statement_type=all"
|
| 160 |
```
|
| 161 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 162 |
## π Results & Examples
|
| 163 |
|
| 164 |
### Input Format
|
|
|
|
| 2 |
|
| 3 |
## π Overview
|
| 4 |
|
| 5 |
+
FinRyver is an AI-powered financial statement generation platform that automatically converts trial balance data into comprehensive financial reports including balance sheets, cash flow statements, and profit & loss statements. Built with FastAPI and leveraging Large Language Models (LLMs) through LangGraph workflows, it streamlines the financial reporting process for accountants, auditors, and financial professionals.
|
| 6 |
|
| 7 |
+
**LangGraph Architecture**: FinRyver now features an intelligent **agentic system** powered by LangGraph that provides AI-driven workflow orchestration, state management, and intelligent task coordination for financial statement generation.
|
| 8 |
|
| 9 |
## π― Key Features
|
| 10 |
|
| 11 |
- **Automated Trial Balance Processing**: Upload Excel files containing trial balance data and automatically extract structured financial information
|
| 12 |
- **AI-Powered Financial Notes Generation**: Utilize LLMs to generate comprehensive financial notes with detailed explanations and context
|
| 13 |
+
- **LangGraph Workflow Orchestration**: State-driven workflows that manage complex financial processing tasks with proper error handling and monitoring
|
| 14 |
- **Multi-Statement Support**: Generate Balance Sheets, Cash Flow Statements, and Profit & Loss statements from the same data source
|
| 15 |
- **Excel Output Generation**: Export all generated reports and notes to professional Excel formats
|
| 16 |
- **RESTful API Architecture**: Easy integration with existing financial systems through well-documented REST endpoints
|
| 17 |
+
- **Specialized Endpoints**: Dedicated routes for each financial statement type (`/notes`, `/pnl`, `/bs`, `/cf`)
|
| 18 |
+
- **Performance Monitoring**: Built-in timing and status tracking for all agentic workflows
|
| 19 |
|
| 20 |
## ποΈ Project Architecture
|
| 21 |
|
| 22 |
```
|
| 23 |
FinRyver/
|
| 24 |
+
βββ app.py # Main FastAPI application with 4 specialized routes
|
| 25 |
+
βββ requirements.txt # Python dependencies
|
| 26 |
βββ Dockerfile # Container configuration
|
| 27 |
βββ docker-compose.yml # Multi-container orchestration
|
| 28 |
βββ
|
| 29 |
+
βββ agents/ # LangGraph-based agentic system
|
| 30 |
+
β βββ langgraph.py # LangGraph workflow definitions
|
| 31 |
+
β βββ simple_tools.py # LangChain tools for financial processing
|
| 32 |
+
β βββ base_config.py # Agent configuration and utilities
|
| 33 |
+
β βββ simple_agent.py # Financial statement agent (legacy)
|
| 34 |
βββ
|
| 35 |
βββ bs/ # Balance Sheet processing modules
|
| 36 |
β βββ bl_llm.py # Balance sheet LLM integration
|
|
|
|
| 90 |
|
| 91 |
### AI/ML Integration
|
| 92 |
- **Large Language Models (LLMs)**: For intelligent financial note generation and analysis
|
| 93 |
+
- **LangChain Framework**: Tool integration and AI agent development
|
| 94 |
+
- **LangGraph**: Workflow orchestration and state management for complex financial processing
|
| 95 |
- **OpenRouter API**: Flexible LLM provider integration for AI-powered analysis
|
| 96 |
- **Custom AI Pipelines**: Specialized processing for financial data interpretation
|
| 97 |
|
|
|
|
| 136 |
|
| 137 |
| Endpoint | Method | Description |
|
| 138 |
|----------|--------|-------------|
|
| 139 |
+
| `/notes` | POST | Generate financial notes from trial balance using LangGraph workflow |
|
| 140 |
+
| `/pnl` | POST | Generate P&L statement from trial balance using LangGraph workflow |
|
| 141 |
+
| `/bs` | POST | Generate balance sheet from trial balance using LangGraph workflow |
|
| 142 |
+
| `/cf` | POST | Generate cash flow statement from trial balance using LangGraph workflow |
|
|
|
|
|
|
|
| 143 |
|
| 144 |
+
#### LangGraph-Powered Endpoints
|
| 145 |
|
| 146 |
+
All endpoints follow the same pattern and use LangGraph workflows for intelligent task orchestration:
|
| 147 |
|
| 148 |
**Parameters:**
|
| 149 |
+
- `file`: Trial balance Excel file (multipart/form-data)
|
| 150 |
+
|
| 151 |
+
**Response:**
|
| 152 |
+
- Excel file download with the generated financial statement
|
| 153 |
|
| 154 |
**Example Usage:**
|
| 155 |
```bash
|
| 156 |
+
# Generate financial notes
|
| 157 |
+
curl -X POST "http://localhost:8000/notes" \
|
| 158 |
+
-H "Content-Type: multipart/form-data" \
|
| 159 |
+
--output notes.xlsx
|
| 160 |
+
|
| 161 |
+
# Generate P&L statement
|
| 162 |
+
curl -X POST "http://localhost:8000/pnl" \
|
| 163 |
+
-H "Content-Type: multipart/form-data" \
|
| 164 |
+
--output pnl_statement.xlsx
|
| 165 |
+
|
| 166 |
+
# Generate balance sheet
|
| 167 |
+
curl -X POST "http://localhost:8000/bs" \
|
| 168 |
+
-H "Content-Type: multipart/form-data" \
|
| 169 |
+
--output balance_sheet.xlsx
|
| 170 |
+
|
| 171 |
+
# Generate cash flow statement
|
| 172 |
+
curl -X POST "http://localhost:8000/cf" \
|
| 173 |
-H "Content-Type: multipart/form-data" \
|
| 174 |
-F "file=@trial_balance.xlsx" \
|
| 175 |
+
--output cash_flow.xlsx
|
|
|
|
| 176 |
```
|
| 177 |
|
| 178 |
+
**LangGraph Workflow Features:**
|
| 179 |
+
- **State Management**: Each workflow tracks execution state, timing, and errors
|
| 180 |
+
- **Error Handling**: Comprehensive error capture and reporting
|
| 181 |
+
- **Performance Monitoring**: Built-in timing for workflow execution
|
| 182 |
+
- **Tool Integration**: Seamless integration with LangChain tools
|
| 183 |
+
|
| 184 |
## π Results & Examples
|
| 185 |
|
| 186 |
### Input Format
|
agents/{langgraph_routes.py β langgraph.py}
RENAMED
|
File without changes
|
app.py
CHANGED
|
@@ -1,21 +1,9 @@
|
|
| 1 |
-
from fastapi import FastAPI, APIRouter, UploadFile, File,
|
| 2 |
-
from fastapi.responses import
|
| 3 |
-
from typing import Optional, Dict, Any
|
| 4 |
-
import pandas as pd
|
| 5 |
import os
|
| 6 |
import shutil
|
| 7 |
-
import json
|
| 8 |
-
import subprocess
|
| 9 |
import logging
|
| 10 |
-
|
| 11 |
-
# Import utilities and logic from modular files
|
| 12 |
-
from utils.utils import clean_value
|
| 13 |
-
from notes.data_extraction import extract_trial_balance_data, analyze_and_save_results
|
| 14 |
-
from notes.llm_notes_generator import FlexibleFinancialNoteGenerator
|
| 15 |
-
from notes.notes_generator import process_json
|
| 16 |
-
from notes.json_to_excel import json_to_xlsx
|
| 17 |
-
from utils.utils_normalize import normalize_llm_note_json, normalize_llm_notes_json
|
| 18 |
-
from agents.langgraph_routes import run_workflow
|
| 19 |
|
| 20 |
# Configure logging for the application
|
| 21 |
logging.basicConfig(level=logging.INFO)
|
|
@@ -35,23 +23,10 @@ async def startup_event():
|
|
| 35 |
@app.on_event("shutdown")
|
| 36 |
async def shutdown_event():
|
| 37 |
logger.info("Financial Notes Generator API is shutting down.")
|
| 38 |
-
router = APIRouter()
|
| 39 |
|
| 40 |
-
|
| 41 |
-
os.makedirs("data/input", exist_ok=True)
|
| 42 |
-
file_location = f"data/input/{file.filename}"
|
| 43 |
-
with open(file_location, "wb") as buffer:
|
| 44 |
-
shutil.copyfileobj(file.file, buffer)
|
| 45 |
-
structured_data = extract_trial_balance_data(file_location)
|
| 46 |
-
output_file = "data/output1/parsed_trial_balance.json"
|
| 47 |
-
analyze_and_save_results(structured_data, output_file)
|
| 48 |
-
with open(output_file, "r", encoding="utf-8") as f:
|
| 49 |
-
parsed_data = json.load(f)
|
| 50 |
-
tb_df = pd.DataFrame(parsed_data if isinstance(parsed_data, list) else parsed_data.get("trial_balance", parsed_data))
|
| 51 |
-
tb_df['amount'] = tb_df['amount'].apply(clean_value)
|
| 52 |
-
return tb_df
|
| 53 |
|
| 54 |
-
@router.post("/new")
|
| 55 |
async def llm_generate_and_excel(
|
| 56 |
file: UploadFile = File(...),
|
| 57 |
note_number: Optional[str] = Form(None)
|
|
@@ -102,302 +77,8 @@ async def llm_generate_and_excel(
|
|
| 102 |
excel_path,
|
| 103 |
filename=os.path.basename(excel_path),
|
| 104 |
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 105 |
-
)
|
| 106 |
-
|
| 107 |
-
@router.post("/hardcoded")
|
| 108 |
-
async def run_full_pipeline(
|
| 109 |
-
file: UploadFile = File(...),
|
| 110 |
-
note_number: Optional[str] = Form(None)
|
| 111 |
-
):
|
| 112 |
-
os.makedirs("data/input", exist_ok=True)
|
| 113 |
-
file_location = f"data/input/{file.filename}"
|
| 114 |
-
with open(file_location, "wb") as buffer:
|
| 115 |
-
shutil.copyfileobj(file.file, buffer)
|
| 116 |
-
os.makedirs("data/output1", exist_ok=True)
|
| 117 |
-
structured_data = extract_trial_balance_data(file_location)
|
| 118 |
-
output1_json = "data/output1/parsed_trial_balance.json"
|
| 119 |
-
analyze_and_save_results(structured_data, output1_json)
|
| 120 |
-
os.makedirs("data/output2", exist_ok=True)
|
| 121 |
-
try:
|
| 122 |
-
process_json(output1_json)
|
| 123 |
-
except ImportError:
|
| 124 |
-
logger.error("main16_23.process_json not found. Please ensure 'app/main16_23.py' exists and is named correctly.")
|
| 125 |
-
raise HTTPException(status_code=500, detail="main16_23.process_json not found. Please ensure 'app/main16_23.py' exists and is named correctly.")
|
| 126 |
-
except Exception as e:
|
| 127 |
-
logger.error(f"main16_23.process_json failed: {e}")
|
| 128 |
-
raise HTTPException(status_code=500, detail=f"main16_23.process_json failed: {e}")
|
| 129 |
-
notes_json = "data/output2/notes_output.json"
|
| 130 |
-
with open(notes_json, "r", encoding="utf-8") as f:
|
| 131 |
-
notes_data = json.load(f)
|
| 132 |
-
if isinstance(notes_data, dict):
|
| 133 |
-
for key in ["notes", "trial_balance"]:
|
| 134 |
-
if key in notes_data:
|
| 135 |
-
notes_data = notes_data[key]
|
| 136 |
-
break
|
| 137 |
-
def wrap_notes(notes):
|
| 138 |
-
return {"notes": notes}
|
| 139 |
-
if note_number:
|
| 140 |
-
numbers = [n.strip() for n in note_number.split(",")]
|
| 141 |
-
notes_data = [
|
| 142 |
-
note for note in notes_data
|
| 143 |
-
if str(note.get('note_number', '')).strip() in numbers
|
| 144 |
-
]
|
| 145 |
-
filtered_json = "data/output2/notes_output_filtered.json"
|
| 146 |
-
with open(filtered_json, "w", encoding="utf-8") as f2:
|
| 147 |
-
json.dump(wrap_notes(notes_data), f2, ensure_ascii=False, indent=2)
|
| 148 |
-
json_input_for_excel = filtered_json
|
| 149 |
-
else:
|
| 150 |
-
temp_json = "data/output2/notes_output_wrapped.json"
|
| 151 |
-
with open(temp_json, "w", encoding="utf-8") as f2:
|
| 152 |
-
json.dump(wrap_notes(notes_data), f2, ensure_ascii=False, indent=2)
|
| 153 |
-
json_input_for_excel = temp_json
|
| 154 |
-
os.makedirs("data/output3", exist_ok=True)
|
| 155 |
-
try:
|
| 156 |
-
output3_xlsx = "data/output3/final_output.xlsx"
|
| 157 |
-
json_to_xlsx(json_input_for_excel, output3_xlsx)
|
| 158 |
-
except ImportError:
|
| 159 |
-
logger.error("json_xlsx.json_to_xlsx not found")
|
| 160 |
-
raise HTTPException(status_code=500, detail="json_xlsx.json_to_xlsx not found")
|
| 161 |
-
except Exception as e:
|
| 162 |
-
logger.error(f"json_xlsx.json_to_xlsx failed: {e}")
|
| 163 |
-
raise HTTPException(status_code=500, detail=f"json_xlsx.json_to_xlsx failed: {e}")
|
| 164 |
-
return FileResponse(
|
| 165 |
-
output3_xlsx,
|
| 166 |
-
filename=os.path.basename(output3_xlsx),
|
| 167 |
-
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 168 |
-
)
|
| 169 |
-
|
| 170 |
-
def run_subprocess(
|
| 171 |
-
script_path: str,
|
| 172 |
-
args: list,
|
| 173 |
-
env: Dict[str, str],
|
| 174 |
-
cwd: str
|
| 175 |
-
) -> subprocess.CompletedProcess:
|
| 176 |
-
try:
|
| 177 |
-
logger.info(f"Running {script_path} with args {args} in {cwd}")
|
| 178 |
-
result = subprocess.run(
|
| 179 |
-
["python", script_path] + args,
|
| 180 |
-
capture_output=True,
|
| 181 |
-
text=True,
|
| 182 |
-
check=True,
|
| 183 |
-
env=env,
|
| 184 |
-
cwd=cwd
|
| 185 |
-
)
|
| 186 |
-
logger.debug(f"{script_path} STDOUT:\n{result.stdout}")
|
| 187 |
-
logger.debug(f"{script_path} STDERR:\n{result.stderr}")
|
| 188 |
-
return result
|
| 189 |
-
except subprocess.CalledProcessError as e:
|
| 190 |
-
logger.error(f"{script_path} failed: {e}")
|
| 191 |
-
logger.error(f"STDOUT: {e.stdout}")
|
| 192 |
-
logger.error(f"STDERR: {e.stderr}")
|
| 193 |
-
raise HTTPException(
|
| 194 |
-
status_code=500,
|
| 195 |
-
detail=f"{script_path} failed: {e}\nSTDOUT:\n{e.stdout}\nSTDERR:\n{e.stderr}"
|
| 196 |
-
)
|
| 197 |
-
|
| 198 |
-
def extract_output_file(stdout: str, keyword: str = "Output file:") -> Optional[str]:
|
| 199 |
-
for line in stdout.splitlines():
|
| 200 |
-
if keyword in line:
|
| 201 |
-
return line.split(keyword)[-1].strip()
|
| 202 |
-
return None
|
| 203 |
-
|
| 204 |
-
@router.post("/bs_from_notes")
|
| 205 |
-
async def bs_from_notes(file: UploadFile = File(...)):
|
| 206 |
-
os.makedirs("data/input", exist_ok=True)
|
| 207 |
-
input_excel_path = os.path.join("data/input", file.filename)
|
| 208 |
-
with open(input_excel_path, "wb") as buffer:
|
| 209 |
-
shutil.copyfileobj(file.file, buffer)
|
| 210 |
-
logger.info(f"Uploaded Excel saved to: {input_excel_path}")
|
| 211 |
-
logger.info(f"Files in data/input/: {os.listdir('data/input')}")
|
| 212 |
-
env = os.environ.copy()
|
| 213 |
-
if os.getenv("OPENROUTER_API_KEY"):
|
| 214 |
-
env["OPENROUTER_API_KEY"] = os.getenv("OPENROUTER_API_KEY")
|
| 215 |
-
env["INPUT_FILE"] = "data/clean_financial_data_bs.json"
|
| 216 |
-
cwd = os.getenv("PROJECT_ROOT", os.getcwd())
|
| 217 |
-
# Run Balance Sheet Data Extractor
|
| 218 |
-
run_subprocess("bs/balance_sheet_data_extractor.py", [input_excel_path], env, cwd)
|
| 219 |
-
logger.info(f"Files in data/csv_notes_bs/: {os.listdir('data/csv_notes_bs') if os.path.exists('data/csv_notes_bs') else 'data/csv_notes_bs does not exist'}")
|
| 220 |
-
# Run Balance Sheet CSV to JSON Converter
|
| 221 |
-
run_subprocess("bs/balance_sheet_csv_to_json_converter.py", [], env, cwd)
|
| 222 |
-
logger.info(f"data/clean_financial_data_bs.json exists: {os.path.exists('data/clean_financial_data_bs.json')}")
|
| 223 |
-
# Run Balance Sheet Generator
|
| 224 |
-
result = run_subprocess("bs/balance_sheet_generator.py", [], env, cwd)
|
| 225 |
-
output_file = extract_output_file(result.stdout)
|
| 226 |
-
if output_file and not os.path.isabs(output_file):
|
| 227 |
-
output_file_path = os.path.join(cwd, output_file)
|
| 228 |
-
else:
|
| 229 |
-
output_file_path = output_file
|
| 230 |
-
if not output_file or not os.path.exists(output_file_path):
|
| 231 |
-
debug_msg = f"\nSTDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}"
|
| 232 |
-
logger.error(f"Could not determine output file from balance_sheet_generator.py output.{debug_msg}")
|
| 233 |
-
raise HTTPException(status_code=500, detail=f"Could not determine output file from balance_sheet_generator.py output.{debug_msg}")
|
| 234 |
-
logger.info(f"Pipeline completed. Output file: {output_file_path}")
|
| 235 |
-
return FileResponse(
|
| 236 |
-
output_file_path,
|
| 237 |
-
filename=os.path.basename(output_file_path),
|
| 238 |
-
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 239 |
-
)
|
| 240 |
-
|
| 241 |
-
@router.post("/pnl_from_notes")
|
| 242 |
-
async def pnl_from_notes(file: UploadFile = File(...)):
|
| 243 |
-
os.makedirs("data/input", exist_ok=True)
|
| 244 |
-
input_excel_path = os.path.join("data/input", file.filename)
|
| 245 |
-
with open(input_excel_path, "wb") as buffer:
|
| 246 |
-
shutil.copyfileobj(file.file, buffer)
|
| 247 |
-
logger.info(f"Uploaded Excel saved to: {input_excel_path}")
|
| 248 |
-
logger.info(f"Files in data/input/: {os.listdir('data/input')}")
|
| 249 |
-
env = os.environ.copy()
|
| 250 |
-
if os.getenv("OPENROUTER_API_KEY"):
|
| 251 |
-
env["OPENROUTER_API_KEY"] = os.getenv("OPENROUTER_API_KEY")
|
| 252 |
-
env["INPUT_FILE"] = "data/clean_financial_data_pnl.json"
|
| 253 |
-
cwd = os.getenv("PROJECT_ROOT", os.getcwd())
|
| 254 |
-
# Run Profit & Loss Data Extractor
|
| 255 |
-
run_subprocess("pnl/profit_loss_data_extractor.py", [input_excel_path], env, cwd)
|
| 256 |
-
csv_notes_pnl_path = os.path.join(cwd, 'data/csv_notes_pnl')
|
| 257 |
-
logger.info(f"Files in {csv_notes_pnl_path}/: {os.listdir(csv_notes_pnl_path) if os.path.exists(csv_notes_pnl_path) else f'{csv_notes_pnl_path} does not exist'}")
|
| 258 |
-
# Run Profit & Loss CSV to JSON Converter
|
| 259 |
-
run_subprocess("pnl/profit_loss_csv_to_json_converter.py", [], env, cwd)
|
| 260 |
-
json_path = os.path.join(cwd, 'data/clean_financial_data_pnl.json')
|
| 261 |
-
logger.info(f"data/clean_financial_data_pnl.json exists: {os.path.exists(json_path)}")
|
| 262 |
-
# Run Profit & Loss Statement Generator
|
| 263 |
-
run_subprocess("pnl/profit_loss_statement_generator.py", [], env, cwd)
|
| 264 |
-
# Use fixed output file path
|
| 265 |
-
output_file_path = os.path.join(cwd, "data/pnl_statement.xlsx")
|
| 266 |
-
if not os.path.exists(output_file_path):
|
| 267 |
-
logger.error(f"Could not find expected output file for P&L statement: {output_file_path}")
|
| 268 |
-
raise HTTPException(status_code=500, detail=f"Could not find expected output file for P&L statement: {output_file_path}")
|
| 269 |
-
logger.info(f"Pipeline completed. Output file: {output_file_path}")
|
| 270 |
-
return FileResponse(
|
| 271 |
-
output_file_path,
|
| 272 |
-
filename=os.path.basename(output_file_path),
|
| 273 |
-
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 274 |
-
)
|
| 275 |
-
|
| 276 |
-
@router.post("/cf_from_notes")
|
| 277 |
-
async def cf_from_notes(file: UploadFile = File(...)):
|
| 278 |
-
os.makedirs("data/input", exist_ok=True)
|
| 279 |
-
input_excel_path = os.path.join("data/input", file.filename)
|
| 280 |
-
with open(input_excel_path, "wb") as buffer:
|
| 281 |
-
shutil.copyfileobj(file.file, buffer)
|
| 282 |
-
logger.info(f"Uploaded Excel saved to: {input_excel_path}")
|
| 283 |
-
logger.info(f"Files in data/input/: {os.listdir('data/input')}")
|
| 284 |
-
env = os.environ.copy()
|
| 285 |
-
cwd = os.getenv("PROJECT_ROOT", os.getcwd())
|
| 286 |
-
# Step 1: Run Cash Flow Data Extractor
|
| 287 |
-
run_subprocess("cf/cash_flow_data_extractor.py", [input_excel_path], env, cwd)
|
| 288 |
-
csv_notes_cfs_path = os.path.join(cwd, 'data/csv_notes_cfs')
|
| 289 |
-
logger.info(f"Files in {csv_notes_cfs_path}/: {os.listdir(csv_notes_cfs_path) if os.path.exists(csv_notes_cfs_path) else f'{csv_notes_cfs_path} does not exist'}")
|
| 290 |
-
# Step 2: Run Cash Flow CSV to JSON Converter
|
| 291 |
-
run_subprocess("cf/cash_flow_csv_to_json_converter.py", [], env, cwd)
|
| 292 |
-
json_path = os.path.join(cwd, 'data/clean_financial_data_cfs.json')
|
| 293 |
-
logger.info(f"data/clean_financial_data_cfs.json exists: {os.path.exists(json_path)}")
|
| 294 |
-
# Step 3: Run Cash Flow Data Processor
|
| 295 |
-
run_subprocess("cf/cash_flow_data_processor.py", [], env, cwd)
|
| 296 |
-
extracted_json_path = os.path.join(cwd, 'data/extracted_cfs_data.json')
|
| 297 |
-
logger.info(f"data/extracted_cfs_data.json exists: {os.path.exists(extracted_json_path)}")
|
| 298 |
-
# Step 4: Run Cash Flow Statement Generator
|
| 299 |
-
result = run_subprocess("cf/cash_flow_statement_generator.py", [], env, cwd)
|
| 300 |
-
output_file = "data/cash_flow_statements.xlsx"
|
| 301 |
-
output_file_path = os.path.join(cwd, output_file)
|
| 302 |
-
if not os.path.exists(output_file_path):
|
| 303 |
-
output_file_path = os.path.join(cwd, "data/cash_flow_statements.xlsx")
|
| 304 |
-
if not os.path.exists(output_file_path):
|
| 305 |
-
debug_msg = f"\nSTDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}"
|
| 306 |
-
logger.error(f"Could not determine output file from cf_generation.py output.{debug_msg}")
|
| 307 |
-
raise HTTPException(status_code=500, detail=f"Could not determine output file from cf_generation.py output.{debug_msg}")
|
| 308 |
-
logger.info(f"Pipeline completed. Output file: {output_file_path}")
|
| 309 |
-
return FileResponse(
|
| 310 |
-
output_file_path,
|
| 311 |
-
filename=os.path.basename(output_file_path),
|
| 312 |
-
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 313 |
-
)
|
| 314 |
-
|
| 315 |
-
@router.post("/agent/generate")
|
| 316 |
-
async def agent_generate_statements(
|
| 317 |
-
file: UploadFile = File(...),
|
| 318 |
-
note_numbers: str = Form(""),
|
| 319 |
-
statement_type: str = Form("notes") # notes, balance_sheet, pnl, cash_flow only
|
| 320 |
-
):
|
| 321 |
-
"""Unified intelligent generation endpoint.
|
| 322 |
-
|
| 323 |
-
Behaviors:
|
| 324 |
-
- balance_sheet / pnl / cash_flow: run respective tool directly and return the Excel file
|
| 325 |
-
"""
|
| 326 |
-
try:
|
| 327 |
-
# Validate statement_type
|
| 328 |
-
valid_types = {"balance_sheet", "pnl", "cash_flow", "notes"}
|
| 329 |
-
if statement_type not in valid_types:
|
| 330 |
-
raise HTTPException(status_code=400, detail=f"Invalid statement_type: {statement_type}. Allowed: balance_sheet, pnl, cash_flow, notes")
|
| 331 |
|
| 332 |
-
# Persist uploaded file
|
| 333 |
-
upload_dir = "data/input"
|
| 334 |
-
os.makedirs(upload_dir, exist_ok=True)
|
| 335 |
-
file_path = os.path.join(upload_dir, file.filename)
|
| 336 |
-
with open(file_path, "wb") as buffer:
|
| 337 |
-
shutil.copyfileobj(file.file, buffer)
|
| 338 |
-
|
| 339 |
-
# Direct tool imports (lazy to avoid import cost if not needed)
|
| 340 |
-
from agents.simple_tools import (
|
| 341 |
-
generate_notes_full_pipeline_from_path,
|
| 342 |
-
generate_balance_sheet,
|
| 343 |
-
generate_pnl_statement,
|
| 344 |
-
generate_cash_flow_statement,
|
| 345 |
-
)
|
| 346 |
-
|
| 347 |
-
# Single statement shortβcircuit (return actual file)
|
| 348 |
-
if statement_type in {"balance_sheet", "pnl", "cash_flow", "notes"}:
|
| 349 |
-
if statement_type == "notes":
|
| 350 |
-
output = generate_notes_full_pipeline_from_path(file_path, note_numbers)
|
| 351 |
-
if output["status"] != "success":
|
| 352 |
-
logger.error(f"Notes generation pipeline failed: {output.get('error')}")
|
| 353 |
-
raise HTTPException(status_code=500, detail=f"Notes generation pipeline failed: {output.get('error')}")
|
| 354 |
-
output3_xlsx = output["output_xlsx_path"]
|
| 355 |
-
return FileResponse(
|
| 356 |
-
output3_xlsx,
|
| 357 |
-
filename=os.path.basename(output3_xlsx),
|
| 358 |
-
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 359 |
-
)
|
| 360 |
-
|
| 361 |
-
if statement_type == "balance_sheet":
|
| 362 |
-
bs_result = generate_balance_sheet.run({"file_path": file_path}) if hasattr(generate_balance_sheet, "run") else generate_balance_sheet(file_path)
|
| 363 |
-
if bs_result.get("status") != "success":
|
| 364 |
-
raise HTTPException(status_code=500, detail=f"Balance sheet generation failed: {bs_result.get('error')}")
|
| 365 |
-
# Expect directory with XLSX files
|
| 366 |
-
output_dir = bs_result.get("output_path", "data/output/")
|
| 367 |
-
# Choose first xlsx file
|
| 368 |
-
if os.path.isdir(output_dir):
|
| 369 |
-
xlsx_files = [f for f in os.listdir(output_dir) if f.endswith('.xlsx')]
|
| 370 |
-
if not xlsx_files:
|
| 371 |
-
raise HTTPException(status_code=500, detail="No balance sheet Excel file produced")
|
| 372 |
-
output_file = os.path.join(output_dir, xlsx_files[0])
|
| 373 |
-
else:
|
| 374 |
-
output_file = output_dir
|
| 375 |
-
return FileResponse(output_file, filename=os.path.basename(output_file), media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
|
| 376 |
-
|
| 377 |
-
if statement_type == "pnl":
|
| 378 |
-
pnl_result = generate_pnl_statement.run({"file_path": file_path}) if hasattr(generate_pnl_statement, "run") else generate_pnl_statement(file_path)
|
| 379 |
-
if pnl_result.get("status") != "success":
|
| 380 |
-
raise HTTPException(status_code=500, detail=f"P&L generation failed: {pnl_result.get('error')}")
|
| 381 |
-
output_path = pnl_result.get("output_path", "data/pnl_statement.xlsx")
|
| 382 |
-
if not os.path.exists(output_path):
|
| 383 |
-
raise HTTPException(status_code=500, detail="P&L Excel file not found")
|
| 384 |
-
return FileResponse(output_path, filename=os.path.basename(output_path), media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
|
| 385 |
-
|
| 386 |
-
if statement_type == "cash_flow":
|
| 387 |
-
cf_result = generate_cash_flow_statement.run({"file_path": file_path}) if hasattr(generate_cash_flow_statement, "run") else generate_cash_flow_statement(file_path)
|
| 388 |
-
if cf_result.get("status") != "success":
|
| 389 |
-
raise HTTPException(status_code=500, detail=f"Cash flow generation failed: {cf_result.get('error')}")
|
| 390 |
-
output_path = cf_result.get("output_path", "data/cash_flow_statements.xlsx")
|
| 391 |
-
if not os.path.exists(output_path):
|
| 392 |
-
raise HTTPException(status_code=500, detail="Cash flow Excel file not found")
|
| 393 |
-
return FileResponse(output_path, filename=os.path.basename(output_path), media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
|
| 394 |
-
|
| 395 |
-
|
| 396 |
-
except HTTPException:
|
| 397 |
-
raise
|
| 398 |
-
except Exception as e:
|
| 399 |
-
logger.error(f"Error in agent statement generation: {e}")
|
| 400 |
-
raise HTTPException(status_code=500, detail=f"Agent generation failed: {str(e)}")
|
| 401 |
|
| 402 |
@router.post("/notes")
|
| 403 |
async def notes_route(file: UploadFile = File(...)):
|
|
|
|
| 1 |
+
from fastapi import FastAPI, APIRouter, UploadFile, File, HTTPException
|
| 2 |
+
from fastapi.responses import FileResponse
|
|
|
|
|
|
|
| 3 |
import os
|
| 4 |
import shutil
|
|
|
|
|
|
|
| 5 |
import logging
|
| 6 |
+
from agents.langgraph import run_workflow
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
|
| 8 |
# Configure logging for the application
|
| 9 |
logging.basicConfig(level=logging.INFO)
|
|
|
|
| 23 |
@app.on_event("shutdown")
|
| 24 |
async def shutdown_event():
|
| 25 |
logger.info("Financial Notes Generator API is shutting down.")
|
|
|
|
| 26 |
|
| 27 |
+
router = APIRouter()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
+
"""@router.post("/new")
|
| 30 |
async def llm_generate_and_excel(
|
| 31 |
file: UploadFile = File(...),
|
| 32 |
note_number: Optional[str] = Form(None)
|
|
|
|
| 77 |
excel_path,
|
| 78 |
filename=os.path.basename(excel_path),
|
| 79 |
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
| 80 |
+
)"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
@router.post("/notes")
|
| 84 |
async def notes_route(file: UploadFile = File(...)):
|
requirements.txt
CHANGED
|
@@ -13,4 +13,4 @@ langchain
|
|
| 13 |
langchain-openai
|
| 14 |
langchain-community
|
| 15 |
langchain-core
|
| 16 |
-
|
|
|
|
| 13 |
langchain-openai
|
| 14 |
langchain-community
|
| 15 |
langchain-core
|
| 16 |
+
langgraph
|