Spaces:
Sleeping
title: Moneyrag
emoji: ๐ฐ
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0
short_description: Where did my money go? Chat with your bank statements
app_port: 8501
MoneyRAG - Personal Finance Transaction Analysis
AI-powered financial transaction analysis using RAG (Retrieval-Augmented Generation) with Model Context Protocol (MCP) integration.
Features
- Smart CSV Ingestion: Automatically maps any CSV format to standardized transaction schema using LLM
- Multi-Provider Support: Works with Google Gemini and OpenAI models
- Merchant Enrichment: Automatically enriches transactions with web-searched merchant information
- Dual Storage: SQLite for structured queries + Qdrant for semantic search
- MCP Integration: Leverages Model Context Protocol for tool-based agent interactions
- Interactive UI: Streamlit-based web interface for chat-based analysis
- Dockerized: Complete containerized deployment ready for production
Architecture
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#fff', 'primaryBorderColor': '#333', 'primaryTextColor': '#333', 'lineColor': '#666' }}}%%
graph TD
%% --- Top Layer: Entry Point ---
subgraph UI["๐ป User Interface"]
Streamlit["๐ Streamlit Web App<br/><i>Interactive Dashboard</i>"]
end
%% --- Middle Layer: Split Processes ---
%% Left Column: Ingestion (The Write Path)
subgraph Ingestion["๐ฅ Data Pipeline (Write)"]
direction TB
CSV["๐ CSV Upload<br/><i>Raw Data</i>"]
Mapper["๐ง LLM Mapper<br/><i>Schema Norm.</i>"]
Enrich["๐ Web Enrich<br/><i>DuckDuckGo</i>"]
CSV --> Mapper
Mapper --> Enrich
end
%% Right Column: Intelligence (The Read Path)
subgraph Agent["๐ค AI Orchestration (Read)"]
direction TB
Brain["๐งฉ LangGraph Agent<br/><i>Controller</i>"]
LLM["โจ LLM Model<br/><i>Gemini / GPT-4</i>"]
Brain <-->|Inference| LLM
end
subgraph MCP["๐ง MCP Tool Server"]
direction LR
SQL_Tool["โก SQL Tool<br/><i>Structured</i>"]
Vector_Tool["๐ฏ Vector Tool<br/><i>Semantic</i>"]
end
%% --- Bottom Layer: Persistence ---
subgraph Storage["๐พ Storage Layer"]
direction LR
SQLite[("๐๏ธ SQLite")]
Qdrant[("๐ฎ Qdrant")]
end
%% --- Connections & Logic ---
%% 1. User Actions
Streamlit -->|1. Upload| CSV
Streamlit -->|3. Query| Brain
%% 2. Ingestion to Storage flow
Enrich -->|2. Store| SQLite
Enrich -->|2. Embed| Qdrant
%% 3. Agent to Tools flow
Brain -->|4. Route| SQL_Tool
Brain -->|4. Route| Vector_Tool
%% 4. Tools to Storage flow (Vertical alignment matches)
SQL_Tool <-->|5. Read/Write| SQLite
Vector_Tool <-->|5. Search| Qdrant
%% 5. Return Path
Brain -.->|6. Response| Streamlit
%% --- Styling ---
classDef ui fill:#E3F2FD,stroke:#1565C0,stroke-width:2px,color:#0D47A1,rx:10,ry:10
classDef ingest fill:#E8F5E9,stroke:#2E7D32,stroke-width:2px,color:#1B5E20,rx:5,ry:5
classDef agent fill:#F3E5F5,stroke:#7B1FA2,stroke-width:2px,color:#4A148C,rx:5,ry:5
classDef mcp fill:#FFF3E0,stroke:#EF6C00,stroke-width:2px,color:#E65100,rx:5,ry:5
classDef storage fill:#ECEFF1,stroke:#455A64,stroke-width:2px,color:#263238,rx:5,ry:5
class Streamlit ui
class CSV,Mapper,Enrich ingest
class Brain,LLM agent
class SQL_Tool,Vector_Tool mcp
class SQLite,Qdrant storage
%% Curve the lines for better readability
linkStyle default interpolate basis
Quick Start
Docker (Recommended)
./docker-run.sh
Choose option 1 to build and run, then open http://localhost:8501
Local Development
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
streamlit run app.py
Getting Started Resources
๐ API Keys
- Google Gemini: Get API key from Google AI Studio
- OpenAI: Get API key from OpenAI Platform
๐ฅ Download Transaction History
- Chase Credit Card: Video Guide
- Discover Credit Card: Video Guide
Usage
- Enter your API key in the sidebar
- Upload CSV transaction files
- Ask questions in natural language
Example Questions
- "How much did I spend on restaurants last month?"
- "What are my top 5 spending categories?"
- "Show me all transactions over $100"
- "Find all Starbucks transactions"
- "Analyze my spending patterns"
Supported CSV Formats
MoneyRAG automatically handles different CSV formats including:
- Chase Bank: Negative values for spending
- Discover: Positive values for spending
- Custom formats: LLM-based column mapping
Required information (can have any column names):
- Date
- Merchant/Description
- ASupported CSV Formats
MoneyRAG automatically handles different CSV formats:
- Chase Bank, Discover, and custom formats
- LLM-based column mapping (works with any column names)
- Required: Date, Merchant/Description, Amount
Configuration
Supported Models:
- Google: gemini-2.0-flash-exp, gemini-1.5-flash, gemini-1.5-pro
- OpenAI: gpt-4o, gpt-4o-mini
Note: API keys entered through UI, no environment variables needed. docker ps docker inspect money-rag-app | grep Health
### Reset everything
```bash
docker-compose down -v
docker rmi money_rag-money-rag
./docker-run.sh # Choose option 1
MCP Server Issues
The MCP server runs as a subprocess. If you see connection errors:
- Check logs:
docker-compose logs -f - Verify mcp_server.py exists:
docker exec money-rag-app ls -la
Permission Issues
chmod +x docker-run.sh
sudo chown -R $USER:$USER data logs
Production Deployment
Using Docker Hub
Tag and push:
docker tag money-rag:latest your-username/money-rag:latest docker push your-username/money-rag:latestPull and run on server:
docker pull your-username/money-rag:latest docker run -d -p 8501:8501 your-username/money-rag:latest
Cloud Platforms
Google Cloud Run:
gcloud builds submit --tag gcr.io/PROJECT-ID/money-rag
gcloud run deploy money-rag \
--image gcr.io/PROJECT-ID/money-rag \
--platform managed \
--allow-unauthenticated
AWS ECS / Azure Container Instances:
- Build and push to respective container registries
- Deploy using platform-specific CLI tools
Security Notes
โ ๏ธ Important:
- API keys are entered via UI and stored only in session state (not persisted)
- Keys are cleared when browser session ends
- Transaction data is session-based and ephemeral
- No sensitive data stored in environment variables or files
- For production, implement secure session management and authentication
Development
Hot Reload
Mount code as volume in docker-compose.yml:
volumes:
- ./app.py:/app/app.py
- ./money_rag.py:/app/money_rag.py
- ./mcp_server.py:/app/mcp_server.py
Testing
# Run unit tests (if available)
pytest tests/
# Test CSV ingestion
python -c "from money_rag import MoneyRAG; ..."
Technologies
Core Framework:
- LangChain (>=1.2.3): Agent orchestration and tool integration
- LangGraph (>=1.0.6): Conversational agent with memory
- langchain-mcp-adapters (>=0.2.1): Model Context Protocol integration
LLM Providers:
- langchain-google-genai (>=2.0.0): Google Gemini integration
- langchain-openai (>=1.1.7): OpenAI GPT integration
Storage & Search:
- Qdrant (>=1.16.2): Vector database for semantic search
- SQLite (via SQLAlchemy >=2.0.45): Relational database for structured queries
Tools & Services:
- FastMCP (>=2.14.3): MCP server implementation
- DuckDuckGo Search (>=8.1.1): Web search for merchant enrichment Container issues:
docker-compose logs
docker-compose down -v # Reset everything
./docker-run.sh # Rebuild
Permission issues:
chmod +x docker-run.sh
Technologies
- LangChain & LangGraph: Agent orchestration
- Google Gemini / OpenAI GPT: LLM providers
- Qdrant: Vector database
- SQLite: Structured storage
- FastMCP: Model Context Protocol
- Streamlit: Web interface
Contributors
License
MIT