title: Automated Task Manager
emoji: π
colorFrom: pink
colorTo: pink
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Extract tasks from Gmail with AI-powered GraphRAG
license: mit
Automated Task Manager π§ π
A production-ready graph-aware reasoning assistant for task understanding, recommendations, and trustable GPT answers β grounded in structured user-uploaded email data with persistent storage.
π Latest Major Upgrades:
- Hybrid QA Pipeline: GraphRAG (topic-centered retrieval) + ChainQA (LLM answer synthesis) + RAGAS (automated answer evaluation)
- LangSmith Integration: End-to-end tracing, debugging, and evaluation for every Q&A session
- Neo4j Graph Database: Enterprise-grade graph storage for performance and scalability
- Neon PostgreSQL: Persistent database storage for emails and tasks with serverless scaling
- OpenAI Embeddings:
text-embedding-3-smallfor superior semantic search - Production Architecture: Full ETL pipeline with ACID compliance, data integrity, and horizontal scaling
π¦ Project Structure
The project is organized into modular components for database storage, graph operations, reasoning, LangGraph workflows, and Streamlit UI delivery:
| File | Purpose |
|---|---|
| Core Database & Graph | |
utils/database.py |
PostgreSQL operations for persistent email/task storage (Neon optimized) |
utils/neo4j_graph_writer.py |
Converts extracted task JSON to Neo4j graph database |
utils/neo4j_graphrag.py |
Neo4j-powered GraphRAG query system with OpenAI embeddings |
| Email Processing | |
utils/email_parser.py |
Parses Gmail Takeout .mbox into structured email DataFrame |
utils/embedding.py |
OpenAI text-embedding-3-small + FAISS index creation |
| AI Pipeline | |
utils/prompt_template.py |
GPT prompt templates for reasoning and extraction |
utils/langgraph_nodes.py |
LangGraph node definitions for each pipeline step |
utils/langgraph_dag.py |
Defines DAGs: agent chat and email-to-graph extraction |
| User Interface | |
app.py |
Streamlit entry: upload .mbox, run extraction pipeline |
pages/My_Calendar.py |
Monthly calendar view of extracted tasks |
pages/AI_Chatbot.py |
Neo4j-powered chatbot interface for graph-based QA |
| Configuration | |
requirements.txt |
Python dependencies (OpenAI, Neo4j, PostgreSQL, LangSmith, RAGAS) |
.env |
Environment variables (DATABASE_URL, NEO4J_URI, OPENAI_API_KEY, LANGCHAIN_API_KEY) |
π§ Hybrid QA Pipeline (GraphRAG + ChainQA + RAGAS)
Each Q&A session uses a hybrid pipeline:
- GraphRAG provides graph-based retrieval for structure and grounding (using OpenAI embeddings and topic expansion).
- ChainQA (LLM-based reasoning) synthesizes a fluent, explainable answer from the retrieved graph context.
- RAGAS evaluates the answerβs quality based on the retrieved context and user query, using metrics like faithfulness and context recall. This evaluation happens automatically after each answer generation step (as long as the question, answer, and context are formatted correctly).
This approach replaces Cypher-generating QA with graph-grounded semantic retrieval (GraphRAG) and LLM answer synthesis (ChainQA).
β Latest Major Upgrades
- Hybrid QA Pipeline: GraphRAG + ChainQA + RAGAS for reliable, explainable answers
- LangSmith Integration: End-to-end tracing and evaluation for every Q&A session
- Replaces Cypher-generating QA: Now uses graph-grounded semantic retrieval and LLM answer synthesis for all Q&A
- Streamlined UI: Only "Load Emails" and "Extract Tasks with AI" (no "Parse Email" button)
π§ͺ RAGAS Evaluation
RAGAS evaluates each answerβs quality based on the retrieved context and user query, using metrics like faithfulness and context recall. Note: RAGAS requires the question, answer, and context to be formatted and passed correctly for accurate evaluation.
β‘ LangSmith Integration (Tracing & Evaluation)
LangSmith lets you trace, debug, and evaluate every Q&A session:
- Sign up at smith.langchain.com and get your API key
- Install LangSmith SDK:
pip install langsmith - Set environment variables:
LANGCHAIN_TRACING_V2=true LANGCHAIN_API_KEY=your-langsmith-api-key LANGCHAIN_PROJECT=your-project-name - Tracing is automatic for all LangChain chains, retrievers, and LLM calls. For custom code, use:
from langsmith import traceable @traceable(name="HybridQAChain") def hybrid_qa_chain(user_query, graph_context): # ... LLM and RAGAS logic ... return answer, ragas_scores - View traces and RAGAS scores in your LangSmith dashboard for every user question.
π§ Full ETL Workflow
- User Uploads Inbox.mbox file (Gmail Takeout extracted)
- Smart Email Filtering β Apply intelligent filters (date range, keywords, content length, etc.)
- Clean & Normalize β Parse emails, sanitize content, and store in Neon PostgreSQL
- OpenAI Embed & Store in FAISS β Create vector embeddings for semantic search
- LLM Extraction β Structured JSON β Extract tasks, people, dates, etc.
- Human-in-the-Loop Validation (if needed) β User can review and correct extracted tasks
- Store in PostgreSQL and Neo4j β Validated tasks and relationships are persisted
- GraphRAG + ChainQA + RAGAS β All Q&A uses the hybrid pipeline for reliable, explainable answers
ποΈ Technical Architecture
- GraphRAG: Topic-centered retrieval from Neo4j using OpenAI embeddings
- ChainQA: LLM (GPT-4) generates answers from graph context
- RAGAS: Automated evaluation of answer quality (faithfulness, context recall, etc.)
- LangSmith: Tracing, debugging, and experiment tracking for every Q&A session
Data Flow
Gmail Takeout β .mbox Upload β Smart Filtering β Email Parsing β Neon PostgreSQL (parsed_email) β
OpenAI Embedding β FAISS Index β LLM Extraction β HITL Validation β
PostgreSQL (tasks) + Neo4j Graph β GraphRAG Retrieval β ChainQA Answer β RAGAS Evaluation β UI
βοΈ Environment Setup
Required Environment Variables
Create a .env file in your project root:
# π OpenAI Configuration (Required)
OPENAI_API_KEY=sk-your_openai_api_key_here
# π Neon PostgreSQL (Required)
DATABASE_URL=postgresql://username:password@ep-xxx-xxx.us-east-1.aws.neon.tech/neondb?sslmode=require
# πΈοΈ Neo4j Configuration (Required)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_neo4j_password
# LangSmith (Optional but recommended)
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your-langsmith-api-key
LANGCHAIN_PROJECT=your-project-name
π§ Troubleshooting
- Database Connection Issues: See error messages for PostgreSQL/Neo4j/OpenAI in the logs
- LangSmith Issues: Ensure your API key and project name are set; see smith.langchain.com
- RAGAS Evaluation: If RAGAS scores are missing, check your Python environment and logs for errors
- File Upload: Only
.mboxfiles are supported; large files are processed up to 200MB
π― System Benefits
- Reliable, explainable answers: All Q&A is grounded in graph data, with automated evaluation
- Traceable and debuggable: Every Q&A session is logged and traceable in LangSmith
- Production-ready: Enterprise databases, scalable architecture, and robust error handling
- Easy to use: Streamlined UI, clear workflow, and persistent storage
π Ready to Deploy
Your automated task manager is now production-ready with:
β
Hybrid QA pipeline (GraphRAG + ChainQA + RAGAS)
β
LangSmith tracing and evaluation
β
Enterprise-grade databases (Neo4j + Neon PostgreSQL)
β
Superior AI embeddings (OpenAI text-embedding-3-small)
β
Persistent data storage
β
Scalable architecture
β
Professional deployment
Perfect for: Teams, consultants, project managers, and anyone who needs to extract actionable insights from email data with enterprise-grade reliability and AI-powered intelligence.