PrimoGreedy-Agent / README.md
CiscsoPonce's picture
docs: update README and showcase page with dashboard, online evals, prompt versioning
d072f0c
metadata
title: PrimoGreedy Agent
emoji: πŸ’Έ
colorFrom: green
colorTo: blue
sdk: docker
pinned: false
app_port: 7860

PrimoGreedy Agent

PrimoGreedy is an automated, AI-driven financial analysis agent designed to hunt, filter, and evaluate Micro-Cap and Small-Cap stocks. It acts as a ruthless "Logic Firewall," aggressively rejecting high-debt, cash-burning, and overvalued companies before deploying a multi-agent LLM pipeline to write highly structured fundamental investment memos β€” and optionally execute paper trades via Alpaca.


Core Architecture (The LangGraph Engine)

The system is built on LangGraph following modern best practices (partial state updates, Command routing, Send parallel fan-out, checkpointing, RetryPolicy, and multi-agent subgraphs).

Hunter Pipeline (src/agent.py / src/whale_hunter.py)

START --> initial_routing --> [chat] --> END
                          \-> [scout] --> [gatekeeper] --Command--> [analyst] --> END
                                              \--Command--> [scout]  (retry)
  1. Scout Node β€” Discovers candidates via yFinance screener + Brave Search trending, scores and ranks them, and pops the best unseen ticker.

  2. Gatekeeper Node β€” Strict quantitative firewall using the Command pattern for routing:

    • Market Cap: $5M -- $500M
    • Share Price: under $30.00
    • Zombie Filter: rejects unprofitable companies with < 6 months cash runway
    • Routes directly to analyst (PASS / retries exhausted) or back to scout (FAIL) via Command.
  3. Analyst Node β€” Two modes controlled by USE_DEBATE env var:

    • Single-LLM (default): Senior Broker analysis via OpenRouter (6-model fallback chain) with structured InvestmentVerdict output.
    • Multi-Agent Debate (USE_DEBATE=true): Three-agent Investment Committee subgraph (Pitcher β†’ Skeptic β†’ Judge) that produces a hallucination-resistant verdict.

    Both modes fetch SEC EDGAR 10-K/10-Q filings (US equities), call Finnhub tools for deep fundamentals, and compute Kelly Criterion position sizing.

Workflow Pipeline (src/workflows/workflow.py)

START --> [data_collection] --> [technical_analysis] --> [news_intelligence] --> [portfolio_manager] --> END

A linear 4-node pipeline for deep single-ticker analysis (used by main.py CLI).

Multi-Agent Debate (src/agents/debate.py)

When USE_DEBATE=true, the analyst node runs a 3-agent LangGraph subgraph:

START --> [pitcher (Gemma)] --> [skeptic (Mistral)] --> [judge (Nemotron)] --> END
  1. The Pitcher β€” Writes the strongest bullish thesis using only provided data.
  2. The Skeptic β€” Challenges the bull case, flagging any fabricated claims.
  3. The Judge β€” Synthesises the debate into a structured InvestmentVerdict, downgrading if fabrications were found.

Models are configurable via DEBATE_PITCHER_MODEL, DEBATE_SKEPTIC_MODEL, DEBATE_JUDGE_MODEL env vars.

Parallel Region Orchestrator (src/whale_hunter.py)

The daily cron dispatches all 4 markets (USA, UK, Canada, Australia) in parallel via the LangGraph Send API:

START --> dispatch_regions --> [hunt_region: USA]       \
                          \-> [hunt_region: UK]         |-- region_results --> END
                          \-> [hunt_region: Canada]     |
                          \-> [hunt_region: Australia]  /

Each hunt_region invokes the full per-region subgraph (scout -> gatekeeper -> analyst -> email).

Supports catalyst-triggered single-ticker mode via CATALYST_TICKER env var (used by repository_dispatch from the VPS polling daemon).


LangGraph Features Used

Feature Where Purpose
Partial state updates All agent nodes Nodes return dict with only changed keys
Annotated reducers src/core/state.py candidates and candidate_scores use operator.add
Command pattern Gatekeeper nodes Combines state update + routing in a single return
Send API whale_hunter.py orchestrator Parallel fan-out across 4 market regions
InMemorySaver All 3 graphs Checkpointing with thread_id for state persistence
RetryPolicy All nodes max_attempts=3, initial_interval=2.0 for transient errors
recursion_limit All invoke() calls Set to 30 to prevent infinite loops
Structured output Analyst nodes with_structured_output(InvestmentVerdict) for validated verdicts
Subgraph src/agents/debate.py Pitcher/Skeptic/Judge multi-agent debate as nested graph
@tool decorator sec_edgar.py, finance_tools.py LangChain tool pattern for API integrations
START / END All graphs Modern entry-point API (no deprecated set_entry_point)

Key Modules

The Interactive UI (app.py)

A Chainlit-powered chat interface.

  • AUTO β€” Smart scan (yFinance screener + Brave trending)
  • @Handle β€” Social scout from X/Twitter accounts
  • PORTFOLIO β€” View the agent's paper trade track record
  • BACKTEST β€” Run backtest on paper portfolio
  • Direct Ticker β€” Type any ticker (e.g., AAPL) for an instant deep-dive

The Morning Cron (src/whale_hunter.py)

Headless agent running as a GitHub Action daily cron. Hunts all 4 regions in parallel, evaluates candidates through the full pipeline, and emails HTML reports via Resend.

Alpaca Paper Trading (src/broker/alpaca.py)

Optional Alpaca Markets integration for live paper trading of US equities:

  • Automatically submits market orders for BUY / STRONG BUY verdicts on US tickers
  • Calculates share quantity from Kelly position sizing and account equity
  • Safety limits: minimum $1 order, maximum 25% of equity per position
  • Records order_id, fill_price, and broker_status to VPS
  • Dry-run mode when ALPACA_ENABLED is not set (logs but doesn't submit)

VPS Data Layer (vps/)

Optional FastAPI + DuckDB backend deployed on a VPS (behind Tailscale) that replaces local JSON files for persistence:

  • seen_tickers β€” Prevents re-analysing the same ticker (30 days for BUY/STRONG BUY, 14 days for AVOID/WATCH to allow re-evaluation)
  • paper_portfolio β€” Records all paper trades with Kelly sizing, Alpaca order IDs, and fill prices
  • agent_runs β€” Operational metrics for LangSmith correlation
  • Live Dashboard (GET /dashboard) β€” Chart.js-powered portfolio dashboard with summary cards, verdict distribution donut, Kelly sizing bar chart, sortable trade table, and seen-ticker feed. Auto-refreshes every 5 minutes. Dark theme.
  • GET /portfolio/summary β€” Lightweight aggregated stats endpoint (no yFinance calls)

The agent (src/core/memory.py, src/portfolio_tracker.py) auto-detects the VPS via VPS_API_URL env var and falls back to local JSON files when unavailable.

Catalyst Polling Daemon (vps/catalyst_poll.py)

VPS-based systemd timer that polls every 15 minutes during US market hours for intraday triggers:

  • Volume spike β€” Current volume > 3x average daily volume
  • Price move β€” Intraday move > 10%
  • Insider filing β€” New SEC Form 4 purchase for a tracked ticker

When triggered, fires a GitHub Actions repository_dispatch event to run the pipeline for that specific ticker.

Grading Engine (scripts/ + src/core/online_eval.py)

Automated quality assurance via LangSmith Evaluators, split into offline and online tiers:

Offline Evaluators (run on demand against golden dataset):

  • scripts/build_golden_dataset.py β€” Curates 50 representative traces into a LangSmith Dataset
  • scripts/evaluators.py β€” 5 custom evaluators:
    • Catalyst Grounding (LLM-as-a-Judge) β€” Scores whether claims are backed by data
    • Company Identity (LLM-as-a-Judge) β€” Catches "name-trap" hallucinations
    • Format β€” Validates headers, no duplicates, Kelly present for BUY
    • Verdict Validity β€” Ensures verdict is one of the 4 valid values
    • Kelly Math β€” Checks allocation is within [1%, 25%] bounds
  • scripts/run_evals.py β€” Runs all evaluators against the golden dataset

Online Evaluators (run inline during every cron):

  • src/core/online_eval.py β€” After each analyst verdict, the cheap evaluators (format_score, verdict_validity_score) run automatically and post results as LangSmith feedback on the run. Zero extra LLM cost.

Annotation Queue:

  • WATCH, AVOID, and fallback-path verdicts are automatically tagged with needs_review=true in LangSmith metadata, so the team can filter and review edge cases in the LangSmith UI.

Prompt A/B Testing:

  • src/prompts/senior_broker.py supports a PROMPT_VERSION env var to pin to a specific LangSmith Hub commit. Deploy two cron runs with different versions and compare results in LangSmith Experiments.

SEC EDGAR Ground Truth (src/sec_edgar.py)

Fetches the most recent 10-K or 10-Q filing from SEC EDGAR for US equities and extracts two investment-critical sections:

  • Item 7: Management's Discussion & Analysis (MD&A) β€” Management's own view of operations
  • Item 1A: Risk Factors β€” Legally mandated disclosure of what could go wrong

Uses the EDGAR EFTS full-text search API with BeautifulSoup HTML parsing and RecursiveCharacterTextSplitter for section truncation. Injected into the analyst prompt as {sec_context} for non-US equities this is skipped gracefully.

Structured Verdicts (src/models/verdict.py)

Pydantic model enforcing one of 4 verdict types:

STRONG BUY | BUY | WATCH | AVOID

Used via llm.with_structured_output(InvestmentVerdict) with a graceful fallback to plain text LLM output.

Kelly Criterion Position Sizing (src/models/kelly.py)

Computes optimal position size from historical portfolio performance using the Kelly Criterion:

  • Calculates win rate, average win %, and average loss % from VPS or local trade history
  • Applies the Kelly formula: f* = (win_rate / avg_loss) - ((1 - win_rate) / avg_win)
  • Uses conservative half-Kelly with verdict-based scaling:
    • STRONG BUY β†’ 100% of half-Kelly
    • BUY β†’ 70% of half-Kelly
    • WATCH β†’ 30% of half-Kelly
  • Clamped to 1% -- 25% to prevent over-concentration
  • Requires minimum 5 historical trades before activating (returns 0% otherwise)

Position sizing is computed post-LLM and injected into the InvestmentVerdict model, appearing in both the report output and the paper trade record.


Quick Start Guide

1. Environment Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2. Configuration (.env)

OPENROUTER_API_KEY=your_key       # LLM Inference (6-model fallback chain)
FINNHUB_API_KEY=your_key          # Deep Fundamentals & Insider Data
BRAVE_API_KEY=your_key            # Web Search
RESEND_API_KEY_CISCO=your_key     # Email Reporting (Cron only)

# Optional: VPS Data API
VPS_API_URL=http://your-vps:8080
VPS_API_KEY=your_vps_key

# Optional: Alpaca Paper Trading (US equities only)
ALPACA_API_KEY=your_key
ALPACA_SECRET_KEY=your_secret
ALPACA_ENABLED=true

# Optional: Multi-Agent Debate
USE_DEBATE=true                   # Enable pitcher/skeptic/judge pipeline

# Optional: LangSmith Observability
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_key
LANGCHAIN_PROJECT=primogreedy

# Optional: Prompt A/B Testing
PROMPT_VERSION=latest             # Pin to a specific Hub commit hash for A/B tests

3. Launching the UI

chainlit run app.py -w

4. Running the Workflow CLI

python3 main.py

5. Deploying the VPS API (optional)

bash vps/deploy.sh

6. Running Evaluations (optional)

python scripts/build_golden_dataset.py   # Build the golden dataset from LangSmith
python scripts/run_evals.py              # Run all evaluators

Project Structure

primogreedy/
β”œβ”€β”€ app.py                          # Chainlit web UI entry point
β”œβ”€β”€ main.py                         # Workflow CLI entry point
β”œβ”€β”€ requirements.txt                # Python dependencies (LangChain 1.0 LTS)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agent.py                    # Interactive Chainlit pipeline (scout/gatekeeper/analyst)
β”‚   β”œβ”€β”€ whale_hunter.py             # Daily cron pipeline + parallel Send orchestrator
β”‚   β”œβ”€β”€ llm.py                      # OpenRouter LLM with 6-model fallback + structured output
β”‚   β”œβ”€β”€ sec_edgar.py                # SEC EDGAR 10-K/10-Q filing fetcher + parser (@tool)
β”‚   β”œβ”€β”€ finance_tools.py            # Finnhub tools (@tool decorated)
β”‚   β”œβ”€β”€ portfolio_tracker.py        # Paper trade recording + Alpaca execution
β”‚   β”œβ”€β”€ email_utils.py              # Resend email dispatch
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ state.py                # AgentState (TypedDict with Annotated reducers + debate fields)
β”‚   β”‚   β”œβ”€β”€ memory.py               # Seen-tickers ledger (VPS or local JSON)
β”‚   β”‚   β”œβ”€β”€ search.py               # Brave Search wrapper (with retry/backoff)
β”‚   β”‚   β”œβ”€β”€ ticker_utils.py         # Ticker extraction, suffix resolution, noise filtering
β”‚   β”‚   β”œβ”€β”€ online_eval.py          # Inline LangSmith evaluators + annotation queue
β”‚   β”‚   └── logger.py               # Logging config
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ verdict.py              # InvestmentVerdict Pydantic model (with header-stripping)
β”‚   β”‚   └── kelly.py                # Kelly Criterion position sizing (with 10-min cache)
β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”œβ”€β”€ debate.py               # Multi-agent pitcher/skeptic/judge subgraph
β”‚   β”‚   β”œβ”€β”€ data_collection_agent.py
β”‚   β”‚   β”œβ”€β”€ technical_analysis_agent.py
β”‚   β”‚   β”œβ”€β”€ news_intelligence_agent.py
β”‚   β”‚   └── portfolio_manager_agent.py
β”‚   β”œβ”€β”€ broker/
β”‚   β”‚   └── alpaca.py               # Alpaca Paper Trading order router + execution
β”‚   β”œβ”€β”€ workflows/
β”‚   β”‚   β”œβ”€β”€ workflow.py             # 4-node linear workflow graph
β”‚   β”‚   └── state.py                # Workflow-specific AgentState
β”‚   β”œβ”€β”€ discovery/
β”‚   β”‚   β”œβ”€β”€ screener.py             # yFinance micro-cap screener
β”‚   β”‚   β”œβ”€β”€ scoring.py              # Quantitative candidate scoring
β”‚   β”‚   └── insider_feed.py         # SEC EDGAR / Finnhub insider data
β”‚   └── prompts/
β”‚       └── senior_broker.py        # LangSmith Hub prompt template
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ build_golden_dataset.py     # LangSmith golden dataset builder
β”‚   β”œβ”€β”€ evaluators.py               # Custom LangSmith evaluators (5 scorers)
β”‚   └── run_evals.py                # Evaluation runner
β”œβ”€β”€ vps/
β”‚   β”œβ”€β”€ api.py                      # FastAPI + DuckDB data API (with broker fields + dashboard)
β”‚   β”œβ”€β”€ catalyst_poll.py            # Intraday catalyst polling daemon
β”‚   β”œβ”€β”€ schema.sql                  # DuckDB table definitions
β”‚   β”œβ”€β”€ deploy.sh                   # VPS deployment script
β”‚   └── requirements.txt            # VPS-specific dependencies
└── .github/workflows/
    └── hunter.yml                  # Daily cron + catalyst dispatch GitHub Action

The Philosophy

PrimoGreedy does not try to predict the future. It relies on strict Benjamin Graham math (Intrinsic Value = sqrt(22.5 * EPS * BookValue)) to establish a baseline Margin of Safety, then applies Peter Lynch's logic to find the catalyst and Charlie Munger's inversion to find the catch. It is designed to say AVOID far more often than it says BUY.