folio / docs /project-design.md
dystomachina's picture
refactor: move business logic from CLI to core library and improve documentation
5a20d88

Folio Project Design

This document outlines how the Folio codebase is structured and how data flows through the application. Folio provides tools for analyzing and visualizing investment portfolios, with a focus on stocks and options, through both a web-based dashboard and a command-line interface (CLI).

Application Overview

Folio is a Python-based application that provides comprehensive portfolio analysis capabilities through multiple interfaces:

  1. Web Interface: A Dash-based web application for visualizing portfolio data
  2. CLI Interface (focli): A command-line interface for portfolio analysis and simulation

Both interfaces leverage the same core library (src/folio/) for business logic, following our strict separation of concerns principles. The primary domain entities for this app are outlined below. For an authoritative overview of the data model, data_model.py is the source of truth.

Deployment Modes

Folio can run in multiple deployment environments:

  • Local Development: Running directly on a developer's machine
  • Docker Container: Running in a containerized environment
  • Hugging Face Spaces: Deployed as a Hugging Face Space for public access

The application detects its environment and adjusts settings accordingly, such as cache directories and logging behavior.

Core Data Model

The core data model consists of several key classes that represent portfolio components:

  • Position: Base class for all positions
    • StockPosition: Represents a stock position with quantity, price, beta, etc.
    • OptionPosition: Represents an option position with strike, expiry, option type, delta, etc.
  • PortfolioGroup: Groups a stock with its related options (e.g., AAPL stock with AAPL options)
  • PortfolioSummary: Contains aggregated metrics for the entire portfolio
  • ExposureBreakdown: Detailed breakdown of exposure metrics by category

These classes are defined in data_model.py and provide the foundation for all portfolio analysis.

Data Flow

The data flow in Folio follows these main steps:

  1. Data Input: User uploads a portfolio CSV file or loads a sample portfolio
  2. Data Processing: The CSV is parsed, validated, and transformed into structured portfolio data
  3. Position Grouping: Stocks and their related options are grouped together
  4. Metrics Calculation: Exposure, beta, and other metrics are calculated for each position and group
  5. Visualization: The processed data is displayed in the dashboard with charts and tables
  6. Interactivity: User interactions trigger callbacks that update the displayed data

CSV Processing

When a user uploads a CSV file, the following process occurs:

  1. The file is validated for security in security.py
  2. The CSV is parsed into a pandas DataFrame
  3. The DataFrame is processed by process_portfolio_data() in portfolio.py
  4. Stock positions are identified and processed
  5. Option positions are parsed and matched to their underlying stocks
  6. Cash-like positions are identified using cash_detection.py
  7. Portfolio groups and summary metrics are calculated

Stock Data Fetching

Folio uses a pluggable data fetching system to retrieve stock data:

  1. A DataFetcherInterface defined in stockdata.py provides a common interface
  2. Concrete implementations include YFinanceDataFetcher and FMP (Financial Modeling Prep) fetchers
  3. A singleton pattern ensures only one data fetcher is created throughout the application
  4. The data source can be configured at runtime through the folio.yaml configuration file
  5. Data is cached to improve performance and reduce API calls

Options Processing

Option positions require special processing:

  1. Option descriptions are parsed in options.py to extract strike, expiry, and option type
  2. QuantLib is used for option pricing and Greeks calculations
  3. Delta exposure is calculated as delta * notional value
  4. Options are matched to their underlying stocks to form portfolio groups
  5. Option metrics are aggregated into the portfolio summary

Portfolio Metrics Calculation

Portfolio metrics are calculated in several steps:

  1. Individual position metrics are calculated first (market value, beta, exposure)
  2. Positions are grouped by underlying ticker
  3. Group-level metrics are calculated (net exposure, beta-adjusted exposure)
  4. Portfolio-level metrics are calculated (total exposure, portfolio beta, etc.)
  5. Exposure breakdowns are created for visualization

The canonical implementations for these calculations are in portfolio_value.py.

Web UI Components

The web UI is built with Dash and consists of several key components:

  1. Summary Cards: Display high-level portfolio metrics
  2. Charts: Visualize portfolio allocation and exposure
  3. Portfolio Table: Display all positions with key metrics
  4. Position Details: Show detailed information for a selected position
  5. P&L Chart: Visualize profit/loss scenarios for options strategies

Each component is defined in the components directory and registered with callbacks in app.py.

Component Interaction

Components interact through Dash callbacks:

  1. Data is stored in dcc.Store components that act as a client-side state
  2. User interactions trigger callbacks that update the stored data
  3. Components subscribe to changes in the stored data and update accordingly
  4. This pattern allows for a reactive UI without page reloads

CLI Interface

The CLI interface (focli) provides a command-line tool for portfolio analysis and simulation:

Architecture

  1. Shell: An interactive shell implemented in shell.py using the cmd module
  2. Commands: Command handlers in the commands directory
  3. Formatters: Output formatting utilities in formatters.py
  4. Utils: CLI-specific utilities in utils.py

Command Structure

The CLI follows a command-subcommand structure:

folio> command [subcommand] [options]

Key commands include:

  • simulate: Simulate portfolio performance with SPY changes
  • position: Analyze a specific position group
  • portfolio: View and analyze portfolio data

Separation of Concerns

The CLI strictly adheres to the separation of concerns principles:

  • Command handlers only handle parsing, validation, and presentation
  • All business logic is delegated to the core library
  • No calculation or simulation logic exists in the CLI layer

Key Modules

Core Library (src/folio/)

Data Processing

  • portfolio.py: Core portfolio processing logic
  • portfolio_value.py: Canonical implementations of portfolio value calculations
  • simulator.py: Portfolio and position simulation logic
  • options.py: Option pricing and Greeks calculations
  • cash_detection.py: Identification of cash-like positions

Data Fetching

  • stockdata.py: Common interface for data fetchers
  • yfinance.py: Yahoo Finance data fetcher
  • fmp.py: Financial Modeling Prep data fetcher

Application Core

  • data_model.py: Core data structures
  • logger.py: Logging configuration
  • security.py: Security utilities for validating user inputs

Web UI (src/folio/)

UI Components

  • components/: UI components for the dashboard
    • charts.py: Portfolio visualization charts
    • portfolio_table.py: Table of portfolio positions
    • position_details.py: Detailed view of a position
    • pnl_chart.py: Profit/loss visualization
    • summary_cards.py: High-level portfolio metrics

Web Application

  • app.py: Main Dash application setup and callbacks

CLI Interface (src/focli/)

Command Handling

  • shell.py: Interactive shell implementation
  • commands/: Command handlers
    • simulate.py: Portfolio simulation commands
    • position.py: Position analysis commands
    • portfolio.py: Portfolio management commands

Presentation

  • formatters.py: Output formatting utilities
  • utils.py: CLI-specific utilities (no business logic)

Configuration

Folio uses a YAML configuration file (folio.yaml) for runtime settings:

  • Data Source: Configure which data source to use (Yahoo Finance or FMP)
  • Cache Settings: Configure cache directories and TTL
  • UI Settings: Configure dashboard appearance and behavior

The configuration is loaded at startup and can be overridden by environment variables.

Error Handling

Folio implements robust error handling:

  1. Fail Fast, Fail Transparently: Errors are raised early and clearly
  2. Graceful Degradation: The application continues to function even if some components fail
  3. Structured Logging: Errors are logged with context for debugging
  4. User Feedback: Error messages are displayed to the user when appropriate

Testing

The codebase includes comprehensive tests:

  • Unit Tests: Test individual functions and classes
  • Integration Tests: Test interactions between components
  • Mock Data: Use mock data for testing to avoid API calls

Tests are organized to mirror the structure of the source code, with test files corresponding to source files.

Development Workflow

To add new features to Folio:

  1. UI Components: Add new components in the components/ directory
  2. Data Processing: Extend the data model in data_model.py and processing logic in utils.py
  3. Callbacks: Add new callbacks in app.py to handle user interactions
  4. Testing: Add tests for new functionality

Separation of Concerns

Folio strictly adheres to separation of concerns principles:

Core Library vs Interface Layers

  1. Core Library (src/folio/):

    • Contains ALL business logic, data processing, and calculation functionality
    • Provides a stable API for interface layers to use
    • Should never depend on interface-specific code
  2. Interface Layers (src/focli/, web UI):

    • Handle user interaction, command parsing, and result presentation
    • Call core library functions to perform business operations
    • Should NEVER contain business logic
    • Focus solely on translating user inputs to core library calls and formatting outputs

Business Logic Placement

Business logic must ALWAYS reside in the core library, not in interface layers. Examples include:

  • Calculations and algorithms
  • Data transformations
  • Simulation logic
  • Portfolio analysis
  • Value calculations

Interface layers should be thin wrappers around the core library, focusing only on:

  • Parsing user input
  • Calling appropriate core library functions
  • Formatting and presenting results
  • Managing UI state

Conclusion

Folio is designed with a clean separation of concerns:

  • Business logic is centralized in the core library
  • Data fetching is abstracted behind interfaces
  • Data processing is separated from UI components
  • UI components are modular and reusable
  • Configuration is externalized for flexibility
  • Interface layers are thin and focused on user interaction

This architecture makes the codebase maintainable, testable, and extensible, allowing for easy addition of new features and improvements.