Spaces:

mingdom
/

folio

Sleeping

App Files Files Community

folio / docs /project-design.md

dystomachina

refactor: move business logic from CLI to core library and improve documentation

5a20d88 8 months ago

preview code

raw

history blame contribute delete

11.6 kB

Folio Project Design

This document outlines how the Folio codebase is structured and how data flows through the application. Folio provides tools for analyzing and visualizing investment portfolios, with a focus on stocks and options, through both a web-based dashboard and a command-line interface (CLI).

Application Overview

Folio is a Python-based application that provides comprehensive portfolio analysis capabilities through multiple interfaces:

Web Interface: A Dash-based web application for visualizing portfolio data
CLI Interface (focli): A command-line interface for portfolio analysis and simulation

Both interfaces leverage the same core library (src/folio/) for business logic, following our strict separation of concerns principles. The primary domain entities for this app are outlined below. For an authoritative overview of the data model, data_model.py is the source of truth.

Deployment Modes

Folio can run in multiple deployment environments:

Local Development: Running directly on a developer's machine
Docker Container: Running in a containerized environment
Hugging Face Spaces: Deployed as a Hugging Face Space for public access

The application detects its environment and adjusts settings accordingly, such as cache directories and logging behavior.

Core Data Model

The core data model consists of several key classes that represent portfolio components:

Position: Base class for all positions
- StockPosition: Represents a stock position with quantity, price, beta, etc.
- OptionPosition: Represents an option position with strike, expiry, option type, delta, etc.
PortfolioGroup: Groups a stock with its related options (e.g., AAPL stock with AAPL options)
PortfolioSummary: Contains aggregated metrics for the entire portfolio
ExposureBreakdown: Detailed breakdown of exposure metrics by category

These classes are defined in data_model.py and provide the foundation for all portfolio analysis.

Data Flow

The data flow in Folio follows these main steps:

Data Input: User uploads a portfolio CSV file or loads a sample portfolio
Data Processing: The CSV is parsed, validated, and transformed into structured portfolio data
Position Grouping: Stocks and their related options are grouped together
Metrics Calculation: Exposure, beta, and other metrics are calculated for each position and group
Visualization: The processed data is displayed in the dashboard with charts and tables
Interactivity: User interactions trigger callbacks that update the displayed data

CSV Processing

When a user uploads a CSV file, the following process occurs:

The file is validated for security in security.py
The CSV is parsed into a pandas DataFrame
The DataFrame is processed by process_portfolio_data() in portfolio.py
Stock positions are identified and processed
Option positions are parsed and matched to their underlying stocks
Cash-like positions are identified using cash_detection.py
Portfolio groups and summary metrics are calculated

Stock Data Fetching

Folio uses a pluggable data fetching system to retrieve stock data:

A DataFetcherInterface defined in stockdata.py provides a common interface
Concrete implementations include YFinanceDataFetcher and FMP (Financial Modeling Prep) fetchers
A singleton pattern ensures only one data fetcher is created throughout the application
The data source can be configured at runtime through the folio.yaml configuration file
Data is cached to improve performance and reduce API calls

Options Processing

Option positions require special processing:

Option descriptions are parsed in options.py to extract strike, expiry, and option type
QuantLib is used for option pricing and Greeks calculations
Delta exposure is calculated as delta * notional value
Options are matched to their underlying stocks to form portfolio groups
Option metrics are aggregated into the portfolio summary

Portfolio Metrics Calculation

Portfolio metrics are calculated in several steps:

Individual position metrics are calculated first (market value, beta, exposure)
Positions are grouped by underlying ticker
Group-level metrics are calculated (net exposure, beta-adjusted exposure)
Portfolio-level metrics are calculated (total exposure, portfolio beta, etc.)
Exposure breakdowns are created for visualization

The canonical implementations for these calculations are in portfolio_value.py.

Web UI Components

The web UI is built with Dash and consists of several key components:

Summary Cards: Display high-level portfolio metrics
Charts: Visualize portfolio allocation and exposure
Portfolio Table: Display all positions with key metrics
Position Details: Show detailed information for a selected position
P&L Chart: Visualize profit/loss scenarios for options strategies

Each component is defined in the components directory and registered with callbacks in app.py.

Component Interaction

Components interact through Dash callbacks:

Data is stored in dcc.Store components that act as a client-side state
User interactions trigger callbacks that update the stored data
Components subscribe to changes in the stored data and update accordingly
This pattern allows for a reactive UI without page reloads

CLI Interface

The CLI interface (focli) provides a command-line tool for portfolio analysis and simulation:

Architecture

Shell: An interactive shell implemented in shell.py using the cmd module
Commands: Command handlers in the commands directory
Formatters: Output formatting utilities in formatters.py
Utils: CLI-specific utilities in utils.py

Command Structure

The CLI follows a command-subcommand structure:

folio> command [subcommand] [options]

Key commands include:

simulate: Simulate portfolio performance with SPY changes
position: Analyze a specific position group
portfolio: View and analyze portfolio data

Separation of Concerns

The CLI strictly adheres to the separation of concerns principles:

Command handlers only handle parsing, validation, and presentation
All business logic is delegated to the core library
No calculation or simulation logic exists in the CLI layer

Key Modules

Core Library (src/folio/)

Data Processing

portfolio.py: Core portfolio processing logic
portfolio_value.py: Canonical implementations of portfolio value calculations
simulator.py: Portfolio and position simulation logic
options.py: Option pricing and Greeks calculations
cash_detection.py: Identification of cash-like positions

Data Fetching

stockdata.py: Common interface for data fetchers
yfinance.py: Yahoo Finance data fetcher
fmp.py: Financial Modeling Prep data fetcher

Application Core

data_model.py: Core data structures
logger.py: Logging configuration
security.py: Security utilities for validating user inputs

Web UI (src/folio/)

UI Components

components/: UI components for the dashboard
- charts.py: Portfolio visualization charts
- portfolio_table.py: Table of portfolio positions
- position_details.py: Detailed view of a position
- pnl_chart.py: Profit/loss visualization
- summary_cards.py: High-level portfolio metrics

Web Application

app.py: Main Dash application setup and callbacks

CLI Interface (src/focli/)

Command Handling

shell.py: Interactive shell implementation
commands/: Command handlers
- simulate.py: Portfolio simulation commands
- position.py: Position analysis commands
- portfolio.py: Portfolio management commands

Presentation

formatters.py: Output formatting utilities
utils.py: CLI-specific utilities (no business logic)

Configuration

Folio uses a YAML configuration file (folio.yaml) for runtime settings:

Data Source: Configure which data source to use (Yahoo Finance or FMP)
Cache Settings: Configure cache directories and TTL
UI Settings: Configure dashboard appearance and behavior

The configuration is loaded at startup and can be overridden by environment variables.

Error Handling

Folio implements robust error handling:

Fail Fast, Fail Transparently: Errors are raised early and clearly
Graceful Degradation: The application continues to function even if some components fail
Structured Logging: Errors are logged with context for debugging
User Feedback: Error messages are displayed to the user when appropriate

Testing

The codebase includes comprehensive tests:

Unit Tests: Test individual functions and classes
Integration Tests: Test interactions between components
Mock Data: Use mock data for testing to avoid API calls

Tests are organized to mirror the structure of the source code, with test files corresponding to source files.

Development Workflow

To add new features to Folio:

UI Components: Add new components in the components/ directory
Data Processing: Extend the data model in data_model.py and processing logic in utils.py
Callbacks: Add new callbacks in app.py to handle user interactions
Testing: Add tests for new functionality

Separation of Concerns

Folio strictly adheres to separation of concerns principles:

Core Library vs Interface Layers

Core Library (src/folio/):
- Contains ALL business logic, data processing, and calculation functionality
- Provides a stable API for interface layers to use
- Should never depend on interface-specific code
Interface Layers (src/focli/, web UI):
- Handle user interaction, command parsing, and result presentation
- Call core library functions to perform business operations
- Should NEVER contain business logic
- Focus solely on translating user inputs to core library calls and formatting outputs

Business Logic Placement

Business logic must ALWAYS reside in the core library, not in interface layers. Examples include:

Calculations and algorithms
Data transformations
Simulation logic
Portfolio analysis
Value calculations

Interface layers should be thin wrappers around the core library, focusing only on:

Parsing user input
Calling appropriate core library functions
Formatting and presenting results
Managing UI state

Conclusion

Folio is designed with a clean separation of concerns:

Business logic is centralized in the core library
Data fetching is abstracted behind interfaces
Data processing is separated from UI components
UI components are modular and reusable
Configuration is externalized for flexibility
Interface layers are thin and focused on user interaction

This architecture makes the codebase maintainable, testable, and extensible, allowing for easy addition of new features and improvements.