Spaces:

mingdom
/

folio

Sleeping

dystomachina commited on Apr 26, 2025

Commit

203d1e9

1 Parent(s): 8803281

refactor(data-fetcher): simplify data fetcher to use only YFinance

BREAKING CHANGE: Removed FMP data fetcher and related configuration options

- Removed src/fmp.py module and all related tests
- Consolidated DataFetcherSingleton into a single implementation in src/stockdata.py
- Removed data_source configuration option from folio.yaml
- Simplified create_data_fetcher to only support YFinance
- Updated all imports and usages to use the simplified data fetcher
- Removed scripts/check_beta.py and tests/fetch_sample_data.py that depended on FMP
- Updated documentation to reflect YFinance as the only supported data source

This change simplifies the codebase by removing unnecessary abstraction and
focusing on a single data source implementation.

Files changed (19) hide show

.env.example +0 -2
.gitignore +2 -2
DOCKER.md +1 -1
README.md +6 -5
docs/ai-rules.md +14 -0
docs/coding-guidelines.md +405 -0
docs/project-conventions.md +360 -0
docs/project-design.md +192 -0
scripts/check_beta.py +0 -175
src/fmp.py +0 -243
src/folio/README.md +0 -119
src/folio/data_fetcher_singleton.py +0 -97
src/folio/folio.yaml +0 -2
src/folio/portfolio.py +1 -14
src/folio/roadmap.md +0 -262
src/folio/utils.py +1 -4
src/stockdata.py +17 -41
tests/fetch_sample_data.py +0 -110
tests/test_data_fetcher.py +0 -471

.env.example CHANGED Viewed

@@ -1,8 +1,6 @@
 # Environment variables for the Folio application
 # API Keys
-# FMP (FinancialModelingPrep) API - not required when using yfinance:
-FMP_API_KEY=your_fmp_api_key_here
 # Used for portfolio analysis premium feature
 GEMINI_API_KEY=your_gemini_api_key_here

 # Environment variables for the Folio application
 # API Keys
 # Used for portfolio analysis premium feature
 GEMINI_API_KEY=your_gemini_api_key_here

.gitignore CHANGED Viewed

@@ -80,5 +80,5 @@ src/lab/portfolio.csv
 .executive.md
 private-data/*
-# Documentation folder (temporary exclusion)
-docs/

 .executive.md
 private-data/*
+# Local documentation folder
+.docs/

DOCKER.md CHANGED Viewed

@@ -99,7 +99,7 @@ If you prefer to use Docker directly without Make:
 - **Port conflicts**: If port 8050 is already in use, modify the `PORT` environment variable in your `.env` file and update the port mapping in `docker-compose.yml`.
-- **Data source issues**: By default, the application uses yfinance as the data source. If you want to use FMP instead, you'll need to set the FMP_API_KEY in your `.env` file and change DATA_SOURCE to 'fmp'.
 - **Volume mounting**: If you're making changes to the code and want to see them reflected immediately, ensure the volumes in `docker-compose.yml` are correctly mapping your local directories.

 - **Port conflicts**: If port 8050 is already in use, modify the `PORT` environment variable in your `.env` file and update the port mapping in `docker-compose.yml`.
+- **Data source**: The application uses yfinance as the data source for stock data.
 - **Volume mounting**: If you're making changes to the code and want to see them reflected immediately, ensure the volumes in `docker-compose.yml` are correctly mapping your local directories.

README.md CHANGED Viewed

@@ -17,20 +17,19 @@ Folio is a powerful web-based dashboard for analyzing and optimizing your invest
 - **Complete Portfolio Visibility**: See your entire financial picture in one place
 - **Smart Risk Assessment**: Understand your portfolio's risk profile with beta analysis
-- **AI-Powered Insights**: Get personalized investment advice from our AI portfolio advisor
 - **Cash & Equivalents Detection**: Automatically identifies money market and cash-like positions
-- **Option Analytics**: Detailed metrics for options including implied volatility and Greeks
 - **Zero Cost**: Free to use, with no hidden fees or subscriptions
 ## Key Features
 - **Portfolio Summary**: View total exposure, beta, and allocation breakdown
 - **Position Details**: Analyze individual positions with detailed metrics
-- **AI Portfolio Advisor**: Get personalized investment advice powered by Google's Gemini AI
 - **Filtering & Sorting**: Filter by position type and sort by various metrics
 - **Real-time Data**: Uses Yahoo Finance API for up-to-date market data
 - **Responsive Design**: Works seamlessly on desktop and mobile devices
-- **Dark Mode**: Easy on the eyes for late-night financial analysis
 ## Getting Started
@@ -99,12 +98,14 @@ For more Docker commands and options, see [DOCKER.md](DOCKER.md).
 For information about logging configuration, see [docs/logging.md](docs/logging.md).
 ## Using Folio
 1. **Upload Your Portfolio**: Use the upload button to import a CSV file with your holdings
 2. **Explore Your Data**: View summary metrics and detailed breakdowns of your investments
 3. **Filter and Sort**: Focus on specific asset types or metrics that matter to you
-4. **Get AI Insights**: Click the "Robot Advisor" button to get personalized advice about your portfolio
 5. **Export or Share**: Save your analysis or share insights with your financial advisor
 ## Sample Portfolio

 - **Complete Portfolio Visibility**: See your entire financial picture in one place
 - **Smart Risk Assessment**: Understand your portfolio's risk profile with beta analysis
 - **Cash & Equivalents Detection**: Automatically identifies money market and cash-like positions
+- **Option Analytics**: Detailed metrics for options including delta exposure and notional value
 - **Zero Cost**: Free to use, with no hidden fees or subscriptions
 ## Key Features
 - **Portfolio Summary**: View total exposure, beta, and allocation breakdown
 - **Position Details**: Analyze individual positions with detailed metrics
+- **Position Grouping**: Automatically groups stocks with their related options
+- **P&L Visualization**: See potential profit/loss scenarios for option strategies
 - **Filtering & Sorting**: Filter by position type and sort by various metrics
 - **Real-time Data**: Uses Yahoo Finance API for up-to-date market data
 - **Responsive Design**: Works seamlessly on desktop and mobile devices
 ## Getting Started
 For information about logging configuration, see [docs/logging.md](docs/logging.md).
+For a detailed explanation of the project architecture, see [docs/project-design.md](docs/project-design.md).
 ## Using Folio
 1. **Upload Your Portfolio**: Use the upload button to import a CSV file with your holdings
 2. **Explore Your Data**: View summary metrics and detailed breakdowns of your investments
 3. **Filter and Sort**: Focus on specific asset types or metrics that matter to you
+4. **Analyze Positions**: Click on any position to see detailed metrics and P&L scenarios
 5. **Export or Share**: Save your analysis or share insights with your financial advisor
 ## Sample Portfolio

docs/ai-rules.md ADDED Viewed

	@@ -0,0 +1,14 @@

+---
+description: Miscellaneous rules to get the AI to behave
+globs: *
+alwaysApply: true
+---
+# General rules for AI
+- Prior to generating any code, carefully read the project conventions
+  - Read [project-design.md](docs/project-design.md) to understand the codebase
+  - Read [project-conventions.md](docs/project-conventions.md) to understand _how_ to write code for the codebase
+## Prohibited actions
+- Do not run `make folio`. This is for the user to run only.
+- Do not use `git` commands unless explicitly asked.

docs/coding-guidelines.md ADDED Viewed

	@@ -0,0 +1,405 @@

+# A Concise, Opinionated Guide to Writing Good Code (with Python examples)
+This guide summarizes core principles for writing clean, maintainable, and effective code. It's opinionated and rule-based, designed to provide clear direction for junior developers. Adhering to these rules will help you build better software and become a more valuable team member. Python examples are provided for clarity.
+## 1. Naming Matters Immensely
+* **Rule:** Use intention-revealing names.
+    * **Don't:** `d = (datetime.now() - start_date).days`
+    * **Do:** `elapsed_time_in_days = (datetime.now() - start_date).days`
+* **Rule:** Avoid disinformation.
+    * **Don't:** `account_list = {"id": 1, "name": "Alice"}` (It's a dictionary, not a list)
+    * **Do:** `account_data = {"id": 1, "name": "Alice"}` or `account_dict = ...`
+* **Rule:** Use pronounceable and searchable names.
+    * **Don't:** `genymdhms = datetime.now().strftime('%Y%m%d%H%M%S')`
+    * **Do:** `generation_timestamp = datetime.now().strftime('%Y%m%d%H%M%S')`
+* **Rule:** Be consistent.
+    * **Don't:** Using `fetch_user_data`, `getUserInfo`, `retrieve_client_details` in the same project.
+    * **Do:** Consistently use one style, e.g., `get_user_data`, `get_order_info`, `get_product_details`.
+## 2. Functions Should Be Small and Focused
+* **Rule:** Functions must do **one thing**.
+    * **Don't:**
+        ```python
+        def process_user_data(user_id):
+            # Fetches data
+            response = requests.get(f"/api/users/{user_id}")
+            user_data = response.json()
+            # Validates data
+            if not user_data.get("email"):
+                raise ValueError("Email missing")
+            # Saves data
+            db.save(user_data)
+            # Sends notification
+            send_email(user_data["email"], "Welcome!")
+            return user_data
+        ```
+    * **Do:** Break it down:
+        ```python
+        def fetch_user_data(user_id):
+            response = requests.get(f"/api/users/{user_id}")
+            response.raise_for_status() # Raise HTTP errors
+            return response.json()
+        def validate_user_data(user_data):
+            if not user_data.get("email"):
+                raise ValueError("Email missing")
+            # ... other validations
+        def save_user_data(user_data):
+            db.save(user_data)
+        def send_welcome_email(email_address):
+            send_email(email_address, "Welcome!")
+        def register_user(user_id):
+            user_data = fetch_user_data(user_id)
+            validate_user_data(user_data)
+            save_user_data(user_data)
+            send_welcome_email(user_data["email"])
+            return user_data
+        ```
+* **Rule:** Functions must be **small**. (The "Do" example above also illustrates this).
+* **Rule:** Minimize function arguments.
+    * **Don't:** `def create_user(name, email, password, dob, address, phone, role, is_active): ...`
+    * **Do:**
+        ```python
+        class UserProfile:
+            def __init__(self, name, email, dob, address, phone):
+                # ... initialization ...
+        def create_user(profile: UserProfile, password: str, role: str, is_active: bool = True):
+             # ... use profile attributes ...
+        ```
+        Or pass a dictionary:
+        ```python
+        def create_user(user_details: dict):
+            # Access details via user_details['name'], user_details['email'] etc.
+            # Consider using TypedDict for better structure if using Python 3.8+
+            ...
+        ```
+* **Rule:** Avoid side effects where possible.
+    * **Don't (Hidden Side Effect):**
+        ```python
+        user_list = []
+        def add_user_if_valid(name, email):
+            if "@" in email:
+                user_list.append({"name": name, "email": email}) # Modifies global state
+                return True
+            return False
+        ```
+    * **Do (Explicit):**
+        ```python
+        def create_user_record(name, email):
+            if "@" not in email:
+                raise ValueError("Invalid email")
+            return {"name": name, "email": email}
+        # Usage
+        try:
+            new_user = create_user_record("Bob", "bob@example.com")
+            user_list.append(new_user) # State change happens outside the function
+        except ValueError as e:
+            print(f"Error: {e}")
+        ```
+## 3. Comments Are for "Why," Not "What"
+* **Rule:** Comment the "Why," not the "What."
+    * **Don't:**
+        ```python
+        # Check if user is eligible
+        if age >= 18 and country == "US": # This just repeats the code
+            is_eligible = True
+        ```
+    * **Do:**
+        ```python
+        # User must be a legal adult in the US to qualify for this specific offer.
+        if age >= 18 and country == "US":
+            is_eligible = True
+        ```
+* **Rule:** Do **not** leave commented-out code.
+    * **Don't:**
+        ```python
+        def calculate_total(items):
+            total = 0
+            for item in items:
+                total += item['price']
+            # tax = total * 0.10 # Old tax calculation
+            # total += tax
+            total *= 1.10 # Apply 10% tax
+            return total
+        ```
+    * **Do:** Remove the commented lines. Use Git history if you need to see the old calculation.
+        ```python
+        def calculate_total(items):
+            total = sum(item['price'] for item in items)
+            total *= 1.10 # Apply 10% tax
+            return total
+        ```
+* **Rule:** Keep comments up-to-date. (Self-explanatory - if the logic changes, update or remove the comment).
+* **Rule:** Avoid redundant comments.
+    * **Don't:**
+        ```python
+        count = 0 # Initialize count
+        count += 1 # Increment count
+        ```
+    * **Do:** Just the code is enough.
+        ```python
+        count = 0
+        count += 1
+        ```
+## 4. Formatting and Structure Enhance Readability
+* **Rule:** Use a consistent style guide (e.g., PEP 8 for Python). Use tools like `Black`, `Flake8`, `isort`.
+    * **Don't:** Inconsistent spacing, line lengths, import orders.
+    * **Do:** Code automatically formatted by tools like `Black`.
+* **Rule:** Top-down narrative.
+    * **Don't:** Define helper functions *before* the main function that uses them, forcing the reader to jump around.
+    * **Do:**
+        ```python
+        def main_process():
+            data = _fetch_data()
+            result = _process_data(data)
+            _save_result(result)
+        # --- Helper functions defined below ---
+        def _fetch_data(): ...
+        def _process_data(data): ...
+        def _save_result(result): ...
+        ```
+        *(Note: Leading underscore `_` often indicates internal/helper functions)*
+* **Rule:** Keep related concepts vertically close. (The example above also shows this).
+* **Rule:** Use whitespace.
+    * **Don't:**
+        ```python
+        def process(a,b,c):
+            x=a+b
+            y=x*c
+            if y>10:
+                print("Large")
+            else:
+                print("Small")
+            z=y-a
+            return z
+        ```
+    * **Do:**
+        ```python
+        def process(a, b, c):
+            intermediate_value = a + b
+            final_value = intermediate_value * c
+            if final_value > 10:
+                print("Large")
+            else:
+                print("Small")
+            adjusted_value = final_value - a
+            return adjusted_value
+        ```
+## 5. Keep It Simple (KISS & YAGNI)
+* **Rule:** KISS (Keep It Simple, Stupid).
+    * **Don't:** Using complex metaprogramming or obscure language features when a simple loop or conditional would suffice.
+    * **Do:** Prefer straightforward, readable solutions.
+* **Rule:** YAGNI (You Ain't Gonna Need It).
+    * **Don't:** Adding configuration options, database fields, or API endpoints for features that *might* be needed in the future but aren't required now.
+    * **Do:** Implement only what's necessary for the current requirements.
+* **Rule:** Avoid premature optimization.
+    * **Don't:** Spending hours micro-optimizing a function with string concatenations before profiling to see if it's even a bottleneck.
+    * **Do:** Write clean code first. If performance is an issue (measure it!), profile and optimize the specific hotspots. Often, a better algorithm beats micro-optimization.
+## 6. Don't Repeat Yourself (DRY)
+* **Rule:** Avoid duplication.
+    * **Don't:**
+        ```python
+        def process_file_a(path):
+            # 10 lines of validation logic
+            if not valid: return None
+            # Process file A specific logic
+            ...
+        def process_file_b(path):
+            # Same 10 lines of validation logic copied here
+            if not valid: return None
+            # Process file B specific logic
+            ...
+        ```
+    * **Do:**
+        ```python
+        def _validate_input(path):
+            # 10 lines of validation logic
+            return is_valid
+        def process_file_a(path):
+            if not _validate_input(path): return None
+            # Process file A specific logic
+            ...
+        def process_file_b(path):
+            if not _validate_input(path): return None
+            # Process file B specific logic
+            ...
+        ```
+## 7. Handle Errors Gracefully
+* **Rule:** Use exceptions over error codes.
+    * **Don't:**
+        ```python
+        def divide(a, b):
+            if b == 0:
+                return -1 # Error code
+            return a / b
+        result = divide(10, 0)
+        if result == -1:
+            print("Error: Division by zero")
+        ```
+    * **Do:**
+        ```python
+        def divide(a, b):
+            if b == 0:
+                raise ValueError("Cannot divide by zero")
+            return a / b
+        try:
+            result = divide(10, 0)
+        except ValueError as e:
+            print(f"Error: {e}")
+        ```
+* **Rule:** Provide context with errors.
+    * **Don't:** `raise Exception("Error!")`
+    * **Do:** `raise ValueError(f"Invalid user ID format: '{user_id_str}'")`
+## 8. Test Your Code
+* **Rule:** Write unit tests (using frameworks like `pytest` or `unittest`).
+    * **Don't:** Skipping tests because the code "looks simple."
+    * **Do:**
+        ```python
+        # Example using pytest
+        from my_module import add
+        def test_add_positive_numbers():
+            assert add(2, 3) == 5
+        def test_add_negative_numbers():
+            assert add(-1, -1) == -2
+        def test_add_mixed_numbers():
+            assert add(5, -3) == 2
+        ```
+* **Rule:** Test behavior, not implementation.
+    * **Don't:** Writing a test that checks if a specific private helper method (`_helper`) was called.
+    * **Do:** Writing a test that checks if the public method produces the correct output or state change, regardless of which internal helpers were used.
+* **Rule:** Keep tests clean, readable, and fast. (Apply the same principles from this guide to your test code).
+## 9. Practice Continuous Refactoring
+* **Rule:** Follow the Boy Scout Rule.
+    * **Don't:** Seeing a poorly named variable or a slightly complex block of code and leaving it because "it works."
+    * **Do:** Taking a few moments to rename the variable or extract a small function to improve clarity before committing your primary change.
+* **Rule:** Refactoring is part of development. (This is a mindset, less about specific code examples).
+## 10. Optimize for Readability
+* **Rule:** Code is read more than written.
+    * **Don't:** Using overly clever one-liners or complex list comprehensions that are hard to decipher.
+        ```python
+        # Clever but potentially hard to read
+        result = [x**2 for x in range(10) if x % 2 == 0 and x > 3]
+        ```
+    * **Do:** Prioritize clarity, even if it means slightly more verbose code.
+        ```python
+        result = []
+        for x in range(10):
+            is_even = x % 2 == 0
+            is_greater_than_3 = x > 3
+            if is_even and is_greater_than_3:
+                result.append(x**2)
+        # Or a more readable comprehension if appropriate
+        result = [x**2 for x in range(4, 10, 2)] # Clearer range
+        ```
+## 11. Python-Specific Best Practices
+* **Rule:** Embrace Pythonic idioms.
+    * **Use List Comprehensions (when clear):** Prefer `squares = [x*x for x in numbers]` over manual `for` loop appends for simple transformations.
+    * **Use Context Managers (`with` statement):** Ensure resources like files or network connections are properly closed.
+        ```python
+        # Don't
+        f = open("myfile.txt", "w")
+        try:
+            f.write("Hello")
+        finally:
+            f.close()
+        # Do
+        with open("myfile.txt", "w") as f:
+            f.write("Hello")
+        # File is automatically closed here, even if errors occur
+        ```
+    * **Iterate Directly:** Iterate over sequences directly instead of using index manipulation.
+        ```python
+        # Don't
+        for i in range(len(my_list)):
+            print(my_list[i])
+        # Do
+        for item in my_list:
+            print(item)
+        # Do (if index is needed)
+        for i, item in enumerate(my_list):
+            print(f"Index {i}: {item}")
+        ```
+* **Rule:** Use Type Hinting (Python 3.5+). Improves readability, enables static analysis tools (`mypy`), and clarifies intent.
+    ```python
+    # Don't
+    def greet(name):
+        print("Hello " + name)
+    # Do
+    def greet(name: str) -> None:
+        print("Hello " + name)
+    def add(a: int, b: int) -> int:
+        return a + b
+    ```
+* **Rule:** Use Virtual Environments (`venv`). Isolate project dependencies to avoid conflicts between projects. Always create and activate a virtual environment before installing packages (`pip install ...`).
+* **Rule:** Prefer f-strings (Python 3.6+) for string formatting. They are generally more readable and often faster than `.format()` or `%` formatting.
+    ```python
+    name = "Alice"
+    age = 30
+    # Don't (older styles)
+    print("Name: %s, Age: %d" % (name, age))
+    print("Name: {}, Age: {}".format(name, age))
+    # Do
+    print(f"Name: {name}, Age: {age}")
+    ```
+* **Rule:** Understand Mutable Default Arguments. Be wary of using mutable types (like lists or dicts) as default function arguments, as they are shared across calls.
+    ```python
+    # Don't (potential bug)
+    def add_item(item, my_list=[]):
+        my_list.append(item)
+        return my_list
+    list1 = add_item(1) # [1]
+    list2 = add_item(2) # [1, 2] - Unexpected!
+    # Do
+    def add_item(item, my_list=None):
+        if my_list is None:
+            my_list = []
+        my_list.append(item)
+        return my_list
+    list1 = add_item(1) # [1]
+    list2 = add_item(2) # [2] - Correct

docs/project-conventions.md ADDED Viewed

	@@ -0,0 +1,360 @@

+---
+description: Concise coding conventions for the Folio project
+alwaysApply: true
+---
+# Folio Project Conventions
+This document outlines the key coding conventions for the Folio project. These conventions are designed to help maintain code quality, readability, and consistency across the codebase.
+## Project Tech Stack
+- **Web Framework**: Dash (Python)
+- **Data Processing**: Pandas, NumPy
+- **Financial Data**: Yahoo Finance API (default), FMP API (optional)
+- **Testing**: Pytest
+- **Linting**: Flake8, Black, isort
+## Core Conventions
+### 1. Fail Fast and Transparently
+Never hide errors with default values. Financial data must be accurate or explicitly marked as unavailable.
+```python
+# ❌ Bad: Hiding errors with defaults
+def get_beta(ticker):
+    try:
+        return data_fetcher.get_beta(ticker)
+    except Exception:
+        return 1.0  # Dangerous default!
+# ✅ Good: Transparent failure
+def get_beta(ticker):
+    try:
+        return data_fetcher.get_beta(ticker)
+    except Exception as e:
+        logger.error(f"Failed to get beta for {ticker}: {e}", exc_info=True)
+        raise  # Let the caller handle the error
+```
+### 2. Use Intention-Revealing Names
+Names should clearly communicate what a variable, function, or class is for.
+```python
+# ❌ Bad: Unclear names
+def calc(p, q):
+    return p * q * 1.1
+# ✅ Good: Clear names
+def calculate_total_with_tax(price, quantity):
+    return price * quantity * 1.1
+```
+### 3. Write Small, Focused Functions
+Each function should do one thing well and be reasonably small.
+```python
+# ❌ Bad: Function doing too much
+def process_portfolio(portfolio_data):
+    # Validate data
+    if not portfolio_data:
+        raise ValueError("Empty portfolio")
+    # Calculate metrics
+    total_value = 0
+    total_beta_adjusted = 0
+    for position in portfolio_data:
+        price = position["price"]
+        quantity = position["quantity"]
+        beta = get_beta(position["ticker"])
+        value = price * quantity
+        total_value += value
+        total_beta_adjusted += value * beta
+    # Generate report
+    report = {
+        "total_value": total_value,
+        "portfolio_beta": total_beta_adjusted / total_value if total_value else 0,
+        "positions": len(portfolio_data)
+    }
+    # Save to database
+    db.save_portfolio_report(report)
+    return report
+# ✅ Good: Functions with single responsibilities
+def validate_portfolio(portfolio_data):
+    if not portfolio_data:
+        raise ValueError("Empty portfolio")
+    return portfolio_data
+def calculate_position_metrics(position):
+    price = position["price"]
+    quantity = position["quantity"]
+    beta = get_beta(position["ticker"])
+    value = price * quantity
+    beta_adjusted = value * beta
+    return {"value": value, "beta_adjusted": value * beta}
+def calculate_portfolio_metrics(portfolio_data):
+    validated_data = validate_portfolio(portfolio_data)
+    position_metrics = [calculate_position_metrics(pos) for pos in validated_data]
+    total_value = sum(pos["value"] for pos in position_metrics)
+    total_beta_adjusted = sum(pos["beta_adjusted"] for pos in position_metrics)
+    return {
+        "total_value": total_value,
+        "portfolio_beta": total_beta_adjusted / total_value if total_value else 0,
+        "positions": len(portfolio_data)
+    }
+def save_portfolio_report(report):
+    db.save_portfolio_report(report)
+    return report
+def process_portfolio(portfolio_data):
+    metrics = calculate_portfolio_metrics(portfolio_data)
+    return save_portfolio_report(metrics)
+```
+### 4. Validate Early, Return Fast
+Check inputs at the beginning of functions to avoid deep nesting and keep the happy path clean.
+```python
+# ❌ Bad: Deeply nested conditionals
+def process_data(data):
+    if data is not None:
+        if "ticker" in data:
+            if data["ticker"] != "":
+                # Process the data...
+                return result
+            else:
+                return None
+        else:
+            return None
+    else:
+        return None
+# ✅ Good: Early validation
+def process_data(data):
+    if data is None:
+        raise ValueError("Data cannot be None")
+    if "ticker" not in data:
+        raise ValueError("Missing required 'ticker' field")
+    if data["ticker"] == "":
+        raise ValueError("Ticker cannot be empty")
+    # Process the data...
+    return result
+```
+### 5. Comment the "Why," Not the "What"
+Explain reasoning behind complex code, not obvious operations.
+```python
+# ❌ Bad: Commenting the obvious
+# Calculate the sum of prices
+total = sum(item.price for item in items)
+# ❌ Bad: Commented-out code
+# Old calculation method
+# for item in items:
+#     total += item.price
+# ✅ Good: Explaining the why
+# Apply 15% discount for bulk orders (>10 items) per company policy
+if len(items) > 10:
+    total *= 0.85
+```
+### 6. Write Minimal, Effective Tests
+Focus on testing critical business logic, not framework functionality.
+```python
+# ❌ Bad: Testing framework functionality
+def test_dataframe_creation():
+    # This just tests pandas functionality, not our code
+    data = {"ticker": ["AAPL"], "price": [150]}
+    df = pd.DataFrame(data)
+    assert len(df) == 1
+    assert "ticker" in df.columns
+# ✅ Good: Testing critical business logic
+def test_portfolio_beta_calculation():
+    # Arrange: Set up test data
+    portfolio = Portfolio()
+    portfolio.add_position(
+        StockPosition(ticker="AAPL", quantity=10, price=150)
+    )
+    # Mock external dependencies
+    data_fetcher = MagicMock()
+    data_fetcher.get_beta.return_value = 1.2
+    # Act: Call the method under test
+    beta = portfolio.calculate_beta(data_fetcher=data_fetcher)
+    # Assert: Verify the result
+    assert beta == 1.2
+    data_fetcher.get_beta.assert_called_once_with("AAPL")
+```
+### 7. Embrace Pythonic Idioms
+Use Python's built-in features to write cleaner, more readable code.
+```python
+# ❌ Bad: Non-Pythonic code
+result = []
+for i in range(len(items)):
+    if items[i].price > 100:
+        result.append(items[i].name)
+# ✅ Good: Pythonic code
+result = [item.name for item in items if item.price > 100]
+# ❌ Bad: Manual resource management
+f = open("data.csv", "r")
+try:
+    data = f.read()
+finally:
+    f.close()
+# ✅ Good: Context manager
+with open("data.csv", "r") as f:
+    data = f.read()
+```
+### 8. Use Type Hints
+Add type hints to improve readability and enable static analysis.
+```python
+# ❌ Bad: No type hints
+def calculate_position_value(quantity, price):
+    return quantity * price
+# ✅ Good: With type hints
+def calculate_position_value(quantity: float, price: float) -> float:
+    return quantity * price
+# Even better: With more specific types and docstring
+from typing import Dict, List, Optional
+def get_positions_by_sector(
+    positions: List[Dict[str, any]],
+    sector: Optional[str] = None
+) -> Dict[str, List[Dict[str, any]]]:
+    """
+    Group positions by sector.
+    Args:
+        positions: List of position dictionaries
+        sector: Optional sector to filter by
+    Returns:
+        Dictionary mapping sectors to lists of positions
+    """
+    result = {}
+    for position in positions:
+        pos_sector = position.get("sector", "Unknown")
+        if sector and pos_sector != sector:
+            continue
+        if pos_sector not in result:
+            result[pos_sector] = []
+        result[pos_sector].append(position)
+    return result
+```
+### 9. Handle Errors Gracefully
+Use exceptions with context and handle them appropriately.
+```python
+# ❌ Bad: Using error codes
+def divide_stocks(total_value, num_stocks):
+    if num_stocks == 0:
+        return -1  # Error code
+    return total_value / num_stocks
+# Usage
+result = divide_stocks(1000, 0)
+if result == -1:
+    print("Error: Cannot divide by zero")
+# ✅ Good: Using exceptions
+def divide_stocks(total_value: float, num_stocks: int) -> float:
+    if num_stocks == 0:
+        raise ValueError("Cannot divide by zero stocks")
+    return total_value / num_stocks
+# Usage
+try:
+    result = divide_stocks(1000, 0)
+except ValueError as e:
+    logger.error(f"Portfolio calculation error: {e}")
+    # Handle the error appropriately
+```
+### 10. Keep It Simple (KISS)
+Prefer simple, straightforward solutions over complex ones.
+```python
+# ❌ Bad: Overly complex
+def is_valid_ticker(ticker):
+    if ticker is not None:
+        if isinstance(ticker, str):
+            if len(ticker) > 0:
+                if len(ticker) <= 5:
+                    if ticker.isalpha():
+                        return True
+    return False
+# ✅ Good: Simple and clear
+def is_valid_ticker(ticker: str) -> bool:
+    return (
+        isinstance(ticker, str) and
+        1 <= len(ticker) <= 5 and
+        ticker.isalpha()
+    )
+```
+## Additional Guidelines
+1. **Follow the Boy Scout Rule**: Leave the code cleaner than you found it.
+2. **Don't Repeat Yourself (DRY)**: Extract repeated code into reusable functions.
+3. **You Aren't Gonna Need It (YAGNI)**: Don't add functionality until it's necessary.
+4. **Optimize After Measuring**: Profile code to identify actual bottlenecks before optimizing.
+5. **Use Consistent Formatting**: Use Black, Flake8, and isort to maintain consistent code style.
+6. **Imports at Top**: Always place all imports at the top of the file.
+7. **No Unused Code**: Remove commented-out code and unused imports/variables.
+8. **Configuration Over Hardcoding**: Use configuration files for values that might change.
+9. **Log with Context**: Include relevant information in log messages.
+10. **Make Small, Focused Changes**: Don't modify unrelated code when implementing a feature or fixing a bug.
+## Benefits of Following These Conventions
+- **Readability**: Code is easier to understand at a glance
+- **Maintainability**: Simpler structure makes changes easier and safer
+- **Testability**: Clear paths make testing more straightforward
+- **Reliability**: Proper error handling prevents unexpected behavior
+- **Performance**: Well-structured code leads to better performance

docs/project-design.md ADDED Viewed

	@@ -0,0 +1,192 @@

+---
+description: This document explains the system architecture and data flow of the Folio application
+globs: *
+alwaysApply: true
+---
+# Folio Project Design
+This document outlines how the Folio codebase is structured and how data flows through the application. Folio is a web-based dashboard for analyzing and visualizing investment portfolios, with a focus on stocks and options.
+## Application Overview
+Folio is a Python-based web application built with Dash that provides comprehensive portfolio analysis capabilities. The primary domain entities for this app are outlined below. For an authoritative overview of the data model, [data_model.py](src/folio/data_model.py) is the source of truth.
+## Deployment Modes
+Folio can run in multiple deployment environments:
+- **Local Development**: Running directly on a developer's machine
+- **Docker Container**: Running in a containerized environment
+- **Hugging Face Spaces**: Deployed as a Hugging Face Space for public access
+The application detects its environment and adjusts settings accordingly, such as cache directories and logging behavior.
+## Core Data Model
+The core data model consists of several key classes that represent portfolio components:
+- **Position**: Base class for all positions
+  - **StockPosition**: Represents a stock position with quantity, price, beta, etc.
+  - **OptionPosition**: Represents an option position with strike, expiry, option type, delta, etc.
+- **PortfolioGroup**: Groups a stock with its related options (e.g., AAPL stock with AAPL options)
+- **PortfolioSummary**: Contains aggregated metrics for the entire portfolio
+- **ExposureBreakdown**: Detailed breakdown of exposure metrics by category
+These classes are defined in [data_model.py](src/folio/data_model.py) and provide the foundation for all portfolio analysis.
+## Data Flow
+The data flow in Folio follows these main steps:
+1. **Data Input**: User uploads a portfolio CSV file or loads a sample portfolio
+2. **Data Processing**: The CSV is parsed, validated, and transformed into structured portfolio data
+3. **Position Grouping**: Stocks and their related options are grouped together
+4. **Metrics Calculation**: Exposure, beta, and other metrics are calculated for each position and group
+5. **Visualization**: The processed data is displayed in the dashboard with charts and tables
+6. **Interactivity**: User interactions trigger callbacks that update the displayed data
+### CSV Processing
+When a user uploads a CSV file, the following process occurs:
+1. The file is validated for security in [security.py](src/folio/security.py)
+2. The CSV is parsed into a pandas DataFrame
+3. The DataFrame is processed by `process_portfolio_data()` in [portfolio.py](src/folio/portfolio.py)
+4. Stock positions are identified and processed
+5. Option positions are parsed and matched to their underlying stocks
+6. Cash-like positions are identified using [cash_detection.py](src/folio/cash_detection.py)
+7. Portfolio groups and summary metrics are calculated
+### Stock Data Fetching
+Folio uses a pluggable data fetching system to retrieve stock data:
+1. A `DataFetcherInterface` defined in [stockdata.py](src/stockdata.py) provides a common interface
+2. Concrete implementations include `YFinanceDataFetcher` and `FMP` (Financial Modeling Prep) fetchers
+3. A singleton pattern ensures only one data fetcher is created throughout the application
+4. The data source can be configured at runtime through the `folio.yaml` configuration file
+5. Data is cached to improve performance and reduce API calls
+### Options Processing
+Option positions require special processing:
+1. Option descriptions are parsed in [options.py](src/folio/options.py) to extract strike, expiry, and option type
+2. QuantLib is used for option pricing and Greeks calculations
+3. Delta exposure is calculated as delta * notional value
+4. Options are matched to their underlying stocks to form portfolio groups
+5. Option metrics are aggregated into the portfolio summary
+### Portfolio Metrics Calculation
+Portfolio metrics are calculated in several steps:
+1. Individual position metrics are calculated first (market value, beta, exposure)
+2. Positions are grouped by underlying ticker
+3. Group-level metrics are calculated (net exposure, beta-adjusted exposure)
+4. Portfolio-level metrics are calculated (total exposure, portfolio beta, etc.)
+5. Exposure breakdowns are created for visualization
+The canonical implementations for these calculations are in [portfolio_value.py](src/folio/portfolio_value.py).
+## UI Components
+The UI is built with Dash and consists of several key components:
+1. **Summary Cards**: Display high-level portfolio metrics
+2. **Charts**: Visualize portfolio allocation and exposure
+3. **Portfolio Table**: Display all positions with key metrics
+4. **Position Details**: Show detailed information for a selected position
+5. **P&L Chart**: Visualize profit/loss scenarios for options strategies
+Each component is defined in the [components](src/folio/components) directory and registered with callbacks in [app.py](src/folio/app.py).
+### Component Interaction
+Components interact through Dash callbacks:
+1. Data is stored in `dcc.Store` components that act as a client-side state
+2. User interactions trigger callbacks that update the stored data
+3. Components subscribe to changes in the stored data and update accordingly
+4. This pattern allows for a reactive UI without page reloads
+## Key Modules
+### Data Processing
+- **portfolio.py**: Core portfolio processing logic
+- **portfolio_value.py**: Canonical implementations of portfolio value calculations
+- **options.py**: Option pricing and Greeks calculations
+- **cash_detection.py**: Identification of cash-like positions
+### Data Fetching
+- **stockdata.py**: Common interface for data fetchers
+- **yfinance.py**: Yahoo Finance data fetcher
+- **fmp.py**: Financial Modeling Prep data fetcher
+### UI Components
+- **components/**: UI components for the dashboard
+  - **charts.py**: Portfolio visualization charts
+  - **portfolio_table.py**: Table of portfolio positions
+  - **position_details.py**: Detailed view of a position
+  - **pnl_chart.py**: Profit/loss visualization
+  - **summary_cards.py**: High-level portfolio metrics
+### Application Core
+- **app.py**: Main Dash application setup and callbacks
+- **data_model.py**: Core data structures
+- **logger.py**: Logging configuration
+- **security.py**: Security utilities for validating user inputs
+## Configuration
+Folio uses a YAML configuration file (`folio.yaml`) for runtime settings:
+- **Data Source**: Configure which data source to use (Yahoo Finance or FMP)
+- **Cache Settings**: Configure cache directories and TTL
+- **UI Settings**: Configure dashboard appearance and behavior
+The configuration is loaded at startup and can be overridden by environment variables.
+## Error Handling
+Folio implements robust error handling:
+1. **Fail Fast, Fail Transparently**: Errors are raised early and clearly
+2. **Graceful Degradation**: The application continues to function even if some components fail
+3. **Structured Logging**: Errors are logged with context for debugging
+4. **User Feedback**: Error messages are displayed to the user when appropriate
+## Testing
+The codebase includes comprehensive tests:
+- **Unit Tests**: Test individual functions and classes
+- **Integration Tests**: Test interactions between components
+- **Mock Data**: Use mock data for testing to avoid API calls
+Tests are organized to mirror the structure of the source code, with test files corresponding to source files.
+## Development Workflow
+To add new features to Folio:
+1. **UI Components**: Add new components in the `components/` directory
+2. **Data Processing**: Extend the data model in `data_model.py` and processing logic in `utils.py`
+3. **Callbacks**: Add new callbacks in `app.py` to handle user interactions
+4. **Testing**: Add tests for new functionality
+## Conclusion
+Folio is designed with a clean separation of concerns:
+- Data fetching is abstracted behind interfaces
+- Data processing is separated from UI components
+- UI components are modular and reusable
+- Configuration is externalized for flexibility
+This architecture makes the codebase maintainable, testable, and extensible, allowing for easy addition of new features and improvements.

scripts/check_beta.py DELETED Viewed

@@ -1,175 +0,0 @@
-"""
-Beta Calculation Validation Script
-This script fetches historical data and calculates beta values for a predefined list of symbols
-to validate how the beta calculation works in practice. It uses raw beta calculation without
-any of the special case handling found in the portfolio processing code.
-Beta measures the volatility of a security in relation to the market (using SPY as proxy).
-- Beta > 1: More volatile than the market
-- Beta = 1: Same volatility as the market
-- Beta < 1: Less volatile than the market
-- Beta < 0: Moves in the opposite direction as the market
-Latest Beta Values (as of 2025-04-01):
-   SPAXX**: Not available (money market fund, no market data)
-     FMPXX: Not available (money market fund, no market data)
-     FFRHX: 0.0553 (money market fund)
-       TLT: -0.0145 (long-term treasury ETF, negative correlation)
-       SHY: 0.0107 (short-term treasury ETF)
-       BIL: 0.0005 (1-3 month T-bill ETF, extremely low beta)
-      MCHI: 0.7130 (China ETF, significant market exposure)
-      IEFA: 0.7862 (International ETF, significant market exposure)
-       SPY: 1.0000 (S&P 500 ETF, market benchmark)
-      AAPL: 1.2029 (Tech stock, higher volatility than market)
-     GOOGL: 1.2695 (Tech stock, higher volatility than market)
-   INVALID: Not available (invalid symbol for testing error handling)
-Usage:
-    python scripts/check_beta.py
-Note: This script calculates raw beta values without the additional logic that might be
-applied in the main application, such as fallbacks for cash-like positions or special
-handling of missing data.
-"""
-import os
-import sys
-import pandas as pd
-# Adjust path to import from src
-if __name__ == "__main__":
-    script_dir = os.path.dirname(__file__)
-    project_root = os.path.abspath(os.path.join(script_dir, ".."))
-    sys.path.insert(0, project_root)
-from src.fmp import DataFetcher
-from src.folio.logger import logger  # Use the same logger if desired
-from src.folio.utils import is_cash_or_short_term
-def calculate_raw_beta(
-    ticker: str, fetcher: DataFetcher, market_data: pd.DataFrame | None
-) -> float | str:
-    """Fetches data and calculates raw beta without special handling."""
-    # Early validation
-    if market_data is None:
-        return "Error: Market data not available"
-    try:
-        # Fetch and validate stock data
-        logger.info(f"Fetching data for {ticker}...")
-        stock_data = fetcher.fetch_data(ticker)
-        # Data validation checks
-        error_msg = _validate_data(ticker, stock_data)
-        if error_msg:
-            return error_msg
-        # Calculate returns
-        logger.info(f"Calculating returns for {ticker}...")
-        stock_returns = stock_data["Close"].pct_change().dropna()
-        market_returns = market_data["Close"].pct_change().dropna()
-        # Align data by index
-        aligned_stock, aligned_market = stock_returns.align(
-            market_returns, join="inner"
-        )
-        # Validate aligned data
-        if aligned_stock.empty or len(aligned_stock) < 2:
-            return f"Error: Not enough overlapping data points after alignment for {ticker} (need >= 2)"
-        # Calculate beta
-        logger.info(f"Calculating variance/covariance for {ticker}...")
-        market_variance = aligned_market.var()
-        covariance = aligned_stock.cov(aligned_market)
-        # Validate variance and covariance
-        error_msg = _validate_variance_covariance(market_variance, covariance)
-        if error_msg:
-            return error_msg
-        # Calculate and return beta
-        beta = covariance / market_variance
-        return beta
-    except Exception as e:
-        return f"Error calculating beta for {ticker}: {e}"
-def _validate_data(ticker: str, stock_data: pd.DataFrame | None) -> str | None:
-    """Validates stock data and returns error message if invalid."""
-    if stock_data is None or stock_data.empty:
-        return f"Error: No data fetched for {ticker}"
-    if len(stock_data) < 2:
-        return f"Error: Not enough data points for {ticker} (need >= 2)"
-    return None
-def _validate_variance_covariance(
-    market_variance: float, covariance: float
-) -> str | None:
-    """Validates variance and covariance calculations and returns error message if invalid."""
-    if pd.isna(market_variance) or abs(market_variance) < 1e-12:
-        return f"Error: Market variance is zero or near-zero ({market_variance})"
-    if pd.isna(covariance):
-        return "Error: Covariance calculation resulted in NaN"
-    return None
-if __name__ == "__main__":
-    symbols_to_check = [
-        "SPAXX**",
-        "FMPXX",
-        "FFRHX",
-        "TLT",  # 20+ Year Treasury Bond ETF
-        "SHY",  # 1-3 Year Treasury Bond ETF
-        "BIL",  # 1-3 Month T-Bill ETF
-        "MCHI",  # iShares MSCI China ETF
-        "IEFA",  # iShares Core MSCI EAFE ETF
-        "SPY",  # S&P 500 ETF
-        "AAPL",  # Apple Stock
-        "GOOGL",  # Google Stock
-        "INVALID",  # Test an invalid ticker
-    ]
-    try:
-        fetcher = DataFetcher()
-        if fetcher is None:
-            raise RuntimeError("Fetcher initialization failed")
-        # Fetch market data once
-        market_data = (
-            fetcher.fetch_market_data()
-        )  # Assumes this fetches S&P500 or similar
-        if market_data is None or market_data.empty:
-            sys.exit(1)
-    except Exception:
-        sys.exit(1)
-    # Calculate beta for each symbol and store results
-    results = {}
-    for symbol in symbols_to_check:
-        beta_result = calculate_raw_beta(symbol, fetcher, market_data)
-        results[symbol] = beta_result
-    # Display results in a formatted table
-    for symbol, result in results.items():
-        if isinstance(result, float):
-            is_cash = is_cash_or_short_term(symbol, beta=result)
-            classification = "CASH-LIKE" if is_cash else "MARKET-CORRELATED"
-        else:
-            # Error case
-            logger.error(f"Error for {symbol}: {result}")
-    # Summary statistics
-    success_count = sum(1 for r in results.values() if isinstance(r, float))
-    error_count = len(results) - success_count
-    cash_like_count = sum(
-        1
-        for s, r in results.items()
-        if isinstance(r, float) and is_cash_or_short_term(s, beta=r)
-    )

src/fmp.py DELETED Viewed

@@ -1,243 +0,0 @@
-"""
-Data fetcher for stock data using Financial Modeling Prep API
-"""
-import logging
-import os
-from datetime import datetime, timedelta
-import pandas as pd
-import requests
-from src.stockdata import DataFetcherInterface
-# Setup logging
-logging.basicConfig(level=logging.INFO)
-logger = logging.getLogger(__name__)
-# Constants
-HTTP_SUCCESS = 200
-class DataFetcher(DataFetcherInterface):
-    """Class to fetch stock data from Financial Modeling Prep API"""
-    # Default period for beta calculations
-    beta_period = "3m"
-    def __init__(self, cache_dir=".cache_fmp"):
-        """Initialize with cache directory"""
-        self.cache_dir = cache_dir
-        self.api_key = os.environ.get("FMP_API_KEY")
-        # If not in environment, try to get from config
-        if not self.api_key:
-            try:
-                from src.v2.config import config
-                self.api_key = config.get("data.fmp.api_key")
-            except ImportError:
-                logger.warning(
-                    "Could not import config from src.v2.config, will rely on environment variable"
-                )
-        self.cache_ttl = 86400  # Default to 1 day
-        # Try to get cache TTL from config if available
-        try:
-            from src.v2.config import config
-            self.cache_ttl = config.get("app.cache.ttl", 86400)
-        except ImportError:
-            logger.warning(
-                "Could not import config from src.v2.config, using default cache TTL"
-            )
-        # Create cache directory if it doesn't exist
-        os.makedirs(cache_dir, exist_ok=True)
-        # Check for API key
-        if not self.api_key:
-            raise ValueError(
-                "No API key found. Please set the FMP_API_KEY environment variable or "
-                "configure it in the config file."
-            )
-    def fetch_data(self, ticker, period="3m", interval="1d"):
-        """
-        Fetch stock data for a ticker
-        Args:
-            ticker (str): Stock ticker symbol
-            period (str): Time period ('3m', '6m', '1y', etc.)
-            interval (str): Data interval ('1d', '1wk', etc.)
-        Returns:
-            pandas.DataFrame: DataFrame with stock data
-        Raises:
-            ValueError: If no data is returned from API
-        """
-        # Check cache first
-        cache_file = os.path.join(self.cache_dir, f"{ticker}_{period}_{interval}.csv")
-        # Use the centralized cache validation logic
-        from src.stockdata import should_use_cache
-        should_use, reason = should_use_cache(cache_file, self.cache_ttl)
-        if should_use:
-            logger.debug(f"Loading cached data for {ticker}: {reason}")
-            return pd.read_csv(cache_file, index_col=0, parse_dates=True)
-        else:
-            logger.debug(f"Cache for {ticker} is not valid: {reason}")
-        # Try to fetch from API
-        try:
-            logger.info(f"Fetching data for {ticker} from API")
-            df = self._fetch_from_api(ticker, period)
-            if df is not None and not df.empty:
-                # Save to cache
-                df.to_csv(cache_file)
-                return df
-            else:
-                # This is a valid case - API returned no data for a valid ticker
-                logger.warning(f"No data returned from API for {ticker}")
-                # Raise a specific error instead of returning an empty DataFrame
-                raise ValueError(f"No historical data found for {ticker}")
-        except (ValueError, requests.exceptions.RequestException) as e:
-            # These are expected errors that can happen with valid inputs
-            # For example, a valid ticker that has no data available or network issues
-            logger.warning(f"Data fetch error for {ticker}: {e}")
-            # Only use expired cache for expected data errors, not for programming errors
-            if os.path.exists(cache_file):
-                logger.warning(f"Using expired cache for {ticker} as fallback")
-                try:
-                    return pd.read_csv(cache_file, index_col=0, parse_dates=True)
-                except (pd.errors.ParserError, pd.errors.EmptyDataError) as cache_e:
-                    logger.error(f"Error reading cache for {ticker}: {cache_e}")
-                    # If we can't read the cache, re-raise the original error
-                    raise e from cache_e
-            # If this is a "No historical data" error and we have no cache,
-            # it's reasonable to return an empty DataFrame with the expected structure
-            if "No historical data found" in str(e):
-                logger.warning(
-                    f"No historical data found for {ticker} and no cache available"
-                )
-                return pd.DataFrame(columns=["Open", "High", "Low", "Close", "Volume"])
-            # For other data errors with no cache, re-raise
-            raise
-        except (ImportError, NameError, AttributeError, TypeError, SyntaxError) as e:
-            # These are programming errors that should never be caught silently
-            logger.critical(f"Critical error in data fetcher: {e}", exc_info=True)
-            raise
-        except Exception as e:
-            # For other unexpected errors, log and re-raise
-            logger.error(
-                f"Unexpected error fetching data for {ticker}: {e}", exc_info=True
-            )
-            raise
-    def fetch_market_data(self, market_index="SPY", period=None, interval="1d"):
-        """
-        Fetch market index data for beta calculations.
-        Args:
-            market_index (str): Market index ticker symbol (default: 'SPY' for S&P 500 ETF)
-            period (str, optional): Time period. If None, uses beta_period.
-            interval (str): Data interval ('1d', '1wk', etc.)
-        Returns:
-            pandas.DataFrame: DataFrame with market index data
-        """
-        # Use the class beta_period if period is None
-        if period is None:
-            period = self.beta_period
-            logger.info(f"Using default beta period: {period}")
-        logger.debug(f"Fetching market data for {market_index}")
-        return self.fetch_data(market_index, period, interval)
-    def _fetch_from_api(self, ticker, period="5y"):
-        """Fetch data from Financial Modeling Prep API"""
-        # Determine date range based on period
-        end_date = datetime.now()
-        if period.endswith("y"):
-            years = int(period[:-1])
-            start_date = end_date - timedelta(days=365 * years)
-        elif period.endswith("m"):
-            months = int(period[:-1])
-            start_date = end_date - timedelta(days=30 * months)
-        else:
-            # Default to 1 year
-            start_date = end_date - timedelta(days=365)
-        # Format dates for API
-        start_str = start_date.strftime("%Y-%m-%d")
-        end_str = end_date.strftime("%Y-%m-%d")
-        # Construct API URL
-        base_url = "https://financialmodelingprep.com/api/v3/historical-price-full"
-        url = f"{base_url}/{ticker}?from={start_str}&to={end_str}&apikey={self.api_key}"
-        # Make request
-        response = requests.get(url)
-        if response.status_code != HTTP_SUCCESS:
-            raise ValueError(
-                f"API request failed with status code {response.status_code}: {response.text}"
-            )
-        # Parse response
-        data = response.json()
-        if "historical" not in data:
-            # This is not a critical error - just log a warning and return empty DataFrame
-            logger.warning(f"No historical data found for {ticker}")
-            return pd.DataFrame(columns=["Open", "High", "Low", "Close", "Volume"])
-        # Convert to DataFrame
-        df = pd.DataFrame(data["historical"])
-        # Convert date to datetime and set as index
-        df["date"] = pd.to_datetime(df["date"])
-        df = df.set_index("date")
-        # Sort by date (ascending)
-        df = df.sort_index()
-        # Rename columns to match expected format
-        df = df.rename(
-            columns={
-                "open": "Open",
-                "high": "High",
-                "low": "Low",
-                "close": "Close",
-                "volume": "Volume",
-            }
-        )
-        return df
-    def _fetch_data(self, url, params=None):
-        try:
-            response = requests.get(url, params=params)
-            if response.status_code == HTTP_SUCCESS:
-                return response.json()
-            else:
-                logger.error(f"Failed to fetch data: {response.status_code}")
-                return None
-        except Exception as e:
-            logger.error(f"Error fetching data: {e}")
-            return None
-if __name__ == "__main__":
-    # Simple test
-    fetcher = DataFetcher()
-    data = fetcher.fetch_data("AAPL", period="1y")

src/folio/README.md DELETED Viewed

@@ -1,119 +0,0 @@
-# Folio - Portfolio Dashboard
-## Overview
-Folio is a web-based dashboard for analyzing and visualizing investment portfolios. It provides a comprehensive view of your portfolio's composition, risk metrics, and exposure analysis with a focus on stocks and options.
-## Features
-- **Portfolio Analysis**: View your entire portfolio with key metrics like value, beta, and exposure
-- **Position Grouping**: Automatically groups stocks with their related options
-- **Risk Metrics**: Calculates beta and beta-adjusted exposure for all positions
-- **Options Analysis**: Provides delta exposure and other option-specific metrics
-- **Interactive UI**: Filter, sort, and search your portfolio with real-time updates
-- **Position Details**: Drill down into specific positions for detailed analysis
-- **CSV Import**: Upload portfolio data from CSV exports (compatible with Fidelity exports)
-- **Auto-Refresh**: Periodically refreshes data to keep metrics current
-## Getting Started
-### Prerequisites
-- Python 3.9+
-- Required packages (see `requirements.txt` in the project root)
-### Running the Dashboard
-```bash
-# From the project root directory:
-# Start with default settings (will prompt for file upload)
-make folio
-# Start with a specific portfolio file
-make folio portfolio=path/to/portfolio.csv
-# Or run directly with Python
-python -m src.folio --portfolio path/to/portfolio.csv --port 8051
-```
-The dashboard will be available at http://127.0.0.1:8051/ (or your specified port).
-## Project Structure
-```
-src/folio/
-├── __init__.py         # Package initialization
-├── __main__.py         # Entry point for running as a module
-├── app.py              # Main Dash application setup and callbacks
-├── components/         # UI components
-│   ├── __init__.py
-│   ├── portfolio_table.py  # Portfolio table component
-│   └── position_details.py # Position details modal
-├── data_model.py       # Data models and type definitions
-├── logger.py           # Logging configuration
-└── utils.py            # Utility functions for data processing
-```
-## Data Model
-The application uses the following key data structures:
-- **Position**: Base class for all positions (stocks and options)
-- **StockPosition**: Represents a stock position
-- **OptionPosition**: Represents an option position with strike, expiry, etc.
-- **PortfolioGroup**: Groups a stock with its related options
-- **PortfolioSummary**: Contains aggregated metrics for the entire portfolio
-- **ExposureBreakdown**: Detailed breakdown of exposure metrics
-## Development Guide
-### Adding New Features
-1. **UI Components**: Add new components in the `components/` directory
-2. **Data Processing**: Extend the data model in `data_model.py` and processing logic in `utils.py`
-3. **Callbacks**: Add new callbacks in `app.py` to handle user interactions
-### Coding Standards
-- Use type hints for all functions and methods
-- Document functions with docstrings (Google style)
-- Log important operations and errors using the logger
-- Handle exceptions gracefully with appropriate error messages
-- Follow the existing pattern for callback registration
-### Testing
-While there's no formal test suite yet, you can test your changes by:
-1. Running the application with a sample portfolio
-2. Verifying that all UI components render correctly
-3. Checking that calculations produce expected results
-4. Testing edge cases (empty portfolio, invalid data, etc.)
-## Troubleshooting
-### Common Issues
-- **Missing Data**: Ensure your CSV has all required columns (Symbol, Description, Quantity, etc.)
-- **Port Conflicts**: If the default port is in use, specify a different port with `--port`
-- **Data Fetching Errors**: Check network connectivity for beta data retrieval
-### Logging
-Logs are stored in the `logs/` directory with timestamps. Check these logs for detailed error information.
-## Future Improvements
-- Add unit tests for core functionality
-- Implement additional portfolio metrics (Sharpe ratio, VaR, etc.)
-- Add visualization components (charts, graphs)
-- Support for additional data sources beyond CSV
-- Enhanced options analytics with Greeks (gamma, theta, vega)
-## Contributing
-1. Follow the existing code style and patterns
-2. Document your changes thoroughly
-3. Test your changes with various portfolio data
-4. Submit a pull request with a clear description of your changes

src/folio/data_fetcher_singleton.py DELETED Viewed

@@ -1,97 +0,0 @@
-"""Singleton module for data fetcher.
-This module provides a singleton instance of the data fetcher to ensure
-it's only initialized once across the application.
-"""
-import os
-import yaml
-from src.stockdata import create_data_fetcher
-from .logger import logger
-class DataFetcherSingleton:
-    """Singleton class for data fetcher."""
-    _instance = None
-    _initialized = False
-    @classmethod
-    def get_instance(cls):
-        """Get the singleton instance of the data fetcher.
-        Returns:
-            DataFetcherInterface: The data fetcher instance.
-        """
-        if cls._instance is None:
-            cls._instance = cls._initialize_data_fetcher()
-        return cls._instance
-    @classmethod
-    def _initialize_data_fetcher(cls):
-        """Initialize the data fetcher.
-        Returns:
-            DataFetcherInterface: The initialized data fetcher.
-        Raises:
-            RuntimeError: If the data fetcher initialization fails.
-        """
-        if cls._initialized:
-            return cls._instance
-        # Load configuration
-        config = cls._load_config()
-        try:
-            # Get data source from config (default to "yfinance" if not specified)
-            data_source = config.get("app", {}).get("data_source", "yfinance")
-            logger.info(f"Using data source: {data_source}")
-            # Create data fetcher using factory
-            data_fetcher = create_data_fetcher(source=data_source)
-            if data_fetcher is None:
-                raise RuntimeError(
-                    "Data fetcher initialization failed but didn't raise an exception"
-                )
-            cls._initialized = True
-            return data_fetcher
-        except ValueError as e:
-            logger.error(f"Failed to initialize data fetcher: {e}")
-            # Re-raise to fail fast rather than continuing with a null reference
-            raise RuntimeError(
-                f"Critical component data fetcher could not be initialized: {e}"
-            ) from e
-    @staticmethod
-    def _load_config():
-        """Load configuration from folio.yaml.
-        Returns:
-            dict: The configuration dictionary.
-        """
-        config_path = os.path.join(os.path.dirname(__file__), "folio.yaml")
-        if os.path.exists(config_path):
-            try:
-                with open(config_path) as f:
-                    return yaml.safe_load(f) or {}
-            except Exception as e:
-                logger.warning(
-                    f"Failed to load folio.yaml: {e}. Using default configuration."
-                )
-        return {}
-# Convenience function to get the data fetcher instance
-def get_data_fetcher():
-    """Get the singleton instance of the data fetcher.
-    Returns:
-        DataFetcherInterface: The data fetcher instance.
-    """
-    return DataFetcherSingleton.get_instance()

src/folio/folio.yaml CHANGED Viewed

@@ -2,8 +2,6 @@
 # TODO: this isn't being used yet. Please update the TODOs to individual sections below as you implement them
 app:
-  # Data source configuration
-  data_source: "yfinance"  # Options: "fmp", "yfinance"
   # Cache configuration
   cache:

 # TODO: this isn't being used yet. Please update the TODOs to individual sections below as you implement them
 app:
   # Cache configuration
   cache:

src/folio/portfolio.py CHANGED Viewed

@@ -7,10 +7,7 @@ This module provides core functionality for portfolio analysis, including:
 - Portfolio metrics and summary calculations
 """
-import os
 import pandas as pd
-import yaml
 from src.stockdata import get_data_fetcher
@@ -33,18 +30,8 @@ from .portfolio_value import (
 )
 from .utils import clean_currency_value, get_beta
-# Load configuration
-config_path = os.path.join(os.path.dirname(__file__), "folio.yaml")
-config = {}
-if os.path.exists(config_path):
-    try:
-        with open(config_path) as f:
-            config = yaml.safe_load(f) or {}
-    except Exception as e:
-        logger.warning(f"Failed to load folio.yaml: {e}. Using default configuration.")
 # Get the singleton data fetcher instance
-data_fetcher = get_data_fetcher(config=config)
 def process_portfolio_data(

 - Portfolio metrics and summary calculations
 """
 import pandas as pd
 from src.stockdata import get_data_fetcher
 )
 from .utils import clean_currency_value, get_beta
 # Get the singleton data fetcher instance
+data_fetcher = get_data_fetcher()
 def process_portfolio_data(

src/folio/roadmap.md DELETED Viewed

@@ -1,262 +0,0 @@
-# Folio Product Roadmap
-## Overview
-This roadmap outlines the strategic direction for Folio, our portfolio dashboard application. Features are prioritized based on their estimated Return on Investment (ROI), considering development effort, user impact, and alignment with our core value proposition of providing comprehensive portfolio analysis and risk management.
-## Priority Matrix
-| Priority | Feature | Effort | Impact | ROI |
-|----------|---------|--------|--------|-----|
-| 1 | Enhanced Options Analytics | Medium | High | ★★★★★ |
-| 2 | Portfolio Visualization | Medium | High | ★★★★★ |
-| 3 | Performance Tracking | Medium | High | ★★★★☆ |
-| 4 | Scenario Analysis & Stress Testing | High | High | ★★★★☆ |
-| 5 | Additional Portfolio Metrics | Low | Medium | ★★★★☆ |
-| 6 | Multi-Source Data Import | Medium | Medium | ★★★☆☆ |
-| 7 | Portfolio Optimization | High | Medium | ★★★☆☆ |
-| 8 | Mobile Responsiveness | Medium | Medium | ★★★☆☆ |
-| 9 | User Accounts & Cloud Sync | High | Medium | ★★☆☆☆ |
-| 10 | API Service | High | Low | ★★☆☆☆ |
-## Detailed Feature Descriptions
-### 1. Enhanced Options Analytics (★★★★★)
-**Description:** Extend the current options analysis with comprehensive Greeks calculations and visualization.
-**Components:**
-- Complete implementation of all Greeks (Delta, Gamma, Theta, Vega, Rho)
-- Options strategy identification and analysis
-- Implied volatility surface visualization
-- Options expiration calendar view
-**Business Value:**
-- Provides deeper insights for options traders
-- Differentiates from basic portfolio trackers
-- Addresses current TODOs in the codebase
-- Builds on existing foundation with high leverage
-**Implementation Effort:** Medium (3-4 weeks)
----
-### 2. Portfolio Visualization (★★★★★)
-**Description:** Add comprehensive data visualization components to provide visual insights into portfolio composition and risk.
-**Components:**
-- Asset allocation pie/treemap charts
-- Exposure breakdown visualizations
-- Risk metrics dashboards
-- Position correlation heatmaps
-- Historical performance charts
-**Business Value:**
-- Dramatically improves user experience and insights
-- Makes complex data more accessible
-- Leverages existing Plotly/Dash capabilities
-- High visual impact for demos and marketing
-**Implementation Effort:** Medium (3-4 weeks)
----
-### 3. Performance Tracking (★★★★☆)
-**Description:** Implement historical performance tracking to monitor portfolio changes over time.
-**Components:**
-- Historical snapshots of portfolio state
-- Performance metrics calculation (returns, drawdowns)
-- Benchmark comparison
-- Attribution analysis (which positions drove performance)
-- Customizable time period selection
-**Business Value:**
-- Enables users to track investment performance
-- Provides accountability for investment decisions
-- Creates stickier product with historical data value
-- Complements existing risk analysis features
-**Implementation Effort:** Medium (4-5 weeks)
----
-### 4. Scenario Analysis & Stress Testing (★★★★☆)
-**Description:** Allow users to model portfolio behavior under different market scenarios.
-**Components:**
-- Market shock simulations (e.g., -20% market crash)
-- Interest rate change scenarios
-- Volatility spike modeling
-- Custom scenario builder
-- Historical scenario replay (e.g., 2008 crash, 2020 COVID)
-**Business Value:**
-- Provides forward-looking risk assessment
-- Highly valuable for risk management
-- Differentiates from basic portfolio trackers
-- Appeals to sophisticated investors
-**Implementation Effort:** High (6-8 weeks)
----
-### 5. Additional Portfolio Metrics (★★★★☆)
-**Description:** Expand the set of portfolio metrics beyond current beta and exposure analysis.
-**Components:**
-- Sharpe ratio, Sortino ratio, and other risk-adjusted return metrics
-- Value at Risk (VaR) calculations
-- Factor exposure analysis (size, value, momentum, etc.)
-- Sector/industry exposure breakdown
-- Correlation metrics with major indices
-**Business Value:**
-- Enhances risk assessment capabilities
-- Relatively easy to implement with high value
-- Builds on existing data model
-- Addresses TODOs in current codebase
-**Implementation Effort:** Low (2-3 weeks)
----
-### 6. Multi-Source Data Import (★★★☆☆)
-**Description:** Expand beyond CSV imports to support multiple brokerage data sources.
-**Components:**
-- Direct API connections to major brokerages
-- Support for additional CSV/Excel formats
-- Automated mapping of different data formats
-- Manual position entry interface
-- Data validation and error handling
-**Business Value:**
-- Reduces friction in user onboarding
-- Expands potential user base
-- Improves data accuracy and freshness
-- Addresses limitation in current implementation
-**Implementation Effort:** Medium (4-6 weeks)
----
-### 7. Portfolio Optimization (���★★☆☆)
-**Description:** Provide recommendations for portfolio improvements based on modern portfolio theory.
-**Components:**
-- Efficient frontier calculation
-- Optimization for different objectives (max return, min risk, etc.)
-- Position sizing recommendations
-- Hedging suggestions
-- Tax-efficient rebalancing recommendations
-**Business Value:**
-- Moves from analysis to actionable recommendations
-- Significant value-add for users
-- Potential premium feature
-- Differentiator from competitors
-**Implementation Effort:** High (8-10 weeks)
----
-### 8. Mobile Responsiveness (★★★☆☆)
-**Description:** Optimize the UI for mobile and tablet devices.
-**Components:**
-- Responsive layout redesign
-- Touch-friendly controls
-- Mobile-optimized tables and charts
-- Progressive web app capabilities
-- Offline mode for basic functionality
-**Business Value:**
-- Expands usage contexts
-- Improves accessibility
-- Meets modern user expectations
-- Potential for mobile app distribution
-**Implementation Effort:** Medium (3-5 weeks)
----
-### 9. User Accounts & Cloud Sync (★★☆☆☆)
-**Description:** Implement user authentication and cloud storage for portfolios.
-**Components:**
-- User registration and authentication
-- Secure portfolio data storage
-- Multi-portfolio support
-- Sharing and collaboration features
-- Premium account tiers
-**Business Value:**
-- Enables monetization strategies
-- Creates persistent user relationships
-- Allows for multi-device access
-- Foundation for social/collaborative features
-**Implementation Effort:** High (6-8 weeks)
----
-### 10. API Service (★★☆☆☆)
-**Description:** Create a public API for programmatic access to Folio analytics.
-**Components:**
-- RESTful API design
-- Authentication and rate limiting
-- Documentation and SDK
-- Webhook support for portfolio updates
-- Integration examples
-**Business Value:**
-- Enables integration with other tools
-- Potential for developer ecosystem
-- Additional monetization channel
-- Automation capabilities for power users
-**Implementation Effort:** High (6-8 weeks)
-## Implementation Phases
-### Phase 1: Core Enhancement (Q2 2025)
-- Enhanced Options Analytics
-- Portfolio Visualization
-- Additional Portfolio Metrics
-### Phase 2: Advanced Analytics (Q3 2025)
-- Performance Tracking
-- Scenario Analysis & Stress Testing
-- Multi-Source Data Import
-### Phase 3: Platform Expansion (Q4 2025)
-- Portfolio Optimization
-- Mobile Responsiveness
-- User Accounts & Cloud Sync
-- API Service
-## Success Metrics
-For each feature, we will track:
-- User adoption rate
-- Time spent using the feature
-- User feedback and satisfaction
-- Impact on key performance indicators
-- Technical stability and performance
-## Conclusion
-This roadmap focuses on building upon Folio's core strengths in portfolio analysis while expanding into new capabilities that enhance user value. The highest ROI features leverage our existing data model and technical foundation while addressing clear user needs for deeper analytics and visualization.
-By prioritizing enhanced options analytics, visualization, and performance tracking in the near term, we can deliver significant value quickly while building toward more ambitious features like scenario analysis and portfolio optimization.

src/folio/utils.py CHANGED Viewed

@@ -28,11 +28,8 @@ def load_config():
     return {}
-# Get configuration
-config = load_config()
 # Get the singleton data fetcher instance
-data_fetcher = get_data_fetcher(config=config)
 def get_beta(ticker: str, description: str = "") -> float:

     return {}
 # Get the singleton data fetcher instance
+data_fetcher = get_data_fetcher()
 def get_beta(ticker: str, description: str = "") -> float:

src/stockdata.py CHANGED Viewed

@@ -59,21 +59,17 @@ class DataFetcherInterface(ABC):
         pass
-def create_data_fetcher(source="yfinance", cache_dir=None):
     """
-    Factory function to create the appropriate data fetcher.
     Args:
-        source (str): Data source to use ('yfinance' or 'fmp')
         cache_dir (str, optional): Cache directory. If None, uses default.
     Returns:
-        DataFetcherInterface: An instance of the appropriate data fetcher
-    Raises:
-        ValueError: If the specified source is not supported
     """
-    # Set default cache directories based on data source and environment
     # In Hugging Face Spaces, use /tmp for cache
     is_huggingface = (
         os.environ.get("HF_SPACE") == "1" or os.environ.get("SPACE_ID") is not None
@@ -82,23 +78,15 @@ def create_data_fetcher(source="yfinance", cache_dir=None):
     if cache_dir is None:
         if is_huggingface:
             # Use /tmp directory for Hugging Face
-            cache_dir = "/tmp/cache_yf" if source == "yfinance" else "/tmp/cache_fmp"
         else:
             # Use local directory for other environments
-            cache_dir = ".cache_yf" if source == "yfinance" else ".cache_fmp"
-    if source == "yfinance":
-        from src.yfinance import YFinanceDataFetcher
-        logger.info(f"Creating YFinance data fetcher with cache dir: {cache_dir}")
-        return YFinanceDataFetcher(cache_dir=cache_dir)
-    elif source == "fmp":
-        from src.fmp import DataFetcher
-        logger.info(f"Creating FMP data fetcher with cache dir: {cache_dir}")
-        return DataFetcher(cache_dir=cache_dir)
-    else:
-        raise ValueError(f"Unknown data source: {source}")
 # Singleton data fetcher class
@@ -106,9 +94,10 @@ class DataFetcherSingleton:
     """Singleton class for data fetcher."""
     _instance = None
     @classmethod
-    def get_instance(cls, source=None, cache_dir=None, config=None):
         """
         Get the singleton instance of the data fetcher.
@@ -116,11 +105,7 @@ class DataFetcherSingleton:
         the application, preventing duplicate initialization.
         Args:
-            source (str, optional): Data source to use ('yfinance' or 'fmp').
-                If None, uses the value from config or defaults to 'yfinance'.
             cache_dir (str, optional): Cache directory. If None, uses default.
-            config (dict, optional): Configuration dictionary. If provided,
-                used to determine the data source if source is None.
         Returns:
             DataFetcherInterface: The singleton data fetcher instance.
@@ -131,22 +116,16 @@ class DataFetcherSingleton:
         if cls._instance is not None:
             return cls._instance
-        # Determine the data source
-        if source is None:
-            if config is not None:
-                source = config.get("app", {}).get("data_source", "yfinance")
-            else:
-                source = "yfinance"
         try:
-            logger.info(f"Using data source: {source}")
-            cls._instance = create_data_fetcher(source=source, cache_dir=cache_dir)
             if cls._instance is None:
                 raise RuntimeError(
                     "Data fetcher initialization failed but didn't raise an exception"
                 )
             return cls._instance
         except ValueError as e:
             logger.error(f"Failed to initialize data fetcher: {e}")
@@ -157,7 +136,7 @@ class DataFetcherSingleton:
 # Convenience function to maintain backward compatibility
-def get_data_fetcher(source=None, cache_dir=None, config=None):
     """
     Get the singleton instance of the data fetcher.
@@ -165,16 +144,13 @@ def get_data_fetcher(source=None, cache_dir=None, config=None):
     for backward compatibility.
     Args:
-        source (str, optional): Data source to use ('yfinance' or 'fmp').
-            If None, uses the value from config or defaults to 'yfinance'.
         cache_dir (str, optional): Cache directory. If None, uses default.
-        config (dict, optional): Configuration dictionary. If provided,
-            used to determine the data source if source is None.
     Returns:
         DataFetcherInterface: The singleton data fetcher instance.
     """
-    return DataFetcherSingleton.get_instance(source, cache_dir, config)
 # Cache management functions

         pass
+def create_data_fetcher(cache_dir=None):
     """
+    Factory function to create a YFinance data fetcher.
     Args:
         cache_dir (str, optional): Cache directory. If None, uses default.
     Returns:
+        DataFetcherInterface: An instance of YFinanceDataFetcher
     """
+    # Set default cache directory based on environment
     # In Hugging Face Spaces, use /tmp for cache
     is_huggingface = (
         os.environ.get("HF_SPACE") == "1" or os.environ.get("SPACE_ID") is not None
     if cache_dir is None:
         if is_huggingface:
             # Use /tmp directory for Hugging Face
+            cache_dir = "/tmp/cache_yf"
         else:
             # Use local directory for other environments
+            cache_dir = ".cache_yf"
+    from src.yfinance import YFinanceDataFetcher
+    logger.info(f"Creating YFinance data fetcher with cache dir: {cache_dir}")
+    return YFinanceDataFetcher(cache_dir=cache_dir)
 # Singleton data fetcher class
     """Singleton class for data fetcher."""
     _instance = None
+    _initialized = False
     @classmethod
+    def get_instance(cls, cache_dir=None):
         """
         Get the singleton instance of the data fetcher.
         the application, preventing duplicate initialization.
         Args:
             cache_dir (str, optional): Cache directory. If None, uses default.
         Returns:
             DataFetcherInterface: The singleton data fetcher instance.
         if cls._instance is not None:
             return cls._instance
         try:
+            logger.info("Initializing YFinance data fetcher")
+            cls._instance = create_data_fetcher(cache_dir=cache_dir)
             if cls._instance is None:
                 raise RuntimeError(
                     "Data fetcher initialization failed but didn't raise an exception"
                 )
+            cls._initialized = True
             return cls._instance
         except ValueError as e:
             logger.error(f"Failed to initialize data fetcher: {e}")
 # Convenience function to maintain backward compatibility
+def get_data_fetcher(cache_dir=None, **kwargs):
     """
     Get the singleton instance of the data fetcher.
     for backward compatibility.
     Args:
         cache_dir (str, optional): Cache directory. If None, uses default.
+        **kwargs: Additional arguments that are ignored (for backward compatibility)
     Returns:
         DataFetcherInterface: The singleton data fetcher instance.
     """
+    return DataFetcherSingleton.get_instance(cache_dir)
 # Cache management functions

tests/fetch_sample_data.py DELETED Viewed

@@ -1,110 +0,0 @@
-"""
-Script to fetch sample data from the FMP API for testing purposes.
-This script fetches data for a few representative tickers and saves it to JSON files
-for reference when creating mock data and tests.
-"""
-import json
-import os
-import sys
-# Add the project root to the Python path
-sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
-from src.fmp import DataFetcher
-# Create output directory
-OUTPUT_DIR = "tests/test_data"
-os.makedirs(OUTPUT_DIR, exist_ok=True)
-# List of tickers to fetch data for
-TICKERS = [
-    "SPY",    # S&P 500 ETF (market benchmark)
-    "AAPL",   # High beta tech stock
-    "GOOGL",  # Another high beta tech stock
-    "SO",     # Low volatility utility stock
-    "TLT",    # Treasury ETF (negative correlation with market)
-    "BIL",    # Short-term treasury (very low volatility)
-    "EFA",    # International ETF
-    "EEM",    # Emerging markets ETF
-]
-# Periods to fetch
-PERIODS = ["1y", "5y"]
-def main():
-    """Fetch sample data and save to files."""
-    # Initialize data fetcher
-    fetcher = DataFetcher()
-    # Fetch data for each ticker and period
-    for ticker in TICKERS:
-        for period in PERIODS:
-            try:
-                # Fetch data
-                df = fetcher.fetch_data(ticker, period=period)
-                if df is not None and not df.empty:
-                    # Save to CSV
-                    csv_path = os.path.join(OUTPUT_DIR, f"{ticker}_{period}.csv")
-                    df.to_csv(csv_path)
-                    # Save first 5 rows to JSON for reference
-                    json_path = os.path.join(OUTPUT_DIR, f"{ticker}_{period}_sample.json")
-                    sample_data = df.head(5).reset_index().to_dict(orient="records")
-                    with open(json_path, "w") as f:
-                        json.dump(sample_data, f, indent=2, default=str)
-                else:
-                    pass
-            except Exception:
-                pass
-    # Calculate and save beta values
-    betas = {}
-    # Use 5-year data for more accurate beta calculation
-    market_data = fetcher.fetch_market_data("SPY", period="5y")
-    market_returns = market_data["Close"].pct_change().dropna()
-    for ticker in TICKERS:
-        try:
-            # Skip SPY (beta = 1.0 by definition)
-            if ticker == "SPY":
-                betas[ticker] = 1.0
-                continue
-            # Fetch data and calculate beta
-            stock_data = fetcher.fetch_data(ticker, period="5y")
-            stock_returns = stock_data["Close"].pct_change().dropna()
-            # Align data
-            common_dates = stock_returns.index.intersection(market_returns.index)
-            if len(common_dates) < 30:  # Require at least 30 data points
-                continue
-            aligned_stock = stock_returns.loc[common_dates]
-            aligned_market = market_returns.loc[common_dates]
-            # Calculate beta
-            covariance = aligned_stock.cov(aligned_market)
-            market_variance = aligned_market.var()
-            beta = covariance / market_variance
-            betas[ticker] = beta
-        except Exception:
-            pass
-    # Save beta values
-    beta_path = os.path.join(OUTPUT_DIR, "beta_values.json")
-    with open(beta_path, "w") as f:
-        json.dump(betas, f, indent=2)
-if __name__ == "__main__":
-    main()

tests/test_data_fetcher.py DELETED Viewed

@@ -1,471 +0,0 @@
-"""
-Tests for the DataFetcher class in src/fmp.py
-These tests verify the core functionality of the DataFetcher class, including:
-1. Initialization and configuration
-2. Data fetching and caching
-3. Error handling
-4. Data format and structure
-The tests use mocking to avoid actual API calls and to provide consistent test data.
-"""
-import os
-import sys
-import time
-from datetime import datetime, timedelta
-from unittest.mock import MagicMock, patch
-import pandas as pd
-import pytest
-import requests
-# Add the project root to the Python path
-sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
-from src.fmp import DataFetcher
-# Import mock data utilities
-from tests.test_data.mock_stock_data import (
-    get_mock_raw_data,
-    get_real_beta,
-    get_real_data,
-)
-@pytest.fixture
-def mock_response():
-    """Create a mock response object for requests with real data."""
-    mock = MagicMock()
-    mock.status_code = 200
-    # Use real data structure from our collected samples
-    mock.json.return_value = get_mock_raw_data("AAPL", "1y")
-    return mock
-@pytest.fixture
-def mock_spy_response():
-    """Create a mock response object for SPY data."""
-    mock = MagicMock()
-    mock.status_code = 200
-    # Use real data structure from our collected samples
-    mock.json.return_value = get_mock_raw_data("SPY", "1y")
-    return mock
-@pytest.fixture
-def mock_empty_response():
-    """Create a mock response with no historical data."""
-    mock = MagicMock()
-    mock.status_code = 200
-    mock.json.return_value = {"symbol": "INVALID", "historical": []}
-    return mock
-@pytest.fixture
-def mock_error_response():
-    """Create a mock response with an error status code."""
-    mock = MagicMock()
-    mock.status_code = 401
-    mock.text = "Unauthorized: Invalid API key"
-    return mock
-@pytest.fixture
-def temp_cache_dir(tmpdir):
-    """Create a temporary directory for cache files."""
-    cache_dir = tmpdir.mkdir("test_cache")
-    return str(cache_dir)
-@pytest.fixture
-def sample_dataframe():
-    """Create a sample DataFrame with the expected structure using real data."""
-    return get_real_data("AAPL", "1y").head(5)
-class TestDataFetcherInitialization:
-    """Tests for DataFetcher initialization and configuration."""
-    def test_init_with_default_cache_dir(self):
-        """Test initialization with default cache directory."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            fetcher = DataFetcher()
-            assert fetcher.cache_dir == ".cache_fmp"
-            assert fetcher.api_key == "test_key"
-            assert fetcher.cache_ttl == 86400  # Default TTL
-    def test_init_with_custom_cache_dir(self, temp_cache_dir):
-        """Test initialization with custom cache directory."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            fetcher = DataFetcher(cache_dir=temp_cache_dir)
-            assert fetcher.cache_dir == temp_cache_dir
-            assert os.path.exists(temp_cache_dir)  # Directory should be created
-    def test_init_with_config_api_key(self):
-        """Test initialization with API key from config."""
-        # Clear environment variable to ensure we use config
-        with patch.dict(os.environ, {}, clear=True):
-            with patch("src.v2.config.config.get", return_value="config_key"):
-                fetcher = DataFetcher()
-                assert fetcher.api_key == "config_key"
-    def test_init_with_env_api_key_precedence(self):
-        """Test that environment variable takes precedence over config."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "env_key"}):
-            with patch("src.v2.config.config.get", return_value="config_key"):
-                fetcher = DataFetcher()
-                assert fetcher.api_key == "env_key"
-    def test_init_with_custom_ttl(self):
-        """Test initialization with custom cache TTL from config."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch(
-                "src.v2.config.config.get",
-                side_effect=lambda key, default=None: 3600
-                if key == "app.cache.ttl"
-                else default,
-            ):
-                fetcher = DataFetcher()
-                assert fetcher.cache_ttl == 3600
-    def test_init_without_api_key(self):
-        """Test initialization without API key raises ValueError."""
-        with patch.dict(os.environ, {}, clear=True):
-            with patch("src.v2.config.config.get", return_value=None):
-                with pytest.raises(ValueError, match="No API key found"):
-                    DataFetcher()
-class TestDataFetching:
-    """Tests for data fetching functionality."""
-    def test_fetch_data_api_call(self, mock_response, temp_cache_dir):
-        """Test fetching data from API."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_data("AAPL", period="1y")
-                # Check DataFrame structure
-                assert isinstance(df, pd.DataFrame)
-                assert len(df) > 0  # Don't check exact length as it may vary
-                # Check that required columns exist
-                required_columns = ["Open", "High", "Low", "Close", "Volume"]
-                for col in required_columns:
-                    assert col in df.columns, f"Column {col} not found in DataFrame"
-                assert df.index.name == "date"
-                assert pd.api.types.is_datetime64_dtype(df.index)
-    def test_fetch_data_cache_creation(self, mock_response, temp_cache_dir):
-        """Test that data is cached after fetching."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                fetcher.fetch_data("AAPL", period="1y")
-                # Check that cache file was created
-                cache_file = os.path.join(temp_cache_dir, "AAPL_1y_1d.csv")
-                assert os.path.exists(cache_file)
-    def test_fetch_data_from_cache(
-        self, mock_response, temp_cache_dir, sample_dataframe
-    ):
-        """Test fetching data from cache."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            # Create cache file
-            cache_file = os.path.join(temp_cache_dir, "AAPL_1y_1d.csv")
-            sample_dataframe.to_csv(cache_file)
-            # Set modification time to be recent (within cache TTL)
-            os.utime(cache_file, (time.time(), time.time()))
-            with patch("requests.get", return_value=mock_response) as mock_get:
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_data("AAPL", period="1y")
-                # API should not be called
-                mock_get.assert_not_called()
-                # Data should match sample
-                pd.testing.assert_frame_equal(df, sample_dataframe)
-    def test_fetch_data_expired_cache(
-        self, mock_response, temp_cache_dir, sample_dataframe
-    ):
-        """Test fetching data with expired cache."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            # Create cache file
-            cache_file = os.path.join(temp_cache_dir, "AAPL_1y_1d.csv")
-            sample_dataframe.to_csv(cache_file)
-            # Set modification time to be old (beyond cache TTL)
-            old_time = time.time() - 100000  # Well beyond default TTL
-            os.utime(cache_file, (old_time, old_time))
-            with patch("requests.get", return_value=mock_response) as mock_get:
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                fetcher.fetch_data("AAPL", period="1y")
-                # API should be called
-                mock_get.assert_called_once()
-    def test_fetch_market_data(self, mock_response, temp_cache_dir):
-        """Test fetching market data."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_market_data(market_index="SPY", period="1y")
-                # Check DataFrame structure
-                assert isinstance(df, pd.DataFrame)
-                assert len(df) > 0  # Don't check exact length as it may vary
-                # Check that required columns exist
-                required_columns = ["Open", "High", "Low", "Close", "Volume"]
-                for col in required_columns:
-                    assert col in df.columns, f"Column {col} not found in DataFrame"
-class TestErrorHandling:
-    """Tests for error handling in DataFetcher."""
-    def test_api_error_response(self, mock_error_response, temp_cache_dir):
-        """Test handling of API error responses."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_error_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                with pytest.raises(ValueError, match="API request failed"):
-                    fetcher.fetch_data("AAPL", period="1y")
-    def test_empty_data_response(self, mock_empty_response, temp_cache_dir):
-        """Test handling of empty data responses."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            # Update the mock response to not include 'historical' key
-            mock_empty_response.json.return_value = {"symbol": "INVALID"}
-            with patch("requests.get", return_value=mock_empty_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                # Now we expect an empty DataFrame instead of an exception
-                result = fetcher.fetch_data("INVALID", period="1y")
-                assert isinstance(result, pd.DataFrame)
-                assert result.empty or len(result) == 0
-                assert "Open" in result.columns
-                assert "Close" in result.columns
-    def test_network_error_with_fallback(self, temp_cache_dir, sample_dataframe):
-        """Test fallback to expired cache on network error."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            # Create cache file
-            cache_file = os.path.join(temp_cache_dir, "AAPL_1y_1d.csv")
-            sample_dataframe.to_csv(cache_file)
-            # Set modification time to be old (beyond cache TTL)
-            old_time = time.time() - 100000  # Well beyond default TTL
-            os.utime(cache_file, (old_time, old_time))
-            # Simulate network error
-            with patch(
-                "requests.get",
-                side_effect=requests.exceptions.ConnectionError("Network error"),
-            ):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_data("AAPL", period="1y")
-                # Should fall back to cache
-                pd.testing.assert_frame_equal(df, sample_dataframe)
-    def test_network_error_without_fallback(self, temp_cache_dir):
-        """Test network error without cache fallback raises exception."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            # Simulate network error with no cache
-            with patch(
-                "requests.get",
-                side_effect=requests.exceptions.ConnectionError("Network error"),
-            ):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                with pytest.raises(requests.exceptions.ConnectionError):
-                    fetcher.fetch_data("AAPL", period="1y")
-class TestDataFormat:
-    """Tests for data format and structure."""
-    def test_date_parsing(self, mock_response, temp_cache_dir):
-        """Test that dates are properly parsed and set as index."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_data("AAPL", period="1y")
-                # Check index is datetime
-                assert pd.api.types.is_datetime64_dtype(df.index)
-                assert df.index.name == "date"
-                # Don't check exact date as it may vary
-    def test_column_renaming(self, mock_response, temp_cache_dir):
-        """Test that columns are properly renamed."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_data("AAPL", period="1y")
-                # Check that required columns exist
-                required_columns = ["Open", "High", "Low", "Close", "Volume"]
-                for col in required_columns:
-                    assert col in df.columns, f"Column {col} not found in DataFrame"
-    def test_data_sorting(self, temp_cache_dir):
-        """Test that data is sorted by date in ascending order."""
-        # Modify mock response to have unsorted dates
-        unsorted_response = MagicMock()
-        unsorted_response.status_code = 200
-        unsorted_response.json.return_value = {
-            "symbol": "AAPL",
-            "historical": [
-                {
-                    "date": "2023-01-05",
-                    "open": 127.13,
-                    "high": 127.77,
-                    "low": 124.76,
-                    "close": 125.02,
-                    "volume": 80829500,
-                },
-                {
-                    "date": "2023-01-03",
-                    "open": 130.28,
-                    "high": 130.9,
-                    "low": 124.17,
-                    "close": 125.07,
-                    "volume": 112117500,
-                },
-                {
-                    "date": "2023-01-04",
-                    "open": 126.89,
-                    "high": 128.66,
-                    "low": 125.08,
-                    "close": 126.36,
-                    "volume": 88883500,
-                },
-            ],
-        }
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=unsorted_response):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                df = fetcher.fetch_data("AAPL", period="1y")
-                # Check sorting
-                assert df.index[0] < df.index[1] < df.index[2]
-                assert df.index[0] == pd.Timestamp("2023-01-03")
-                assert df.index[1] == pd.Timestamp("2023-01-04")
-                assert df.index[2] == pd.Timestamp("2023-01-05")
-class TestPeriodHandling:
-    """Tests for period handling in date range calculation."""
-    def test_period_years(self, mock_response, temp_cache_dir):
-        """Test period handling for years."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response) as mock_get:
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                fetcher.fetch_data("AAPL", period="2y")
-                # Extract URL from the call
-                url = mock_get.call_args[0][0]
-                # Check that date range is approximately 2 years
-                today = datetime.now().strftime("%Y-%m-%d")
-                two_years_ago = (datetime.now() - timedelta(days=365 * 2)).strftime(
-                    "%Y-%m"
-                )
-                assert today in url
-                assert two_years_ago in url
-    def test_period_months(self, mock_response, temp_cache_dir):
-        """Test period handling for months."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response) as mock_get:
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                fetcher.fetch_data("AAPL", period="6m")
-                # Extract URL from the call
-                url = mock_get.call_args[0][0]
-                # Check that date range is approximately 6 months
-                today = datetime.now().strftime("%Y-%m-%d")
-                six_months_ago = (datetime.now() - timedelta(days=30 * 6)).strftime(
-                    "%Y-%m"
-                )
-                assert today in url
-                assert six_months_ago in url
-    def test_period_default(self, mock_response, temp_cache_dir):
-        """Test default period handling."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch("requests.get", return_value=mock_response) as mock_get:
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                fetcher.fetch_data("AAPL", period="invalid")
-                # Extract URL from the call
-                url = mock_get.call_args[0][0]
-                # Check that date range is approximately 1 year (default)
-                today = datetime.now().strftime("%Y-%m-%d")
-                one_year_ago = (datetime.now() - timedelta(days=365)).strftime("%Y-%m")
-                assert today in url
-                assert one_year_ago in url
-class TestBetaCalculation:
-    """Tests for beta calculation using the DataFetcher."""
-    def test_beta_calculation(self, mock_response, mock_spy_response, temp_cache_dir):
-        """Test beta calculation with mock data."""
-        with patch.dict(os.environ, {"FMP_API_KEY": "test_key"}):
-            with patch(
-                "requests.get",
-                side_effect=lambda url, params=None: mock_spy_response
-                if "SPY" in url
-                else mock_response,
-            ):
-                fetcher = DataFetcher(cache_dir=temp_cache_dir)
-                # Get stock and market data
-                stock_data = fetcher.fetch_data("AAPL", period="1y")
-                market_data = fetcher.fetch_market_data("SPY", period="1y")
-                # Calculate beta manually
-                stock_returns = stock_data["Close"].pct_change().dropna()
-                market_returns = market_data["Close"].pct_change().dropna()
-                # Align data
-                common_dates = stock_returns.index.intersection(market_returns.index)
-                stock_returns = stock_returns.loc[common_dates]
-                market_returns = market_returns.loc[common_dates]
-                # Calculate beta
-                covariance = stock_returns.cov(market_returns)
-                market_variance = market_returns.var()
-                beta = covariance / market_variance
-                # Compare with expected beta from real data
-                get_real_beta("AAPL")
-                # Beta should be within a reasonable range of the expected value
-                # The exact value will differ due to the mock data and date ranges
-                assert 0.5 < beta < 2.0, f"Beta {beta} is outside reasonable range"
-                # For information only - not a strict test
-if __name__ == "__main__":
-    pytest.main(["-v", "test_data_fetcher.py"])