AJAY KASU commited on
Commit
cafdd88
·
0 Parent(s):

Initial Release: QuantScale AI Institutional Engine

Browse files
.env.example ADDED
@@ -0,0 +1 @@
 
 
1
+ HF_TOKEN=hf_your_hugging_face_token_here
.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ .env
2
+ .env
Dockerfile ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY requirements.txt .
6
+ RUN pip install --no-cache-dir -r requirements.txt
7
+
8
+ COPY . .
9
+
10
+ # Expose API Port
11
+ EXPOSE 7860
12
+
13
+ # Run FastAPI
14
+ CMD ["uvicorn", "api.app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: QuantScaleAI
3
+ emoji: 📈
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ pinned: false
8
+ app_port: 7860
9
+ ---
10
+
11
+ # QuantScale AI: Automated Direct Indexing & Attribution Engine
12
+
13
+ **QuantScale AI** is an institutional-grade portfolio optimization engine designed to replicate the "Direct Indexing" capabilities of top asset managers (e.g., Goldman Sachs, BlackRock).
14
+
15
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Live%20Demo-blue)](https://huggingface.co/spaces/AJAYKASU/QuantScaleAI)
16
+ [![API Docs](https://img.shields.io/badge/Swagger-API%20Docs-green)](https://ajaykasu-quantscaleai.hf.space/docs)
17
+
18
+ It specifically addresses the challenge of **Personalized Indexing at Scale**: allowing 60,000+ client portfolios to track a benchmark (S&P 500) while accommodating specific constraints (Values-based exclusions like "No Energy") and providing automated, high-precision performance attribution.
19
+
20
+ ---
21
+
22
+ ## Key Features
23
+
24
+ ### 1. Quantitative Engine (The Math)
25
+ - **Tracking Error Minimization**: Uses `cvxpy` to solve the quadratic programming problem of minimizing active risk.
26
+ - **Robust Risk Modeling**: Implements **Ledoit-Wolf Covariance Shrinkage** to handle the "High Dimensionality, Low Sample Size" problem inherent in 500-stock correlation matrices.
27
+ - **Direct Indexing**: Optimizes individual stock weights rather than ETFs, enabling granular customization.
28
+
29
+ ### 2. Wealth Management Features
30
+ - **Tax-Loss Harvesting**: Automated identification of loss lots with **Wash Sale Proxy logic**.
31
+ - *Example*: Detects a loss in Chevron (CVX) -> Suggests swap to Exxon (XOM) to maintain Energy exposure without triggering wash sale rules.
32
+ - **Sector Caching**: Local caching layer to handle API rate limits and ensure low-latency performance for demos.
33
+
34
+ ### 3. AI Integration (Generation Alpha)
35
+ - **Attribution Precision**: Uses the **Brinson-Fachler Attribution Model** to decompose excess return into **Allocation Effect** (Sector weighting) and **Selection Effect** (Stock picking).
36
+ - **Hugging Face Integration**: Feeds high-signal attribution data (Top 5 Contributors/Detractors) into `Meta-Llama-3-8B-Instruct` to generate profound, natural language client commentaries.
37
+
38
+ ---
39
+
40
+ ## Mathematical Formulation
41
+
42
+ The core optimizer solves the following Quadratic Program:
43
+
44
+ $$
45
+ \min_{w} \quad (w - w_b)^T \Sigma (w - w_b)
46
+ $$
47
+
48
+ **Subject to:**
49
+
50
+ $$
51
+ \sum_{i=1}^{N} w_i = 1 \quad (\text{Fully Invested})
52
+ $$
53
+
54
+ $$
55
+ w_i \ge 0 \quad (\text{Long Only})
56
+ $$
57
+
58
+ $$
59
+ w_{excluded} = 0 \quad (\text{Sector Constraints})
60
+ $$
61
+
62
+ Where:
63
+ - $w$ is the vector of portfolio weights.
64
+ - $w_b$ is the vector of benchmark weights.
65
+ - $\Sigma$ is the Ledoit-Wolf shrunk covariance matrix.
66
+
67
+ ---
68
+
69
+ ## Tech Stack
70
+ - **Languages**: Python 3.10+
71
+ - **Optimization**: `cvxpy`, `scikit-learn` (Ledoit-Wolf)
72
+ - **Data**: `yfinance` (Market Data), `pandas`, `numpy`
73
+ - **AI/LLM**: `huggingface_hub` (Inference API)
74
+ - **API**: `FastAPI` (Async REST Endpoints)
75
+ - **Architecture**: Object-Oriented (Abstract Managers, Pydantic Schemas)
76
+
77
+ ---
78
+
79
+ ## Installation & Usage
80
+
81
+ 1. **Clone & Install**
82
+ ```bash
83
+ git clone https://github.com/AjayKasu1/QuantScaleAI.git
84
+ pip install -r requirements.txt
85
+ ```
86
+
87
+ 2. **Configure Credentials**
88
+ Rename `.env.example` to `.env` and add your Hugging Face Token:
89
+ ```env
90
+ HF_TOKEN=hf_...
91
+ ```
92
+
93
+ 3. **Run the API**
94
+ ```bash
95
+ uvicorn api.app:app --reload
96
+ ```
97
+ POST to `http://127.0.0.1:8000/optimize` with:
98
+ ```json
99
+ {
100
+ "client_id": "CLIENT_01",
101
+ "excluded_sectors": ["Energy"]
102
+ }
103
+ ```
104
+
105
+ ---
106
+
107
+ ## Architecture
108
+
109
+ ```mermaid
110
+ graph TD
111
+ A[Client Request] --> B[FastAPI Layer]
112
+ B --> C[QuantScaleSystem]
113
+ C --> D[MarketDataEngine]
114
+ D --> E[(Sector Cache)]
115
+ C --> F[RiskModel]
116
+ F --> G[PortfolioOptimizer]
117
+ G --> H[AttributionEngine]
118
+ H --> I[AIReporter]
119
+ I --> J((Hugging Face API))
120
+ J --> I
121
+ I --> B
122
+ ```
__pycache__/config.cpython-311.pyc ADDED
Binary file (2.27 kB). View file
 
__pycache__/config.cpython-39.pyc ADDED
Binary file (1.39 kB). View file
 
__pycache__/main.cpython-311.pyc ADDED
Binary file (4.95 kB). View file
 
ai/__pycache__/ai_reporter.cpython-39.pyc ADDED
Binary file (2.11 kB). View file
 
ai/__pycache__/prompts.cpython-39.pyc ADDED
Binary file (1.6 kB). View file
 
ai/ai_reporter.py ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from huggingface_hub import InferenceClient
3
+ from core.schema import AttributionReport
4
+ from ai.prompts import SYSTEM_PROMPT, ATTRIBUTION_PROMPT_TEMPLATE
5
+ from config import settings
6
+
7
+ logger = logging.getLogger(__name__)
8
+
9
+ class AIReporter:
10
+ """
11
+ Generates natural language commentary using Hugging Face Inference API.
12
+ Models used: meta-llama/Meta-Llama-3-8B-Instruct (or similar available via API).
13
+ """
14
+
15
+ def __init__(self):
16
+ token = settings.HF_TOKEN.get_secret_value() if settings.HF_TOKEN else None
17
+
18
+ if token:
19
+ self.client = InferenceClient(token=token)
20
+ else:
21
+ self.client = None
22
+ logger.warning("HF_TOKEN not found. AI features will be disabled.")
23
+
24
+ # Default to a robust instruction model
25
+ self.model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
26
+
27
+ def generate_report(self,
28
+ attribution_report: AttributionReport,
29
+ excluded_sector: str) -> str:
30
+ """
31
+ Constructs the prompt and calls the HF API to generate the commentary.
32
+ """
33
+ logger.info("Generating AI Commentary...")
34
+
35
+ from datetime import datetime
36
+ # Get current date in a specific format (e.g., "February 03, 2026")
37
+ current_date = datetime.now().strftime("%B %d, %Y")
38
+
39
+ # Format the user prompt
40
+ # We assume ATTRIBUTION_PROMPT_TEMPLATE handles the rest, but we force the date in context
41
+ user_prompt = f"""
42
+ Current Date: {current_date}
43
+ INSTRUCTION: Start your commentary exactly with the header: "Market Commentary - {current_date}"
44
+ """ + ATTRIBUTION_PROMPT_TEMPLATE.format(
45
+ excluded_sector=excluded_sector,
46
+ total_active_return=attribution_report.total_active_return * 100, # Convert to %
47
+ allocation_effect=attribution_report.allocation_effect * 100,
48
+ selection_effect=attribution_report.selection_effect * 100,
49
+ top_contributors=", ".join(attribution_report.top_contributors),
50
+ top_detractors=", ".join(attribution_report.top_detractors),
51
+ current_date=current_date # Pass date to template
52
+ )
53
+
54
+ if not self.client:
55
+ return f"AI Commentary Unavailable. (Missing HF_TOKEN). Current Date: {current_date}"
56
+
57
+ messages = [
58
+ {"role": "system", "content": SYSTEM_PROMPT},
59
+ {"role": "user", "content": user_prompt}
60
+ ]
61
+
62
+ try:
63
+ response = self.client.chat_completion(
64
+ model=self.model_id,
65
+ messages=messages,
66
+ max_tokens=500,
67
+ temperature=0.7
68
+ )
69
+
70
+ commentary = response.choices[0].message.content
71
+ logger.info("AI Commentary generated successfully.")
72
+ return commentary
73
+
74
+ except Exception as e:
75
+ logger.error(f"Failed to generate AI report: {e}")
76
+ return "Error generating commentary. Please check API connection."
ai/prompts.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # System Prompt for the Portfolio Manager Persona
2
+ SYSTEM_PROMPT = """You are a Senior Portfolio Manager at a top-tier Asset Management firm (e.g., Goldman Sachs, BlackRock).
3
+ Your goal is to write a concise, professional, and insightful performance commentary for a High Net Worth Application.
4
+ Your tone should be:
5
+ 1. Professional and reassuring.
6
+ 2. Mathematically precise (cite the numbers).
7
+ 3. Explanatory (explain 'why' something happened).
8
+
9
+ Avoid generic financial advice. Focus strictly on the attribution data provided.
10
+ """
11
+
12
+ # User Prompt Template
13
+ ATTRIBUTION_PROMPT_TEMPLATE = """
14
+ Write a "Trailing 30-Day Risk & Performance Attribution" report relative to the S&P 500 benchmark.
15
+
16
+ ## Constraints Applied
17
+ - Exclusions: {excluded_sector}
18
+
19
+ ## Brinson-Fachler Attribution Data (Trailing 30 Days)
20
+ - Total Active Return (Alpha): {total_active_return:.2f}%
21
+ - Allocation Effect (Impact of Exclusions): {allocation_effect:.2f}%
22
+ - Selection Effect (Impact of Stock Picking): {selection_effect:.2f}%
23
+
24
+ ## Attribution Detail
25
+ - Top Active Contributors: {top_contributors}
26
+ - Top Active Detractors: {top_detractors}
27
+
28
+ ## Guidelines for the Narrative:
29
+ 1. **Timeframe**: Use the EXACT date provided. Write "For the trailing 30-day period ending {current_date}..." DO NOT generalize to "the month of...".
30
+ 2. **Ticker Validation (CRITICAL)**: Always verify tickers. ExxonMobil is XOM, Chevron is CVX. Do NOT swap them.
31
+ 3. **Attribution Logic**:
32
+ - If a sector is excluded (0% weight), attribute ALL gains/losses to the **Allocation Effect**.
33
+ - Do NOT mention 'Selection Effect' for sectors where we hold 0% (e.g., if Energy is excluded, you didn't "select" bad Energy stocks, you just didn't own the sector).
34
+ 4. **Detractor Clarity**:
35
+ - If an EXCLUDED stock (like AMZN, XOM, CVX) is listed as a "Top Detractor", explicitly state: "We suffered a drag because the portfolio missed out on the rally in [Stock] due to exclusion constraints."
36
+
37
+ Write a professional, concise 3-paragraph commentary.
38
+ """
analytics/__pycache__/attribution.cpython-39.pyc ADDED
Binary file (2.93 kB). View file
 
analytics/__pycache__/risk_model.cpython-39.pyc ADDED
Binary file (1.89 kB). View file
 
analytics/__pycache__/tax_module.cpython-39.pyc ADDED
Binary file (3.55 kB). View file
 
analytics/attribution.py ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import numpy as np
3
+ from typing import Dict, List
4
+ import logging
5
+ from core.schema import AttributionReport
6
+
7
+ logger = logging.getLogger(__name__)
8
+
9
+ class AttributionEngine:
10
+ """
11
+ Implements the Brinson-Fachler Attribution Model.
12
+ Decomposes portfolio excess return into:
13
+ 1. Allocation Effect: Value added by sector weighting decisions.
14
+ 2. Selection Effect: Value added by stock picking within sectors.
15
+ """
16
+
17
+ def __init__(self):
18
+ pass
19
+
20
+ def generate_attribution_report(self,
21
+ portfolio_weights: Dict[str, float],
22
+ benchmark_weights: Dict[str, float],
23
+ asset_returns: pd.Series,
24
+ sector_map: Dict[str, str]) -> AttributionReport:
25
+ """
26
+ Calculates attribution effects.
27
+
28
+ Args:
29
+ portfolio_weights: Ticker -> Weight
30
+ benchmark_weights: Ticker -> Weight
31
+ asset_returns: Ticker -> Return (period)
32
+ sector_map: Ticker -> Sector
33
+
34
+ Returns:
35
+ AttributionReport object
36
+ """
37
+ # Create a DataFrame for calculation
38
+ all_tickers = set(portfolio_weights.keys()) | set(benchmark_weights.keys())
39
+ df = pd.DataFrame(index=list(all_tickers))
40
+
41
+ df['wp'] = df.index.map(portfolio_weights).fillna(0.0)
42
+ df['wb'] = df.index.map(benchmark_weights).fillna(0.0)
43
+ df['ret'] = df.index.map(asset_returns).fillna(0.0)
44
+ df['sector'] = df.index.map(sector_map).fillna("Unknown")
45
+
46
+ # Calculate Sector Level Data
47
+ # Sector Portfolio Return (R_pi), Sector Benchmark Return (R_bi)
48
+ # Sector Portfolio Weight (w_pi), Sector Benchmark Weight (w_bi)
49
+
50
+ sector_groups = df.groupby('sector')
51
+
52
+ attribution_rows = []
53
+
54
+ total_benchmark_return = (df['wb'] * df['ret']).sum()
55
+
56
+ for sector, data in sector_groups:
57
+ w_p = data['wp'].sum()
58
+ w_b = data['wb'].sum()
59
+
60
+ # Avoid division by zero if weight is 0
61
+ R_p = (data['wp'] * data['ret']).sum() / w_p if w_p > 0 else 0
62
+ R_b = (data['wb'] * data['ret']).sum() / w_b if w_b > 0 else 0
63
+
64
+ # Brinson-Fachler Allocation: (w_p - w_b) * (R_b - R_total_benchmark)
65
+ allocation_effect = (w_p - w_b) * (R_b - total_benchmark_return)
66
+
67
+ # Selection Effect: w_b * (R_p - R_b)
68
+ # Note: Often interaction is w_p * ... or split.
69
+ # Brinson-Beebower uses w_b for selection.
70
+ selection_effect = w_b * (R_p - R_b)
71
+
72
+ # Interaction: (w_p - w_b) * (R_p - R_b)
73
+ interaction_effect = (w_p - w_b) * (R_p - R_b)
74
+
75
+ attribution_rows.append({
76
+ 'sector': sector,
77
+ 'allocation': allocation_effect,
78
+ 'selection': selection_effect,
79
+ 'interaction': interaction_effect,
80
+ 'total_effect': allocation_effect + selection_effect + interaction_effect
81
+ })
82
+
83
+ attr_df = pd.DataFrame(attribution_rows)
84
+
85
+ total_allocation = attr_df['allocation'].sum()
86
+ total_selection = attr_df['selection'].sum() # + interaction usually bundled
87
+ total_interaction = attr_df['interaction'].sum()
88
+
89
+ # Calculate Top Contributors/Detractors to active return
90
+ # Active Weight * Asset Return? Or Contribution to Active Return?
91
+ # Contribution to Active Return = w_p*r_a - w_b*r_a ...
92
+ df['active_weight'] = df['wp'] - df['wb']
93
+ df['contribution'] = df['active_weight'] * df['ret'] # Simple approx
94
+
95
+ sorted_contrib = df.sort_values(by='contribution', ascending=False)
96
+ top_contributors = sorted_contrib.head(5).index.tolist()
97
+ top_detractors = sorted_contrib.tail(5).index.tolist()
98
+
99
+ # Narrative skeleton (to be filled by AI)
100
+ narrative_raw = (
101
+ f"Total Active Return: {(total_allocation + total_selection + total_interaction):.4f}. "
102
+ f"Allocation Effect: {total_allocation:.4f}. "
103
+ f"Selection Effect: {total_selection + total_interaction:.4f}."
104
+ )
105
+
106
+ return AttributionReport(
107
+ allocation_effect=total_allocation,
108
+ selection_effect=total_selection + total_interaction,
109
+ total_active_return=(total_allocation + total_selection + total_interaction),
110
+ top_contributors=top_contributors,
111
+ top_detractors=top_detractors,
112
+ narrative=narrative_raw
113
+ )
analytics/risk_model.py ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import numpy as np
3
+ from sklearn.covariance import LedoitWolf
4
+ import logging
5
+
6
+ logger = logging.getLogger(__name__)
7
+
8
+ class RiskModel:
9
+ """
10
+ Computes the covariance matrix of asset returns using Ledoit-Wolf Shrinkage.
11
+ This is essential for high-dimensional portfolios (N > 500) where the
12
+ sample covariance matrix is often ill-conditioned or noisy.
13
+ """
14
+
15
+ def __init__(self):
16
+ pass
17
+
18
+ def compute_covariance_matrix(self, returns: pd.DataFrame) -> pd.DataFrame:
19
+ """
20
+ Calculates the shrunk covariance matrix.
21
+
22
+ Args:
23
+ returns (pd.DataFrame): Historical daily returns (Date index, Ticker columns).
24
+
25
+ Returns:
26
+ pd.DataFrame: Covariance matrix (Ticker index, Ticker columns).
27
+ """
28
+ if returns.empty:
29
+ logger.error("Returns dataframe is empty. Cannot compute covariance.")
30
+ raise ValueError("Empty returns dataframe.")
31
+
32
+ logger.info(f"Computing Ledoit-Wolf shrinkage covariance for {returns.shape[1]} assets...")
33
+
34
+ # Use scikit-learn's LedoitWolf estimator
35
+ lw = LedoitWolf()
36
+
37
+ # Fit logic
38
+ # Note: scikit-learn expects (n_samples, n_features).
39
+ # Our returns df is already (n_days, n_tickers), which matches.
40
+ try:
41
+ X = returns.values
42
+ lw.fit(X)
43
+
44
+ # The estimated covariance matrix
45
+ cov_matrix = lw.covariance_
46
+
47
+ # Reconstruct DataFrame
48
+ cov_df = pd.DataFrame(
49
+ cov_matrix,
50
+ index=returns.columns,
51
+ columns=returns.columns
52
+ )
53
+ logger.info("Covariance matrix computation successful.")
54
+ return cov_df
55
+
56
+ except Exception as e:
57
+ logger.error(f"Failed to compute covariance matrix: {e}")
58
+ raise e
analytics/tax_module.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import numpy as np
3
+ from typing import List, Dict, Optional
4
+ from datetime import date, timedelta
5
+ import logging
6
+ from core.schema import TaxLot, HarvestOpportunity, TickerData
7
+
8
+ logger = logging.getLogger(__name__)
9
+
10
+ class TaxEngine:
11
+ """
12
+ Identifies tax-loss harvesting opportunities and suggests proxies
13
+ to avoid Wash Sale violations.
14
+ """
15
+
16
+ def __init__(self, risk_model=None):
17
+ self.risk_model = risk_model
18
+
19
+ def check_wash_sale_rule(self, symbol: str, transaction_date: date,
20
+ recent_transactions: List[Dict]) -> bool:
21
+ """
22
+ Checks if a sale would trigger a wash sale based on purchases
23
+ within +/- 30 days.
24
+ """
25
+ # Simplified simulation: Look for any 'buy' of this symbol in last 30 days
26
+ limit_date = transaction_date - timedelta(days=30)
27
+
28
+ for txn in recent_transactions:
29
+ if txn['symbol'] == symbol and txn['type'] == 'buy':
30
+ txn_date = txn['date']
31
+ if txn_date >= limit_date and txn_date <= transaction_date:
32
+ return True
33
+ return False
34
+
35
+ def find_proxy(self, loser_ticker: str, sector: str,
36
+ candidate_tickers: List[TickerData],
37
+ correlation_matrix: Optional[pd.DataFrame] = None) -> str:
38
+ """
39
+ Finds a suitable proxy stock in the same sector.
40
+ Ideally high correlation (to maintain tracking) but not "substantially identical".
41
+ """
42
+ # Filter for same sector
43
+ sector_peers = [t.symbol for t in candidate_tickers if t.sector == sector and t.symbol != loser_ticker]
44
+
45
+ if not sector_peers:
46
+ return "SPY" # Fallback
47
+
48
+ if correlation_matrix is not None and not correlation_matrix.empty:
49
+ try:
50
+ # Get correlations for the loser ticker
51
+ if loser_ticker in correlation_matrix.index:
52
+ corrs = correlation_matrix[loser_ticker]
53
+ # Filter for sector peers
54
+ peer_corrs = corrs[corrs.index.isin(sector_peers)]
55
+ # Sort desc, pick top
56
+ if not peer_corrs.empty:
57
+ best_proxy = peer_corrs.idxmax()
58
+ logger.info(f"Found proxy for {loser_ticker} using correlation: {best_proxy} (corr: {peer_corrs.max():.2f})")
59
+ return best_proxy
60
+ except Exception as e:
61
+ logger.warning(f"Correlation lookup failed: {e}. Falling back to random peer.")
62
+
63
+ # Fallback: Pick a random peer in the sector
64
+ return sector_peers[0]
65
+
66
+ def harvest_losses(self, portfolio_lots: List[TaxLot],
67
+ market_prices: Dict[str, float],
68
+ candidate_tickers: List[TickerData],
69
+ correlation_matrix: Optional[pd.DataFrame] = None) -> List[HarvestOpportunity]:
70
+ """
71
+ Scans portfolio for lots with > 10% Unrealized Loss.
72
+ """
73
+ opportunities = []
74
+
75
+ for lot in portfolio_lots:
76
+ # Update current price if available
77
+ if lot.symbol in market_prices:
78
+ lot.current_price = market_prices[lot.symbol]
79
+
80
+ # Check threshold (e.g. -10%)
81
+ if lot.loss_percentage <= -0.10:
82
+ # Find Proxy
83
+ # Need to find the sector for this ticker from candidate_tickers
84
+ ticker_obj = next((t for t in candidate_tickers if t.symbol == lot.symbol), None)
85
+ sector = ticker_obj.sector if ticker_obj else "Unknown"
86
+
87
+ proxy = self.find_proxy(lot.symbol, sector, candidate_tickers, correlation_matrix)
88
+
89
+ opp = HarvestOpportunity(
90
+ sell_ticker=lot.symbol,
91
+ buy_proxy_ticker=proxy,
92
+ quantity=lot.quantity,
93
+ estimated_loss_harvested=abs(lot.unrealized_pl),
94
+ reason=f"Loss of {lot.loss_percentage*100:.1f}% exceeds 10% threshold."
95
+ )
96
+ opportunities.append(opp)
97
+
98
+ logger.info(f"Identified {len(opportunities)} tax-loss harvesting opportunities.")
99
+ return opportunities
api/__pycache__/app.cpython-311.pyc ADDED
Binary file (2.76 kB). View file
 
api/app.py ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI, HTTPException, Depends
2
+ from core.schema import OptimizationRequest, OptimizationResult
3
+ from main import QuantScaleSystem
4
+ import logging
5
+
6
+ app = FastAPI(title="QuantScale AI API", version="1.0.0")
7
+ logger = logging.getLogger("API")
8
+
9
+ # Singleton System
10
+ system = QuantScaleSystem()
11
+
12
+ from fastapi.responses import RedirectResponse
13
+
14
+ from fastapi.responses import FileResponse
15
+ from fastapi.staticfiles import StaticFiles
16
+
17
+ # Mount static files
18
+ app.mount("/static", StaticFiles(directory="api/static"), name="static")
19
+
20
+ @app.get("/")
21
+ def root():
22
+ """Serves the AI Interface."""
23
+ return FileResponse('api/static/index.html')
24
+
25
+ @app.get("/health")
26
+ def health_check():
27
+ return {"status": "healthy", "service": "QuantScale AI Direct Indexing"}
28
+
29
+ @app.post("/optimize", response_model=dict)
30
+ def optimize_portfolio(request: OptimizationRequest):
31
+ """
32
+ Optimizes a portfolio based on exclusions and generates an AI Attribution report.
33
+ """
34
+ try:
35
+ result = system.run_pipeline(request)
36
+ if not result:
37
+ raise HTTPException(status_code=500, detail="Pipeline failed to execute.")
38
+
39
+ return {
40
+ "client_id": request.client_id,
41
+ "allocations": result['optimization'].weights,
42
+ "tracking_error": result['optimization'].tracking_error,
43
+ "attribution_narrative": result['commentary']
44
+ }
45
+ except Exception as e:
46
+ logger.error(f"API Error: {e}")
47
+ raise HTTPException(status_code=500, detail=str(e))
api/static/index.html ADDED
@@ -0,0 +1,537 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+
4
+ <head>
5
+ <meta charset="UTF-8">
6
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
7
+ <title>QuantScale AI</title>
8
+ <link
9
+ href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600&family=JetBrains+Mono:wght@400;700&display=swap"
10
+ rel="stylesheet">
11
+ <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
12
+ <script src="https://cdnjs.cloudflare.com/ajax/libs/html2pdf.js/0.10.1/html2pdf.bundle.min.js"></script>
13
+ <style>
14
+ :root {
15
+ --bg-color: #0f1117;
16
+ --card-bg: #1e212b;
17
+ --accent: #3b82f6;
18
+ --text-primary: #e2e8f0;
19
+ --text-secondary: #94a3b8;
20
+ --success: #10b981;
21
+ }
22
+
23
+ body {
24
+ font-family: 'Inter', sans-serif;
25
+ background-color: var(--bg-color);
26
+ color: var(--text-primary);
27
+ margin: 0;
28
+ display: flex;
29
+ flex-direction: column;
30
+ align-items: center;
31
+ min-height: 100vh;
32
+ }
33
+
34
+ .container {
35
+ width: 100%;
36
+ max-width: 900px;
37
+ padding: 2rem;
38
+ box-sizing: border-box;
39
+ }
40
+
41
+ header {
42
+ text-align: center;
43
+ margin-bottom: 3rem;
44
+ }
45
+
46
+ h1 {
47
+ font-size: 2.5rem;
48
+ margin-bottom: 0.5rem;
49
+ background: linear-gradient(90deg, #60a5fa, #34d399);
50
+ -webkit-background-clip: text;
51
+ -webkit-text-fill-color: transparent;
52
+ }
53
+
54
+ .subtitle {
55
+ color: var(--text-secondary);
56
+ font-size: 1.1rem;
57
+ }
58
+
59
+ .input-area {
60
+ background-color: var(--card-bg);
61
+ padding: 1.5rem;
62
+ border-radius: 12px;
63
+ box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1);
64
+ margin-bottom: 2rem;
65
+ }
66
+
67
+ textarea {
68
+ width: 100%;
69
+ background-color: #0f1117;
70
+ border: 1px solid #2d3748;
71
+ color: var(--text-primary);
72
+ border-radius: 8px;
73
+ padding: 1rem;
74
+ font-family: 'Inter', sans-serif;
75
+ font-size: 1rem;
76
+ resize: none;
77
+ height: 80px;
78
+ box-sizing: border-box;
79
+ outline: none;
80
+ transition: border-color 0.2s;
81
+ }
82
+
83
+ textarea:focus {
84
+ border-color: var(--accent);
85
+ }
86
+
87
+ .btn-primary {
88
+ background-color: var(--accent);
89
+ color: white;
90
+ border: none;
91
+ padding: 0.75rem 1.5rem;
92
+ border-radius: 8px;
93
+ font-weight: 600;
94
+ cursor: pointer;
95
+ margin-top: 1rem;
96
+ width: 100%;
97
+ transition: opacity 0.2s;
98
+ }
99
+
100
+ .btn-primary:hover {
101
+ opacity: 0.9;
102
+ }
103
+
104
+ .loader {
105
+ display: none;
106
+ text-align: center;
107
+ margin: 2rem 0;
108
+ color: var(--accent);
109
+ }
110
+
111
+ #results {
112
+ display: none;
113
+ animation: fadeIn 0.5s ease;
114
+ }
115
+
116
+ .report-grid {
117
+ display: grid;
118
+ grid-template-columns: 1fr 1fr;
119
+ gap: 1.5rem;
120
+ margin-bottom: 2rem;
121
+ }
122
+
123
+ .card {
124
+ background-color: var(--card-bg);
125
+ padding: 1.5rem;
126
+ border-radius: 12px;
127
+ border: 1px solid #2d3748;
128
+ }
129
+
130
+ h3 {
131
+ margin-top: 0;
132
+ font-size: 0.9rem;
133
+ text-transform: uppercase;
134
+ letter-spacing: 0.05em;
135
+ color: var(--text-secondary);
136
+ }
137
+
138
+ .metric {
139
+ font-size: 2rem;
140
+ font-weight: 700;
141
+ color: var(--text-primary);
142
+ }
143
+
144
+ .metric-label {
145
+ font-size: 0.875rem;
146
+ color: var(--text-secondary);
147
+ }
148
+
149
+ .narrative-box {
150
+ background-color: #1e212b;
151
+ border-left: 4px solid var(--success);
152
+ padding: 1.5rem;
153
+ border-radius: 0 12px 12px 0;
154
+ line-height: 1.6;
155
+ }
156
+
157
+ .holding-list {
158
+ max-height: 300px;
159
+ overflow-y: auto;
160
+ font-family: 'JetBrains Mono', monospace;
161
+ font-size: 0.9rem;
162
+ }
163
+
164
+ .holding-item {
165
+ display: flex;
166
+ justify-content: space-between;
167
+ padding: 0.5rem 0;
168
+ border-bottom: 1px solid #2d3748;
169
+ }
170
+
171
+ @keyframes fadeIn {
172
+ from {
173
+ opacity: 0;
174
+ transform: translateY(10px);
175
+ }
176
+
177
+ to {
178
+ opacity: 1;
179
+ transform: translateY(0);
180
+ }
181
+ }
182
+
183
+ /* PDF Export Styles (Professional Document Mode) */
184
+ .pdf-mode {
185
+ /* CRITICAL: Override Variables to Jet Black */
186
+ --bg-color: #ffffff !important;
187
+ --card-bg: transparent !important;
188
+ --text-primary: #000000 !important;
189
+ --text-secondary: #000000 !important;
190
+ /* Force subtitles to black */
191
+ --accent: #000000 !important;
192
+
193
+ background-color: #ffffff !important;
194
+ color: #000000 !important;
195
+ padding: 40px;
196
+ }
197
+
198
+ .pdf-mode .report-grid {
199
+ gap: 2rem;
200
+ }
201
+
202
+ .pdf-mode .card {
203
+ background-color: transparent !important;
204
+ border: 1px solid #000000 !important;
205
+ /* Sharp black border */
206
+ box-shadow: none !important;
207
+ border-radius: 4px !important;
208
+ /* Sharper corners */
209
+ padding: 1.5rem !important;
210
+ color: #000000 !important;
211
+ }
212
+
213
+ .pdf-mode h1 {
214
+ background: none !important;
215
+ -webkit-text-fill-color: #000000 !important;
216
+ color: #000000 !important;
217
+ font-size: 24pt !important;
218
+ margin-bottom: 5px !important;
219
+ }
220
+
221
+ .pdf-mode .subtitle {
222
+ color: #333333 !important;
223
+ font-size: 14pt !important;
224
+ margin-bottom: 20px !important;
225
+ }
226
+
227
+ .pdf-mode h1,
228
+ .pdf-mode h2,
229
+ .pdf-mode h3,
230
+ .pdf-mode p {
231
+ color: #000000 !important;
232
+ }
233
+
234
+ .pdf-mode h3 {
235
+ color: #000000 !important;
236
+ font-weight: 800 !important;
237
+ border-bottom: 1px solid #000000;
238
+ padding-bottom: 5px;
239
+ margin-bottom: 15px;
240
+ font-size: 12pt !important;
241
+ }
242
+
243
+ .pdf-mode .metric {
244
+ color: #000000 !important;
245
+ font-size: 28pt !important;
246
+ }
247
+
248
+ .pdf-mode .metric-label {
249
+ color: #333333 !important;
250
+ font-size: 10pt !important;
251
+ font-weight: 500 !important;
252
+ }
253
+
254
+ .pdf-mode .holding-item {
255
+ border-bottom: 1px solid #dddddd !important;
256
+ color: #000000 !important;
257
+ font-size: 10pt !important;
258
+ }
259
+
260
+ .pdf-mode .narrative-box {
261
+ background-color: transparent !important;
262
+ /* No grey box */
263
+ color: #000000 !important;
264
+ border-left: 4px solid #000000 !important;
265
+ /* Black accent */
266
+ padding-left: 15px !important;
267
+ font-size: 11pt !important;
268
+ line-height: 1.5 !important;
269
+ text-align: justify;
270
+ }
271
+
272
+ /* Force Chart Legends to be dark (might trigger re-render if I could, but simple CSS helps) */
273
+ .pdf-mode canvas {
274
+ filter: contrast(1.2);
275
+ /* Slight boost */
276
+ }
277
+ </style>
278
+ </head>
279
+
280
+ <body>
281
+
282
+ <div class="container">
283
+ <header>
284
+ <h1>QuantScale AI</h1>
285
+ <div class="subtitle">Direct Indexing & Attribution Engine</div>
286
+ </header>
287
+
288
+ <div class="input-area">
289
+ <textarea id="userInput"
290
+ placeholder="Describe your goal, e.g., 'Optimize my $100k portfolio but exclude the Energy and Utilities sectors.'"></textarea>
291
+ <button class="btn-primary" onclick="runOptimization()">Generate Portfolio Strategy</button>
292
+ </div>
293
+
294
+ <div class="loader" id="loader">
295
+ Running Convex Optimization & AI Model...
296
+ </div>
297
+
298
+ <div id="results">
299
+ <!-- Download Button -->
300
+ <div style="text-align: right; margin-bottom: 1rem;">
301
+ <button onclick="downloadPDF()"
302
+ style="background: transparent; border: 1px solid #3b82f6; color: #3b82f6; padding: 0.5rem 1rem; border-radius: 6px; cursor: pointer; font-family: 'Inter', sans-serif;">
303
+ 📄 Generate Institutional Report
304
+ </button>
305
+ </div>
306
+
307
+ <!-- Top Metrics -->
308
+ <div class="report-grid">
309
+ <div class="card">
310
+ <h3>Projected Tracking Error</h3>
311
+ <div class="metric" id="teMetric">0.00%</div>
312
+ <div class="metric-label">vs S&P 500 Benchmark</div>
313
+ </div>
314
+ <div class="card">
315
+ <h3>Excluded Sectors</h3>
316
+ <div class="metric" id="excludedMetric" style="color: #ef4444;">None</div>
317
+ <div class="metric-label">Constraints applied</div>
318
+ </div>
319
+ </div>
320
+
321
+ <!-- AI Commentary -->
322
+ <div class="card" style="margin-bottom: 2rem;">
323
+ <h3>AI Performance Attribution</h3>
324
+ <div id="aiNarrative" class="narrative-box"></div>
325
+ </div>
326
+
327
+ <!-- Holdings & Chart -->
328
+ <div class="report-grid">
329
+ <div class="card">
330
+ <h3>Top Holdings</h3>
331
+ <div class="holding-list" id="holdingsList"></div>
332
+ </div>
333
+ <div class="card">
334
+ <h3>Sector Allocation</h3>
335
+ <canvas id="allocationChart"></canvas>
336
+ </div>
337
+ </div>
338
+ </div>
339
+ </div>
340
+
341
+ <script>
342
+ async function downloadPDF() {
343
+ const element = document.getElementById('results');
344
+ const btn = element.querySelector('button');
345
+
346
+ // 1. Switch to PDF Mode
347
+ element.classList.add('pdf-mode');
348
+ if (btn) btn.style.display = 'none';
349
+
350
+ // 2. Force Chart to Black Text (No Animation)
351
+ if (myChart) {
352
+ myChart.options.plugins.legend.labels.color = '#000000';
353
+ myChart.options.scales = myChart.options.scales || {};
354
+ myChart.update('none');
355
+ }
356
+
357
+ // 3. WAIT for Canvas Repaint (The "Freeze" Strategy)
358
+ await new Promise(resolve => setTimeout(resolve, 500));
359
+
360
+ const opt = {
361
+ margin: 1,
362
+ filename: 'QuantScale_Institutional_Report.pdf',
363
+ image: { type: 'jpeg', quality: 0.98 },
364
+ html2canvas: { scale: 3, backgroundColor: '#ffffff', useCORS: true, letterRendering: true },
365
+ jsPDF: { unit: 'in', format: 'letter', orientation: 'portrait' }
366
+ };
367
+
368
+ // 4. Generate & Save
369
+ await html2pdf().set(opt).from(element).save();
370
+
371
+ // 5. Cleanup / Restore
372
+ element.classList.remove('pdf-mode');
373
+ if (btn) btn.style.display = 'inline-block';
374
+
375
+ // Restore Chart Colors
376
+ if (myChart) {
377
+ myChart.options.plugins.legend.labels.color = '#94a3b8';
378
+
379
+ // Revert to animation default or none?
380
+ // Using 'none' to snap back instantly.
381
+ myChart.update('none');
382
+ }
383
+ }
384
+
385
+ async function runOptimization() {
386
+ const input = document.getElementById('userInput').value;
387
+ const loader = document.getElementById('loader');
388
+ const results = document.getElementById('results');
389
+
390
+ // UI Reset
391
+ results.style.display = 'none';
392
+ loader.style.display = 'block';
393
+
394
+ // 1. Simple Intent Parsing (Client-Side for Demo Speed)
395
+ // 1. Simple Intent Parsing (Client-Side for Demo Speed)
396
+ const sectorKeywords = {
397
+ "Energy": ["energy", "oil", "gas"],
398
+ "Technology": ["technology", "tech", "software", "it"],
399
+ "Financials": ["financials", "finance", "banks"],
400
+ "Healthcare": ["healthcare", "health", "pharma"],
401
+ "Utilities": ["utilities", "utility"],
402
+ "Materials": ["materials", "mining"],
403
+ "Consumer Discretionary": ["consumer", "retail", "discretionary"], // Note: Amazon is here
404
+ "Real Estate": ["real estate", "reit"],
405
+ "Communication Services": ["communication", "media", "telecom"] // Google/Meta/Netflix
406
+ };
407
+
408
+ // Single Stock Mapping (Common FAANG+ names)
409
+ const stockKeywords = {
410
+ "AMZN": ["amazon"],
411
+ "AAPL": ["apple", "iphone"],
412
+ "MSFT": ["microsoft", "windows"],
413
+ "GOOGL": ["google", "alphabet"],
414
+ "META": ["meta", "facebook"],
415
+ "TSLA": ["tesla"],
416
+ "NVDA": ["nvidia", "chips"],
417
+ "NFLX": ["netflix"]
418
+ };
419
+
420
+ let excluded = [];
421
+ let excludedTickers = [];
422
+ const lowerInput = input.toLowerCase();
423
+
424
+ // Check Sectors
425
+ for (const [sector, keywords] of Object.entries(sectorKeywords)) {
426
+ if (keywords.some(k => lowerInput.includes(k))) {
427
+ // Avoid double counting if user said "Amazon" (Consumer) but didn't mean the whole sector?
428
+ // For now, standard inclusion
429
+ excluded.push(sector);
430
+ }
431
+ }
432
+
433
+ // Check Tickers
434
+ for (const [ticker, keywords] of Object.entries(stockKeywords)) {
435
+ if (keywords.some(k => lowerInput.includes(k))) {
436
+ excludedTickers.push(ticker);
437
+ }
438
+ }
439
+
440
+ // Default fallback if query is generic
441
+ if (excluded.length === 0 && excludedTickers.length === 0 && input.length > 5) {
442
+ // If user typed something but matched nothing, maybe assume No Exclusions for now or ask?
443
+ // For demo, we send "None" effectively.
444
+ }
445
+
446
+ const payload = {
447
+ "client_id": "Web_User",
448
+ "excluded_sectors": excluded,
449
+ "excluded_tickers": excludedTickers,
450
+ "initial_investment": 100000
451
+ };
452
+
453
+ try {
454
+ const response = await fetch('/optimize', {
455
+ method: 'POST',
456
+ headers: { 'Content-Type': 'application/json' },
457
+ body: JSON.stringify(payload)
458
+ });
459
+
460
+ const data = await response.json();
461
+
462
+ // Display Results
463
+ const allExclusions = [...excluded, ...excludedTickers];
464
+ displayData(data, allExclusions);
465
+ loader.style.display = 'none';
466
+ results.style.display = 'block';
467
+
468
+ } catch (error) {
469
+ alert("Optimization Failed: " + error);
470
+ loader.style.display = 'none';
471
+ }
472
+ }
473
+
474
+ function displayData(data, excluded) {
475
+ // Metrics
476
+ document.getElementById('teMetric').innerText = (data.tracking_error * 100).toFixed(4) + "%";
477
+ document.getElementById('excludedMetric').innerText = excluded.length > 0 ? excluded.join(", ") : "None";
478
+
479
+ // AI Text - Markdown clean
480
+ // Simple replace of **bold** with <b>
481
+ let narrative = data.attribution_narrative || "No commentary generated.";
482
+ narrative = narrative.replace(/\*\*(.*?)\*\*/g, '<b>$1</b>').replace(/\n/g, '<br>');
483
+ document.getElementById('aiNarrative').innerHTML = narrative;
484
+
485
+ // Holdings List (Top 10)
486
+ const listObj = document.getElementById('holdingsList');
487
+ listObj.innerHTML = '';
488
+ // Sort by weight
489
+ const sorted = Object.entries(data.allocations).sort((a, b) => b[1] - a[1]).slice(0, 15);
490
+
491
+ sorted.forEach(([ticker, weight]) => {
492
+ const div = document.createElement('div');
493
+ div.className = 'holding-item';
494
+ div.innerHTML = `<span>${ticker}</span><span>${(weight * 100).toFixed(2)}%</span>`;
495
+ listObj.appendChild(div);
496
+ });
497
+
498
+ // Chart
499
+ renderChart(data.allocations);
500
+ }
501
+
502
+ let myChart = null;
503
+ function renderChart(allocations) {
504
+ const ctx = document.getElementById('allocationChart').getContext('2d');
505
+ if (myChart) myChart.destroy();
506
+
507
+ // Simplification: In a real app we'd map Ticker -> Sector here
508
+ // For now, let's just show Top 5 Tickers vs "Others"
509
+ const sorted = Object.entries(allocations).sort((a, b) => b[1] - a[1]);
510
+ const top5 = sorted.slice(0, 5);
511
+ const others = sorted.slice(5).reduce((acc, curr) => acc + curr[1], 0);
512
+
513
+ const labels = top5.map(x => x[0]).concat(["Others"]);
514
+ const data = top5.map(x => x[1]).concat([others]);
515
+
516
+ myChart = new Chart(ctx, {
517
+ type: 'doughnut',
518
+ data: {
519
+ labels: labels,
520
+ datasets: [{
521
+ data: data,
522
+ backgroundColor: ['#3b82f6', '#10b981', '#f59e0b', '#ef4444', '#8b5cf6', '#475569'],
523
+ borderWidth: 0
524
+ }]
525
+ },
526
+ options: {
527
+ responsive: true,
528
+ plugins: {
529
+ legend: { position: 'right', labels: { color: '#94a3b8' } }
530
+ }
531
+ }
532
+ });
533
+ }
534
+ </script>
535
+ </body>
536
+
537
+ </html>
config.py ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from typing import Optional
3
+ from pydantic import BaseModel, Field, SecretStr
4
+
5
+ class Settings(BaseModel):
6
+ """
7
+ Application Configuration.
8
+ Loads from environment variables via os.getenv.
9
+ """
10
+
11
+ # API Keys
12
+ HF_TOKEN: Optional[SecretStr] = Field(default_factory=lambda: SecretStr(os.getenv("HF_TOKEN", "")) if os.getenv("HF_TOKEN") else None, description="Hugging Face API Token")
13
+
14
+ # Data Configuration
15
+ DATA_CACHE_DIR: str = Field(default="./data_cache", description="Directory to store cached market data")
16
+ SECTOR_MAP_FILE: str = Field(default="./data/sector_map.json", description="Path to sector mapping cache")
17
+
18
+ # Optimization Defaults
19
+ MAX_WEIGHT: float = Field(default=0.05, description="Maximum weight for a single asset")
20
+ MIN_WEIGHT: float = Field(default=0.00, description="Minimum weight for a single asset")
21
+
22
+ # Universe
23
+ BENCHMARK_TICKER: str = Field(default="^GSPC", description="Benchmark Ticker (S&P 500)")
24
+
25
+ # System
26
+ LOG_LEVEL: str = Field(default="INFO", description="Logging level")
27
+
28
+
29
+ # Global settings instance
30
+ try:
31
+ settings = Settings()
32
+ except Exception as e:
33
+ print(f"WARNING: Settings failed to load. Using defaults/env vars might be missing: {e}")
34
+ # Fallback to empty settings if possible or re-raise if critical
35
+ # For now, let's allow it to crash safely or provide a dummy
36
+ # But if HF_TOKEN is None, the AI feature will just fail gracefully later
37
+ settings = Settings(HF_TOKEN=None)
core/__pycache__/schema.cpython-311.pyc ADDED
Binary file (6.08 kB). View file
 
core/__pycache__/schema.cpython-39.pyc ADDED
Binary file (4.01 kB). View file
 
core/schema.py ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List, Dict, Optional
2
+ from pydantic import BaseModel, Field, validator
3
+ import pandas as pd
4
+ from datetime import date
5
+
6
+ class TickerData(BaseModel):
7
+ """
8
+ Represents a single stock's metadata and price history.
9
+ """
10
+ symbol: str
11
+ sector: str
12
+ price_history: Dict[str, float] = Field(default_factory=dict, description="Date (ISO) -> Adj Close Price")
13
+
14
+ @property
15
+ def latest_price(self) -> float:
16
+ if not self.price_history:
17
+ return 0.0
18
+ # Sort by date key and get last value
19
+ return self.price_history[sorted(self.price_history.keys())[-1]]
20
+
21
+ class OptimizationRequest(BaseModel):
22
+ """
23
+ User request for portfolio optimization.
24
+ """
25
+ client_id: str
26
+ initial_investment: float = 100000.0
27
+ excluded_sectors: List[str] = Field(default_factory=list, description="List of sectors to exclude (e.g., ['Energy'])")
28
+ excluded_tickers: List[str] = Field(default_factory=list, description="List of specific tickers to exclude (e.g., ['AMZN'])")
29
+ benchmark: str = "^GSPC"
30
+
31
+ class Config:
32
+ json_schema_extra = {
33
+ "example": {
34
+ "client_id": "Demo_User_1",
35
+ "initial_investment": 100000.0,
36
+ "excluded_sectors": ["Energy"],
37
+ "excluded_tickers": ["AMZN"],
38
+ "benchmark": "^GSPC"
39
+ }
40
+ }
41
+
42
+ class OptimizationResult(BaseModel):
43
+ """
44
+ Output of the optimization engine.
45
+ """
46
+ weights: Dict[str, float] = Field(..., description="Ticker -> Optimal Weight")
47
+ tracking_error: float
48
+ status: str
49
+
50
+ @validator('weights')
51
+ def validate_weights(cls, v):
52
+ # Filter out near-zero weights for cleanliness
53
+ return {k: val for k, val in v.items() if val > 0.0001}
54
+
55
+ class TaxLot(BaseModel):
56
+ """
57
+ A specific purchase lot of a stock.
58
+ """
59
+ symbol: str
60
+ purchase_date: date
61
+ quantity: int
62
+ cost_basis_per_share: float
63
+ current_price: float
64
+
65
+ @property
66
+ def unrealized_pl(self) -> float:
67
+ return (self.current_price - self.cost_basis_per_share) * self.quantity
68
+
69
+ @property
70
+ def is_loss(self) -> bool:
71
+ return self.unrealized_pl < 0
72
+
73
+ @property
74
+ def loss_percentage(self) -> float:
75
+ if self.cost_basis_per_share == 0: return 0.0
76
+ return (self.current_price - self.cost_basis_per_share) / self.cost_basis_per_share
77
+
78
+ class HarvestOpportunity(BaseModel):
79
+ """
80
+ A suggestion to harvest a tax loss.
81
+ """
82
+ sell_ticker: str
83
+ buy_proxy_ticker: str
84
+ quantity: int
85
+ estimated_loss_harvested: float
86
+ reason: str
87
+
88
+ class AttributionReport(BaseModel):
89
+ """
90
+ Brinson Attribution Data.
91
+ """
92
+ allocation_effect: float
93
+ selection_effect: float
94
+ total_active_return: float
95
+ top_contributors: List[str]
96
+ top_detractors: List[str]
97
+ narrative: str
data/__pycache__/data_manager.cpython-311.pyc ADDED
Binary file (10.6 kB). View file
 
data/__pycache__/data_manager.cpython-39.pyc ADDED
Binary file (5.49 kB). View file
 
data/__pycache__/optimizer.cpython-39.pyc ADDED
Binary file (4.1 kB). View file
 
data/data_manager.py ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import yfinance as yf
2
+ import pandas as pd
3
+ import numpy as np
4
+ import json
5
+ import os
6
+ import logging
7
+ from typing import List, Dict, Optional
8
+ from core.schema import TickerData
9
+ from config import settings
10
+
11
+ logging.basicConfig(level=settings.LOG_LEVEL)
12
+ logger = logging.getLogger(__name__)
13
+
14
+ class SectorCache:
15
+ """
16
+ Manages a local cache of Ticker -> Sector mappings to avoid
17
+ yfinance API throttling and improve speed.
18
+ """
19
+ def __init__(self, cache_file: str = settings.SECTOR_MAP_FILE):
20
+ self.cache_file = cache_file
21
+ self.sector_map = self._load_cache()
22
+
23
+ def _load_cache(self) -> Dict[str, str]:
24
+ if os.path.exists(self.cache_file):
25
+ try:
26
+ with open(self.cache_file, 'r') as f:
27
+ return json.load(f)
28
+ except Exception as e:
29
+ logger.error(f"Failed to load sector cache: {e}")
30
+ return {}
31
+ return {}
32
+
33
+ def save_cache(self):
34
+ os.makedirs(os.path.dirname(self.cache_file), exist_ok=True)
35
+ with open(self.cache_file, 'w') as f:
36
+ json.dump(self.sector_map, f, indent=2)
37
+
38
+ def get_sector(self, ticker: str) -> Optional[str]:
39
+ return self.sector_map.get(ticker)
40
+
41
+ def update_sector(self, ticker: str, sector: str):
42
+ self.sector_map[ticker] = sector
43
+
44
+ class MarketDataEngine:
45
+ """
46
+ Handles robust data ingestion from diverse sources (Wikipedia, yfinance).
47
+ Implements data cleaning and validation policies.
48
+ """
49
+ def __init__(self):
50
+ self.sector_cache = SectorCache()
51
+
52
+ def fetch_sp500_tickers(self) -> List[str]:
53
+ """
54
+ Loads S&P 500 components from a static JSON file (Production Mode).
55
+ Eliminates dependency on Wikipedia scraping.
56
+ """
57
+ try:
58
+ universe_file = os.path.join(os.path.dirname(__file__), 'sp500_universe.json')
59
+
60
+ # If we happen to not have the file, use the fallback list
61
+ if not os.path.exists(universe_file):
62
+ logger.warning("Universe file not found. Using fallback.")
63
+ return self._get_fallback_tickers()
64
+
65
+ with open(universe_file, 'r') as f:
66
+ universe_data = json.load(f)
67
+
68
+ tickers = []
69
+ for item in universe_data:
70
+ ticker = item['ticker']
71
+ sector = item['sector']
72
+ tickers.append(ticker)
73
+ self.sector_cache.update_sector(ticker, sector)
74
+
75
+ self.sector_cache.save_cache()
76
+ logger.info(f"Successfully loaded {len(tickers)} tickers from static universe.")
77
+ return tickers
78
+
79
+ except Exception as e:
80
+ logger.error(f"Error loading universe: {e}")
81
+ return self._get_fallback_tickers()
82
+
83
+ def _get_fallback_tickers(self) -> List[str]:
84
+ # Fallback for Demo Reliability
85
+ fallback_map = {
86
+ "AAPL": "Information Technology", "MSFT": "Information Technology", "GOOGL": "Communication Services",
87
+ "AMZN": "Consumer Discretionary", "NVDA": "Information Technology", "META": "Communication Services",
88
+ "TSLA": "Consumer Discretionary", "BRK-B": "Financials", "V": "Financials", "UNH": "Health Care",
89
+ "XOM": "Energy", "JNJ": "Health Care", "JPM": "Financials", "PG": "Consumer Staples",
90
+ "LLY": "Health Care", "MA": "Financials", "CVX": "Energy", "MRK": "Health Care",
91
+ "HD": "Consumer Discretionary", "PEP": "Consumer Staples", "COST": "Consumer Staples"
92
+ }
93
+ for t, s in fallback_map.items():
94
+ self.sector_cache.update_sector(t, s)
95
+ return list(fallback_map.keys())
96
+
97
+ def fetch_market_data(self, tickers: List[str], start_date: str = "2023-01-01") -> pd.DataFrame:
98
+ """
99
+ Fetches adjusted close prices for a list of tickers.
100
+ """
101
+ if not tickers:
102
+ logger.warning("No tickers provided to fetch.")
103
+ return pd.DataFrame()
104
+
105
+ logger.info(f"Downloading data for {len(tickers)} tickers from {start_date}...")
106
+ # Use yfinance download with threads
107
+ # 'Close' is usually adjusted in newer versions or defaults
108
+ data = yf.download(tickers, start=start_date, progress=False)
109
+
110
+ if data.empty:
111
+ logger.error("No data fetched from yfinance.")
112
+ return pd.DataFrame()
113
+
114
+ # Handle MultiIndex (Price, Ticker)
115
+ if hasattr(data.columns, 'levels') and 'Close' in data.columns.levels[0]:
116
+ data = data['Close']
117
+ elif 'Close' in data.columns:
118
+ data = data['Close']
119
+ elif 'Adj Close' in data.columns:
120
+ data = data['Adj Close']
121
+ else:
122
+ # Fallback
123
+ logger.warning("Could not find Close/Adj Close. Using first level.")
124
+ data = data.iloc[:, :len(tickers)] # Risky but fallback
125
+
126
+ return self._clean_data(data)
127
+
128
+ def _clean_data(self, df: pd.DataFrame) -> pd.DataFrame:
129
+ """
130
+ Applies data quality rules:
131
+ 1. Drop columns with > 10% missing data.
132
+ 2. Forward fill then Backward fill remaining NaNs.
133
+ """
134
+ initial_count = len(df.columns)
135
+
136
+ # Rule 1: Drop > 10% missing
137
+ missing_frac = df.isnull().mean()
138
+ drop_cols = missing_frac[missing_frac > 0.10].index.tolist()
139
+ df_clean = df.drop(columns=drop_cols)
140
+
141
+ dropped_count = len(drop_cols)
142
+ if dropped_count > 0:
143
+ logger.warning(f"Dropped {dropped_count} tickers due to >10% missing data: {drop_cols[:5]}...")
144
+
145
+ # Rule 2: Fill NaNs
146
+ df_clean = df_clean.ffill().bfill()
147
+
148
+ logger.info(f"Data cleaning complete. Retained {len(df_clean.columns)}/{initial_count} tickers.")
149
+ return df_clean
150
+
151
+ def get_sector_map(self) -> Dict[str, str]:
152
+ return self.sector_cache.sector_map
data/optimizer.py ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import cvxpy as cp
2
+ import pandas as pd
3
+ import numpy as np
4
+ import logging
5
+ from typing import List, Dict, Optional
6
+ from core.schema import OptimizationResult
7
+ from config import settings
8
+
9
+ logger = logging.getLogger(__name__)
10
+
11
+ class PortfolioOptimizer:
12
+ """
13
+ Quantitative Optimization Engine using CVXPY.
14
+ Objective: Minimize Tracking Error against a Benchmark.
15
+ Constraints:
16
+ 1. Full Investment (Sum w = 1)
17
+ 2. Long Only (w >= 0)
18
+ 3. Sector Exclusions (w[excluded] = 0)
19
+ """
20
+
21
+ def __init__(self):
22
+ pass
23
+
24
+ def optimize_portfolio(self,
25
+ covariance_matrix: pd.DataFrame,
26
+ tickers: List[str],
27
+ benchmark_weights: pd.DataFrame,
28
+ sector_map: Dict[str, str],
29
+ excluded_sectors: List[str],
30
+ excluded_tickers: List[str] = None) -> OptimizationResult:
31
+ """
32
+ Solves the tracking error minimization problem.
33
+
34
+ Args:
35
+ covariance_matrix: (N x N) Ledoit-Wolf shrunk covariance matrix.
36
+ tickers: List of N tickers.
37
+ benchmark_weights: (N x 1) Weights of the benchmark (e.g. S&P 500).
38
+ Un-held assets should have 0 weight.
39
+ sector_map: Dictionary mapping ticker -> sector.
40
+ excluded_sectors: List of sectors to exclude.
41
+ excluded_tickers: List of specific tickers to exclude.
42
+
43
+ Returns:
44
+ OptimizationResult containing weights and status.
45
+ """
46
+ excluded_tickers = excluded_tickers or []
47
+ n_assets = len(tickers)
48
+ if covariance_matrix.shape != (n_assets, n_assets):
49
+ raise ValueError(f"Covariance matrix shape {covariance_matrix.shape} does not match tickers count {n_assets}")
50
+
51
+ logger.info(f"Setting up CVXPY optimization for {n_assets} assets...")
52
+
53
+ # Variables
54
+ w = cp.Variable(n_assets)
55
+
56
+ # Benchmark Weights Vector (aligned to tickers)
57
+ if isinstance(benchmark_weights, (pd.Series, pd.DataFrame)):
58
+ w_b = benchmark_weights.reindex(tickers).fillna(0).values.flatten()
59
+ else:
60
+ w_b = np.array(benchmark_weights)
61
+
62
+ # Objective
63
+ active_weights = w - w_b
64
+ tracking_error_variance = cp.quad_form(active_weights, covariance_matrix.values)
65
+ objective = cp.Minimize(tracking_error_variance)
66
+
67
+ # 1. Identify Exclusions FIRST to adjust constraints
68
+ excluded_indices = []
69
+ mask_vector = np.zeros(n_assets)
70
+
71
+ # Sector Exclusions
72
+ if excluded_sectors:
73
+ logger.info(f"Applying Sector Exclusion Validation for: {excluded_sectors}")
74
+ for i, ticker in enumerate(tickers):
75
+ sector = sector_map.get(ticker, "Unknown")
76
+ for excl in excluded_sectors:
77
+ if excl.lower() == sector.lower() or (excl == "Technology" and sector == "Information Technology"):
78
+ excluded_indices.append(i)
79
+ mask_vector[i] = 1
80
+
81
+ # Ticker Exclusions (NEW)
82
+ if excluded_tickers:
83
+ logger.info(f"Applying Ticker Exclusion Validation for: {excluded_tickers}")
84
+ for i, ticker in enumerate(tickers):
85
+ if ticker in excluded_tickers:
86
+ excluded_indices.append(i)
87
+ mask_vector[i] = 1
88
+
89
+ excluded_indices = list(set(excluded_indices)) # Dedupe
90
+
91
+ logger.info(f"DEBUG: Excluded Mask Sum = {mask_vector.sum()} assets out of {n_assets}")
92
+
93
+ if len(excluded_indices) == n_assets:
94
+ raise ValueError("All assets excluded! Cannot optimize.")
95
+
96
+ # 2. Dynamic Constraints
97
+ n_active = n_assets - len(excluded_indices)
98
+ if n_active == 0: n_active = 1
99
+
100
+ min_avg_weight = 1.0 / n_active
101
+ dynamic_max = max(0.20, min_avg_weight * 1.5)
102
+
103
+ MAX_WEIGHT_LIMIT = dynamic_max
104
+ logger.info(f"DEBUG: Active Assets={n_active}, Min Avg={min_avg_weight:.4f}, Dynamic Max Limit={MAX_WEIGHT_LIMIT:.4f}")
105
+
106
+ constraints = [
107
+ cp.sum(w) == 1,
108
+ w >= 0,
109
+ w <= MAX_WEIGHT_LIMIT
110
+ ]
111
+
112
+ # Apply Exclusions
113
+ if excluded_indices:
114
+ constraints.append(w[excluded_indices] == 0)
115
+
116
+ # Problem
117
+ prob = cp.Problem(objective, constraints)
118
+
119
+ try:
120
+ logger.info("Solving quadratic programming problem...")
121
+ # verbose=True to see solver output in logs
122
+ prob.solve(verbose=True)
123
+ except Exception as e:
124
+ logger.error(f"Optimization CRASHED: {e}")
125
+ raise e
126
+
127
+ # CHECK SOLVER STATUS
128
+ if prob.status not in [cp.OPTIMAL, cp.OPTIMAL_INACCURATE]:
129
+ logger.error(f"Optimization FAILED with status: {prob.status}")
130
+ raise ValueError(f"Solver failed: {prob.status}")
131
+
132
+ # Extract weights
133
+ optimal_weights = w.value
134
+ if optimal_weights is None:
135
+ raise ValueError("Solver returned None for weights.")
136
+
137
+ # Add small tolerance cleanup
138
+ optimal_weights[optimal_weights < 1e-4] = 0
139
+
140
+ # Normalize just in case (solver precision)
141
+ # optimal_weights = optimal_weights / optimal_weights.sum()
142
+
143
+ # Format Result
144
+ weight_dict = {
145
+ tickers[i]: float(optimal_weights[i])
146
+ for i in range(n_assets)
147
+ if optimal_weights[i] > 0
148
+ }
149
+
150
+ # Calculate resulting Tracking Error (volatility of active returns)
151
+ # TE = sqrt(variance)
152
+ te = np.sqrt(prob.value) if prob.value > 0 else 0.0
153
+
154
+ logger.info(f"Optimization Solved. Tracking Error: {te:.4f}")
155
+
156
+ return OptimizationResult(
157
+ weights=weight_dict,
158
+ tracking_error=te,
159
+ status=prob.status
160
+ )
data/sector_map.json ADDED
@@ -0,0 +1,505 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "MMM": "Industrials",
3
+ "AOS": "Industrials",
4
+ "ABT": "Health Care",
5
+ "ABBV": "Health Care",
6
+ "ACN": "Information Technology",
7
+ "ADBE": "Information Technology",
8
+ "AMD": "Information Technology",
9
+ "AES": "Utilities",
10
+ "AFL": "Financials",
11
+ "A": "Health Care",
12
+ "APD": "Materials",
13
+ "ABNB": "Consumer Discretionary",
14
+ "AKAM": "Information Technology",
15
+ "ALB": "Materials",
16
+ "ARE": "Real Estate",
17
+ "ALGN": "Health Care",
18
+ "ALLE": "Industrials",
19
+ "LNT": "Utilities",
20
+ "ALL": "Financials",
21
+ "GOOGL": "Communication Services",
22
+ "GOOG": "Communication Services",
23
+ "MO": "Consumer Staples",
24
+ "AMZN": "Consumer Discretionary",
25
+ "AMCR": "Materials",
26
+ "AEE": "Utilities",
27
+ "AEP": "Utilities",
28
+ "AXP": "Financials",
29
+ "AIG": "Financials",
30
+ "AMT": "Real Estate",
31
+ "AWK": "Utilities",
32
+ "AMP": "Financials",
33
+ "AME": "Industrials",
34
+ "AMGN": "Health Care",
35
+ "APH": "Information Technology",
36
+ "ADI": "Information Technology",
37
+ "AON": "Financials",
38
+ "APA": "Energy",
39
+ "APO": "Financials",
40
+ "AAPL": "Information Technology",
41
+ "AMAT": "Information Technology",
42
+ "APP": "Information Technology",
43
+ "APTV": "Consumer Discretionary",
44
+ "ACGL": "Financials",
45
+ "ADM": "Consumer Staples",
46
+ "ARES": "Financials",
47
+ "ANET": "Information Technology",
48
+ "AJG": "Financials",
49
+ "AIZ": "Financials",
50
+ "T": "Communication Services",
51
+ "ATO": "Utilities",
52
+ "ADSK": "Information Technology",
53
+ "ADP": "Industrials",
54
+ "AZO": "Consumer Discretionary",
55
+ "AVB": "Real Estate",
56
+ "AVY": "Materials",
57
+ "AXON": "Industrials",
58
+ "BKR": "Energy",
59
+ "BALL": "Materials",
60
+ "BAC": "Financials",
61
+ "BAX": "Health Care",
62
+ "BDX": "Health Care",
63
+ "BRK-B": "Financials",
64
+ "BBY": "Consumer Discretionary",
65
+ "TECH": "Health Care",
66
+ "BIIB": "Health Care",
67
+ "BLK": "Financials",
68
+ "BX": "Financials",
69
+ "XYZ": "Financials",
70
+ "BK": "Financials",
71
+ "BA": "Industrials",
72
+ "BKNG": "Consumer Discretionary",
73
+ "BSX": "Health Care",
74
+ "BMY": "Health Care",
75
+ "AVGO": "Information Technology",
76
+ "BR": "Industrials",
77
+ "BRO": "Financials",
78
+ "BF-B": "Consumer Staples",
79
+ "BLDR": "Industrials",
80
+ "BG": "Consumer Staples",
81
+ "BXP": "Real Estate",
82
+ "CHRW": "Industrials",
83
+ "CDNS": "Information Technology",
84
+ "CPT": "Real Estate",
85
+ "CPB": "Consumer Staples",
86
+ "COF": "Financials",
87
+ "CAH": "Health Care",
88
+ "CCL": "Consumer Discretionary",
89
+ "CARR": "Industrials",
90
+ "CVNA": "Consumer Discretionary",
91
+ "CAT": "Industrials",
92
+ "CBOE": "Financials",
93
+ "CBRE": "Real Estate",
94
+ "CDW": "Information Technology",
95
+ "COR": "Health Care",
96
+ "CNC": "Health Care",
97
+ "CNP": "Utilities",
98
+ "CF": "Materials",
99
+ "CRL": "Health Care",
100
+ "SCHW": "Financials",
101
+ "CHTR": "Communication Services",
102
+ "CVX": "Energy",
103
+ "CMG": "Consumer Discretionary",
104
+ "CB": "Financials",
105
+ "CHD": "Consumer Staples",
106
+ "CI": "Health Care",
107
+ "CINF": "Financials",
108
+ "CTAS": "Industrials",
109
+ "CSCO": "Information Technology",
110
+ "C": "Financials",
111
+ "CFG": "Financials",
112
+ "CLX": "Consumer Staples",
113
+ "CME": "Financials",
114
+ "CMS": "Utilities",
115
+ "KO": "Consumer Staples",
116
+ "CTSH": "Information Technology",
117
+ "COIN": "Financials",
118
+ "CL": "Consumer Staples",
119
+ "CMCSA": "Communication Services",
120
+ "FIX": "Industrials",
121
+ "CAG": "Consumer Staples",
122
+ "COP": "Energy",
123
+ "ED": "Utilities",
124
+ "STZ": "Consumer Staples",
125
+ "CEG": "Utilities",
126
+ "COO": "Health Care",
127
+ "CPRT": "Industrials",
128
+ "GLW": "Information Technology",
129
+ "CPAY": "Financials",
130
+ "CTVA": "Materials",
131
+ "CSGP": "Real Estate",
132
+ "COST": "Consumer Staples",
133
+ "CTRA": "Energy",
134
+ "CRH": "Materials",
135
+ "CRWD": "Information Technology",
136
+ "CCI": "Real Estate",
137
+ "CSX": "Industrials",
138
+ "CMI": "Industrials",
139
+ "CVS": "Health Care",
140
+ "DHR": "Health Care",
141
+ "DRI": "Consumer Discretionary",
142
+ "DDOG": "Information Technology",
143
+ "DVA": "Health Care",
144
+ "DAY": "Industrials",
145
+ "DECK": "Consumer Discretionary",
146
+ "DE": "Industrials",
147
+ "DELL": "Information Technology",
148
+ "DAL": "Industrials",
149
+ "DVN": "Energy",
150
+ "DXCM": "Health Care",
151
+ "FANG": "Energy",
152
+ "DLR": "Real Estate",
153
+ "DG": "Consumer Staples",
154
+ "DLTR": "Consumer Staples",
155
+ "D": "Utilities",
156
+ "DPZ": "Consumer Discretionary",
157
+ "DASH": "Consumer Discretionary",
158
+ "DOV": "Industrials",
159
+ "DOW": "Materials",
160
+ "DHI": "Consumer Discretionary",
161
+ "DTE": "Utilities",
162
+ "DUK": "Utilities",
163
+ "DD": "Materials",
164
+ "ETN": "Industrials",
165
+ "EBAY": "Consumer Discretionary",
166
+ "ECL": "Materials",
167
+ "EIX": "Utilities",
168
+ "EW": "Health Care",
169
+ "EA": "Communication Services",
170
+ "ELV": "Health Care",
171
+ "EME": "Industrials",
172
+ "EMR": "Industrials",
173
+ "ETR": "Utilities",
174
+ "EOG": "Energy",
175
+ "EPAM": "Information Technology",
176
+ "EQT": "Energy",
177
+ "EFX": "Industrials",
178
+ "EQIX": "Real Estate",
179
+ "EQR": "Real Estate",
180
+ "ERIE": "Financials",
181
+ "ESS": "Real Estate",
182
+ "EL": "Consumer Staples",
183
+ "EG": "Financials",
184
+ "EVRG": "Utilities",
185
+ "ES": "Utilities",
186
+ "EXC": "Utilities",
187
+ "EXE": "Energy",
188
+ "EXPE": "Consumer Discretionary",
189
+ "EXPD": "Industrials",
190
+ "EXR": "Real Estate",
191
+ "XOM": "Energy",
192
+ "FFIV": "Information Technology",
193
+ "FDS": "Financials",
194
+ "FICO": "Information Technology",
195
+ "FAST": "Industrials",
196
+ "FRT": "Real Estate",
197
+ "FDX": "Industrials",
198
+ "FIS": "Financials",
199
+ "FITB": "Financials",
200
+ "FSLR": "Information Technology",
201
+ "FE": "Utilities",
202
+ "FISV": "Financials",
203
+ "F": "Consumer Discretionary",
204
+ "FTNT": "Information Technology",
205
+ "FTV": "Industrials",
206
+ "FOXA": "Communication Services",
207
+ "FOX": "Communication Services",
208
+ "BEN": "Financials",
209
+ "FCX": "Materials",
210
+ "GRMN": "Consumer Discretionary",
211
+ "IT": "Information Technology",
212
+ "GE": "Industrials",
213
+ "GEHC": "Health Care",
214
+ "GEV": "Industrials",
215
+ "GEN": "Information Technology",
216
+ "GNRC": "Industrials",
217
+ "GD": "Industrials",
218
+ "GIS": "Consumer Staples",
219
+ "GM": "Consumer Discretionary",
220
+ "GPC": "Consumer Discretionary",
221
+ "GILD": "Health Care",
222
+ "GPN": "Financials",
223
+ "GL": "Financials",
224
+ "GDDY": "Information Technology",
225
+ "GS": "Financials",
226
+ "HAL": "Energy",
227
+ "HIG": "Financials",
228
+ "HAS": "Consumer Discretionary",
229
+ "HCA": "Health Care",
230
+ "DOC": "Real Estate",
231
+ "HSIC": "Health Care",
232
+ "HSY": "Consumer Staples",
233
+ "HPE": "Information Technology",
234
+ "HLT": "Consumer Discretionary",
235
+ "HOLX": "Health Care",
236
+ "HD": "Consumer Discretionary",
237
+ "HON": "Industrials",
238
+ "HRL": "Consumer Staples",
239
+ "HST": "Real Estate",
240
+ "HWM": "Industrials",
241
+ "HPQ": "Information Technology",
242
+ "HUBB": "Industrials",
243
+ "HUM": "Health Care",
244
+ "HBAN": "Financials",
245
+ "HII": "Industrials",
246
+ "IBM": "Information Technology",
247
+ "IEX": "Industrials",
248
+ "IDXX": "Health Care",
249
+ "ITW": "Industrials",
250
+ "INCY": "Health Care",
251
+ "IR": "Industrials",
252
+ "PODD": "Health Care",
253
+ "INTC": "Information Technology",
254
+ "IBKR": "Financials",
255
+ "ICE": "Financials",
256
+ "IFF": "Materials",
257
+ "IP": "Materials",
258
+ "INTU": "Information Technology",
259
+ "ISRG": "Health Care",
260
+ "IVZ": "Financials",
261
+ "INVH": "Real Estate",
262
+ "IQV": "Health Care",
263
+ "IRM": "Real Estate",
264
+ "JBHT": "Industrials",
265
+ "JBL": "Information Technology",
266
+ "JKHY": "Financials",
267
+ "J": "Industrials",
268
+ "JNJ": "Health Care",
269
+ "JCI": "Industrials",
270
+ "JPM": "Financials",
271
+ "KVUE": "Consumer Staples",
272
+ "KDP": "Consumer Staples",
273
+ "KEY": "Financials",
274
+ "KEYS": "Information Technology",
275
+ "KMB": "Consumer Staples",
276
+ "KIM": "Real Estate",
277
+ "KMI": "Energy",
278
+ "KKR": "Financials",
279
+ "KLAC": "Information Technology",
280
+ "KHC": "Consumer Staples",
281
+ "KR": "Consumer Staples",
282
+ "LHX": "Industrials",
283
+ "LH": "Health Care",
284
+ "LRCX": "Information Technology",
285
+ "LW": "Consumer Staples",
286
+ "LVS": "Consumer Discretionary",
287
+ "LDOS": "Industrials",
288
+ "LEN": "Consumer Discretionary",
289
+ "LII": "Industrials",
290
+ "LLY": "Health Care",
291
+ "LIN": "Materials",
292
+ "LYV": "Communication Services",
293
+ "LMT": "Industrials",
294
+ "L": "Financials",
295
+ "LOW": "Consumer Discretionary",
296
+ "LULU": "Consumer Discretionary",
297
+ "LYB": "Materials",
298
+ "MTB": "Financials",
299
+ "MPC": "Energy",
300
+ "MAR": "Consumer Discretionary",
301
+ "MRSH": "Financials",
302
+ "MLM": "Materials",
303
+ "MAS": "Industrials",
304
+ "MA": "Financials",
305
+ "MTCH": "Communication Services",
306
+ "MKC": "Consumer Staples",
307
+ "MCD": "Consumer Discretionary",
308
+ "MCK": "Health Care",
309
+ "MDT": "Health Care",
310
+ "MRK": "Health Care",
311
+ "META": "Communication Services",
312
+ "MET": "Financials",
313
+ "MTD": "Health Care",
314
+ "MGM": "Consumer Discretionary",
315
+ "MCHP": "Information Technology",
316
+ "MU": "Information Technology",
317
+ "MSFT": "Information Technology",
318
+ "MAA": "Real Estate",
319
+ "MRNA": "Health Care",
320
+ "MOH": "Health Care",
321
+ "TAP": "Consumer Staples",
322
+ "MDLZ": "Consumer Staples",
323
+ "MPWR": "Information Technology",
324
+ "MNST": "Consumer Staples",
325
+ "MCO": "Financials",
326
+ "MS": "Financials",
327
+ "MOS": "Materials",
328
+ "MSI": "Information Technology",
329
+ "MSCI": "Financials",
330
+ "NDAQ": "Financials",
331
+ "NTAP": "Information Technology",
332
+ "NFLX": "Communication Services",
333
+ "NEM": "Materials",
334
+ "NWSA": "Communication Services",
335
+ "NWS": "Communication Services",
336
+ "NEE": "Utilities",
337
+ "NKE": "Consumer Discretionary",
338
+ "NI": "Utilities",
339
+ "NDSN": "Industrials",
340
+ "NSC": "Industrials",
341
+ "NTRS": "Financials",
342
+ "NOC": "Industrials",
343
+ "NCLH": "Consumer Discretionary",
344
+ "NRG": "Utilities",
345
+ "NUE": "Materials",
346
+ "NVDA": "Information Technology",
347
+ "NVR": "Consumer Discretionary",
348
+ "NXPI": "Information Technology",
349
+ "ORLY": "Consumer Discretionary",
350
+ "OXY": "Energy",
351
+ "ODFL": "Industrials",
352
+ "OMC": "Communication Services",
353
+ "ON": "Information Technology",
354
+ "OKE": "Energy",
355
+ "ORCL": "Information Technology",
356
+ "OTIS": "Industrials",
357
+ "PCAR": "Industrials",
358
+ "PKG": "Materials",
359
+ "PLTR": "Information Technology",
360
+ "PANW": "Information Technology",
361
+ "PSKY": "Communication Services",
362
+ "PH": "Industrials",
363
+ "PAYX": "Industrials",
364
+ "PAYC": "Industrials",
365
+ "PYPL": "Financials",
366
+ "PNR": "Industrials",
367
+ "PEP": "Consumer Staples",
368
+ "PFE": "Health Care",
369
+ "PCG": "Utilities",
370
+ "PM": "Consumer Staples",
371
+ "PSX": "Energy",
372
+ "PNW": "Utilities",
373
+ "PNC": "Financials",
374
+ "POOL": "Consumer Discretionary",
375
+ "PPG": "Materials",
376
+ "PPL": "Utilities",
377
+ "PFG": "Financials",
378
+ "PG": "Consumer Staples",
379
+ "PGR": "Financials",
380
+ "PLD": "Real Estate",
381
+ "PRU": "Financials",
382
+ "PEG": "Utilities",
383
+ "PTC": "Information Technology",
384
+ "PSA": "Real Estate",
385
+ "PHM": "Consumer Discretionary",
386
+ "PWR": "Industrials",
387
+ "QCOM": "Information Technology",
388
+ "DGX": "Health Care",
389
+ "Q": "Information Technology",
390
+ "RL": "Consumer Discretionary",
391
+ "RJF": "Financials",
392
+ "RTX": "Industrials",
393
+ "O": "Real Estate",
394
+ "REG": "Real Estate",
395
+ "REGN": "Health Care",
396
+ "RF": "Financials",
397
+ "RSG": "Industrials",
398
+ "RMD": "Health Care",
399
+ "RVTY": "Health Care",
400
+ "HOOD": "Financials",
401
+ "ROK": "Industrials",
402
+ "ROL": "Industrials",
403
+ "ROP": "Information Technology",
404
+ "ROST": "Consumer Discretionary",
405
+ "RCL": "Consumer Discretionary",
406
+ "SPGI": "Financials",
407
+ "CRM": "Information Technology",
408
+ "SNDK": "Information Technology",
409
+ "SBAC": "Real Estate",
410
+ "SLB": "Energy",
411
+ "STX": "Information Technology",
412
+ "SRE": "Utilities",
413
+ "NOW": "Information Technology",
414
+ "SHW": "Materials",
415
+ "SPG": "Real Estate",
416
+ "SWKS": "Information Technology",
417
+ "SJM": "Consumer Staples",
418
+ "SW": "Materials",
419
+ "SNA": "Industrials",
420
+ "SOLV": "Health Care",
421
+ "SO": "Utilities",
422
+ "LUV": "Industrials",
423
+ "SWK": "Industrials",
424
+ "SBUX": "Consumer Discretionary",
425
+ "STT": "Financials",
426
+ "STLD": "Materials",
427
+ "STE": "Health Care",
428
+ "SYK": "Health Care",
429
+ "SMCI": "Information Technology",
430
+ "SYF": "Financials",
431
+ "SNPS": "Information Technology",
432
+ "SYY": "Consumer Staples",
433
+ "TMUS": "Communication Services",
434
+ "TROW": "Financials",
435
+ "TTWO": "Communication Services",
436
+ "TPR": "Consumer Discretionary",
437
+ "TRGP": "Energy",
438
+ "TGT": "Consumer Staples",
439
+ "TEL": "Information Technology",
440
+ "TDY": "Information Technology",
441
+ "TER": "Information Technology",
442
+ "TSLA": "Consumer Discretionary",
443
+ "TXN": "Information Technology",
444
+ "TPL": "Energy",
445
+ "TXT": "Industrials",
446
+ "TMO": "Health Care",
447
+ "TJX": "Consumer Discretionary",
448
+ "TKO": "Communication Services",
449
+ "TTD": "Communication Services",
450
+ "TSCO": "Consumer Discretionary",
451
+ "TT": "Industrials",
452
+ "TDG": "Industrials",
453
+ "TRV": "Financials",
454
+ "TRMB": "Information Technology",
455
+ "TFC": "Financials",
456
+ "TYL": "Information Technology",
457
+ "TSN": "Consumer Staples",
458
+ "USB": "Financials",
459
+ "UBER": "Industrials",
460
+ "UDR": "Real Estate",
461
+ "ULTA": "Consumer Discretionary",
462
+ "UNP": "Industrials",
463
+ "UAL": "Industrials",
464
+ "UPS": "Industrials",
465
+ "URI": "Industrials",
466
+ "UNH": "Health Care",
467
+ "UHS": "Health Care",
468
+ "VLO": "Energy",
469
+ "VTR": "Real Estate",
470
+ "VLTO": "Industrials",
471
+ "VRSN": "Information Technology",
472
+ "VRSK": "Industrials",
473
+ "VZ": "Communication Services",
474
+ "VRTX": "Health Care",
475
+ "VTRS": "Health Care",
476
+ "VICI": "Real Estate",
477
+ "V": "Financials",
478
+ "VST": "Utilities",
479
+ "VMC": "Materials",
480
+ "WRB": "Financials",
481
+ "GWW": "Industrials",
482
+ "WAB": "Industrials",
483
+ "WMT": "Consumer Staples",
484
+ "DIS": "Communication Services",
485
+ "WBD": "Communication Services",
486
+ "WM": "Industrials",
487
+ "WAT": "Health Care",
488
+ "WEC": "Utilities",
489
+ "WFC": "Financials",
490
+ "WELL": "Real Estate",
491
+ "WST": "Health Care",
492
+ "WDC": "Information Technology",
493
+ "WY": "Real Estate",
494
+ "WSM": "Consumer Discretionary",
495
+ "WMB": "Energy",
496
+ "WTW": "Financials",
497
+ "WDAY": "Information Technology",
498
+ "WYNN": "Consumer Discretionary",
499
+ "XEL": "Utilities",
500
+ "XYL": "Industrials",
501
+ "YUM": "Consumer Discretionary",
502
+ "ZBRA": "Information Technology",
503
+ "ZBH": "Health Care",
504
+ "ZTS": "Health Care"
505
+ }
data/sp500_universe.json ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "ticker": "AAPL",
4
+ "sector": "Information Technology"
5
+ },
6
+ {
7
+ "ticker": "MSFT",
8
+ "sector": "Information Technology"
9
+ },
10
+ {
11
+ "ticker": "NVDA",
12
+ "sector": "Information Technology"
13
+ },
14
+ {
15
+ "ticker": "AVGO",
16
+ "sector": "Information Technology"
17
+ },
18
+ {
19
+ "ticker": "ADBE",
20
+ "sector": "Information Technology"
21
+ },
22
+ {
23
+ "ticker": "CRM",
24
+ "sector": "Information Technology"
25
+ },
26
+ {
27
+ "ticker": "CSCO",
28
+ "sector": "Information Technology"
29
+ },
30
+ {
31
+ "ticker": "AMD",
32
+ "sector": "Information Technology"
33
+ },
34
+ {
35
+ "ticker": "INTC",
36
+ "sector": "Information Technology"
37
+ },
38
+ {
39
+ "ticker": "AMZN",
40
+ "sector": "Consumer Discretionary"
41
+ },
42
+ {
43
+ "ticker": "TSLA",
44
+ "sector": "Consumer Discretionary"
45
+ },
46
+ {
47
+ "ticker": "HD",
48
+ "sector": "Consumer Discretionary"
49
+ },
50
+ {
51
+ "ticker": "MCD",
52
+ "sector": "Consumer Discretionary"
53
+ },
54
+ {
55
+ "ticker": "NKE",
56
+ "sector": "Consumer Discretionary"
57
+ },
58
+ {
59
+ "ticker": "LOW",
60
+ "sector": "Consumer Discretionary"
61
+ },
62
+ {
63
+ "ticker": "SBUX",
64
+ "sector": "Consumer Discretionary"
65
+ },
66
+ {
67
+ "ticker": "GOOGL",
68
+ "sector": "Communication Services"
69
+ },
70
+ {
71
+ "ticker": "GOOG",
72
+ "sector": "Communication Services"
73
+ },
74
+ {
75
+ "ticker": "META",
76
+ "sector": "Communication Services"
77
+ },
78
+ {
79
+ "ticker": "NFLX",
80
+ "sector": "Communication Services"
81
+ },
82
+ {
83
+ "ticker": "DIS",
84
+ "sector": "Communication Services"
85
+ },
86
+ {
87
+ "ticker": "CMCSA",
88
+ "sector": "Communication Services"
89
+ },
90
+ {
91
+ "ticker": "VZ",
92
+ "sector": "Communication Services"
93
+ },
94
+ {
95
+ "ticker": "T",
96
+ "sector": "Communication Services"
97
+ },
98
+ {
99
+ "ticker": "BRK-B",
100
+ "sector": "Financials"
101
+ },
102
+ {
103
+ "ticker": "JPM",
104
+ "sector": "Financials"
105
+ },
106
+ {
107
+ "ticker": "V",
108
+ "sector": "Financials"
109
+ },
110
+ {
111
+ "ticker": "MA",
112
+ "sector": "Financials"
113
+ },
114
+ {
115
+ "ticker": "BAC",
116
+ "sector": "Financials"
117
+ },
118
+ {
119
+ "ticker": "WFC",
120
+ "sector": "Financials"
121
+ },
122
+ {
123
+ "ticker": "MS",
124
+ "sector": "Financials"
125
+ },
126
+ {
127
+ "ticker": "GS",
128
+ "sector": "Financials"
129
+ },
130
+ {
131
+ "ticker": "BLK",
132
+ "sector": "Financials"
133
+ },
134
+ {
135
+ "ticker": "UNH",
136
+ "sector": "Health Care"
137
+ },
138
+ {
139
+ "ticker": "LLY",
140
+ "sector": "Health Care"
141
+ },
142
+ {
143
+ "ticker": "JNJ",
144
+ "sector": "Health Care"
145
+ },
146
+ {
147
+ "ticker": "MRK",
148
+ "sector": "Health Care"
149
+ },
150
+ {
151
+ "ticker": "ABBV",
152
+ "sector": "Health Care"
153
+ },
154
+ {
155
+ "ticker": "PFE",
156
+ "sector": "Health Care"
157
+ },
158
+ {
159
+ "ticker": "AMGN",
160
+ "sector": "Health Care"
161
+ },
162
+ {
163
+ "ticker": "TMO",
164
+ "sector": "Health Care"
165
+ },
166
+ {
167
+ "ticker": "PG",
168
+ "sector": "Consumer Staples"
169
+ },
170
+ {
171
+ "ticker": "COST",
172
+ "sector": "Consumer Staples"
173
+ },
174
+ {
175
+ "ticker": "PEP",
176
+ "sector": "Consumer Staples"
177
+ },
178
+ {
179
+ "ticker": "KO",
180
+ "sector": "Consumer Staples"
181
+ },
182
+ {
183
+ "ticker": "WMT",
184
+ "sector": "Consumer Staples"
185
+ },
186
+ {
187
+ "ticker": "PM",
188
+ "sector": "Consumer Staples"
189
+ },
190
+ {
191
+ "ticker": "XOM",
192
+ "sector": "Energy"
193
+ },
194
+ {
195
+ "ticker": "CVX",
196
+ "sector": "Energy"
197
+ },
198
+ {
199
+ "ticker": "COP",
200
+ "sector": "Energy"
201
+ },
202
+ {
203
+ "ticker": "SLB",
204
+ "sector": "Energy"
205
+ },
206
+ {
207
+ "ticker": "EOG",
208
+ "sector": "Energy"
209
+ },
210
+ {
211
+ "ticker": "MPC",
212
+ "sector": "Energy"
213
+ },
214
+ {
215
+ "ticker": "LIN",
216
+ "sector": "Materials"
217
+ },
218
+ {
219
+ "ticker": "SHW",
220
+ "sector": "Materials"
221
+ },
222
+ {
223
+ "ticker": "FCX",
224
+ "sector": "Materials"
225
+ },
226
+ {
227
+ "ticker": "CAT",
228
+ "sector": "Industrials"
229
+ },
230
+ {
231
+ "ticker": "UNP",
232
+ "sector": "Industrials"
233
+ },
234
+ {
235
+ "ticker": "GE",
236
+ "sector": "Industrials"
237
+ },
238
+ {
239
+ "ticker": "HON",
240
+ "sector": "Industrials"
241
+ },
242
+ {
243
+ "ticker": "NEE",
244
+ "sector": "Utilities"
245
+ },
246
+ {
247
+ "ticker": "DUK",
248
+ "sector": "Utilities"
249
+ },
250
+ {
251
+ "ticker": "SO",
252
+ "sector": "Utilities"
253
+ },
254
+ {
255
+ "ticker": "PLD",
256
+ "sector": "Real Estate"
257
+ },
258
+ {
259
+ "ticker": "AMT",
260
+ "sector": "Real Estate"
261
+ },
262
+ {
263
+ "ticker": "EQIX",
264
+ "sector": "Real Estate"
265
+ }
266
+ ]
debug_optimizer_tech.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import numpy as np
3
+ import logging
4
+ from data.optimizer import PortfolioOptimizer
5
+ from core.schema import OptimizationResult
6
+
7
+ # Config Logging
8
+ logging.basicConfig(level=logging.INFO)
9
+ logger = logging.getLogger(__name__)
10
+
11
+ def test_optimizer_exclusion():
12
+ print("\n--- STARTING OPTIMIZER DEBUG TEST ---\n")
13
+
14
+ # 1. Mock Data Setup (Mini S&P 500)
15
+ tickers = ["AAPL", "MSFT", "GOOGL", "XOM", "CVX", "JPM", "BAC", "JNJ", "PFE", "NEE"]
16
+ n = len(tickers)
17
+
18
+ # Sector Map (Tech, Energy, Financials, Healthcare, Utilities)
19
+ sector_map = {
20
+ "AAPL": "Information Technology",
21
+ "MSFT": "Information Technology",
22
+ "GOOGL": "Communication Services", # Often grouped with Tech
23
+ "XOM": "Energy",
24
+ "CVX": "Energy",
25
+ "JPM": "Financials",
26
+ "BAC": "Financials",
27
+ "JNJ": "Health Care",
28
+ "PFE": "Health Care",
29
+ "NEE": "Utilities"
30
+ }
31
+
32
+ # Mock Covariance (Identity for simplicity, slight correlation)
33
+ np.random.seed(42)
34
+ cov_data = np.eye(n) * 0.0004 # Low variance
35
+ cov_df = pd.DataFrame(cov_data, index=tickers, columns=tickers)
36
+
37
+ # Benchmark weights (Equal weight benchmark for test)
38
+ bench_weights = pd.Series(np.ones(n)/n, index=tickers)
39
+
40
+ # 2. Instantiate Optimizer
41
+ opt = PortfolioOptimizer()
42
+
43
+ # 3. Test Cases
44
+
45
+ # Case A: Normal
46
+ print("\n[Case A] No Exclusions")
47
+ res_a = opt.optimize_portfolio(cov_df, tickers, bench_weights, sector_map, [])
48
+ print(f"Status: {res_a.status}, TE: {res_a.tracking_error:.4f}")
49
+
50
+ # Case B: Exclude Energy (2 stocks)
51
+ print("\n[Case B] Exclude Energy")
52
+ res_b = opt.optimize_portfolio(cov_df, tickers, bench_weights, sector_map, ["Energy"])
53
+ print(f"Status: {res_b.status}, TE: {res_b.tracking_error:.4f}")
54
+ print(f"Weights: {res_b.weights}")
55
+ assert "XOM" not in res_b.weights
56
+ assert "CVX" not in res_b.weights
57
+
58
+ # Case C: Exclude Tech (Heavyweights AAPL, MSFT) -> This usually breaks tight constraints!
59
+ print("\n[Case C] Exclude Technology (The Failure Case)")
60
+ try:
61
+ # Note: sector_map uses "Information Technology", so we pass "Technology" and ensure the loop handles it
62
+ # Or we act like the frontend and pass the mapped name?
63
+ # The frontend usually sends "Technology".
64
+ # But wait, my optimizer code line 91 checks: if excl == sector or ...
65
+ # My fixed code handles "Technology" == "Information Technology".
66
+
67
+ # Let's pass "Technology"
68
+ res_c = opt.optimize_portfolio(cov_df, tickers, bench_weights, sector_map, ["Technology"])
69
+ print(f"Status: {res_c.status}, TE: {res_c.tracking_error:.4f}")
70
+ print(f"Weights: {res_c.weights}")
71
+
72
+ # Verification
73
+ if "AAPL" in res_c.weights or "MSFT" in res_c.weights:
74
+ print("❌ FAILURE: Tech stocks still in portfolio!")
75
+ else:
76
+ print("✅ SUCCESS: Tech stocks removed!")
77
+
78
+ except Exception as e:
79
+ print(f"❌ CRASHED: {e}")
80
+
81
+ if __name__ == "__main__":
82
+ test_optimizer_exclusion()
debug_yf.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ import yfinance as yf
2
+ tickers = ["AAPL", "MSFT"]
3
+ data = yf.download(tickers, start="2024-01-01", progress=False)
4
+ print("Columns:", data.columns)
5
+ try:
6
+ print(data['Adj Close'].head())
7
+ except Exception as e:
8
+ print("Error accessing Adj Close:", e)
9
+ print(data.head())
main.py ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ import pandas as pd
3
+ from typing import Dict, Any
4
+
5
+ from config import settings
6
+ from data.data_manager import MarketDataEngine
7
+ from analytics.risk_model import RiskModel
8
+ from data.optimizer import PortfolioOptimizer
9
+ from analytics.tax_module import TaxEngine
10
+ from analytics.attribution import AttributionEngine
11
+ from ai.ai_reporter import AIReporter
12
+ from core.schema import OptimizationRequest, TickerData
13
+
14
+ # Setup Logging
15
+ logging.basicConfig(level=settings.LOG_LEVEL, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
16
+ logger = logging.getLogger("QuantScaleAI")
17
+
18
+ class QuantScaleSystem:
19
+ def __init__(self):
20
+ self.data_engine = MarketDataEngine()
21
+ self.risk_model = RiskModel()
22
+ self.optimizer = PortfolioOptimizer()
23
+ self.tax_engine = TaxEngine()
24
+ self.attribution_engine = AttributionEngine()
25
+ self.ai_reporter = AIReporter()
26
+
27
+ def run_pipeline(self, request: OptimizationRequest):
28
+ logger.info(f"Starting pipeline for Client {request.client_id}...")
29
+
30
+ # 1. Fetch Universe (S&P 500)
31
+ tickers = self.data_engine.fetch_sp500_tickers()
32
+ # Limit for demo speed if needed, but let's try full
33
+ # tickers = tickers[:50]
34
+
35
+ # 2. Get Market Data
36
+ # Fetch last 2 years for covariance
37
+ data = self.data_engine.fetch_market_data(tickers, start_date="2023-01-01")
38
+ if data.empty:
39
+ logger.error("No market data available. Aborting.")
40
+ return None
41
+
42
+ returns = data.pct_change().dropna()
43
+
44
+ # 3. Compute Risk Model
45
+ # Ensure we align returns and tickers
46
+ valid_tickers = returns.columns.tolist()
47
+ cov_matrix = self.risk_model.compute_covariance_matrix(returns)
48
+
49
+ # 4. Get Benchmark Data (S&P 500)
50
+ # Fetch benchmark to calculate weights used for Tracking Error
51
+ # Simplification: Assume Market Cap weights or Equal weights for the benchmark
52
+ # since getting live weights is hard without expensive data.
53
+ # We will assume Equal Weights for the Benchmark in this demo logic
54
+ # or use a proxy.
55
+ # BETTER: Use SPY returns as the benchmark returns series for optimization.
56
+
57
+ # For the optimizer, we need "Benchmark Weights" if we want to minimize active weight variance.
58
+ # If we just map to S&P 500, let's assume valid_tickers ARE the index.
59
+ # 4. Get Benchmark Data (S&P 500)
60
+ # Fetch benchmark to calculate weights used for Tracking Error
61
+ # REALISTIC PROXY: S&P 500 is Market Cap Weighted.
62
+ # We manually assign Top 10 weights to make Tracking Error realistic when checking exclusions.
63
+
64
+ n_assets = len(valid_tickers)
65
+ benchmark_weights = pd.Series(0.0, index=valid_tickers)
66
+
67
+ # Approximate weights (Feb 2026-ish Reality)
68
+ # Total Market Cap heavily skewed to Mag 7
69
+ top_weights = {
70
+ "MSFT": 0.070, "AAPL": 0.065, "NVDA": 0.060,
71
+ "AMZN": 0.035, "GOOGL": 0.020, "GOOG": 0.020,
72
+ "META": 0.020, "TSLA": 0.015, "BRK-B": 0.015,
73
+ "LLY": 0.012, "AVGO": 0.012, "JPM": 0.010
74
+ }
75
+
76
+ current_total = 0.0
77
+ for t, w in top_weights.items():
78
+ if t in valid_tickers:
79
+ benchmark_weights[t] = w
80
+ current_total += w
81
+
82
+ # Distribute remaining weight equally among rest
83
+ remaining_weight = 1.0 - current_total
84
+ remaining_count = n_assets - len([t for t in top_weights if t in valid_tickers])
85
+
86
+ if remaining_count > 0:
87
+ avg_rest = remaining_weight / remaining_count
88
+ for t in valid_tickers:
89
+ if benchmark_weights[t] == 0.0:
90
+ benchmark_weights[t] = avg_rest
91
+
92
+ # Normalize just in case
93
+ benchmark_weights = benchmark_weights / benchmark_weights.sum()
94
+
95
+ # 5. Optimize Portfolio
96
+ sector_map = self.data_engine.get_sector_map()
97
+
98
+ opt_result = self.optimizer.optimize_portfolio(
99
+ covariance_matrix=cov_matrix,
100
+ tickers=valid_tickers,
101
+ benchmark_weights=benchmark_weights,
102
+ sector_map=sector_map,
103
+ excluded_sectors=request.excluded_sectors,
104
+ excluded_tickers=request.excluded_tickers
105
+ )
106
+
107
+ if opt_result.status != "optimal":
108
+ logger.warning("Optimization might be suboptimal.")
109
+
110
+ # 6. Attribution Analysis (Simulated Performance)
111
+ # We need "performance" loop.
112
+ # Let's calculate return over the LAST MONTH for attribution
113
+ last_month = returns.iloc[-21:]
114
+ asset_period_return = (1 + last_month).prod() - 1
115
+
116
+ attribution = self.attribution_engine.generate_attribution_report(
117
+ portfolio_weights=opt_result.weights,
118
+ benchmark_weights=benchmark_weights.to_dict(),
119
+ asset_returns=asset_period_return,
120
+ sector_map=sector_map
121
+ )
122
+
123
+ # 7. AI Reporting
124
+ # Combine exclusions for the narrative
125
+ exclusions_list = request.excluded_sectors + request.excluded_tickers
126
+ excluded = ", ".join(exclusions_list) if exclusions_list else "None"
127
+
128
+ commentary = self.ai_reporter.generate_report(attribution, excluded)
129
+
130
+ return {
131
+ "optimization": opt_result,
132
+ "attribution": attribution,
133
+ "commentary": commentary
134
+ }
135
+
136
+ if __name__ == "__main__":
137
+ # Test Run
138
+ req = OptimizationRequest(
139
+ client_id="TEST_001",
140
+ excluded_sectors=["Energy"] # Typical ESG constraint
141
+ )
142
+ system = QuantScaleSystem()
143
+ result = system.run_pipeline(req)
144
+
145
+ if result:
146
+ print("\n--- AI COMMENTARY ---\n")
147
+ print(result['commentary'])
qes_scale_optimizer.ipynb ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# QuantScale AI: Automated Direct Indexing & Attribution\n",
8
+ "## Goldman Sachs Quant Prep Project\n",
9
+ "\n",
10
+ "This notebook demonstrates the end-to-end workflow:\n",
11
+ "1. **Data Ingestion**: Scraping S&P 500 & fetching market data.\n",
12
+ "2. **Risk Modeling**: Computing Ledoit-Wolf Shrinkage Covariance.\n",
13
+ "3. **Optimization**: Minimizing Tracking Error with Sector Exclusion Constraints.\n",
14
+ "4. **AI Reporting**: Using Hugging Face to generate professional commentary."
15
+ ]
16
+ },
17
+ {
18
+ "cell_type": "code",
19
+ "execution_count": null,
20
+ "metadata": {},
21
+ "outputs": [],
22
+ "source": [
23
+ "!pip install -r requirements.txt"
24
+ ]
25
+ },
26
+ {
27
+ "cell_type": "code",
28
+ "execution_count": null,
29
+ "metadata": {},
30
+ "outputs": [],
31
+ "source": [
32
+ "from main import QuantScaleSystem\n",
33
+ "from core.schema import OptimizationRequest\n",
34
+ "import matplotlib.pyplot as plt\n",
35
+ "\n",
36
+ "# Initialize System\n",
37
+ "system = QuantScaleSystem()\n",
38
+ "\n",
39
+ "# Test Case: Optimization with Energy Exclusion\n",
40
+ "req = OptimizationRequest(client_id=\"COLAB_USER\", excluded_sectors=[\"Energy\"])\n",
41
+ "result = system.run_pipeline(req)"
42
+ ]
43
+ },
44
+ {
45
+ "cell_type": "code",
46
+ "execution_count": null,
47
+ "metadata": {},
48
+ "outputs": [],
49
+ "source": [
50
+ "# Visualization of Weights\n",
51
+ "if result:\n",
52
+ " weights = result['optimization'].weights\n",
53
+ " plt.figure(figsize=(12, 6))\n",
54
+ " plt.bar(range(len(weights)), list(weights.values()), align='center')\n",
55
+ " plt.title('Optimized Portfolio Weights (Energy Excluded)')\n",
56
+ " plt.xlabel('Assets')\n",
57
+ " plt.ylabel('Weight')\n",
58
+ " plt.show()"
59
+ ]
60
+ },
61
+ {
62
+ "cell_type": "code",
63
+ "execution_count": null,
64
+ "metadata": {},
65
+ "outputs": [],
66
+ "source": [
67
+ "# AI Commentary\n",
68
+ "print(result['commentary'])"
69
+ ]
70
+ }
71
+ ],
72
+ "metadata": {
73
+ "kernelspec": {
74
+ "display_name": "Python 3",
75
+ "language": "python",
76
+ "name": "python3"
77
+ },
78
+ "language_info": {
79
+ "codemirror_mode": {
80
+ "name": "ipython",
81
+ "version": 3
82
+ },
83
+ "file_extension": ".py",
84
+ "mimetype": "text/x-python",
85
+ "name": "python",
86
+ "nbconvert_exporter": "python",
87
+ "pygments_lexer": "ipython3",
88
+ "version": "3.10.12"
89
+ }
90
+ },
91
+ "nbformat": 4,
92
+ "nbformat_minor": 2
93
+ }
requirements.txt ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ cvxpy>=1.4.1
2
+ yfinance>=0.2.33
3
+ google-generativeai>=0.3.2
4
+ pandas>=2.1.4
5
+ numpy>=1.26.3
6
+ scikit-learn>=1.3.2
7
+ fastapi>=0.109.0
8
+ uvicorn>=0.27.0
9
+ pydantic>=2.5.3
10
+ pydantic>=2.5.3
11
+ python-dotenv>=1.0.0
12
+ matplotlib>=3.8.2
13
+ scipy>=1.11.4
14
+ huggingface_hub>=0.20.0