File size: 14,721 Bytes
283fbc7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 | **STAT 4830 Frontend Technical Description**
---
Dhruv Gupta, Kelly Wang, Didrik Wiig-Andersen, Aiden Lee, Frank Ma
STAT 4830, Project PRISM
**Project Proposal**
As part of our class, we have created an online gradient ascent based model for portfolio allocation and optimization. We want to make an interactive frontend for the project that allows users to simulate and test performance on several different year ranges, hyperparameter combinations, and stock market universes.
They should receive quick and immediate feedback, as well as comparisons with several different benchmarks in terms of both cumulative returns and annualized sharpe ratio.
We should try and include as many relevant and valid graphs as possible to make the website visually appealing.
**Proposed Technical Stack**
* **Hugging Face:** Our Python backend will be hosted in a Huggingface space, which will be in charge of actually running the model and providing back the results
* **React:** The frontend will be all react. The app has been created using npx create-react-app to begin with.
**Backend Setup**
We have cloned our Hugging Face space into our project (a blank Gradio template project). We have renamed the folder it is in to be called "backend"
**Necessary Dependencies**
import torch
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import yfinance as yf
from datetime import datetime
import numpy as np
import seaborn as sns
import wrds
import random
**Backend Development Steps**
These steps are to be followed by Cursor Agent running Claude 3.7 Sonnet. Each step should only be completed one at a time, and after each step is completed, the readme file should be updated accordingly. Do NOT go ahead at all and do not set up extra steps in advance. We have begun by cloning our HuggingFace spcae into our project and named the folder backend (Gradio blank template).
1. Set up the file directory and all necessary introductory files for the project. Ensure we have installed and are able to run any necessary dependencies.
*Completed: Created `backend/requirements.txt` and `backend/app.py`.*
2. Store the following list of stock tickers, organized by sector, in a JSON file
*Completed: Created `backend/data/tickers_by_sector.json`.*
\[
{
"sector": "Technology",
"tickers": \["AAPL", "MSFT", "NVDA", "GOOGL", "META", "AVGO", "ORCL", "IBM", "CSCO", "TSM", "ASML", "AMD", "TXN", "INTC", "MU", "QCOM", "LRCX", "NXPI", "ADI"\]
},
{
"sector": "Consumer Discretionary",
"tickers": \["AMZN", "TSLA", "NKE", "MCD", "SBUX", "YUM", "GM", "F", "RIVN", "NIO", "TTWO", "EA", "GME", "AMC"\]
},
{
"sector": "Financials",
"tickers": \["JPM", "V", "MA", "GS", "MS", "BAC", "C", "AXP", "SCHW"\]
},
{
"sector": "Health Care",
"tickers": \["UNH", "JNJ", "LLY", "PFE", "MRNA", "BMY", "GILD", "CVS", "VRTX", "ISRG"\]
},
{
"sector": "Consumer Staples",
"tickers": \["WMT", "PG", "TGT", "KO", "PEP", "TSN", "CAG", "SYY", "HRL", "MDLZ"\]
},
{
"sector": "Energy",
"tickers": \["XOM", "CVX", "NEE", "DUK", "SO", "D", "ENB", "SLB", "EOG", "PSX"\]
},
{
"sector": "Industrials",
"tickers": \["DE", "LMT", "RTX", "BA", "CAT", "GE", "HON", "UPS", "EMR", "NOC", "FDX", "CSX", "UNP", "DAL"\]
},
{
"sector": "Real Estate",
"tickers": \["PLD", "AMT", "EQIX", "O", "SPG", "VICI", "DLR", "WY", "EQR", "PSA"\]
},
{
"sector": "Materials",
"tickers": \["ADM", "BG", "CF", "MOS", "FMC"\]
},
{
"sector": "Communication Services",
"tickers": \["NFLX", "DIS", "PARA", "WBD", "CMCSA", "SPOT", "LYV"\]
}
\]
3. We have stored the data for each of these tickers from 1-1-2007 to 4-1-2025 in a file called data/stock_data.csv and the risk free returns values for each day in data/risk_free_data.csv. Please read them in and save them as df for future reference
*Completed: Loaded data into global DataFrames in `backend/app.py`.*
4. Create a route that, given optional inputs of start date, end date, and tickers, creates a dataframe that only contains data on the given tickers for the given timeframe
*Completed: Created `filter_data` function in `backend/utils.py` and test interface in `backend/app.py`.*
5. Build out a route that runs our OGD on a given dataframe which takes in hyperparameters and returns both the day to day weights and day to day returns (both cumulative and by ticker)
*Completed: Created `run_ogd` function in `backend/optimization.py` and integrated into Gradio interface in `backend/app.py`.*
\# Objective function
def calculate\_sharpe(
returns: torch.tensor,
risk\_free\_rate: torch.tensor \= None
):
if risk\_free\_rate is not None:
excess\_returns \= returns \- risk\_free\_rate
else:
excess\_returns \= returns
sharpe \= torch.mean(excess\_returns, dim=0) / torch.std(excess\_returns, dim=0)
return sharpe
def calculate\_sortino(
returns: torch.tensor,
min\_acceptable\_return: torch.tensor
):
if min\_acceptable\_return is not None:
excess\_returns \= returns \- min\_acceptable\_return
downside\_deviation \= torch.std(
torch.where(excess\_returns \< 0, excess\_returns, torch.tensor(0.0)),
)
sortino \= torch.mean(excess\_returns, dim=0) / (downside\_deviation \+ eps\*\*2)
return sortino
def calculate\_max\_drawdown(
returns: torch.tensor
):
"""calculates max drawdown for the duration of the returns passed
i.e. expects returns to be trimmed to the period of interest
max drawdown is defined to be positive, takes the range \[0, \\infty)
"""
cum\_returns \= (returns \+ 1).cumprod(dim=0)
return (cum\_returns.max() \- cum\_returns\[-1\]) / (cum\_returns.max() \+ eps \*\*2)
def calculate\_turnover(
new\_weights: torch.tensor,
prev\_weights: torch.tensor
):
"""Turnover is defined to be the sum of absolute differences
between the new weights and the previous weights, divided by 2\.
Takes the range \[0, \\infty)
This value should be minimized
"""
return torch.sum(torch.abs(new\_weights \- prev\_weights)) / 2
def calculate\_objective\_func(
returns: torch.tensor,
risk\_free\_rate: torch.tensor,
new\_weights,
prev\_weights,
alphas \= \[1,1,1\]
):
return (
a\[0\] \* calculate\_sortino(returns, risk\_free\_rate)
\- a\[1\] \* calculate\_max\_drawdown(returns)
\- a\[2\] \* calculate\_turnover(
new\_weights,
prev\_weights
)
)
\# set up
window\_size \= 10
return\_logs \= torch.zeros(
size \= (returns.shape\[0\],),
dtype=torch.float32
)
rolling\_return\_list \= \[\]
\# returns.shape\[1\] \- 1 because we don't allow investing in
\# risk free asset for the moment
print(f"Initializing optimization...")
weights \= torch.rand(
size \= (returns.shape\[1\] \- 1,),
requires\_grad=True
)
optimizer \= torch.optim.SGD(\[weights\], lr=0.5)
weights\_log \= torch.zeros((returns.shape\[0\], returns.shape\[1\] \- 1))
for i, date in enumerate(returns.index):
if i % 5 \== 0:
print(f"Step {i} of {returns.shape\[0\]}", end \= '\\r')
normalized\_weights \= torch.nn.functional.softmax(weights, dim=0)
daily\_returns \= torch.tensor(
returns.loc\[date\].T\[:-1\],
dtype=torch.float32
)
ret \= torch.dot(normalized\_weights, daily\_returns)
\# for logging
return\_logs\[i\] \= ret.detach()
rolling\_return\_list.append(ret)
if len(rolling\_return\_list) \> window\_size:
rolling\_return\_list.pop(0)
past\_returns \= torch.stack(rolling\_return\_list)
past\_rf \= torch.tensor(
returns.iloc\[max(0, i \- window\_size):i\]\['rf'\].values,
dtype=torch.float32
)
objective \= \-calculate\_objective\_func(
past\_returns,
past\_rf,
normalized\_weights,
weights\_log\[i \- 1\]
)
optimizer.zero\_grad()
objective.backward(retain\_graph=True)
optimizer.step()
weights\_log\[i\] \= normalized\_weights
6. Build out a route that given a dataframe returns the day to day returns for an equal weight portfolio
7. Build out a route that given a dataframe returns the day to day returns for a "random portfolio" – you fully (and equally) invest in 3 randomly selected stocks on any given day
8. Build out a unified route that, given a set of hyperparameters, start date, and end date, creates the dataframe, runs OGD, and also runs both the benchmarks, then returns all the data
9. Bundle it all into a well structured API
**Frontend Development Steps**
1. Create a file directory with images, components, data and pages. Create a global API variable that is set and can be edited for where the server is hosted
2. Develop a header and footer for the project
3. Create the layout for a dashboard that will take up exactly 100vh
1. On the right 1/4th we should have a list of our 111 stock tickers, grouped by sector. There should be a way to select our stock tickers in batches (i.e. toggle all, toggle by sector, etc) or individually
4. On the left hand side Top 2/3, we should have 2 graphs; cumulative returns and weight evolution. These graphs must be highly reflexive, running across the run's time and necessary tickers, etc
2. On the top of the left hand side above the stock tickers, we should have a horizontal menu to set our 4 variables
3. There should also be a smart and relevant place to run "Allocate Portfolio"
5. On the bottom 1/3rd of the left, we should have 3 graphs in a row that provide more specific statistics. I will leave it up to you to decide what these graphs should show
**Frontend Considerations**
1. We want the frontend to be as clean and modern as possible, considering our target audience is 16-24 year olds. Take heavy inspiration from the UI of Notion. Have it be by default in a "dark mode"
2. We want the frontend to feel responsive and provide micro or fake feedback while we're waiting for the OGD as it may take quite long. Maybe run some fake simulations through geometric brownian motions while we're waiting
# Small epsilon for Sharpe calculation
eps = 1e-8
ANNUAL_TRADING_DAYS = 252
def run_equal_weight(data_df: pd.DataFrame) -> pd.Series:
"""Calculates daily returns for a static equal-weight portfolio.
Args:
data_df (pd.DataFrame): DataFrame with dates as index, ticker returns,
and an 'rf' column.
Returns:
pd.Series: Daily returns of the equal-weight portfolio.
"""
stock_returns = data_df.drop(columns=['rf'], errors='ignore')
if stock_returns.empty:
return pd.Series(dtype=float, name="EqualWeightReturn")
# Calculate the mean return across all stocks for each day
daily_returns = stock_returns.mean(axis=1)
return daily_returns.rename("EqualWeightReturn")
def run_random_portfolio(
data_df: pd.DataFrame,
num_stocks: int = 3,
rebalance_days: int = 20
) -> pd.Series:
"""Calculates daily returns for a randomly selected portfolio,
rebalanced periodically.
Args:
data_df (pd.DataFrame): DataFrame with dates as index, ticker returns,
and an 'rf' column.
num_stocks (int): Number of stocks to randomly select.
rebalance_days (int): How often to re-select stocks.
Returns:
pd.Series: Daily returns of the random portfolio.
"""
stock_returns = data_df.drop(columns=['rf'], errors='ignore')
if stock_returns.empty or stock_returns.shape[1] < num_stocks:
print("Warning: Not enough stocks available for random portfolio.")
return pd.Series(dtype=float, name="RandomPortfolioReturn")
tickers = stock_returns.columns.tolist()
portfolio_returns = pd.Series(index=data_df.index, dtype=float)
selected_tickers = []
for i, date in enumerate(data_df.index):
# Rebalance check
if i % rebalance_days == 0 or not selected_tickers:
if len(tickers) >= num_stocks:
selected_tickers = random.sample(tickers, num_stocks)
else: # Should not happen based on initial check, but safe
selected_tickers = tickers
# print(f"Rebalancing Random Portfolio on {date.date()}: {selected_tickers}")
# Calculate return for the day using selected tickers
daily_returns = stock_returns.loc[date, selected_tickers]
portfolio_returns[date] = daily_returns.mean() # Equal weight among selected
return portfolio_returns.rename("RandomPortfolioReturn")
# --- Performance Metrics ---
def calculate_cumulative_returns(returns_series: pd.Series) -> pd.Series:
"""Calculates cumulative returns from a daily returns series."""
return (1 + returns_series.fillna(0)).cumprod()
def calculate_performance_metrics(returns_series: pd.Series, rf_series: pd.Series) -> dict:
"""Calculates annualized Sharpe Ratio and Max Drawdown."""
if returns_series.empty or returns_series.isnull().all():
return {"Annualized Sharpe Ratio": 0.0, "Max Drawdown": 0.0, "Cumulative Return": 1.0}
cumulative_return = (1 + returns_series.fillna(0)).cumprod().iloc[-1]
# Align risk-free rate series to the returns series index
aligned_rf = rf_series.reindex(returns_series.index).fillna(0)
# Calculate Excess Returns
excess_returns = returns_series - aligned_rf
# Annualized Sharpe Ratio
# Use np.sqrt(ANNUAL_TRADING_DAYS) for annualization factor
mean_excess_return = excess_returns.mean()
std_dev_excess_return = excess_returns.std()
sharpe_ratio = (mean_excess_return / (std_dev_excess_return + eps)) * np.sqrt(ANNUAL_TRADING_DAYS)
# Max Drawdown
cumulative = calculate_cumulative_returns(returns_series)
peak = cumulative.expanding(min_periods=1).max()
drawdown = (cumulative - peak) / (peak + eps) # Drawdown is negative or zero
max_drawdown = abs(drawdown.min()) # Max drawdown is positive
return {
"Annualized Sharpe Ratio": round(sharpe_ratio, 4),
"Max Drawdown": round(max_drawdown, 4),
"Cumulative Return": round(cumulative_return, 4)
} |