Spaces:
Running
Running
| title: Trading Forecasting Backend | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # Trading Forecasting Backend | |
| This folder is now a standalone Hugging Face Docker Space backend. Upload the contents of this `backend` folder to a Hugging Face Space repository, upload the separate `dataset` folder to a Hugging Face Dataset repository, and deploy the separate `frontend` folder to Netlify. | |
| The backend contains the quantitative model code, training scripts, model outputs, primary market data, and alternative data from the forecasting research workspace. | |
| ## Hugging Face Space Setup | |
| Create a new Hugging Face Space with Docker SDK, then upload this backend folder as the Space root. | |
| Required Space variables/secrets: | |
| - `FRONTEND_ORIGINS`: your Netlify URL, for example `https://your-site.netlify.app`. | |
| - `CRON_SECRET`: a long shared secret. Use the same value in Netlify. | |
| - `HF_DATASET_REPO_ID`: your Hugging Face Dataset repo id, for example `your-username/your-forecasting-dataset`. | |
| Useful optional settings: | |
| - `AUTO_UPDATE_ENABLED=true` | |
| - `AUTO_RETRAIN_ENABLED=true` | |
| - `AUTO_UPDATE_ON_START=false` | |
| - `DATASET_SYNC_ON_START=true` | |
| - `HF_DATASET_REVISION=main` | |
| - `DAILY_UPDATE_TIME=17:30` | |
| - `UPDATE_TIMEZONE=Asia/Kolkata` | |
| - `MARKET_BUILD_WORKERS=2` | |
| The app listens on port `7860` and exposes Swagger docs at `/docs`. | |
| ## API Routes | |
| - `GET /health` - Space health, file checks, latest data date, and update status. | |
| - `GET /api/status` - same as health, for frontend polling. | |
| - `GET /api/forecast/latest` - latest stock high/low, first-extrema, and Nifty forecasts. | |
| - `GET /api/models/summaries` - model summary JSONs. | |
| - `GET /api/data/catalog` - searchable data manifest. | |
| - `GET /api/data/sample?category=bars&asset=nifty50&timeframe=1d` - small sample from a manifest dataset. | |
| - `POST /api/cron/tick` - Netlify scheduled ping endpoint; starts an update only when due. | |
| - `POST /api/update/start` - manual update trigger. Send `x-admin-secret` if `CRON_SECRET` or `ADMIN_SECRET` is set. | |
| - `POST /api/dataset/sync` - manually sync the Hugging Face Dataset repo into the Space runtime. | |
| ## Netlify Keep-Awake Cron | |
| The `frontend` folder now includes: | |
| - `frontend/netlify.toml` | |
| - `frontend/netlify/functions/keep-space-awake.mjs` | |
| On Netlify, set these environment variables: | |
| - `HUGGING_FACE_SPACE_URL=https://YOUR-HF-USERNAME-YOUR-SPACE.hf.space` | |
| - `CRON_SECRET=<same value as the Space CRON_SECRET>` | |
| The scheduled function runs every 10 minutes and calls `/api/cron/tick`. This keeps the Space warm and lets the backend start its daily update/retrain job after the configured market-close time. | |
| ## Layout | |
| - `app.py` - FastAPI backend app for Hugging Face Spaces. | |
| - `Dockerfile` - Docker Space runtime setup. | |
| - `requirements.txt` - Python dependencies. | |
| - `research_runtime/Code/models/` - trainable model packages and the small latest forecast/summary outputs needed by the API. | |
| - `research_runtime/Code/scripts/data_ingestion/` - data refresh scripts used by update jobs. | |
| - `research_runtime/Code/scripts/data_preparation/` - research data rebuild scripts used by update jobs. | |
| `research_runtime/Data/` and `research_runtime/Alt Data/` are intentionally not bundled in the Space repo anymore. They now live in the separate Hugging Face Dataset repo and are downloaded into `research_runtime/` by the backend when `HF_DATASET_REPO_ID` is set. | |
| ## Main Model Outputs To Wire First | |
| - Stock high/low forecasts: `research_runtime/Code/models/stock_high_low_forecaster/outputs/latest_forecasts.csv` | |
| - Stock high/low metrics: `research_runtime/Code/models/stock_high_low_forecaster/outputs/metrics_by_symbol.csv` | |
| - First-extrema forecasts: `research_runtime/Code/models/first_extrema_forecaster/outputs/latest_forecasts.csv` | |
| - Nifty forecasts: `research_runtime/Code/models/nifty_forecaster/outputs/forecaster_latest_forecasts.csv` | |
| - Nifty summary: `research_runtime/Code/models/nifty_forecaster/outputs/forecaster_summary.json` | |
| ## Training Entrypoints | |
| Run these from `backend/research_runtime` so project-relative paths resolve correctly: | |
| ```powershell | |
| python Code\models\stock_high_low_forecaster\train.py | |
| python Code\models\first_extrema_forecaster\train.py | |
| python Code\models\nifty_forecaster\train.py | |
| ``` | |
| ## Data Labels | |
| These live in the separate Dataset repo: | |
| - Raw minute OHLCV: `Data/raw/minute/*_minute.csv` | |
| - Processed bars: `Data/processed/bars/{1m,5m,1h,4h,1d}/*.csv` | |
| - Processed features: `Data/processed/features/{1m,5m,1h,4h,1d}/*.csv` | |
| - Market panels: `Data/processed/panels/*_market_panel.csv` | |
| - Master daily panel: `Data/processed/panels/daily_master_panel.csv` | |
| - Data manifest: `Data/metadata/manifest.csv` | |
| - Feature dictionary: `Data/metadata/feature_dictionary.csv` | |
| - Options features: `Alt Data/options/processed/*_options_daily_features.csv` | |
| - Institutional panel: `Alt Data/institutional/processed/institutional_daily_panel.csv` | |
| - External daily panel: `Alt Data/external/processed/external_daily_panel.csv` | |
| - Corporate events: `Alt Data/corporate/processed/corporate_announcements.csv` | |
| ## Frontend Wiring Notes | |
| The current frontend is static mock data in `frontend/index.html` and `frontend/script.js`. | |
| - Forecast cards can call `/api/forecast/latest`. | |
| - Model accuracy and version/date stats can call `/api/models/summaries`. | |
| - Market Data can call `/api/data/catalog` and `/api/data/sample`. | |
| ## Pruned From Backend | |
| - Kotak credential/runtime files. | |
| - Live-trading scripts and live broker artifacts. | |
| - Kotak monitor artifacts and cached NSE temp folders. | |
| - Python `__pycache__` folders. | |
| - CatBoost generated training-log folder. | |
| - One-off maintenance/backfill scripts. | |
| - Backtest artifacts, chart images, old trade reports, test prediction dumps, generated training datasets, and saved model binaries. | |
| `KOTAKBANK` CSV files remain because those are normal market datasets for Kotak Mahindra Bank, not broker-runtime files. | |