Spaces:

abhaypratapsingh111
/

chronos2-forecasting

Sleeping

App Files Files Community

chronos2-forecasting / README.md

abhaypratapsingh111

Upload README.md with huggingface_hub

aebe391 verified about 2 months ago

preview code

raw

history blame

10.7 kB

	# Chronos 2 Time Series Forecasting Application

	A production-ready web application for testing Amazon's Chronos 2 time series forecasting model using the latest `Chronos2Pipeline` API. Built with Dash for enterprise scalability and designed for both local development and cloud deployment.

	## Features

	- Latest Chronos 2 API: Uses `Chronos2Pipeline.predict_df()` with DataFrame-based interface
	- Interactive Forecasting: Generate forecasts up to 365 days with adjustable confidence intervals
	- Dual Model Support: Switch between Fast (Chronos-Bolt) and Accurate (Chronos-2) variants
	- Multivariate Ready: Built on Chronos 2 architecture supporting multivariate forecasting
	- Flexible Data Input: Upload CSV/Excel files or use sample datasets
	- Rich Visualizations: Interactive Plotly charts with confidence bands and zoom capabilities
	- Data Quality Analysis: Automatic preprocessing with quality reports
	- GPU Acceleration: Automatic CUDA support with CPU fallback
	- Security Hardened: Non-root Docker containers, server-side validation, filename sanitization
	- Production Ready: Designed for deployment on local machines or Databricks Apps

	## Architecture

	Built following best practices for scalability and maintainability:

	- Dash Framework: Handles thousands of concurrent users
	- Plotly Visualizations: Smooth rendering of 100K+ data points
	- Model Caching: Chronos 2 loaded once at startup for fast inference
	- Client-Side State: Efficient state management without server sessions
	- Modular Design: Clean separation of components, services, and utilities

	## Installation

	### Prerequisites

	- Python 3.10+
	- CUDA-capable GPU (optional, for faster inference)
	- 8GB+ RAM (4-8GB for model + overhead)

	### Local Setup

	1. Clone the repository
	```bash
	git clone <repository-url>
	cd chronos2-forecasting-app
	```

	2. Create a virtual environment
	```bash
	python -m venv venv

	# On Windows
	venv\Scripts\activate

	# On Linux/Mac
	source venv/bin/activate
	```

	3. Install dependencies
	```bash
	pip install -r requirements.txt
	```

	4. Run the application
	```bash
	python app.py
	```

	5. Access the app
	Open your browser to `http://127.0.0.1:8050`

	## Usage Guide

	### Quick Start

	1. Load Sample Data
	- Click one of the sample dataset buttons (Electricity, Retail, Manufacturing)
	- Or upload your own CSV/Excel file

	2. Configure Data
	- Select the date column
	- Select the target variable to forecast
	- (Optional) Select an ID column for multivariate series

	3. Set Forecast Parameters
	- Adjust the forecast horizon (1-365 days)
	- Select confidence levels (80%, 90%, 95%, 99%)
	- Choose model variant (Fast or Accurate)

	4. Generate Forecast
	- Click "Generate Forecast" button
	- Wait for model inference (typically 1-5 seconds)
	- View interactive chart with confidence intervals

	### Data Requirements

	Your data should have:
	- Date column: Any standard date format
	- Target column: Numeric values to forecast
	- Minimum rows: At least 2x the forecast horizon
	- File size: Up to 100MB
	- Formats: CSV, XLSX, XLS

	### Tips for Best Results

	- Use at least 2x the forecast horizon in historical data
	- Clean your data before upload (though the app handles basic preprocessing)
	- Start with the Fast model variant for quick testing
	- Use the Accurate variant for final forecasts
	- Larger confidence intervals provide more conservative forecasts

	## Project Structure

	```
	chronos2-forecasting-app/
	├── app.py # Main Dash application
	├── components/ # UI components
	│ ├── upload.py # File upload component
	│ ├── chart.py # Chart generation
	│ └── controls.py # Parameter controls
	├── services/ # Business logic
	│ ├── model_service.py # Chronos model wrapper
	│ ├── data_processor.py # Data preprocessing
	│ └── cache_manager.py # Caching logic
	├── utils/ # Utilities
	│ ├── validators.py # Input validation
	│ └── metrics.py # Forecast metrics
	├── config/ # Configuration
	│ ├── settings.py # Environment settings
	│ └── constants.py # App constants
	├── datasets/ # Sample datasets
	├── static/ # Static assets
	│ └── custom.css # Custom styles
	├── requirements.txt # Python dependencies
	├── Dockerfile # Container definition
	└── README.md # This file
	```

	## Configuration

	### Environment Variables

	- `ENVIRONMENT`: Set to `local` or `production`
	- `DEVICE`: Set to `auto`, `cuda`, or `cpu`
	- `LOG_LEVEL`: Set to `DEBUG`, `INFO`, `WARNING`, or `ERROR`
	- `DATABRICKS_APP_PORT`: Port for Databricks deployment (default: 8080)

	### Local vs Databricks Configuration

	The app automatically detects the environment and adjusts settings:

	Local Development:
	- Host: 127.0.0.1
	- Port: 8050
	- Debug: Enabled
	- Storage: Local directories

	Databricks Deployment:
	- Host: 0.0.0.0
	- Port: 8080 (or DATABRICKS_APP_PORT)
	- Debug: Disabled
	- Storage: /tmp and /dbfs

	## Deployment

	### Hugging Face Spaces (Recommended for Free Hosting)

	The easiest way to deploy this app for free:

	1. Create a Hugging Face account at https://huggingface.co

	2. Create a new Space
	- Go to https://huggingface.co/spaces
	- Click "Create new Space"
	- Select "Dash" as the SDK
	- Choose a name for your Space

	3. Upload your code
	- Option A: Connect your GitHub repository (recommended)
	- Option B: Upload files directly through the web interface

	4. Configure the Space
	- The app will automatically use `app.py` as the entry point
	- HuggingFace Spaces provides 16GB RAM (sufficient for Chronos-2)
	- Optional: Request GPU upgrade for faster inference

	5. Access your deployed app
	- Your app will be live at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`

	Note: First startup may take 2-3 minutes as the Chronos-2 model downloads (~500MB).

	### Docker Deployment

	1. Build the image
	```bash
	docker build -t chronos2-forecasting .
	```

	2. Run the container
	```bash
	docker run -p 8080:8080 chronos2-forecasting
	```

	3. With GPU support
	```bash
	docker run --gpus all -p 8080:8080 chronos2-forecasting
	```

	### Databricks Apps Deployment

	1. Upload code to DBFS
	```bash
	databricks fs cp -r . dbfs:/apps/chronos2-forecasting/
	```

	2. Create Databricks App
	- Use the Databricks Apps UI
	- Point to the uploaded directory
	- Set environment variable: `ENVIRONMENT=production`

	3. Configure resources
	- Minimum: 8GB RAM
	- Recommended: GPU instance for faster inference

	### Production Considerations

	- Memory: Allocate 6-8GB for the model + overhead
	- Scaling: Use multiple workers with Gunicorn
	- Monitoring: Check `/health` endpoint for status
	- Logging: Logs to stdout for easy collection
	- Timeouts: Set to 300s+ for large forecasts

	## API Reference

	### Health Check Endpoint

	```
	GET /health
	```

	Returns:
	```json
	{
	"status": "healthy",
	"model_loaded": true,
	"model_variant": "fast",
	"device": "cuda"
	}
	```

	## Troubleshooting

	### Model Loading Issues

	Problem: Model fails to load
	- Check available memory (need 4-8GB)
	- Try CPU mode: Set `DEVICE=cpu`
	- Check internet connection (first run downloads model)

	### GPU Not Detected

	Problem: CUDA device not found
	- Verify CUDA installation: `python -c "import torch; print(torch.cuda.is_available())"`
	- Install correct PyTorch version for your CUDA
	- App will automatically fall back to CPU

	### Upload Failures

	Problem: File upload fails
	- Check file size (<100MB)
	- Verify file format (CSV, XLSX, XLS)
	- Ensure file is not corrupted

	### Slow Performance

	Problem: Forecasts take too long
	- Use Fast model variant instead of Accurate
	- Reduce forecast horizon
	- Enable GPU acceleration
	- Limit data points (app decimates to 10K for display)

	### Memory Errors

	Problem: Out of memory during inference
	- Switch to Fast model variant (smaller)
	- Use CPU instead of GPU
	- Reduce batch size in model_service.py
	- Close other applications

	## Performance Tuning

	### For Development
	- Enable debug mode for detailed logging
	- Use Fast model variant
	- Work with smaller datasets initially

	### For Production
	- Disable debug mode
	- Use GPU for inference
	- Enable caching (already configured)
	- Use Gunicorn with 4 workers
	- Set up monitoring and alerting

	## Contributing

	Contributions are welcome! Please:

	1. Fork the repository
	2. Create a feature branch
	3. Make your changes
	4. Add tests if applicable
	5. Submit a pull request

	## License

	This project is provided as-is for educational and research purposes.

	## Acknowledgments

	- Chronos Model: Amazon Science
	- Dash Framework: Plotly
	- Sample Data: Generated for demonstration purposes

	## Support

	For issues, questions, or suggestions:
	- Open an issue in the repository
	- Check existing documentation
	- Review troubleshooting guide above

	## Changelog

	### Version 1.0.1 (Latest - Chronos 2 Full Implementation)
	- BREAKING: Migrated to Chronos 2 API with `Chronos2Pipeline`
	- Fixed deprecated pandas methods (`fillna(method=...)` → `ffill()`/`bfill()`)
	- Updated to `chronos-forecasting==2.0.0` package
	- Fixed type hints (`any` → `Any`) across all modules
	- Added DataFrame-based prediction interface
	- Security improvements:
	- Non-root user in Docker container
	- Server-side file validation
	- Filename sanitization
	- Health check timeout configuration
	- Updated model paths to support Chronos-2 (s3://autogluon/chronos-2)
	- Fixed data format compatibility (id/timestamp/target columns)
	- Added `requests` library for health checks

	### Version 1.0.0 (Initial Release)
	- Chronos 2 model integration
	- Single-page Dash application
	- CSV/Excel upload support
	- Interactive visualizations
	- Confidence interval display
	- Sample datasets included
	- Docker deployment ready
	- Databricks Apps compatible

	## Roadmap

	Future enhancements being considered:
	- Multi-series forecasting UI
	- Model comparison features
	- Export forecast results
	- Custom model fine-tuning
	- Real-time data streaming
	- Advanced metrics dashboard
	- API-only mode for programmatic access

	---

	Built with Dash and Chronos 2 for production-ready time series forecasting.