--- title: "Dacon Broadcast Article Performance Predictor" emoji: "📰" colorFrom: blue colorTo: purple sdk: docker app_port: 7860 short_description: "AI-powered article KPI predictions and SEO recommendations" tags: - flask - seo - analytics - journalism datasets: [] models: [] suggested_hardware: cpu-upgrade suggested_storage: medium pinned: false --- # Dacon Broadcast Article Performance Predictor This project hosts a Flask web application that predicts article performance and provides AI-powered SEO recommendations. ## Local development 1. Create a virtual environment and install dependencies. ```powershell python -m venv .venv .\.venv\Scripts\Activate.ps1 pip install -r requirements.txt ``` 2. Ensure the model artifacts are generated: ```powershell .\.venv\Scripts\python.exe train_and_save_models.py ``` 3. Add your Google Generative AI key to a `.env` file: ```ini GEMINI_API_KEY=your-api-key ``` 4. Run the development server: ```powershell .\.venv\Scripts\python.exe app.py ``` ## Production deployment (Gunicorn + Nginx) 1. **Copy project to server** (e.g., `/srv/dacon_broadcast_paper`). 2. **Create virtual environment** and install requirements as above. 3. **Generate artifacts** on the server or copy them from local build. 4. **Configure environment variables**: ```bash echo "GEMINI_API_KEY=your-api-key" | sudo tee /etc/dacon_app.env ``` 5. **Test Gunicorn manually**: ```bash cd /srv/dacon_broadcast_paper source .venv/bin/activate gunicorn --bind 127.0.0.1:8000 --workers 3 --timeout 120 wsgi:application ``` ### systemd service Use `deploy/dacon_app.service` as a template: ```bash sudo cp deploy/dacon_app.service /etc/systemd/system/dacon_app.service sudo systemctl daemon-reload sudo systemctl enable dacon_app sudo systemctl start dacon_app sudo systemctl status dacon_app ``` Adjust `WorkingDirectory`, `ExecStart`, and `Environment` entries to match your server paths or reference `/etc/dacon_app.env` with `EnvironmentFile=` if preferred. ### Nginx reverse proxy 1. Install Nginx (`sudo apt install nginx`). 2. Copy the provided config: ```bash sudo cp deploy/dacon_app.nginx.conf /etc/nginx/sites-available/dacon_app sudo ln -s /etc/nginx/sites-available/dacon_app /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl reload nginx ``` 3. Update `server_name` and any path aliases before reloading. 4. (Optional) Enable HTTPS via Certbot: ```bash sudo apt install certbot python3-certbot-nginx sudo certbot --nginx -d your-domain.com ``` ### Firewall and health checks - Open ports 80/443 via `ufw` or your cloud provider’s security group. - Use the `/healthz` endpoint for health monitoring. - Logs: - Application: `journalctl -u dacon_app` - Nginx: `/var/log/nginx/access.log`, `/var/log/nginx/error.log` ## File overview - `app.py` – Flask application with prediction and SEO endpoints. - `wsgi.py` – WSGI entrypoint for production servers. - `deploy/dacon_app.service` – sample systemd unit for Gunicorn. - `deploy/dacon_app.nginx.conf` – sample Nginx reverse proxy configuration. - `train_and_save_models.py` – pipeline that creates required artifacts. - `data_csv/` – CSV inputs used by the app. ## Troubleshooting - If Gunicorn crashes, check for missing artifacts under `artifacts/`. - Ensure the `.env` file or environment variables include `GEMINI_API_KEY`. - Increase `client_max_body_size` in Nginx if large payloads are expected. - For Windows hosting, consider running Gunicorn/Nginx via WSL2 or using IIS + FastCGI with `wsgi.py`. ## Hugging Face Spaces deployment (Docker Space) Hugging Face Spaces support custom web apps through Docker. Use the provided `Dockerfile` to containerize the app and expose it via Gunicorn. 1. **Prepare the repository** - Ensure all required artifacts (`*.pkl`) and the `data_csv/` folder are committed (Spaces pull the repo directly). - Keep individual files under 1 GB (Spaces limit); use Git LFS for large artifacts if needed. 2. **Create a new Space** - On Hugging Face, click **Create Space** → type `Docker` → name it (e.g., `username/dacon-predictor`). - Leave hardware as default unless more RAM is required (~16 GB recommended because of NLP dependencies). 3. **Push the code** - Initialize the Space as a Git repo locally: ```bash huggingface-cli repo create username/dacon-predictor --type=space --space-sdk=docker git remote add space https://huggingface.co/spaces/username/dacon-predictor git push space main ``` - Alternatively, clone the empty Space repo and copy the project files into it before pushing. 4. **Secrets & configuration** - In the Space settings, add a secret named `GEMINI_API_KEY` with your Google Generative AI key. - Optional: set `GUNICORN_WORKERS` to tune concurrency. 5. **Container build** - Spaces will build the `Dockerfile`. It installs system deps (OpenJDK, MeCab) and Python requirements, then launches Gunicorn binding to `$PORT` (HF uses port 7860 by default). - The app serves `index.html` via Flask, so no additional frontend wiring is required. 6. **Testing & monitoring** - Once the build finishes, open the Space URL to verify predictions and SEO generation. - Check the Space logs (Settings → Logs) for build/runtime issues, especially MeCab/Java errors. ### Space-specific tips - **Cold start latency**: Spaces sleep when idle; first request may take longer as the model artifacts load. - **Resource usage**: If memory spikes occur (pandas + scikit-learn + MeCab), upgrade to a larger hardware tier. - **Background tasks**: This setup serves HTTP requests only; long-running offline jobs should be run outside Spaces. - **Security**: Secrets set in HF UI aren’t exposed in the repo. Avoid committing `.env` with real keys. - **Custom domains**: Hugging Face supports domain mapping on paid tiers if you need branding.