Spaces:
Sleeping
Sleeping
File size: 7,028 Bytes
1debfcf d81a83d 954e83c c7327da d81a83d c7327da d81a83d c7327da d81a83d c7327da d81a83d 2237f11 d81a83d 2237f11 d81a83d 2237f11 d81a83d 2237f11 3988d04 2237f11 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | ---
title: Streamlit Chatbot
emoji: "π¨οΈ"
colorFrom: indigo
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
---
# Streamlit Chatbot β¨
A lightweight chatbot built with [Streamlit](https://streamlit.io/) and the open-source `microsoft/DialoGPT-small` language model from [Hugging Face](https://huggingface.co/). This repository is ready to be deployed to [Hugging Face Spaces](https://huggingface.co/spaces) automatically through GitHub Actions.
## Features
* π **Open-source LLM** β Uses a small conversational model that runs comfortably on the free GPU or CPU hardware offered by Spaces.
* π¬ **Chat interface** β Powered by Streamlit 1.30+ `st.chat_*` components.
* π **Persistent history** β Session-state keeps the discussion context on the client side.
* π **1-click deploy** β Push to the `main` branch and GitHub Actions mirrors the repository to your Space.
---
## Quick start (local)
```bash
# 1. Install dependencies
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 2. Launch the app
streamlit run app.py
```
The app will open in your browser at `http://localhost:8501`.
---
## Quick start (Docker)
If you prefer to run the chatbot in a container instead of a local virtual-env, use the provided `Dockerfile`.
```bash
# 1. Build the image (tagged "streamlit-chatbot")
docker build -t streamlit-chatbot .
# 2. Run the container and expose the app on http://localhost:8501
docker run --rm -it -e PORT=8501 -p 8501:8501 streamlit-chatbot
```
The container entrypoint launches Streamlit on the port given by the `PORT` environment variable (the same variable Hugging Face uses). By passing `-e PORT=8501` and mapping `-p 8501:8501`, you can access the interface in your browser at `http://localhost:8501`.
---
## Manual deploy to Hugging Face Spaces (CLI)
If you'd rather push the repository yourself (skipping GitHub Actions):
```bash
# 1. Authenticate once (stores your token locally)
huggingface-cli login # paste your HF_TOKEN when prompted
# 2. (First time only) create the Space as a Docker Space
huggingface-cli repo create afscomercial/streamlit-chatbot \
--repo-type space --space-sdk docker -y # change the name accordingly
# 3. Add the new remote and push
cd path/to/streamlit_chatbot
git lfs install # enables Large-File Storage just in case
git remote add hf \
https://huggingface.co/spaces/afscomercial/streamlit-chatbot
git push hf main --force # overwrite contents of the Space
```
After the push the Space will rebuild the Docker image and redeploy automatically.
---
## Repository layout
```
.
βββ app.py # Streamlit application β chat UI
βββ fine_tune.py # Script to fine-tune the base LLM on JSONL data
βββ requirements.txt # Python dependencies
βββ data/ # Example datasets (small, can live in git)
β βββ aviation_conversations.jsonl
βββ research/ # Jupyter notebooks / ad-hoc DS experiments (untracked by CI)
βββ .streamlit/
β βββ config.toml # UI & server settings
βββ .github/
β βββ workflows/
β βββ deploy-to-spaces.yml # CI/CD β auto-deploy app
βββ Dockerfile # Container definition for Docker/HF Spaces
βββ README.md
```
## Research folder
The `research/` directory is reserved for exploratory notebooks, data-science experiments, and scratch work that shouldn't affect the production application. Feel free to place notebooks, CSVs, or prototype scripts here. Anything computationally heavy or containing large files should **not** be committed; the folder is in the `.gitignore` by default.
## Fine-tuning the model (aviation example)
This repo ships with a tiny JSON-Lines dataset in `data/` that contains sample Q&A about aviation. A GitHub Action (`train-model.yml`) fine-tunes `microsoft/DialoGPT-small` on that data and pushes the checkpoint to the Hub as `afscomercial/streamlit-chatbot-aviation` (or the repo name you set in the `MODEL_REPO` secret).
You can also run it locally:
```bash
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export HF_TOKEN=YOUR_WRITE_TOKEN
export MODEL_REPO="afscomercial/streamlit-chatbot-aviation"
python fine_tune.py
```
The script will train for one epoch (change `EPOCHS` if you wish) and push the new weights to the model repo.
### Where the model is used
`app.py` now reads the environment variable `MODEL_REPO` (defaulting to `afscomercial/streamlit-chatbot-aviation`). At startup, the Streamlit app downloads the fine-tuned checkpoint instead of the vanilla DialoGPT model.
### Pushing the fine-tuned model to the Hub
There are two convenient ways to upload your checkpoint to the Hub once the training run is finished.
#### Option 1 β let the training script push automatically
`fine_tune.py` ends with `trainer.push_to_hub()`, so you only need to:
```bash
# 1 Β· Authenticate once (stores your token locally)
huggingface-cli login # paste your HF access token
# 2 Β· (First time only) create the model repo on the Hub
huggingface-cli repo create <USER>/<MODEL_REPO> -y
# e.g. huggingface-cli repo create your-username/streamlit-chatbot-aviation -y
# 3 Β· Point the run to that repo (default shown below)
export MODEL_REPO="your-username/streamlit-chatbot-aviation"
# 4 Β· Launch training β the script will commit + push automatically
python fine_tune.py
```
#### Option 2 β push an existing folder manually
If you already have the fine-tuned files on disk (e.g. in `finetuned-aviation/`):
```bash
# 1 Β· Create the repo once
huggingface-cli repo create your-username/streamlit-chatbot-aviation -y
# 2 Β· Clone the empty repo & copy your files into it
git clone https://huggingface.co/your-username/streamlit-chatbot-aviation
cd streamlit-chatbot-aviation
cp -r /path/to/finetuned-aviation/* .
# 3 Β· Commit and push
git add .
git commit -m "First fine-tuned checkpoint"
git push
```
After the checkpoint is online, simply point the Streamlit app to it (locally or on Spaces) with:
```bash
export MODEL_REPO="your-username/streamlit-chatbot-aviation"
streamlit run app.py
```
## Architecture
```mermaid
graph TD
subgraph "Frontend"
U["User<br/>Browser"] -->|"HTTP 8501"| A["Streamlit<br/>Chatbot (app.py)"]
end
subgraph "Backend"
A -->|"Load fine-tuned weights<br/>+ tokenizer"| M["LLM<br/>DialoGPT-fine-tuned"]
A -->|"Generate reply"| M
M -->|"Response"| A
end
subgraph "Model Hub"
MH["Hugging Face<br/>Model Repo"]
end
MH --> M
subgraph "Training"
DS["Dataset<br/>aviation_conversations.jsonl"]
FT["fine_tune.py<br/>(HF Trainer)"]
DS --> FT
FT -->|"Push to Hub"| MH
end
CI["GitHub Actions<br/>train-model.yml"] --> FT
CI2["GitHub Actions<br/>deploy-to-spaces.yml"] -->|"Docker Image"| HFSpace["HF Space<br/>Docker Runtime"]
HFSpace --> A
```
--- |