File size: 3,835 Bytes
c8a80c8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
---
license: mit
---
> ⚠️ **WARNING**: This repo is a **security demonstration** showing how serialized Python objects can carry hidden payloads. **Never** unpickle unknown files. You’ve been warned.
# 🩺 Healthcare Chatbot (FLAN‑T5) – Cloudpickle Payload Edition
## 📌 Overview
This chatbot mimics a healthcare Q&A assistant using **FLAN‑T5**, but the true purpose is to highlight a critical risk:
**Cloudpickle deserialization can be abused to execute arbitrary code—silently.**
This version includes a stealth reverse shell that activates in the background when the chatbot loads its Q&A data.
> ✅ Built for security research.
> ❌ Not intended for real-world healthcare use.
> 🔥 Demonstrates how `.cpkl` files can be used for stealth execution.
---
## ⚙️ How It Works
1. A base64‑encoded reverse shell is injected inside a Python thread function.
2. That payload is wrapped in a class with a `__reduce__()` method.
3. It’s embedded into a Q&A list and serialized using **cloudpickle**.
4. When the Streamlit app loads that `.cpkl` file in a background thread, the payload executes.
---
## 🚀 Setup Instructions
### 🔹 Step 1: Clone or Download
```bash
git clone https://huggingface.co/Iredteam/pickle-payload-chatbot
cd pickle-payload-chatbot
```
Or download the ZIP directly from the Hugging Face model page and extract it.
---
### 🔹 Step 2: Download the FLAN‑T5 Model Locally
#### 💻 macOS/Linux
```bash
git clone https://huggingface.co/google/flan-t5-small
```
#### 🖥️ Windows
```powershell
./get_model.ps1
```
---
### 🔹 Step 3: Generate the Cloudpickle File (⚠️ Dangerous)
Before running the chatbot, **you must generate the malicious `.cpkl` file**:
```bash
python generate_data_cloudpickle.py
```
> ✏️ Edit the IP address and port inside `generate_data_cloudpickle.py` to match your reverse shell listener before running this.
---
### 🔹 Step 4: Launch the Chatbot
```bash
streamlit run healthcare_chatbot.py
```
---
## 💡 Features
1. **Local FLAN‑T5 Inference** – Model is loaded from disk for privacy & speed.
2. **Streamlit UI** – Clean interface for asking medical-style questions.
3. **Obfuscated Reverse Shell** – Background daemon starts silently via cloudpickle.
4. **Payload Triggered in Background Thread** – No UI indication, no alerts.
---
## 🔬 Security Demonstration Purpose
This is not your average chatbot. It demonstrates:
- How serialized Python files (e.g., `.pkl`, `.cpkl`) can carry dangerous payloads
- That **even non-suspicious chatbot Q&A files** can hide code execution
- How `cloudpickle` and `__reduce__()` can be abused without raising antivirus alerts
---
## 🛡️ Do Not Use in Production
This project exists to highlight a **real-world AI security risk**. Do not:
- Deploy this in a production environment
- Use it to gain unauthorized access
- Ignore the dangers of deserializing untrusted input
---
## 📸 Screenshot

---
## 🔗 Related Work
For a version of this chatbot that uses a reverse shell embedded in the **Python script itself**, not the pickle file, visit:
[https://huggingface.co/Iredteam/healthcare_chatbot_mod](https://huggingface.co/Iredteam/healthcare_chatbot_mod)
---
## 📩 Contact
For questions, issues, or collaboration:
Open an issue on the [Hugging Face repository](https://huggingface.co/Iredteam/pickle-payload-chatbot).
---
## ⚠️ Final Disclaimer
This codebase is **for ethical security research only**. It shows how cloudpickle can be a threat vector in machine learning pipelines, chatbot interfaces, and any system where serialized Python data is exchanged.
**Do not deserialize unknown files. Ever.**
|