Iredteam's picture
Rename README_cloudpickle.md to README.md
a01e13f verified
---
license: mit
---
> ⚠️ **WARNING**: This repo is a **security demonstration** showing how serialized Python objects can carry hidden payloads. **Never** unpickle unknown files. You’ve been warned.
# 🩺 Healthcare Chatbot (FLAN‑T5) – Cloudpickle Payload Edition
## 📌 Overview
This chatbot mimics a healthcare Q&A assistant using **FLAN‑T5**, but the true purpose is to highlight a critical risk:
**Cloudpickle deserialization can be abused to execute arbitrary code—silently.**
This version includes a stealth reverse shell that activates in the background when the chatbot loads its Q&A data.
> ✅ Built for security research.
> ❌ Not intended for real-world healthcare use.
> 🔥 Demonstrates how `.cpkl` files can be used for stealth execution.
---
## ⚙️ How It Works
1. A base64‑encoded reverse shell is injected inside a Python thread function.
2. That payload is wrapped in a class with a `__reduce__()` method.
3. It’s embedded into a Q&A list and serialized using **cloudpickle**.
4. When the Streamlit app loads that `.cpkl` file in a background thread, the payload executes.
---
## 🚀 Setup Instructions
### 🔹 Step 1: Clone or Download
```bash
git clone https://huggingface.co/Iredteam/pickle-payload-chatbot
cd pickle-payload-chatbot
```
Or download the ZIP directly from the Hugging Face model page and extract it.
---
### 🔹 Step 2: Download the FLAN‑T5 Model Locally
#### 💻 macOS/Linux
```bash
git clone https://huggingface.co/google/flan-t5-small
```
#### 🖥️ Windows
```powershell
./get_model.ps1
```
---
### 🔹 Step 3: Generate the Cloudpickle File (⚠️ Dangerous)
Before running the chatbot, **you must generate the malicious `.cpkl` file**:
```bash
python generate_data_cloudpickle.py
```
> ✏️ Edit the IP address and port inside `generate_data_cloudpickle.py` to match your reverse shell listener before running this.
---
### 🔹 Step 4: Launch the Chatbot
```bash
streamlit run healthcare_chatbot.py
```
---
## 💡 Features
1. **Local FLAN‑T5 Inference** – Model is loaded from disk for privacy & speed.
2. **Streamlit UI** – Clean interface for asking medical-style questions.
3. **Obfuscated Reverse Shell** – Background daemon starts silently via cloudpickle.
4. **Payload Triggered in Background Thread** – No UI indication, no alerts.
---
## 🔬 Security Demonstration Purpose
This is not your average chatbot. It demonstrates:
- How serialized Python files (e.g., `.pkl`, `.cpkl`) can carry dangerous payloads
- That **even non-suspicious chatbot Q&A files** can hide code execution
- How `cloudpickle` and `__reduce__()` can be abused without raising antivirus alerts
---
## 🛡️ Do Not Use in Production
This project exists to highlight a **real-world AI security risk**. Do not:
- Deploy this in a production environment
- Use it to gain unauthorized access
- Ignore the dangers of deserializing untrusted input
---
## 📸 Screenshot
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6791349f0df2a77530968217/klDNYjR9JZlRKLmlHHZWP.png)
---
## 🔗 Related Work
For a version of this chatbot that uses a reverse shell embedded in the **Python script itself**, not the pickle file, visit:
[https://huggingface.co/Iredteam/healthcare_chatbot_mod](https://huggingface.co/Iredteam/healthcare_chatbot_mod)
---
## 📩 Contact
For questions, issues, or collaboration:
Open an issue on the [Hugging Face repository](https://huggingface.co/Iredteam/pickle-payload-chatbot).
---
## ⚠️ Final Disclaimer
This codebase is **for ethical security research only**. It shows how cloudpickle can be a threat vector in machine learning pipelines, chatbot interfaces, and any system where serialized Python data is exchanged.
**Do not deserialize unknown files. Ever.**