File size: 4,604 Bytes
0551c02 74e84f3 0551c02 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | ---
title: Privacy Preserving Machine Learning
emoji: π
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 6.9.0
app_file: app.py
pinned: false
license: mit
short_description: Project demonstrates privacy-preserving techniques for ML
---
# Privacy-Preserving Machine Learning Demo
## π Project Overview
This project demonstrates privacy-preserving techniques for machine learning on sensitive healthcare data. It implements:
- **SHA-256 Hashing** for direct identifiers (SSN)
- **Pseudonymization** for names
- **K-Anonymity Generalization** for DOB and income
- **Laplace Noise** (Differential Privacy) for numerical values
- **Differentially Private ML Training** using IBM's diffprivlib
## π Files Included
| File | Description |
|------|-------------|
| `app.py` | Gradio web interface (main entry point for HF Spaces) |
| `privacy_ml_solution.py` | Core ML pipeline with all privacy techniques |
| `requirements.txt` | Python dependencies |
| `Assignment2Dataset-1_encrypted.csv` | The encrypted/anonymized dataset |
| `model_comparison_results.csv` | Performance metrics comparing models |
| `Privacy_Preserving_ML_Report.docx` | Comprehensive academic report |
| `Technical_Documentation.docx` | Code and library documentation |
---
## π Deploying to Hugging Face Spaces (Step-by-Step for Beginners)
### Step 1: Create a Hugging Face Account
1. Go to [huggingface.co](https://huggingface.co)
2. Click "Sign Up" in the top right
3. Fill in your details and verify your email
### Step 2: Create a New Space
1. Once logged in, click your profile picture β "New Space"
2. Fill in the form:
- **Owner**: Select your username
- **Space name**: e.g., `privacy-ml-demo`
- **License**: MIT
- **SDK**: Select **Gradio**
- **Hardware**: Keep as "CPU Basic" (free)
3. Click "Create Space"
### Step 3: Upload Your Files
**Option A: Using the Web Interface (Easiest)**
1. In your new Space, click the "Files" tab
2. Click "Add file" β "Upload files"
3. Upload these files:
- `app.py` (REQUIRED - this is the entry point)
- `requirements.txt` (REQUIRED)
- `Assignment2Dataset-1_encrypted.csv` (optional sample data)
4. Wait for the build to complete (~2-3 minutes)
**Option B: Using Git (For More Control)**
```bash
# Clone your space
git clone https://huggingface.co/spaces/YOUR_USERNAME/privacy-ml-demo
cd privacy-ml-demo
# Copy your files into the directory
cp /path/to/app.py .
cp /path/to/requirements.txt .
# Commit and push
git add .
git commit -m "Initial upload"
git push
```
### Step 4: Wait for Build
1. Click the "App" tab to see your space building
2. Watch the logs for any errors
3. Once complete, your app will be live!
### Step 5: Test Your App
1. Your app is now live at: `https://huggingface.co/spaces/YOUR_USERNAME/privacy-ml-demo`
2. Upload a CSV file and adjust the epsilon slider
3. Click "Run Privacy Analysis" to see results
---
## βοΈ Local Testing (Before Deployment)
```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run the app
python app.py
# Open http://localhost:7860 in your browser
```
---
## π§ Troubleshooting
### "Build failed" Error
- Check that `requirements.txt` has correct package names
- View build logs for specific error messages
### "App crashed" Error
- Ensure `app.py` has `demo.launch()` at the bottom
- Check for syntax errors in your code
### Slow Loading
- Free tier spaces "sleep" after inactivity
- First load takes ~30 seconds to wake up
### Memory Issues
- Reduce `n_estimators` in RandomForest
- Use smaller test datasets
---
## π Understanding the Results
| Metric | What it Means |
|--------|---------------|
| **Accuracy** | % of correct predictions |
| **F1 Score** | Balance of precision and recall |
| **Epsilon (Ξ΅)** | Privacy budget - lower = more privacy |
### Privacy Level Guide
- Ξ΅ = 0.1-0.5: Very high privacy, some accuracy loss
- Ξ΅ = 1.0: Balanced (recommended)
- Ξ΅ = 5.0+: Lower privacy, minimal accuracy impact
---
## π Learn More
- [Differential Privacy Explained](https://desfontain.es/privacy/differential-privacy-awesomeness.html)
- [IBM diffprivlib Documentation](https://diffprivlib.readthedocs.io/)
- [Gradio Documentation](https://gradio.app/docs/)
- [Hugging Face Spaces Guide](https://huggingface.co/docs/hub/spaces)
---
## π License
MIT License - Feel free to use and modify for your projects.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|