--- title: Privacy Preserving Machine Learning emoji: 🏆 colorFrom: gray colorTo: blue sdk: gradio sdk_version: 6.9.0 app_file: app.py pinned: false license: mit short_description: Project demonstrates privacy-preserving techniques for ML --- # Privacy-Preserving Machine Learning Demo ## 🔒 Project Overview This project demonstrates privacy-preserving techniques for machine learning on sensitive healthcare data. It implements: - **SHA-256 Hashing** for direct identifiers (SSN) - **Pseudonymization** for names - **K-Anonymity Generalization** for DOB and income - **Laplace Noise** (Differential Privacy) for numerical values - **Differentially Private ML Training** using IBM's diffprivlib ## 📁 Files Included | File | Description | |------|-------------| | `app.py` | Gradio web interface (main entry point for HF Spaces) | | `privacy_ml_solution.py` | Core ML pipeline with all privacy techniques | | `requirements.txt` | Python dependencies | | `Assignment2Dataset-1_encrypted.csv` | The encrypted/anonymized dataset | | `model_comparison_results.csv` | Performance metrics comparing models | | `Privacy_Preserving_ML_Report.docx` | Comprehensive academic report | | `Technical_Documentation.docx` | Code and library documentation | --- ## 🚀 Deploying to Hugging Face Spaces (Step-by-Step for Beginners) ### Step 1: Create a Hugging Face Account 1. Go to [huggingface.co](https://huggingface.co) 2. Click "Sign Up" in the top right 3. Fill in your details and verify your email ### Step 2: Create a New Space 1. Once logged in, click your profile picture → "New Space" 2. Fill in the form: - **Owner**: Select your username - **Space name**: e.g., `privacy-ml-demo` - **License**: MIT - **SDK**: Select **Gradio** - **Hardware**: Keep as "CPU Basic" (free) 3. Click "Create Space" ### Step 3: Upload Your Files **Option A: Using the Web Interface (Easiest)** 1. In your new Space, click the "Files" tab 2. Click "Add file" → "Upload files" 3. Upload these files: - `app.py` (REQUIRED - this is the entry point) - `requirements.txt` (REQUIRED) - `Assignment2Dataset-1_encrypted.csv` (optional sample data) 4. Wait for the build to complete (~2-3 minutes) **Option B: Using Git (For More Control)** ```bash # Clone your space git clone https://huggingface.co/spaces/YOUR_USERNAME/privacy-ml-demo cd privacy-ml-demo # Copy your files into the directory cp /path/to/app.py . cp /path/to/requirements.txt . # Commit and push git add . git commit -m "Initial upload" git push ``` ### Step 4: Wait for Build 1. Click the "App" tab to see your space building 2. Watch the logs for any errors 3. Once complete, your app will be live! ### Step 5: Test Your App 1. Your app is now live at: `https://huggingface.co/spaces/YOUR_USERNAME/privacy-ml-demo` 2. Upload a CSV file and adjust the epsilon slider 3. Click "Run Privacy Analysis" to see results --- ## ⚙️ Local Testing (Before Deployment) ```bash # Create virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Run the app python app.py # Open http://localhost:7860 in your browser ``` --- ## 🔧 Troubleshooting ### "Build failed" Error - Check that `requirements.txt` has correct package names - View build logs for specific error messages ### "App crashed" Error - Ensure `app.py` has `demo.launch()` at the bottom - Check for syntax errors in your code ### Slow Loading - Free tier spaces "sleep" after inactivity - First load takes ~30 seconds to wake up ### Memory Issues - Reduce `n_estimators` in RandomForest - Use smaller test datasets --- ## 📊 Understanding the Results | Metric | What it Means | |--------|---------------| | **Accuracy** | % of correct predictions | | **F1 Score** | Balance of precision and recall | | **Epsilon (ε)** | Privacy budget - lower = more privacy | ### Privacy Level Guide - ε = 0.1-0.5: Very high privacy, some accuracy loss - ε = 1.0: Balanced (recommended) - ε = 5.0+: Lower privacy, minimal accuracy impact --- ## 📚 Learn More - [Differential Privacy Explained](https://desfontain.es/privacy/differential-privacy-awesomeness.html) - [IBM diffprivlib Documentation](https://diffprivlib.readthedocs.io/) - [Gradio Documentation](https://gradio.app/docs/) - [Hugging Face Spaces Guide](https://huggingface.co/docs/hub/spaces) --- ## 📝 License MIT License - Feel free to use and modify for your projects. Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference