ML_Salary_Predictor / Predictor_README.md
ChitiN7's picture
Upload 6 files
a31b6f5 verified

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

MLPayGrade Advanced Salary Predictor - Hugging Face Spaces Deployment

πŸš€ Quick Deploy to Hugging Face Spaces

This repository contains everything needed to deploy your MLPayGrade salary prediction model on Hugging Face Spaces.

πŸ“ Files Structure

Deployment Files/
β”œβ”€β”€ app.py                 # Main Gradio app for Hugging Face
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ best_model.pkl        # Trained LightGBM model
β”œβ”€β”€ scaler.pkl            # Feature scaler
β”œβ”€β”€ feature_names.json    # Feature names list
β”œβ”€β”€ deployment_functions.pkl  # Feature engineering functions
β”œβ”€β”€ shap_explainer.pkl   # SHAP explainer
└── shap_importance.json # Feature importance rankings

🎯 Model Information

  • Algorithm: LightGBM Regressor
  • Features: 85 Clean Features (No Data Leakage)
  • Performance: RΒ² = 0.2848, MAE = $44,323.68, RMSE = $64,868.74
  • Data: 2024 ML/AI Job Market Data
  • Validation: Honest Performance (Corrected for Data Leakage)

πŸš€ Deployment Steps

1. Create Hugging Face Account

2. Create New Space

  • Click "New Space" button
  • Choose "Gradio" as the SDK
  • Set visibility (Public or Private)
  • Choose a license

3. Upload Files

  • Upload all files from the Deployment Files/ folder
  • Make sure app.py is in the root directory
  • Upload model files (*.pkl, *.json)

4. Automatic Deployment

  • Hugging Face will automatically install dependencies from requirements.txt
  • The app will be available at: https://huggingface.co/spaces/YOUR_USERNAME/SPACE_NAME

πŸ”§ Features

Job Configuration

  • Job Title: Data Scientist, ML Engineer, AI Engineer, Data Engineer, Data Analyst
  • Experience Level: Entry, Mid, Senior, Executive
  • Company Size: Small (<50), Medium (50-250), Large (>250)
  • Employment Type: Full-time, Part-time, Contract, Freelance
  • Location: US, CA, GB, AU, DE, FR, etc.
  • Remote Work: On-site, Hybrid, Remote

Model Outputs

  • Predicted Salary: Annual salary in USD
  • Detailed Explanation: Feature breakdown and model information
  • What-If Analysis: Interactive parameter exploration

πŸ“Š Model Performance

Metric Value Status
RΒ² Score 0.2848 βœ… Honest
MAE $44,323.68 βœ… Realistic
RMSE $64,868.74 βœ… Appropriate
Data Leakage None βœ… Clean

🎯 Key Advantages

  1. No Data Leakage: All features are legitimate and domain-driven
  2. Honest Performance: Realistic RΒ² score reflects true predictive power
  3. Clean Architecture: Proper train-test separation
  4. Domain Knowledge: Features based on industry understanding
  5. Interactive UI: User-friendly Gradio interface

πŸ” Technical Details

Feature Engineering

  • Ordinal Encodings: Experience level, company size, employment type
  • Interaction Features: Experience Γ— Size, Experience Γ— Remote, Size Γ— Remote
  • Geographic Features: Country-based location encoding
  • Complexity Features: Job title word count, location diversity

Model Architecture

  • Algorithm: LightGBM (Gradient Boosting)
  • Preprocessing: RobustScaler for feature scaling
  • Validation: Proper train-test split (no data leakage)
  • Explainability: SHAP analysis ready

🌐 Access Your Deployed App

Once deployed, your app will be available at:

https://huggingface.co/spaces/YOUR_USERNAME/MLPayGrade-Salary-Predictor

πŸ“ˆ Usage Examples

Example 1: Senior Data Scientist

  • Job Title: "Data Scientist"
  • Experience: Senior Level
  • Company Size: Large
  • Location: US
  • Remote: Hybrid
  • Predicted: ~$180,000

Example 2: Entry ML Engineer

  • Job Title: "ML Engineer"
  • Experience: Entry Level
  • Company Size: Medium
  • Location: CA
  • Remote: On-site
  • Predicted: ~$95,000

πŸŽ‰ Benefits of Hugging Face Deployment

  1. Reliability: Always available, no local setup needed
  2. Scalability: Handles multiple users simultaneously
  3. Sharing: Easy to share with stakeholders
  4. Updates: Simple to update and redeploy
  5. Professional: Looks professional for presentations

πŸ”§ Troubleshooting

Common Issues

  1. Model Loading Error: Ensure all .pkl files are uploaded
  2. Dependency Issues: Check requirements.txt compatibility
  3. Memory Limits: Free tier has 16GB RAM limit
  4. File Size: Ensure model files are under space limits

Solutions

  1. Verify Files: Check all required files are present
  2. Update Dependencies: Use compatible package versions
  3. Optimize Model: Reduce model size if needed
  4. Check Logs: Use Hugging Face logs for debugging

πŸ“ž Support

For deployment issues:

  1. Check Hugging Face documentation
  2. Review error logs in your Space
  3. Verify all files are properly uploaded
  4. Ensure dependencies are compatible

MLPayGrade Advanced Track - Built with excellence and honesty! 🎯