Spaces:

Dreipfelt
/

getaround-api

Running

App Files Files Community

Dreipfelt commited on 19 days ago

Commit

372c427

verified ·

1 Parent(s): 52631bd

Upload 7 files

Browse files

Files changed (8) hide show

.gitattributes +1 -0
Dockerfile +10 -0
README.md +183 -0
app.py +177 -0
feature_names.json +1 -0
model_metrics.json +1 -0
pipeline.pkl +3 -0
requirements.txt +12 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1 @@


1	+ pipeline.pkl filter=lfs diff=lfs merge=lfs -text

Dockerfile ADDED Viewed

	@@ -0,0 +1,10 @@

+FROM python:3.10-slim
+WORKDIR /app
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,183 @@

+# 🚗 GetAround — Delay Analysis & Pricing Prediction
+> Certification CDSD — Data Science & Deployment Project — Jedha Bootcamp
+---
+## 📌 Project Overview
+GetAround is a peer-to-peer car rental platform. Late vehicle returns create friction
+for subsequent rentals, leading to customer dissatisfaction and cancellations.
+This project addresses two strategic challenges:
+- **Operational optimization** — Analyzing late checkouts and simulating minimum delay
+  thresholds to reduce conflicts between consecutive rentals.
+- **Pricing optimization** — Serving a Machine Learning model via a production API to
+  help owners set optimal daily rental prices.
+---
+## 🔗 Production Links
+| Service | URL |
+|---------|-----|
+| 📊 Dashboard | https://huggingface.co/spaces/Dreipfelt/getaround-dashboard |
+| 🔌 API | https://Dreipfelt-getaround-api.hf.space |
+| 📄 API Docs | https://Dreipfelt-getaround-api.hf.space/docs |
+| ⚙️ Swagger UI | https://Dreipfelt-getaround-api.hf.space/swagger |
+| 💻 GitHub | https://github.com/Data-Science-Designer-and-Developer/Project_GetAround |
+---
+## 🎯 Business Objectives
+### Delay Management
+- Measure how often drivers return cars late
+- Quantify the impact on subsequent rentals
+- Simulate different minimum delay thresholds (0 to 720 minutes)
+- Help Product Management choose:
+  - an optimal delay **threshold**
+  - an appropriate **scope** (all cars vs Connect only)
+### Pricing Optimization
+- Train a ML model on car characteristics
+- Serve predictions via a REST API
+- Allow real-time price prediction through a `/predict` endpoint
+---
+## 📊 Dashboard
+The interactive dashboard allows Product Managers to:
+- Visualize the distribution of late checkouts
+- Compare Connect vs Mobile check-in types
+- Simulate the trade-off between blocked rentals and resolved issues
+- Filter by scope and threshold in real time
+- Get a live price prediction from the API
+🔗 https://huggingface.co/spaces/Dreipfelt/getaround-dashboard
+---
+## 🤖 Machine Learning API
+### Model
+| Property | Value |
+|----------|-------|
+| Algorithm | XGBoost Regressor (sklearn Pipeline) |
+| Target | rental_price_per_day (€) |
+| R² | ~0.68 |
+| RMSE | XX € ← à remplacer depuis le notebook |
+| Features | 28 (mileage, engine_power, fuel, color, car_type, options…) |
+> **Baseline context:** a naive model predicting the dataset mean achieves R² = 0.
+> Our model's R² of 0.68 represents a substantial improvement over this baseline,
+> explaining 68% of price variance from car characteristics alone.
+### Endpoint `/predict`
+- **Method**: POST
+- **Input**: JSON with key `input` — list of lists (one per car)
+- **Validation**: each row must contain exactly the number of features defined in
+  `feature_names.json`; the API returns a `422` error with a descriptive message
+  if the input is malformed.
+```bash
+curl -X POST "https://Dreipfelt-getaround-api.hf.space/predict" \
+     -H "Content-Type: application/json" \
+     -d '{"input": [[150000, 120, 1, 1, 1, 0, 1, 1, 0]]}'
+```
+**Response:**
+```json
+{"prediction": [104.75]}
+```
+📄 Full documentation: https://Dreipfelt-getaround-api.hf.space/docs
+⚙️ Swagger UI: https://Dreipfelt-getaround-api.hf.space/swagger
+---
+## 🗂️ Repository Structure
+```
+Project_GetAround/
+├── api/                        # FastAPI application
+│   ├── app.py                  # API endpoints
+│   ├── Dockerfile              # Docker configuration
+│   └── feature_names.json      # Model feature names
+│
+├── dashboard/                  # Streamlit dashboard
+│   ├── app.py                  # Dashboard application
+│   └── requirements.txt
+│
+├── notebooks/                  # Jupyter notebooks
+│   ├── 01_EDA_delays.ipynb     # Delay analysis
+│   └── 02_ML_pricing.ipynb     # ML model training
+│
+├── .gitignore
+└── README.md
+```
+---
+## 🛠️ Tech Stack
+| Category | Tools |
+|----------|-------|
+| Language | Python 3.10 |
+| Dashboard | Streamlit, Plotly |
+| API | FastAPI, Uvicorn |
+| ML | Scikit-learn, XGBoost Regressor |
+| Deployment | Hugging Face Spaces, Docker |
+| Version Control | Git, GitHub |
+---
+## 🔒 Data & Privacy (RGPD / GDPR)
+The datasets used in this project (`get_around_delay_analysis.xlsx` and the pricing
+dataset) contain **no personal data**: rental IDs are anonymous identifiers, and no
+name, email, phone number, or precise location is present.
+The API processes only technical car characteristics (mileage, engine power, equipment
+options) submitted by the user. This data is used for real-time inference only and is
+**not stored or logged** after the response is returned.
+The service is hosted on **Hugging Face Spaces** (EU infrastructure), consistent with
+RGPD requirements. No third-party analytics or tracking is used.
+---
+## ⚙️ Local Setup
+```bash
+# Clone the repo
+git clone https://github.com/Data-Science-Designer-and-Developer/Project_GetAround.git
+cd Project_GetAround
+# Install dependencies
+pip install -r dashboard/requirements.txt
+# Run the dashboard
+streamlit run dashboard/app.py
+# Run the API
+cd api
+uvicorn app:app --reload
+# API available at http://localhost:8000
+# Swagger UI at http://localhost:8000/swagger
+# Custom docs at http://localhost:8000/docs
+```
+---
+## 👤 Author
+**Frédéric**
+CDSD Candidate — Data Scientist
+Jedha Bootcamp

app.py ADDED Viewed

	@@ -0,0 +1,177 @@

+import os
+import json
+import joblib
+import pandas as pd
+import numpy as np
+from fastapi import FastAPI
+from fastapi.responses import HTMLResponse
+from pydantic import BaseModel, ConfigDict
+# ── Load model and feature names ────────────────────────────────────────
+BASE_DIR = os.path.dirname(os.path.abspath(__file__))
+PIPELINE_PATH = os.path.join(BASE_DIR, "pipeline.pkl")
+FEATURES_PATH = os.path.join(BASE_DIR, "feature_names.json")
+METRICS_PATH = os.path.join(BASE_DIR, "model_metrics.json")
+pipeline = joblib.load(PIPELINE_PATH)
+with open(FEATURES_PATH, "r", encoding="utf-8") as f:
+    feature_names = json.load(f)
+# ── Initialize app ──────────────────────────────────────────────────────
+app = FastAPI(
+    title="GetAround Pricing API",
+    description="Predicts the optimal rental price per day for a car",
+    version="1.0.0"
+)
+# ── Input schema ────────────────────────────────────────────────────────
+class PredictInput(BaseModel):
+    input: list[list]
+    model_config = ConfigDict(
+        json_schema_extra={
+            "example": {
+                "input": [
+                    [150000, 120, 1, 1, 1, 0, 1, 1, 0]
+                ]
+            }
+        }
+    )
+# ── Root route ──────────────────────────────────────────────────────────
+@app.get("/", response_class=HTMLResponse)
+def root():
+    return """
+    <html>
+        <body style="font-family: Arial; text-align: center; padding: 50px;">
+            <h1>🚗 GetAround Pricing API</h1>
+            <p>API is running!</p>
+            <a href="/docs">📄 Go to Documentation</a>
+        </body>
+    </html>
+    """
+# ── /predict route ──────────────────────────────────────────────────────
+@app.post("/predict")
+def predict(data: PredictInput):
+    # Convert input to DataFrame with correct column names
+    X = pd.DataFrame(data.input, columns=feature_names)
+    # Make predictions
+    predictions = pipeline.predict(X)
+    # Round to 2 decimals and return as list
+    return {"prediction": [round(float(p), 2) for p in predictions]}
+# ── /docs route ─────────────────────────────────────────────────────────
+@app.get("/docs", response_class=HTMLResponse)
+def documentation():
+    return """
+    <!DOCTYPE html>
+    <html lang="en">
+    <head>
+        <meta charset="UTF-8">
+        <meta name="viewport" content="width=device-width, initial-scale=1.0">
+        <title>GetAround API Documentation</title>
+        <style>
+            * { margin: 0; padding: 0; box-sizing: border-box; }
+            body { font-family: 'Segoe UI', Arial, sans-serif; background: #f5f7fa; color: #333; }
+            header { background: #1a1a2e; color: white; padding: 40px; text-align: center; }
+            header h1 { font-size: 2.5em; margin-bottom: 10px; }
+            header p { color: #aaa; font-size: 1.1em; }
+            .container { max-width: 900px; margin: 40px auto; padding: 0 20px; }
+            .endpoint { background: white; border-radius: 12px; padding: 30px; margin-bottom: 30px; box-shadow: 0 2px 10px rgba(0,0,0,0.08); }
+            .endpoint h2 { font-size: 1.4em; margin-bottom: 15px; display: flex; align-items: center; gap: 12px; }
+            .badge { padding: 5px 14px; border-radius: 20px; font-size: 0.85em; font-weight: bold; }
+            .post { background: #d4edda; color: #155724; }
+            .get  { background: #cce5ff; color: #004085; }
+            .url  { background: #1a1a2e; color: #00d4aa; padding: 12px 18px; border-radius: 8px; font-family: monospace; margin: 15px 0; }
+            .section-title { font-weight: bold; margin: 20px 0 8px; color: #555; text-transform: uppercase; font-size: 0.85em; letter-spacing: 1px; }
+            pre { background: #f8f9fa; border: 1px solid #e9ecef; border-radius: 8px; padding: 15px; font-family: monospace; font-size: 0.9em; overflow-x: auto; }
+            .param-table { width: 100%; border-collapse: collapse; margin-top: 10px; }
+            .param-table th { background: #f1f3f5; padding: 10px; text-align: left; font-size: 0.85em; color: #555; }
+            .param-table td { padding: 10px; border-bottom: 1px solid #f1f3f5; font-size: 0.9em; }
+            .tag { background: #e9ecef; padding: 2px 8px; border-radius: 4px; font-family: monospace; font-size: 0.85em; }
+            footer { text-align: center; padding: 30px; color: #aaa; font-size: 0.9em; }
+        </style>
+    </head>
+    <body>
+    <header>
+        <h1>🚗 GetAround Pricing API</h1>
+        <p>Predict the optimal rental price per day for any car</p>
+    </header>
+    <div class="container">
+        <div class="endpoint">
+            <h2><span class="badge post">POST</span>/predict</h2>
+            <p>Returns a predicted rental price per day based on the car's characteristics.</p>
+            <div class="url">/predict</div>
+            <div class="section-title">Input</div>
+            <p>JSON body with key <span class="tag">input</span> — a list of lists (one per car).</p>
+            <table class="param-table">
+                <tr><th>#</th><th>Feature</th><th>Type</th><th>Example</th></tr>
+                <tr><td>1</td><td>mileage</td><td>float</td><td>150000.0</td></tr>
+                <tr><td>2</td><td>engine_power</td><td>float</td><td>120.0</td></tr>
+                <tr><td>3</td><td>private_parking_available</td><td>bool (0/1)</td><td>1.0</td></tr>
+                <tr><td>4</td><td>has_gps</td><td>bool (0/1)</td><td>1.0</td></tr>
+                <tr><td>5</td><td>has_air_conditioning</td><td>bool (0/1)</td><td>1.0</td></tr>
+                <tr><td>6</td><td>automatic_car</td><td>bool (0/1)</td><td>0.0</td></tr>
+                <tr><td>7</td><td>has_getaround_connect</td><td>bool (0/1)</td><td>1.0</td></tr>
+                <tr><td>8</td><td>has_speed_regulator</td><td>bool (0/1)</td><td>1.0</td></tr>
+                <tr><td>9</td><td>winter_tires</td><td>bool (0/1)</td><td>0.0</td></tr>
+            </table>
+            <div class="section-title">Request Example</div>
+            <pre>curl -X POST "https://your-url/predict" \
+     -H "Content-Type: application/json" \
+     -d '{"input": [[7.0, 0.27, 0.36, 20.7, 0.045, 45.0, 170.0, 1.001, 3.0, 0.45, 8.8]]}'</pre>
+            <div class="section-title">Response Example</div>
+            <pre>{"prediction": [89.5]}</pre>
+        </div>
+        <div class="endpoint">
+            <h2><span class="badge get">GET</span>/</h2>
+            <p>Health check — confirms the API is running.</p>
+            <div class="url">/</div>
+        </div>
+        <div class="endpoint">
+            <h2><span class="badge get">GET</span>/docs</h2>
+            <p>This documentation page.</p>
+            <div class="url">/docs</div>
+        </div>
+        <div class="endpoint">
+            <h2>🤖 Model Information</h2>
+            <table class="param-table">
+                <tr><th>Property</th><th>Value</th></tr>
+                <tr><td>Algorithm</td><td>XGBoost Regressor (via sklearn Pipeline)</td></tr>
+                <tr><td>Target</td><td>rental_price_per_day (€)</td></tr>
+                <tr><td>RMSE</td><td>~XX €</td></tr>
+                <tr><td>R²</td><td>~0.XX</td></tr>
+            </table>
+            <p style="margin-top:12px; color:#888; font-size:0.85em;">
+                ⚠️ Replace RMSE and R² with your actual results from the notebook.
+            </p>
+        </div>
+    </div>
+    <footer>GetAround Pricing API — Built with FastAPI 🚀</footer>
+    </body>
+    </html>
+    """

feature_names.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ ["model_key", "mileage", "engine_power", "fuel", "paint_color", "car_type", "private_parking_available", "has_gps", "has_air_conditioning", "automatic_car", "has_getaround_connect", "has_speed_regulator", "winter_tires"]

model_metrics.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"RMSE": 16.602761905202982, "MAE": 10.496041297912598, "R\u00b2": 0.7382780909538269, "CV_RMSE_mean": 16.862179946899413, "CV_RMSE_std": 1.2674684824599929}

pipeline.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39b815b2a06f3cd4c43e246dfc5af9d8178a328c2cc87bf0b4acfee2af903981
+size 348162

requirements.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+fastapi==0.115.0
+uvicorn==0.30.6
+pandas==2.2.2
+numpy==1.26.4
+scikit-learn==1.5.1
+joblib==1.4.2
+xgboost==3.1.2
+pydantic==2.9.2