Spaces:

VolarisLLC
/

LungCancerPrediction

Sleeping

App Files Files Community

saifisvibinn commited on Nov 11, 2025

Commit

6f2b9f4

1 Parent(s): ac19fe4

Replace user registration API with lung cancer prediction API

Browse files

Files changed (8) hide show

Dockerfile +3 -0
FASTAPI_README.md +0 -195
README.md +49 -15
best_lung_cancer_model.joblib +0 -0
main.py +322 -175
model_loader.py +141 -0
requirements.txt +5 -1
scaler.joblib +0 -0

Dockerfile CHANGED Viewed

@@ -12,6 +12,9 @@ RUN pip install --no-cache-dir -r requirements.txt
 # Copy application code
 COPY main.py .
 COPY start.sh .
 # Make startup script executable

 # Copy application code
 COPY main.py .
+COPY model_loader.py .
+COPY best_lung_cancer_model.joblib .
+COPY scaler.joblib .
 COPY start.sh .
 # Make startup script executable

FASTAPI_README.md DELETED Viewed

@@ -1,195 +0,0 @@
-# FastAPI Backend - User Registration API
-A clean, production-ready FastAPI backend with user registration functionality.
-## Features
-- ✅ RESTful API endpoints
-- ✅ Automatic Swagger/OpenAPI documentation
-- ✅ Pydantic models for request validation
-- ✅ Age validation (18+)
-- ✅ Clean, readable, production-ready code
-- ✅ Comprehensive error handling
-- ✅ Type hints throughout
-## Installation
-1. **Install dependencies:**
-   ```bash
-   pip install -r requirements.txt
-   ```
-## Running the API
-### Development Mode (with auto-reload)
-```bash
-uvicorn main:app --reload
-```
-### Production Mode
-```bash
-uvicorn main:app --host 0.0.0.0 --port 8000
-```
-The API will be available at:
-- **API**: http://localhost:8000
-- **Swagger UI**: http://localhost:8000/docs
-- **ReDoc**: http://localhost:8000/redoc
-## API Endpoints
-### 1. GET /status
-Check API status.
-**Response:**
-```json
-{
-  "status": "API is running"
-}
-```
-**Example:**
-```bash
-curl http://localhost:8000/status
-```
-### 2. POST /register
-Register a new user.
-**Request Body:**
-```json
-{
-  "name": "John Doe",
-  "email": "john.doe@example.com",
-  "age": 25
-}
-```
-**Success Response (201 Created):**
-```json
-{
-  "success": true,
-  "message": "User registered successfully",
-  "user": {
-    "name": "John Doe",
-    "email": "john.doe@example.com",
-    "age": 25
-  }
-}
-```
-**Error Response (400 Bad Request):**
-```json
-{
-  "success": false,
-  "error": "User must be at least 18",
-  "status_code": 400
-}
-```
-**Example:**
-```bash
-curl -X POST http://localhost:8000/register \
-  -H "Content-Type: application/json" \
-  -d '{
-    "name": "John Doe",
-    "email": "john.doe@example.com",
-    "age": 25
-  }'
-```
-## Validation Rules
-- **name**: Required, 1-100 characters, cannot be empty
-- **email**: Required, must be valid email format
-- **age**: Required, must be 18 or older
-## API Documentation
-FastAPI automatically generates interactive API documentation:
-- **Swagger UI**: http://localhost:8000/docs
-- **ReDoc**: http://localhost:8000/redoc
-## Testing
-### Test with cURL
-**Status endpoint:**
-```bash
-curl http://localhost:8000/status
-```
-**Register endpoint (valid):**
-```bash
-curl -X POST http://localhost:8000/register \
-  -H "Content-Type: application/json" \
-  -d '{"name": "Jane Smith", "email": "jane@example.com", "age": 25}'
-```
-**Register endpoint (invalid age):**
-```bash
-curl -X POST http://localhost:8000/register \
-  -H "Content-Type: application/json" \
-  -d '{"name": "Young User", "email": "young@example.com", "age": 16}'
-```
-### Test with Python
-```python
-import requests
-# Test status endpoint
-response = requests.get("http://localhost:8000/status")
-print(response.json())
-# Test register endpoint
-response = requests.post(
-    "http://localhost:8000/register",
-    json={
-        "name": "John Doe",
-        "email": "john@example.com",
-        "age": 25
-    }
-)
-print(response.json())
-```
-## Project Structure
-```
-.
-├── main.py              # FastAPI application
-├── requirements.txt     # Python dependencies
-└── FASTAPI_README.md    # This file
-```
-## Code Quality
-- ✅ Type hints throughout
-- ✅ Comprehensive docstrings
-- ✅ Pydantic models for validation
-- ✅ Proper HTTP status codes
-- ✅ Error handling
-- ✅ Clean, readable code structure
-- ✅ Production-ready patterns
-## Next Steps
-To make this production-ready, consider adding:
-1. **Database Integration**: Store users in a database (PostgreSQL, MongoDB, etc.)
-2. **Authentication**: Add JWT or OAuth2 authentication
-3. **Password Hashing**: If adding passwords, use bcrypt or similar
-4. **Email Verification**: Send confirmation emails
-5. **Rate Limiting**: Prevent abuse
-6. **Logging**: Add structured logging
-7. **Testing**: Add unit and integration tests
-8. **Docker**: Containerize the application
-9. **Environment Variables**: Use .env for configuration
-10. **CORS**: Configure CORS if needed for frontend integration
-## License
-This project is provided as-is for educational purposes.

README.md CHANGED Viewed

@@ -1,6 +1,14 @@
-# FastAPI User Registration Backend
-A clean, production-ready FastAPI backend with user registration functionality.
 ## Quick Start
@@ -24,21 +32,47 @@ See `QUICK_DEPLOY.md` for quick deployment instructions (Railway recommended).
 For detailed deployment options, see `DEPLOYMENT.md`.
-## Documentation
-- **Quick Start Guide:** `FASTAPI_README.md`
-- **Deployment Guide:** `DEPLOYMENT.md`
-- **Quick Deploy:** `QUICK_DEPLOY.md`
 ## API Endpoints
 - `GET /status` - Check API status
-- `POST /register` - Register a new user (requires name, email, age 18+)
-## Features
-- ✅ RESTful API endpoints
-- ✅ Automatic Swagger/OpenAPI documentation
-- ✅ Pydantic models for request validation
-- ✅ Age validation (18+)
-- ✅ Clean, production-ready code

+# Lung Cancer Prediction API
+A FastAPI-based REST API for predicting lung cancer risk based on patient symptoms and characteristics.
+## Features
+- ✅ RESTful API endpoints
+- ✅ Automatic Swagger/OpenAPI documentation
+- ✅ Pydantic models for request validation
+- ✅ CORS support for web applications
+- ✅ Production-ready with error handling
 ## Quick Start
 For detailed deployment options, see `DEPLOYMENT.md`.
 ## API Endpoints
+- `GET /` - API information
 - `GET /status` - Check API status
+- `POST /predict` - Predict lung cancer risk
+## Request Format
+```json
+{
+  "gender": "M",
+  "age": 65,
+  "smoking": "YES",
+  "yellow_fingers": "NO",
+  "anxiety": "NO",
+  "peer_pressure": "NO",
+  "chronic_disease": "YES",
+  "fatigue": "YES",
+  "allergy": "NO",
+  "wheezing": "YES",
+  "alcohol": "NO",
+  "coughing": "YES",
+  "shortness_of_breath": "YES",
+  "swallowing_difficulty": "NO",
+  "chest_pain": "YES"
+}
+```
+## Response Format
+```json
+{
+  "success": true,
+  "prediction": "YES",
+  "probability": 87.5,
+  "message": "Prediction: YES (Confidence: 87.50%)"
+}
+```
+## Notes
+- This application is for educational/research purposes only
+- Medical predictions should always be verified by healthcare professionals
+- The model accuracy depends on the quality of the training data

best_lung_cancer_model.joblib ADDED Viewed

Binary file (59.7 kB). View file

main.py CHANGED Viewed

@@ -1,114 +1,226 @@
 """
-FastAPI Backend Application
-A simple REST API with user registration functionality.
 """
 from fastapi import FastAPI, HTTPException, status, Request
 from fastapi.responses import JSONResponse
 from fastapi.exceptions import RequestValidationError
-from pydantic import BaseModel, EmailStr, Field, field_validator
-import uvicorn
 import os
 # Initialize FastAPI application
-# This automatically enables Swagger UI at /docs and ReDoc at /redoc
 app = FastAPI(
-    title="User Registration API",
-    description="A simple API for user registration with validation",
     version="1.0.0",
-    docs_url="/docs",  # Swagger UI documentation
-    redoc_url="/redoc"  # Alternative API documentation
 )
 # ============================================================================
 # Pydantic Models for Request/Response Validation
 # ============================================================================
-class RegisterRequest(BaseModel):
     """
-    Request model for user registration.
-    Validates name, email, and age fields.
     """
-    name: str = Field(
-        ...,
-        min_length=1,
-        max_length=100,
-        description="User's full name",
-        examples=["John Doe"]
-    )
-    email: EmailStr = Field(
-        ...,
-        description="User's email address",
-        examples=["john.doe@example.com"]
-    )
-    age: int = Field(
-        ...,
-        description="User's age (must be 18 or older)",
-        examples=[25]
-    )
-    @field_validator('name')
     @classmethod
-    def validate_name(cls, v: str) -> str:
-        """Validate that name is not empty after stripping whitespace."""
-        if not v.strip():
-            raise ValueError("Name cannot be empty")
-        return v.strip()
-    @field_validator('age')
     @classmethod
-    def validate_age(cls, v: int) -> int:
-        """Validate that age is 18 or older."""
-        if v < 18:
-            raise ValueError("User must be at least 18")
         return v
-    class Config:
-        """Pydantic configuration."""
-        json_schema_extra = {
-            "example": {
-                "name": "John Doe",
-                "email": "john.doe@example.com",
-                "age": 25
-            }
-        }
-class RegisterResponse(BaseModel):
     """
-    Response model for successful registration.
     """
-    success: bool = Field(
-        ...,
-        description="Indicates if registration was successful",
-        examples=[True]
-    )
-    message: str = Field(
-        ...,
-        description="Confirmation message",
-        examples=["User registered successfully"]
-    )
-    user: dict = Field(
-        ...,
-        description="Registered user information",
-        examples=[{
-            "name": "John Doe",
-            "email": "john.doe@example.com",
-            "age": 25
-        }]
-    )
 class StatusResponse(BaseModel):
     """
     Response model for status endpoint.
     """
-    status: str = Field(
-        ...,
-        description="API status message",
-        examples=["API is running"]
-    )
 # ============================================================================
@@ -118,24 +230,19 @@ class StatusResponse(BaseModel):
 @app.get(
     "/",
     summary="API Root",
-    description="Root endpoint with API information and available endpoints",
     tags=["Info"]
 )
 async def root():
-    """
-    Root endpoint that provides API information.
-    Returns:
-        dict: API information and available endpoints
-    """
     return {
-        "message": "Welcome to the User Registration API",
         "version": "1.0.0",
         "docs": "/docs",
         "redoc": "/redoc",
         "endpoints": {
             "GET /status": "Check API status",
-            "POST /register": "Register a new user (requires name, email, age 18+)"
         }
     }
@@ -144,7 +251,7 @@ async def root():
     "/status",
     response_model=StatusResponse,
     summary="Check API Status",
-    description="Returns the current status of the API",
     tags=["Health"]
 )
 async def get_status():
@@ -152,90 +259,154 @@ async def get_status():
     Health check endpoint.
     Returns:
-        JSONResponse: Status message indicating the API is running
-    Example Response:
-        {
-            "status": "API is running"
-        }
     """
-    return StatusResponse(status="API is running")
 @app.post(
-    "/register",
-    response_model=RegisterResponse,
-    status_code=status.HTTP_201_CREATED,
-    summary="Register a New User",
-    description="Register a new user with name, email, and age. Age must be 18 or older.",
-    tags=["Users"]
 )
-async def register_user(user_data: RegisterRequest):
     """
-    Register a new user endpoint.
-    This endpoint accepts user registration data and validates:
-    - Name: Must be non-empty string (1-100 characters)
-    - Email: Must be a valid email format
-    - Age: Must be 18 or older
     Args:
-        user_data (RegisterRequest): User registration data
     Returns:
-        RegisterResponse: Success confirmation with user data
     Raises:
-        HTTPException: 400 Bad Request if validation fails
-        HTTPException: 422 Unprocessable Entity if request format is invalid
-    Example Request:
-        {
-            "name": "John Doe",
-            "email": "john.doe@example.com",
-            "age": 25
-        }
-    Example Response:
-        {
-            "success": true,
-            "message": "User registered successfully",
-            "user": {
-                "name": "John Doe",
-                "email": "john.doe@example.com",
-                "age": 25
-            }
-        }
     """
-    # Age validation is handled by Pydantic field_validator
-    # In a real application, you would:
-    # 1. Check if email already exists in database
-    # 2. Hash password if included
-    # 3. Save user to database
-    # 4. Send confirmation email
-    # For now, we'll just return a success response
-    return RegisterResponse(
-        success=True,
-        message="User registered successfully",
-        user={
-            "name": user_data.name,
-            "email": user_data.email,
-            "age": user_data.age
-        }
-    )
 # ============================================================================
-# Custom Exception Handlers
 # ============================================================================
 @app.exception_handler(HTTPException)
-async def http_exception_handler(request, exc: HTTPException):
-    """
-    Custom handler for HTTP exceptions.
-    Returns consistent error response format.
-    """
     return JSONResponse(
         status_code=exc.status_code,
         content={
@@ -248,29 +419,8 @@ async def http_exception_handler(request, exc: HTTPException):
 @app.exception_handler(RequestValidationError)
 async def validation_exception_handler(request: Request, exc: RequestValidationError):
-    """
-    Custom handler for Pydantic validation errors.
-    Converts validation errors to 400 Bad Request with custom message for age.
-    """
     errors = exc.errors()
-    # Check if the error is related to age validation
-    for error in errors:
-        error_loc = error.get("loc", [])
-        error_msg = str(error.get("msg", ""))
-        # Check if this is an age validation error
-        if "age" in error_loc and ("User must be at least 18" in error_msg or "18" in error_msg):
-            return JSONResponse(
-                status_code=status.HTTP_400_BAD_REQUEST,
-                content={
-                    "success": False,
-                    "error": "User must be at least 18",
-                    "status_code": 400
-                }
-            )
-    # For other validation errors, return standard format
     return JSONResponse(
         status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
         content={
@@ -287,12 +437,10 @@ async def validation_exception_handler(request: Request, exc: RequestValidationE
 # ============================================================================
 if __name__ == "__main__":
-    # Run the application using uvicorn
     # Get port from environment variable (for deployment) or default to 8000
     port = int(os.environ.get("PORT", 8000))
     # --reload enables auto-reload on code changes (development only)
-    # In production, reload should be False
     reload = os.environ.get("ENVIRONMENT", "development") == "development"
     uvicorn.run(
@@ -301,4 +449,3 @@ if __name__ == "__main__":
         port=port,
         reload=reload
     )

 """
+FastAPI Lung Cancer Prediction API
+A RESTful API for predicting lung cancer risk based on patient symptoms and characteristics.
 """
 from fastapi import FastAPI, HTTPException, status, Request
 from fastapi.responses import JSONResponse
 from fastapi.exceptions import RequestValidationError
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel, Field, field_validator
+import numpy as np
+import sys
+import warnings
 import os
+import uvicorn
+warnings.filterwarnings('ignore')
 # Initialize FastAPI application
 app = FastAPI(
+    title="Lung Cancer Prediction API",
+    description="A RESTful API for predicting lung cancer risk based on patient symptoms",
     version="1.0.0",
+    docs_url="/docs",
+    redoc_url="/redoc"
 )
+# Enable CORS for all origins
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# ============================================================================
+# Model Loading with Compatibility Handling
+# ============================================================================
+model = None
+scaler = None
+# Try to load using the robust loader
+try:
+    import sklearn
+    print(f"scikit-learn version: {sklearn.__version__}")
+    # First, try aggressive patching - USE EuclideanDistance64 (not 32!)
+    try:
+        import sklearn.metrics._dist_metrics as dist_metrics
+        # Patch EuclideanDistance if missing - prioritize 64-bit version
+        if not hasattr(dist_metrics, 'EuclideanDistance'):
+            print("Attempting to patch EuclideanDistance...")
+            # Try option 1: Use EuclideanDistance64 (model uses 64-bit)
+            if hasattr(dist_metrics, 'EuclideanDistance64'):
+                EuclideanDistance64 = dist_metrics.EuclideanDistance64
+                dist_metrics.EuclideanDistance = EuclideanDistance64
+                setattr(dist_metrics, 'EuclideanDistance', EuclideanDistance64)
+                # Update in sys.modules - CRITICAL for unpickling
+                mod_name = 'sklearn.metrics._dist_metrics'
+                if mod_name in sys.modules:
+                    setattr(sys.modules[mod_name], 'EuclideanDistance', EuclideanDistance64)
+                if hasattr(dist_metrics, '__dict__'):
+                    dist_metrics.__dict__['EuclideanDistance'] = EuclideanDistance64
+                print("[OK] Patched EuclideanDistance using EuclideanDistance64")
+            # Fallback: Use EuclideanDistance32
+            elif hasattr(dist_metrics, 'EuclideanDistance32'):
+                EuclideanDistance32 = dist_metrics.EuclideanDistance32
+                dist_metrics.EuclideanDistance = EuclideanDistance32
+                setattr(dist_metrics, 'EuclideanDistance', EuclideanDistance32)
+                mod_name = 'sklearn.metrics._dist_metrics'
+                if mod_name in sys.modules:
+                    setattr(sys.modules[mod_name], 'EuclideanDistance', EuclideanDistance32)
+                if hasattr(dist_metrics, '__dict__'):
+                    dist_metrics.__dict__['EuclideanDistance'] = EuclideanDistance32
+                print("[OK] Patched EuclideanDistance using EuclideanDistance32")
+        # Ensure patch is in sys.modules
+        if 'sklearn.metrics._dist_metrics' in sys.modules and hasattr(dist_metrics, 'EuclideanDistance'):
+            if not hasattr(sys.modules['sklearn.metrics._dist_metrics'], 'EuclideanDistance'):
+                setattr(sys.modules['sklearn.metrics._dist_metrics'], 'EuclideanDistance', dist_metrics.EuclideanDistance)
+    except Exception as patch_error:
+        print(f"Warning: Could not apply pre-patch: {patch_error}")
+        import traceback
+        traceback.print_exc()
+    # Now try to load the model
+    try:
+        print("Loading model...")
+        import joblib
+        # Try standard loading first
+        try:
+            model = joblib.load('best_lung_cancer_model.joblib')
+            scaler = joblib.load('scaler.joblib')
+            print("[OK] Model and scaler loaded successfully!")
+        except (AttributeError, ModuleNotFoundError, KeyError) as e:
+            if 'EuclideanDistance' in str(e) or 'EuclideanDistance' in repr(e):
+                print("Compatibility issue detected. Trying alternative loading method...")
+                # Try using the model_loader
+                try:
+                    from model_loader import load_sklearn_model_safe
+                    model, scaler = load_sklearn_model_safe('best_lung_cancer_model.joblib', 'scaler.joblib')
+                    print("[OK] Model and scaler loaded successfully using compatibility loader!")
+                except Exception as e2:
+                    print(f"Compatibility loader also failed: {e2}")
+                    raise e  # Raise original error
+            else:
+                raise
+        # Print model info if available
+        if hasattr(model, 'feature_names_in_'):
+            print(f"Model expects {len(model.feature_names_in_)} features")
+            print(f"Features: {list(model.feature_names_in_)}")
+        if hasattr(model, 'classes_'):
+            print(f"Model classes: {model.classes_}")
+        if scaler and hasattr(scaler, 'n_features_in_'):
+            print(f"Scaler expects {scaler.n_features_in_} features")
+    except Exception as e:
+        error_msg = str(e)
+        print("\n" + "="*70)
+        print("MODEL LOADING ERROR")
+        print("="*70)
+        print(f"\nError: {error_msg}")
+        print("\nTroubleshooting steps:")
+        print("\n1. Try installing a compatible scikit-learn version:")
+        print("   pip uninstall scikit-learn")
+        print("   pip install scikit-learn==1.2.2")
+        print("\n2. If that doesn't work, try using Python 3.10 or 3.11")
+        print("   (Python 3.12 may have compatibility issues)")
+        print("\n3. Alternative: Install scikit-learn with pre-built wheels:")
+        print("   pip install --only-binary :all: scikit-learn==1.2.2")
+        print("\n4. Check that both model files exist:")
+        print("   - best_lung_cancer_model.joblib")
+        print("   - scaler.joblib")
+        print("="*70 + "\n")
+        import traceback
+        traceback.print_exc()
+        model = None
+        scaler = None
+except Exception as e:
+    print(f"Critical error during initialization: {e}")
+    import traceback
+    traceback.print_exc()
+    model = None
+    scaler = None
 # ============================================================================
 # Pydantic Models for Request/Response Validation
 # ============================================================================
+class PredictionRequest(BaseModel):
     """
+    Request model for lung cancer prediction.
     """
+    gender: str = Field(..., description="Patient gender", examples=["M"])
+    age: float = Field(..., ge=1, le=150, description="Patient age", examples=[65])
+    smoking: str = Field(..., description="Smoking status", examples=["YES"])
+    yellow_fingers: str = Field(..., description="Yellow fingers symptom", examples=["NO"])
+    anxiety: str = Field(..., description="Anxiety symptom", examples=["NO"])
+    peer_pressure: str = Field(..., description="Peer pressure", examples=["NO"])
+    chronic_disease: str = Field(..., description="Chronic disease", examples=["YES"])
+    fatigue: str = Field(..., description="Fatigue symptom", examples=["YES"])
+    allergy: str = Field(..., description="Allergy", examples=["NO"])
+    wheezing: str = Field(..., description="Wheezing symptom", examples=["YES"])
+    alcohol: str = Field(..., description="Alcohol consumption", examples=["NO"])
+    coughing: str = Field(..., description="Coughing symptom", examples=["YES"])
+    shortness_of_breath: str = Field(..., description="Shortness of breath", examples=["YES"])
+    swallowing_difficulty: str = Field(..., description="Swallowing difficulty", examples=["NO"])
+    chest_pain: str = Field(..., description="Chest pain symptom", examples=["YES"])
+    @field_validator('gender')
     @classmethod
+    def validate_gender(cls, v: str) -> str:
+        """Validate gender is M or F."""
+        v = v.upper()
+        if v not in ['M', 'F']:
+            raise ValueError('gender must be "M" or "F"')
+        return v
+    @field_validator('smoking', 'yellow_fingers', 'anxiety', 'peer_pressure',
+                     'chronic_disease', 'fatigue', 'allergy', 'wheezing',
+                     'alcohol', 'coughing', 'shortness_of_breath',
+                     'swallowing_difficulty', 'chest_pain')
     @classmethod
+    def validate_yes_no(cls, v: str) -> str:
+        """Validate YES/NO fields."""
+        v = v.upper()
+        if v not in ['YES', 'NO']:
+            raise ValueError('must be "YES" or "NO"')
         return v
+class PredictionResponse(BaseModel):
     """
+    Response model for prediction.
     """
+    success: bool = Field(..., description="Indicates if prediction was successful")
+    prediction: str = Field(..., description="Prediction result: YES or NO")
+    probability: float = Field(..., description="Confidence percentage")
+    message: str = Field(..., description="Human-readable message")
 class StatusResponse(BaseModel):
     """
     Response model for status endpoint.
     """
+    status: str = Field(..., description="API status message")
 # ============================================================================
 @app.get(
     "/",
     summary="API Root",
+    description="Root endpoint with API information",
     tags=["Info"]
 )
 async def root():
+    """Root endpoint that provides API information."""
     return {
+        "message": "Welcome to the Lung Cancer Prediction API",
         "version": "1.0.0",
         "docs": "/docs",
         "redoc": "/redoc",
         "endpoints": {
             "GET /status": "Check API status",
+            "POST /predict": "Predict lung cancer risk"
         }
     }
     "/status",
     response_model=StatusResponse,
     summary="Check API Status",
+    description="Returns the current status of the API and model loading status",
     tags=["Health"]
 )
 async def get_status():
     Health check endpoint.
     Returns:
+        StatusResponse: Status message indicating if API and model are ready
     """
+    if model is None or scaler is None:
+        raise HTTPException(
+            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
+            detail="Model or scaler not loaded"
+        )
+    return StatusResponse(status="API is running and model is loaded")
 @app.post(
+    "/predict",
+    response_model=PredictionResponse,
+    summary="Predict Lung Cancer Risk",
+    description="Predict lung cancer risk based on patient symptoms and characteristics",
+    tags=["Prediction"]
 )
+async def predict(data: PredictionRequest):
     """
+    Predict lung cancer risk based on patient data.
     Args:
+        data: PredictionRequest containing patient information
     Returns:
+        PredictionResponse: Prediction result with confidence score
     Raises:
+        HTTPException: 500 if model not loaded, 400 if validation fails
     """
+    if model is None or scaler is None:
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail="Model or scaler not loaded. Please check server logs for details."
+        )
+    try:
+        # Convert YES/NO to numeric (YES=2, NO=1)
+        smoking = 2 if data.smoking == 'YES' else 1
+        yellow_fingers = 2 if data.yellow_fingers == 'YES' else 1
+        anxiety = 2 if data.anxiety == 'YES' else 1
+        peer_pressure = 2 if data.peer_pressure == 'YES' else 1
+        chronic_disease = 2 if data.chronic_disease == 'YES' else 1
+        fatigue = 2 if data.fatigue == 'YES' else 1
+        allergy = 2 if data.allergy == 'YES' else 1
+        wheezing = 2 if data.wheezing == 'YES' else 1
+        alcohol = 2 if data.alcohol == 'YES' else 1
+        coughing = 2 if data.coughing == 'YES' else 1
+        shortness_of_breath = 2 if data.shortness_of_breath == 'YES' else 1
+        swallowing_difficulty = 2 if data.swallowing_difficulty == 'YES' else 1
+        chest_pain = 2 if data.chest_pain == 'YES' else 1
+        # Try different gender encodings
+        # Pattern 1: M=1, F=0 (binary)
+        gender_encoded = 1 if data.gender == 'M' else 0
+        # Create feature array
+        features_v1 = np.array([[
+            gender_encoded,  # Gender: M=1, F=0
+            data.age,
+            smoking,
+            yellow_fingers,
+            anxiety,
+            peer_pressure,
+            chronic_disease,
+            fatigue,
+            allergy,
+            wheezing,
+            alcohol,
+            coughing,
+            shortness_of_breath,
+            swallowing_difficulty,
+            chest_pain
+        ]], dtype=np.float64)
+        # Try alternative: gender as M=2, F=1
+        gender_encoded_v2 = 2 if data.gender == 'M' else 1
+        features_v2 = np.array([[
+            gender_encoded_v2,  # Gender: M=2, F=1
+            data.age,
+            smoking,
+            yellow_fingers,
+            anxiety,
+            peer_pressure,
+            chronic_disease,
+            fatigue,
+            allergy,
+            wheezing,
+            alcohol,
+            coughing,
+            shortness_of_breath,
+            swallowing_difficulty,
+            chest_pain
+        ]], dtype=np.float64)
+        # Try to make prediction with first encoding
+        try:
+            features_scaled = scaler.transform(features_v1)
+            prediction = model.predict(features_scaled)[0]
+            prediction_proba = model.predict_proba(features_scaled)[0]
+        except:
+            # If that fails, try second encoding
+            try:
+                features_scaled = scaler.transform(features_v2)
+                prediction = model.predict(features_scaled)[0]
+                prediction_proba = model.predict_proba(features_scaled)[0]
+            except Exception as e:
+                raise HTTPException(
+                    status_code=status.HTTP_400_BAD_REQUEST,
+                    detail=f"Error processing features: {str(e)}"
+                )
+        # Get probability and result
+        # Model classes are [0, 1] where 0=NO, 1=YES
+        if prediction == 1:
+            result = "YES"
+            probability = prediction_proba[1] * 100 if len(prediction_proba) > 1 else (1 - prediction_proba[0]) * 100
+        else:
+            result = "NO"
+            probability = prediction_proba[0] * 100
+        return PredictionResponse(
+            success=True,
+            prediction=result,
+            probability=round(probability, 2),
+            message=f'Prediction: {result} (Confidence: {probability:.2f}%)'
+        )
+    except HTTPException:
+        raise
+    except Exception as e:
+        import traceback
+        error_details = traceback.format_exc()
+        print(f"Prediction error: {error_details}")
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail=f'Prediction failed: {str(e)}'
+        )
 # ============================================================================
+# Exception Handlers
 # ============================================================================
 @app.exception_handler(HTTPException)
+async def http_exception_handler(request: Request, exc: HTTPException):
+    """Custom handler for HTTP exceptions."""
     return JSONResponse(
         status_code=exc.status_code,
         content={
 @app.exception_handler(RequestValidationError)
 async def validation_exception_handler(request: Request, exc: RequestValidationError):
+    """Custom handler for validation errors."""
     errors = exc.errors()
     return JSONResponse(
         status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
         content={
 # ============================================================================
 if __name__ == "__main__":
     # Get port from environment variable (for deployment) or default to 8000
     port = int(os.environ.get("PORT", 8000))
     # --reload enables auto-reload on code changes (development only)
     reload = os.environ.get("ENVIRONMENT", "development") == "development"
     uvicorn.run(
         port=port,
         reload=reload
     )

model_loader.py ADDED Viewed

	@@ -0,0 +1,141 @@

+"""
+Robust model loader with compatibility fixes for scikit-learn version mismatches.
+"""
+import joblib
+import pickle
+import sys
+import warnings
+class SklearnCompatibilityUnpickler(pickle.Unpickler):
+    """Custom unpickler that handles scikit-learn compatibility issues."""
+    def find_class(self, module, name):
+        # Handle EuclideanDistance compatibility issue
+        if module == 'sklearn.metrics._dist_metrics' and name == 'EuclideanDistance':
+            try:
+                # Try to import and patch the module
+                import sklearn.metrics._dist_metrics as dist_metrics
+                # Check if EuclideanDistance exists
+                if not hasattr(dist_metrics, 'EuclideanDistance'):
+                    # Try to create it from available classes
+                    if hasattr(dist_metrics, 'EuclideanDistance32'):
+                        # Create a class that acts like EuclideanDistance
+                        class EuclideanDistanceWrapper(dist_metrics.EuclideanDistance32):
+                            pass
+                        dist_metrics.EuclideanDistance = EuclideanDistanceWrapper
+                    elif hasattr(dist_metrics, 'EuclideanDistance64'):
+                        class EuclideanDistanceWrapper(dist_metrics.EuclideanDistance64):
+                            pass
+                        dist_metrics.EuclideanDistance = EuclideanDistanceWrapper
+                    else:
+                        # Last resort: try to find it in neighbors module
+                        try:
+                            from sklearn.neighbors._dist_metrics import EuclideanDistance as ED
+                            dist_metrics.EuclideanDistance = ED
+                        except:
+                            # Create a minimal stub class
+                            class EuclideanDistanceStub:
+                                def __init__(self, *args, **kwargs):
+                                    pass
+                            dist_metrics.EuclideanDistance = EuclideanDistanceStub
+                return getattr(dist_metrics, 'EuclideanDistance')
+            except Exception as e:
+                warnings.warn(f"Could not patch EuclideanDistance: {e}")
+                # Fallback: return a stub class
+                class EuclideanDistanceStub:
+                    def __init__(self, *args, **kwargs):
+                        pass
+                return EuclideanDistanceStub
+        # For all other classes, use default behavior
+        return super().find_class(module, name)
+def load_model_with_compatibility(model_path):
+    """
+    Load a joblib model with compatibility fixes.
+    Args:
+        model_path: Path to the .joblib model file
+    Returns:
+        Loaded model object
+    """
+    try:
+        # First, try to patch the module before loading
+        try:
+            import sklearn.metrics._dist_metrics as dist_metrics
+            if not hasattr(dist_metrics, 'EuclideanDistance'):
+                if hasattr(dist_metrics, 'EuclideanDistance32'):
+                    dist_metrics.EuclideanDistance = dist_metrics.EuclideanDistance32
+                elif hasattr(dist_metrics, 'EuclideanDistance64'):
+                    dist_metrics.EuclideanDistance = dist_metrics.EuclideanDistance64
+        except:
+            pass
+        # Try standard loading first
+        try:
+            return joblib.load(model_path)
+        except (AttributeError, ModuleNotFoundError) as e:
+            if 'EuclideanDistance' in str(e):
+                # Try with custom unpickler
+                warnings.warn("Using compatibility mode to load model...")
+                try:
+                    # Use joblib's internal file handling but with custom unpickler
+                    import joblib.numpy_pickle
+                    # Open the file
+                    with open(model_path, 'rb') as f:
+                        # Try to use joblib's format detection
+                        unpickler = SklearnCompatibilityUnpickler(f)
+                        try:
+                            return unpickler.load()
+                        except:
+                            # If that doesn't work, try monkey-patching more aggressively
+                            # Re-import after patching
+                            import importlib
+                            import sklearn.metrics._dist_metrics
+                            importlib.reload(sklearn.metrics._dist_metrics)
+                            # Patch again after reload
+                            dist_metrics = sklearn.metrics._dist_metrics
+                            if not hasattr(dist_metrics, 'EuclideanDistance'):
+                                if hasattr(dist_metrics, 'EuclideanDistance32'):
+                                    # Create a proper alias
+                                    dist_metrics.EuclideanDistance = type('EuclideanDistance',
+                                                                         (dist_metrics.EuclideanDistance32,), {})
+                            # Try loading again
+                            return joblib.load(model_path)
+                except Exception as e2:
+                    raise RuntimeError(f"Failed to load model even with compatibility mode: {e2}")
+            else:
+                raise
+    except Exception as e:
+        raise RuntimeError(f"Error loading model from {model_path}: {e}")
+def load_sklearn_model_safe(model_path, scaler_path=None):
+    """
+    Safely load sklearn model and scaler with compatibility fixes.
+    Args:
+        model_path: Path to model .joblib file
+        scaler_path: Path to scaler .joblib file (optional)
+    Returns:
+        Tuple of (model, scaler) or (model, None) if scaler_path not provided
+    """
+    model = load_model_with_compatibility(model_path)
+    scaler = None
+    if scaler_path:
+        try:
+            scaler = load_model_with_compatibility(scaler_path)
+        except Exception as e:
+            warnings.warn(f"Could not load scaler: {e}")
+    return model, scaler

requirements.txt CHANGED Viewed

@@ -2,4 +2,8 @@
 fastapi>=0.104.0
 uvicorn[standard]>=0.24.0
 pydantic>=2.0.0
-email-validator>=2.0.0

 fastapi>=0.104.0
 uvicorn[standard]>=0.24.0
 pydantic>=2.0.0
+# Machine Learning dependencies
+scikit-learn>=1.2.0
+joblib>=1.3.0
+numpy>=1.24.0

scaler.joblib ADDED Viewed

Binary file (1.52 kB). View file