Spaces:

TheDeepDas
/

Sanjay

Sleeping

App Files Files Community

TheDeepDas commited on Oct 4, 2025

Commit

6c9c901

1 Parent(s): 1de8f23

Docker

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.dockerignore +66 -0
.env +3 -0
Dockerfile +42 -0
HUGGINGFACE_DEPLOYMENT.md +83 -0
Procfile +1 -0
Procfile.railway +2 -0
README.md +5 -7
app/__init__.py +0 -0
app/__pycache__/__init__.cpython-311.pyc +0 -0
app/__pycache__/__init__.cpython-39.pyc +0 -0
app/__pycache__/config.cpython-311.pyc +0 -0
app/__pycache__/config.cpython-39.pyc +0 -0
app/__pycache__/core_security.cpython-311.pyc +0 -0
app/__pycache__/database.cpython-311.pyc +0 -0
app/__pycache__/dependencies.cpython-311.pyc +0 -0
app/__pycache__/main.cpython-311.pyc +0 -0
app/__pycache__/main.cpython-39.pyc +0 -0
app/__pycache__/schemas.cpython-311.pyc +0 -0
app/config.py +27 -0
app/core_security.py +27 -0
app/database.py +55 -0
app/dependencies.py +28 -0
app/main.py +89 -0
app/routers/__init__.py +0 -0
app/routers/__pycache__/__init__.cpython-311.pyc +0 -0
app/routers/__pycache__/auth.cpython-311.pyc +0 -0
app/routers/__pycache__/incidents.cpython-311.pyc +0 -0
app/routers/auth.py +69 -0
app/routers/incidents.py +113 -0
app/schemas.py +56 -0
app/services/__init__.py +0 -0
app/services/fallback_storage.py +0 -0
app/services/incidents.py +32 -0
app/services/ml_model.py +158 -0
app/services/ml_model_training.py +277 -0
app/services/nlp.py +105 -0
app/services/users.py +53 -0
eda.ipynb +265 -0
incidents.csv +0 -0
models/severity_model.pkl +3 -0
models/threat_model.pkl +3 -0
requirements-docker.txt +21 -0
requirements-railway-light.txt +13 -0
requirements-railway.txt +22 -0
requirements-training.txt +15 -0
requirements.txt +22 -0
start-hf.sh +16 -0
start.sh +13 -0
tests/__pycache__/conftest.cpython-311-pytest-8.3.3.pyc +0 -0
tests/__pycache__/test_auth.cpython-311-pytest-8.3.3.pyc +0 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,66 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual environments
+venv/
+env/
+ENV/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Git
+.git/
+.gitignore
+# Documentation
+README.md
+*.md
+# Test files
+tests/
+test_*.py
+*_test.py
+# Jupyter notebooks
+*.ipynb
+.ipynb_checkpoints/
+# Environment files (will be set in Hugging Face Spaces)
+.env
+.env.local
+.env.production
+# Logs
+logs/
+*.log
+# Temporary files
+tmp/
+temp/

.env ADDED Viewed

	@@ -0,0 +1,3 @@

+JWT_SECRET_KEY=change_this_secret_in_production
+MONGODB_URI=mongodb+srv://deepdblm_db_user:IqLKnKhwLLSOP1Ka@cluster0.0u1vpow.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0
+ALLOWED_ORIGINS=http://localhost:3000,http://localhost:5173,http://127.0.0.1:3000,http://127.0.0.1:5173,http://localhost:8080,https://marine-pollution-detection.onrender.com,https://marine-pollution-detection-production.up.railway.app,https://marine-pollution-detection.vercel.app

Dockerfile ADDED Viewed

	@@ -0,0 +1,42 @@

+# Use Python 3.11 slim image for better performance
+FROM python:3.11-slim
+# Set working directory
+WORKDIR /app
+# Set environment variables
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV PYTHONUNBUFFERED=1
+ENV PORT=7860
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for better caching
+COPY requirements-docker.txt requirements.txt
+# Install Python dependencies
+RUN pip install --no-cache-dir --upgrade pip
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the application code
+COPY . .
+# Make startup script executable
+RUN chmod +x start-hf.sh
+# Create models directory if it doesn't exist
+RUN mkdir -p models
+# Expose port 7860 (Hugging Face Spaces default)
+EXPOSE 7860
+# Health check
+HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Command to run the application
+CMD ["./start-hf.sh"]

HUGGINGFACE_DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,83 @@

+# 🚀 Hugging Face Spaces Deployment Guide
+## 📁 Files Created for Docker Deployment:
+- `Dockerfile` - Main Docker configuration
+- `requirements-docker.txt` - Optimized dependencies for Docker
+- `.dockerignore` - Excludes unnecessary files from build
+- `start-hf.sh` - Startup script for Hugging Face Spaces
+- `README.md` - Hugging Face Spaces metadata
+## 🔧 Deployment Steps:
+### 1. **Create New Hugging Face Space**
+1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
+2. Click "Create new Space"
+3. Choose:
+   - **Space name**: `marine-guard-api`
+   - **License**: MIT
+   - **SDK**: Docker
+   - **Hardware**: CPU basic (free tier)
+### 2. **Upload Backend Files**
+Upload these files to your Hugging Face Space:
+```
+Dockerfile
+requirements-docker.txt
+start-hf.sh
+README.md
+app/
+├── __init__.py
+├── main.py
+├── config.py
+├── database.py
+├── dependencies.py
+├── schemas.py
+├── core_security.py
+├── routers/
+│   ├── __init__.py
+│   ├── auth.py
+│   └── incidents.py
+└── services/
+    ├── __init__.py
+    ├── incidents.py
+    ├── ml_model.py
+    ├── nlp.py
+    └── users.py
+models/
+├── threat_model.pkl
+└── severity_model.pkl
+```
+### 3. **Set Environment Variables in Hugging Face**
+In your Space settings, add these environment variables:
+```
+MONGODB_URI=mongodb+srv://deepdblm_db_user:IqLKnKhwLLSOP1Ka@cluster0.0u1vpow.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0
+JWT_SECRET_KEY=your-secure-secret-key-here
+ALLOWED_ORIGINS=https://marine-pollution-detection.vercel.app,http://localhost:3000
+```
+### 4. **Update Frontend Configuration**
+Once deployed, update your frontend `.env` file:
+```
+VITE_API_BASE_URL=https://your-username-marine-guard-api.hf.space/api
+```
+## 🎯 Expected URLs:
+- **API Base**: `https://your-username-marine-guard-api.hf.space`
+- **Health Check**: `https://your-username-marine-guard-api.hf.space/health`
+- **Auth Endpoint**: `https://your-username-marine-guard-api.hf.space/api/auth/login`
+## 🔍 Troubleshooting:
+- Check the **Logs** tab in your Hugging Face Space for any errors
+- Ensure all environment variables are set correctly
+- The space will take 2-3 minutes to build and start
+- Models will be loaded automatically on startup
+## 📊 Features Included:
+- ✅ FastAPI backend with all endpoints
+- ✅ MongoDB Atlas connection
+- ✅ JWT authentication
+- ✅ ML model inference (threat & severity classification)
+- ✅ CORS configured for Vercel frontend
+- ✅ Health check endpoint
+- ✅ Automatic model loading on startup

Procfile ADDED Viewed

	@@ -0,0 +1 @@


1	+ web: uvicorn app.main:app --host 0.0.0.0 --port $PORT

Procfile.railway ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ # Procfile for Railway - lightweight deployment without ML
2	+ web: pip install -r requirements-railway-light.txt && uvicorn app.main:app --host 0.0.0.0 --port $PORT

README.md CHANGED Viewed

@@ -1,10 +1,8 @@
----
-title: Sanjay
-emoji: 🐠
-colorFrom: green
 colorTo: green
 sdk: docker
 pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+title: Marine Guard API
+emoji: 🌊
+colorFrom: blue
 colorTo: green
 sdk: docker
 pinned: false
+license: mit
+app_port: 7860

app/__init__.py ADDED Viewed

File without changes

app/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (142 Bytes). View file

app/__pycache__/__init__.cpython-39.pyc ADDED Viewed

Binary file (124 Bytes). View file

app/__pycache__/config.cpython-311.pyc ADDED Viewed

Binary file (1.5 kB). View file

app/__pycache__/config.cpython-39.pyc ADDED Viewed

Binary file (1.07 kB). View file

app/__pycache__/core_security.cpython-311.pyc ADDED Viewed

Binary file (1.84 kB). View file

app/__pycache__/database.cpython-311.pyc ADDED Viewed

Binary file (2.49 kB). View file

app/__pycache__/dependencies.cpython-311.pyc ADDED Viewed

Binary file (1.82 kB). View file

app/__pycache__/main.cpython-311.pyc ADDED Viewed

Binary file (4.3 kB). View file

app/__pycache__/main.cpython-39.pyc ADDED Viewed

Binary file (1.28 kB). View file

app/__pycache__/schemas.cpython-311.pyc ADDED Viewed

Binary file (3.25 kB). View file

app/config.py ADDED Viewed

	@@ -0,0 +1,27 @@

+from functools import lru_cache
+from typing import List, Union
+import os
+from pydantic_settings import BaseSettings
+class Settings(BaseSettings):
+    app_name: str = "Marine Guard Backend"
+    mongodb_uri: str
+    database_name: str = "marine_guard"
+    jwt_secret_key: str
+    jwt_algorithm: str = "HS256"
+    access_token_expire_minutes: int = 60 * 24
+    allowed_origins: Union[str, List[str]] = "http://localhost:5173"
+    # Hugging Face Spaces specific settings
+    space_id: str = os.getenv("SPACE_ID", "")
+    class Config:
+        env_file = ".env"
+        extra = "allow"  # Allow extra fields for Hugging Face environment variables
+@lru_cache
+def get_settings() -> Settings:
+    return Settings()

app/core_security.py ADDED Viewed

	@@ -0,0 +1,27 @@

+from datetime import datetime, timedelta, timezone
+from typing import Optional
+from jose import jwt
+from passlib.context import CryptContext
+from .config import get_settings
+pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
+def verify_password(plain_password: str, hashed_password: str) -> bool:
+    return pwd_context.verify(plain_password, hashed_password)
+def get_password_hash(password: str) -> str:
+    return pwd_context.hash(password)
+def create_access_token(subject: str, expires_delta: Optional[timedelta] = None) -> str:
+    settings = get_settings()
+    if expires_delta is None:
+        expires_delta = timedelta(minutes=settings.access_token_expire_minutes)
+    expire = datetime.now(timezone.utc) + expires_delta
+    to_encode = {"sub": subject, "exp": expire}
+    return jwt.encode(to_encode, settings.jwt_secret_key, algorithm=settings.jwt_algorithm)

app/database.py ADDED Viewed

	@@ -0,0 +1,55 @@

+from motor.motor_asyncio import AsyncIOMotorClient, AsyncIOMotorDatabase
+from typing import Optional
+import logging
+from .config import get_settings
+logger = logging.getLogger(__name__)
+_client: Optional[AsyncIOMotorClient] = None
+_db: Optional[AsyncIOMotorDatabase] = None
+_connection_healthy = False
+async def test_connection() -> bool:
+    """Test if the database connection is healthy"""
+    global _connection_healthy
+    try:
+        client = get_client()
+        # Try to ping the database
+        await client.admin.command('ping')
+        _connection_healthy = True
+        logger.info("Database connection established successfully")
+        return True
+    except Exception as e:
+        _connection_healthy = False
+        logger.warning(f"Database connection failed: {e}")
+        return False
+def get_client() -> AsyncIOMotorClient:
+    global _client
+    if _client is None:
+        settings = get_settings()
+        # Use the MongoDB URI exactly as provided, without additional parameters
+        _client = AsyncIOMotorClient(settings.mongodb_uri)
+    return _client
+def get_database() -> AsyncIOMotorDatabase:
+    global _db
+    if _db is None:
+        client = get_client()
+        settings = get_settings()
+        _db = client[settings.database_name]
+    return _db
+def get_collection(name: str):
+    db = get_database()
+    return db[name]
+def is_database_available() -> bool:
+    """Check if database is available for operations"""
+    return _connection_healthy

app/dependencies.py ADDED Viewed

	@@ -0,0 +1,28 @@

+from fastapi import Depends, HTTPException, status
+from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
+from jose import JWTError, jwt
+from .config import get_settings
+from .services.users import get_user_by_id, serialize_user
+security = HTTPBearer()
+async def get_current_user(credentials: HTTPAuthorizationCredentials = Depends(security)):
+    token = credentials.credentials
+    settings = get_settings()
+    try:
+        payload = jwt.decode(token, settings.jwt_secret_key, algorithms=[settings.jwt_algorithm])
+    except JWTError as exc:  # pragma: no cover - error path
+        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Could not validate credentials") from exc
+    user_id: str = payload.get("sub")
+    if user_id is None:
+        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid token payload")
+    user_doc = await get_user_by_id(user_id)
+    if user_doc is None:
+        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="User not found")
+    return serialize_user(user_doc)

app/main.py ADDED Viewed

	@@ -0,0 +1,89 @@

+import logging
+from contextlib import asynccontextmanager
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from .config import get_settings
+from .database import get_collection, test_connection
+from .routers import auth, incidents
+logger = logging.getLogger(__name__)
+async def setup_database_indexes():
+    """Set up database indexes with error handling."""
+    try:
+        # Test connection first
+        connection_ok = await test_connection()
+        if not connection_ok:
+            logger.warning("Database connection failed - skipping index creation")
+            return
+        # Create indexes
+        users = get_collection("users")
+        incidents_collection = get_collection("incidents")
+        await users.create_index("email", unique=True)
+        await incidents_collection.create_index("created_at")
+        logger.info("Database indexes created successfully")
+    except Exception as e:
+        logger.warning(f"Database setup failed: {e}")
+        logger.warning("Application will continue without database indexes")
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # Startup
+    await setup_database_indexes()
+    yield
+    # Shutdown (if needed in future)
+settings = get_settings()
+app = FastAPI(title=settings.app_name, lifespan=lifespan)
+allowed_origins = settings.allowed_origins
+if isinstance(allowed_origins, str):
+    allowed_origins = [origin.strip() for origin in allowed_origins.split(",") if origin.strip()]
+# Ensure we have the allowed origins for development and production
+if not allowed_origins:
+    allowed_origins = ["*"]  # Fallback to allow all if not configured
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=allowed_origins,
+    allow_credentials=True,
+    allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"],
+    allow_headers=["*"],
+    expose_headers=["*"],
+)
+# Debug: Log the allowed origins in startup
+logger.info(f"CORS allowed origins: {allowed_origins}")
+@app.get("/health")
+async def health_check():
+    connection_ok = await test_connection()
+    if connection_ok:
+        return {"status": "ok", "database": "connected"}
+    else:
+        return {"status": "degraded", "database": "disconnected"}
+@app.get("/")
+async def root():
+    return {"message": "Marine Guard API", "status": "running"}
+@app.options("/{path:path}")
+async def options_handler(path: str):
+    """Handle CORS preflight requests."""
+    return {"message": "OK"}
+app.include_router(auth.router, prefix="/api")
+app.include_router(incidents.router, prefix="/api")

app/routers/__init__.py ADDED Viewed

File without changes

app/routers/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (150 Bytes). View file

app/routers/__pycache__/auth.cpython-311.pyc ADDED Viewed

Binary file (4.01 kB). View file

app/routers/__pycache__/incidents.cpython-311.pyc ADDED Viewed

Binary file (4.73 kB). View file

app/routers/auth.py ADDED Viewed

	@@ -0,0 +1,69 @@

+from fastapi import APIRouter, Depends, HTTPException, status
+from ..core_security import create_access_token, get_password_hash, verify_password
+from ..schemas import TokenResponse, UserCreate, UserInDB, UserLogin
+from ..services.users import create_user, get_user_by_email, serialize_user
+from ..dependencies import get_current_user
+router = APIRouter(prefix="/auth", tags=["auth"])
+@router.post("/signup", response_model=TokenResponse)
+async def signup(payload: UserCreate):
+    try:
+        existing_user = await get_user_by_email(payload.email)
+        if existing_user:
+            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Email already registered")
+        user_doc = await create_user(
+            {
+                "email": payload.email,
+                "display_name": payload.display_name,
+                "organization": payload.organization,
+                "role": payload.role,
+                "password_hash": get_password_hash(payload.password),
+            }
+        )
+        user_data = serialize_user(user_doc)
+        token = create_access_token(user_data["id"])
+        return TokenResponse(access_token=token, user=UserInDB.model_validate(user_data))
+    except HTTPException:
+        raise
+    except Exception as e:
+        raise HTTPException(
+            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
+            detail="Database service temporarily unavailable"
+        )
+@router.post("/login", response_model=TokenResponse)
+async def login(payload: UserLogin):
+    try:
+        user_doc = await get_user_by_email(payload.email)
+        if not user_doc or not verify_password(payload.password, user_doc.get("password_hash", "")):
+            raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Incorrect email or password")
+        user_data = serialize_user(user_doc)
+        token = create_access_token(user_data["id"])
+        return TokenResponse(access_token=token, user=UserInDB.model_validate(user_data))
+    except HTTPException:
+        raise
+    except Exception as e:
+        raise HTTPException(
+            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
+            detail="Database service temporarily unavailable"
+        )
+@router.get("/me", response_model=UserInDB)
+async def get_me(current_user: dict = Depends(get_current_user)):
+    try:
+        return UserInDB.model_validate(current_user)
+    except Exception as e:
+        raise HTTPException(
+            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
+            detail="Database service temporarily unavailable"
+        )

app/routers/incidents.py ADDED Viewed

	@@ -0,0 +1,113 @@

+from datetime import datetime
+from fastapi import APIRouter, Depends, File, Form, HTTPException, status, UploadFile
+from ..dependencies import get_current_user
+from ..schemas import IncidentResponse
+from ..services.incidents import save_incident_document, store_image
+from ..services.nlp import classify_incident, get_model_info
+from ..database import is_database_available
+router = APIRouter(prefix="/incidents", tags=["incidents"])
+@router.post("/classify", response_model=IncidentResponse)
+async def classify_incident_report(
+    description: str = Form(...),
+    latitude: float = Form(...),
+    longitude: float = Form(...),
+    name: str = Form(""),
+    image: UploadFile | None = File(None),
+    current_user=Depends(get_current_user),
+):
+    if not is_database_available():
+        raise HTTPException(
+            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
+            detail="Database service is currently unavailable. Please try again later."
+        )
+    try:
+        # Use the ML model for classification
+        classification_result = classify_incident(description, name)
+        if isinstance(classification_result, dict):
+            # ML model returned detailed results with confidence
+            incident_class = classification_result['threat']
+            severity = classification_result['severity']
+            confidence_scores = {
+                'threat_confidence': classification_result.get('threat_confidence'),
+                'severity_confidence': classification_result.get('severity_confidence')
+            }
+        else:
+            # Fallback classification returned simple tuple
+            incident_class, severity = classification_result
+            confidence_scores = None
+        image_path = await store_image(image)
+        document = {
+            "name": name,
+            "description": description,
+            "latitude": latitude,
+            "longitude": longitude,
+            "incident_class": incident_class,
+            "severity": severity,
+            "reporter_id": current_user["id"],
+            "image_path": image_path,
+            "created_at": datetime.utcnow(),
+        }
+        saved = await save_incident_document(document)
+        return IncidentResponse(
+            incident_class=incident_class,
+            severity=severity,
+            incident_id=str(saved["_id"]),
+            confidence_scores=confidence_scores,
+        )
+    except HTTPException:
+        raise
+    except Exception as e:
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail="Failed to process incident report"
+        )
+@router.get("/model-info")
+async def get_classification_model_info():
+    """Get information about the current classification model"""
+    return get_model_info()
+@router.post("/test-classify")
+async def test_classification(
+    description: str = Form(...),
+    name: str = Form(""),
+):
+    """Test endpoint for classification without saving to database"""
+    try:
+        classification_result = classify_incident(description, name)
+        if isinstance(classification_result, dict):
+            return {
+                "threat": classification_result['threat'],
+                "severity": classification_result['severity'],
+                "confidence_scores": {
+                    'threat_confidence': classification_result.get('threat_confidence'),
+                    'severity_confidence': classification_result.get('severity_confidence')
+                },
+                "model_type": "machine_learning"
+            }
+        else:
+            threat, severity = classification_result
+            return {
+                "threat": threat,
+                "severity": severity,
+                "model_type": "rule_based_fallback"
+            }
+    except Exception as e:
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=f"Classification failed: {str(e)}"
+        )

app/schemas.py ADDED Viewed

	@@ -0,0 +1,56 @@

+from typing import Optional
+from datetime import datetime
+from pydantic import BaseModel, EmailStr, Field
+class UserCreate(BaseModel):
+    email: EmailStr
+    password: str = Field(min_length=6)
+    display_name: str
+    organization: Optional[str] = None
+    role: Optional[str] = Field(default="citizen")
+class UserLogin(BaseModel):
+    email: EmailStr
+    password: str
+class UserInDB(BaseModel):
+    id: str
+    email: EmailStr
+    display_name: str
+    organization: Optional[str]
+    role: Optional[str]
+    created_at: datetime
+class TokenResponse(BaseModel):
+    access_token: str
+    token_type: str = "bearer"
+    user: UserInDB
+class IncidentCreate(BaseModel):
+    description: str
+    latitude: float
+    longitude: float
+class IncidentInDB(BaseModel):
+    id: str
+    description: str
+    latitude: float
+    longitude: float
+    incident_class: str
+    severity: str
+    created_at: datetime
+    reporter_id: Optional[str]
+    image_path: Optional[str]
+class IncidentResponse(BaseModel):
+    incident_class: str
+    severity: str
+    incident_id: str
+    confidence_scores: Optional[dict] = None

app/services/__init__.py ADDED Viewed

File without changes

app/services/fallback_storage.py ADDED Viewed

File without changes

app/services/incidents.py ADDED Viewed

	@@ -0,0 +1,32 @@

+from pathlib import Path
+from typing import Optional
+from uuid import uuid4
+from ..database import get_collection
+INCIDENTS_COLLECTION = "incidents"
+UPLOAD_DIR = Path(__file__).resolve().parent.parent / "uploads"
+UPLOAD_DIR.mkdir(exist_ok=True)
+async def save_incident_document(document: dict) -> dict:
+    collection = get_collection(INCIDENTS_COLLECTION)
+    result = await collection.insert_one(document)
+    document["_id"] = result.inserted_id
+    return document
+async def store_image(upload_file) -> Optional[str]:
+    if upload_file is None:
+        return None
+    UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
+    file_extension = Path(upload_file.filename).suffix
+    filename = f"{uuid4().hex}{file_extension}"
+    file_path = UPLOAD_DIR / filename
+    contents = await upload_file.read()
+    file_path.write_bytes(contents)
+    await upload_file.close()
+    return str(Path("uploads") / filename)

app/services/ml_model.py ADDED Viewed

	@@ -0,0 +1,158 @@

+# Inference-only ML model service
+# Models are pre-trained and saved as .pkl files
+import numpy as np
+import re
+from pathlib import Path
+import joblib
+import logging
+logger = logging.getLogger(__name__)
+# Minimal sklearn imports for model loading
+try:
+    from sklearn.feature_extraction.text import TfidfVectorizer
+    from sklearn.ensemble import RandomForestClassifier
+    SKLEARN_AVAILABLE = True
+    logger.info("sklearn imported successfully")
+except ImportError as e:
+    SKLEARN_AVAILABLE = False
+    logger.warning(f"sklearn not available: {e}. Using rule-based classification.")
+# Get the directory where this file is located
+BASE_DIR = Path(__file__).resolve().parent.parent.parent
+MODEL_DIR = BASE_DIR / "models"
+MODEL_DIR.mkdir(exist_ok=True)
+class IncidentClassifier:
+    def __init__(self):
+        self.threat_model = None
+        self.severity_model = None
+        self.is_trained = False
+        # Try to load pre-trained models automatically
+        try:
+            if self.load_models():
+                logger.info("Pre-trained models loaded successfully")
+            else:
+                logger.warning("No pre-trained models found. Classification will use fallback rules.")
+        except Exception as e:
+            logger.warning(f"Failed to load models on initialization: {e}")
+    def preprocess_text(self, text):
+        """Clean and preprocess text data"""
+        if text is None or text == "":
+            return ""
+        # Convert to lowercase
+        text = str(text).lower()
+        # Remove special characters but keep spaces
+        text = re.sub(r'[^a-zA-Z0-9\s]', ' ', text)
+        # Remove extra whitespaces
+        text = re.sub(r'\s+', ' ', text).strip()
+        return text
+    def load_models(self):
+        """Load trained models from disk"""
+        if not SKLEARN_AVAILABLE:
+            logger.warning("sklearn not available, cannot load models")
+            return False
+        try:
+            threat_model_path = MODEL_DIR / "threat_model.pkl"
+            severity_model_path = MODEL_DIR / "severity_model.pkl"
+            if threat_model_path.exists() and severity_model_path.exists():
+                self.threat_model = joblib.load(threat_model_path)
+                self.severity_model = joblib.load(severity_model_path)
+                self.is_trained = True
+                logger.info("Models loaded successfully")
+                return True
+            else:
+                logger.warning("Model files not found")
+                return False
+        except Exception as e:
+            logger.error(f"Error loading models: {e}")
+            return False
+    def predict(self, description, name=""):
+        """Predict threat type and severity for an incident"""
+        if not self.is_trained:
+            # Fallback to rule-based classification
+            return self._rule_based_classification(description, name)
+        try:
+            # Combine name and description
+            combined_text = f"{name} {description}".strip()
+            preprocessed_text = self.preprocess_text(combined_text)
+            if not preprocessed_text:
+                return self._rule_based_classification(description, name)
+            # Make predictions using loaded models
+            threat_pred = self.threat_model.predict([preprocessed_text])[0]
+            severity_pred = self.severity_model.predict([preprocessed_text])[0]
+            # Get prediction probabilities for confidence scores
+            threat_proba = self.threat_model.predict_proba([preprocessed_text])[0]
+            severity_proba = self.severity_model.predict_proba([preprocessed_text])[0]
+            # Get confidence scores (max probability)
+            threat_confidence = float(np.max(threat_proba))
+            severity_confidence = float(np.max(severity_proba))
+            return {
+                'threat': threat_pred,
+                'severity': severity_pred,
+                'threat_confidence': threat_confidence,
+                'severity_confidence': severity_confidence
+            }
+        except Exception as e:
+            logger.error(f"Prediction error: {e}")
+            return self._rule_based_classification(description, name)
+    def _rule_based_classification(self, description, name=""):
+        """Rule-based classification when ML models are not available"""
+        combined_text = f"{name} {description}".lower()
+        # Threat classification
+        if any(keyword in combined_text for keyword in ['oil', 'petroleum', 'crude', 'spill', 'tanker']):
+            threat = 'Oil'
+        elif any(keyword in combined_text for keyword in ['chemical', 'toxic', 'hazardous', 'acid', 'industrial']):
+            threat = 'Chemical'
+        else:
+            threat = 'Other'
+        # Severity classification
+        high_indicators = ['major', 'massive', 'large', 'explosion', 'fire', 'emergency', 'critical', 'severe']
+        medium_indicators = ['moderate', 'contained', 'limited', 'minor']
+        if any(indicator in combined_text for indicator in high_indicators):
+            severity = 'high'
+        elif any(indicator in combined_text for indicator in medium_indicators):
+            severity = 'medium'
+        else:
+            severity = 'low'
+        # Return with confidence scores for consistency
+        return {
+            'threat': threat,
+            'severity': severity,
+            'threat_confidence': 0.8,  # Mock confidence for rule-based
+            'severity_confidence': 0.7
+        }
+# Global classifier instance
+incident_classifier = IncidentClassifier()
+def get_classifier():
+    """Get the global classifier instance"""
+    return incident_classifier
+def predict_incident(description, name=""):
+    """Predict threat and severity for an incident"""
+    classifier = get_classifier()
+    return classifier.predict(description, name)

app/services/ml_model_training.py ADDED Viewed

	@@ -0,0 +1,277 @@

+# Core dependencies (always available)
+import numpy as np
+import pickle
+import re
+from pathlib import Path
+import joblib
+import logging
+# Training dependencies (only imported when needed)
+try:
+    import pandas as pd
+    from sklearn.model_selection import train_test_split
+    from sklearn.feature_extraction.text import TfidfVectorizer
+    from sklearn.ensemble import RandomForestClassifier
+    from sklearn.preprocessing import LabelEncoder
+    from sklearn.metrics import classification_report, accuracy_score
+    from sklearn.pipeline import Pipeline
+    TRAINING_DEPENDENCIES_AVAILABLE = True
+except ImportError:
+    TRAINING_DEPENDENCIES_AVAILABLE = False
+    # These will be None if training dependencies are not available
+    pd = None
+    train_test_split = None
+    TfidfVectorizer = None
+    RandomForestClassifier = None
+    LabelEncoder = None
+    classification_report = None
+    accuracy_score = None
+    Pipeline = None
+logger = logging.getLogger(__name__)
+# Get the directory where this file is located
+BASE_DIR = Path(__file__).resolve().parent.parent.parent
+MODEL_DIR = BASE_DIR / "models"
+MODEL_DIR.mkdir(exist_ok=True)
+class IncidentClassifier:
+    def __init__(self):
+        self.threat_model = None
+        self.severity_model = None
+        self.threat_encoder = None
+        self.severity_encoder = None
+        self.is_trained = False
+        # Try to load pre-trained models automatically
+        try:
+            if self.load_models():
+                logger.info("Pre-trained models loaded successfully")
+            else:
+                logger.warning("No pre-trained models found. Classification will use fallback rules.")
+        except Exception as e:
+            logger.warning(f"Failed to load models on initialization: {e}")
+    def preprocess_text(self, text):
+        """Clean and preprocess text data"""
+        if text is None or (pd and pd.isna(text)):
+            return ""
+        # Convert to lowercase
+        text = str(text).lower()
+        # Remove special characters but keep spaces
+        text = re.sub(r'[^a-zA-Z0-9\s]', ' ', text)
+        # Remove extra whitespaces
+        text = re.sub(r'\s+', ' ', text).strip()
+        return text
+    def create_severity_labels(self, df):
+        """Create severity labels based on description content and threat type"""
+        severity_labels = []
+        for _, row in df.iterrows():
+            description = str(row['description']).lower()
+            threat = row['threat']
+            # High severity indicators
+            high_indicators = [
+                'major', 'massive', 'large scale', 'explosion', 'fire', 'fatality',
+                'death', 'significant', 'extensive', 'severe', 'critical',
+                'emergency', 'disaster', 'toxic', 'hazardous', 'dangerous',
+                'thousands', 'gallons', 'barrels', 'tons'
+            ]
+            # Medium severity indicators
+            medium_indicators = [
+                'moderate', 'contained', 'limited', 'minor leak', 'small spill',
+                'hundreds', 'investigation', 'response', 'cleanup'
+            ]
+            # Low severity indicators
+            low_indicators = [
+                'minor', 'small', 'trace', 'minimal', 'observation', 'potential',
+                'suspected', 'no injuries', 'no damage', 'monitoring'
+            ]
+            # Count indicators
+            high_count = sum(1 for indicator in high_indicators if indicator in description)
+            medium_count = sum(1 for indicator in medium_indicators if indicator in description)
+            low_count = sum(1 for indicator in low_indicators if indicator in description)
+            # Classify based on threat type and indicators
+            if threat == 'Chemical' or high_count >= 2:
+                severity = 'high'
+            elif threat == 'Oil' and (high_count >= 1 or medium_count >= 2):
+                severity = 'medium'
+            elif low_count >= 2 or 'minor' in description:
+                severity = 'low'
+            elif high_count >= 1:
+                severity = 'high'
+            elif medium_count >= 1:
+                severity = 'medium'
+            else:
+                severity = 'low'
+            severity_labels.append(severity)
+        return severity_labels
+    def train_models(self, csv_path=None):
+        """Train both threat classification and severity assessment models"""
+        if not TRAINING_DEPENDENCIES_AVAILABLE:
+            logger.error("Training dependencies (pandas, scikit-learn) not available. Install with: pip install -r requirements-training.txt")
+            raise ImportError("Training dependencies not available. This method requires pandas and scikit-learn.")
+        try:
+            if csv_path is None:
+                csv_path = BASE_DIR / "incidents.csv"
+            logger.info(f"Loading dataset from {csv_path}")
+            df = pd.read_csv(csv_path)
+            # Clean the data
+            df = df.dropna(subset=['description', 'threat'])
+            # Combine name and description for features
+            df['combined_text'] = df['name'].fillna('') + ' ' + df['description'].fillna('')
+            df['combined_text'] = df['combined_text'].apply(self.preprocess_text)
+            # Create severity labels
+            df['severity'] = self.create_severity_labels(df)
+            # Prepare features
+            X = df['combined_text']
+            y_threat = df['threat']
+            y_severity = df['severity']
+            # Split the data
+            X_train, X_test, y_threat_train, y_threat_test, y_severity_train, y_severity_test = train_test_split(
+                X, y_threat, y_severity, test_size=0.2, random_state=42, stratify=y_threat
+            )
+            # Train threat classification model
+            logger.info("Training threat classification model...")
+            self.threat_model = Pipeline([
+                ('tfidf', TfidfVectorizer(max_features=5000, stop_words='english', ngram_range=(1, 2))),
+                ('classifier', RandomForestClassifier(n_estimators=100, random_state=42))
+            ])
+            self.threat_model.fit(X_train, y_threat_train)
+            # Train severity assessment model
+            logger.info("Training severity assessment model...")
+            self.severity_model = Pipeline([
+                ('tfidf', TfidfVectorizer(max_features=5000, stop_words='english', ngram_range=(1, 2))),
+                ('classifier', RandomForestClassifier(n_estimators=100, random_state=42))
+            ])
+            self.severity_model.fit(X_train, y_severity_train)
+            # Evaluate models
+            threat_pred = self.threat_model.predict(X_test)
+            severity_pred = self.severity_model.predict(X_test)
+            logger.info("Threat Classification Results:")
+            logger.info(f"Accuracy: {accuracy_score(y_threat_test, threat_pred):.3f}")
+            logger.info("\n" + classification_report(y_threat_test, threat_pred))
+            logger.info("Severity Assessment Results:")
+            logger.info(f"Accuracy: {accuracy_score(y_severity_test, severity_pred):.3f}")
+            logger.info("\n" + classification_report(y_severity_test, severity_pred))
+            # Save models
+            self.save_models()
+            self.is_trained = True
+            logger.info("Models trained and saved successfully!")
+            return {
+                'threat_accuracy': accuracy_score(y_threat_test, threat_pred),
+                'severity_accuracy': accuracy_score(y_severity_test, severity_pred),
+                'threat_distribution': df['threat'].value_counts().to_dict(),
+                'severity_distribution': df['severity'].value_counts().to_dict()
+            }
+        except Exception as e:
+            logger.error(f"Error training models: {e}")
+            raise
+    def save_models(self):
+        """Save trained models to disk"""
+        try:
+            joblib.dump(self.threat_model, MODEL_DIR / "threat_model.pkl")
+            joblib.dump(self.severity_model, MODEL_DIR / "severity_model.pkl")
+            logger.info("Models saved successfully")
+        except Exception as e:
+            logger.error(f"Error saving models: {e}")
+            raise
+    def load_models(self):
+        """Load trained models from disk"""
+        try:
+            threat_model_path = MODEL_DIR / "threat_model.pkl"
+            severity_model_path = MODEL_DIR / "severity_model.pkl"
+            if threat_model_path.exists() and severity_model_path.exists():
+                self.threat_model = joblib.load(threat_model_path)
+                self.severity_model = joblib.load(severity_model_path)
+                self.is_trained = True
+                logger.info("Models loaded successfully")
+                return True
+            else:
+                logger.warning("Model files not found")
+                return False
+        except Exception as e:
+            logger.error(f"Error loading models: {e}")
+            return False
+    def predict(self, description, name=""):
+        """Predict threat type and severity for a given incident description"""
+        if not self.is_trained:
+            if not self.load_models():
+                raise ValueError("Models not trained or loaded")
+        # Preprocess input
+        combined_text = self.preprocess_text(f"{name} {description}")
+        # Make predictions
+        threat_pred = self.threat_model.predict([combined_text])[0]
+        severity_pred = self.severity_model.predict([combined_text])[0]
+        # Get prediction probabilities for confidence scores
+        threat_proba = self.threat_model.predict_proba([combined_text])[0]
+        severity_proba = self.severity_model.predict_proba([combined_text])[0]
+        threat_confidence = max(threat_proba)
+        severity_confidence = max(severity_proba)
+        return {
+            'threat': threat_pred,
+            'severity': severity_pred,
+            'threat_confidence': float(threat_confidence),
+            'severity_confidence': float(severity_confidence)
+        }
+# Global instance
+incident_classifier = IncidentClassifier()
+def get_classifier():
+    """Get the global classifier instance"""
+    return incident_classifier
+def train_models():
+    """Train the models using the incidents dataset"""
+    if not TRAINING_DEPENDENCIES_AVAILABLE:
+        logger.error("Training dependencies not available. Models should be pre-trained for deployment.")
+        return False
+    classifier = get_classifier()
+    return classifier.train_models()
+def predict_incident(description, name=""):
+    """Predict threat and severity for an incident"""
+    classifier = get_classifier()
+    return classifier.predict(description, name)

app/services/nlp.py ADDED Viewed

	@@ -0,0 +1,105 @@

+import logging
+from typing import Tuple, Union, Dict
+from .ml_model import predict_incident, get_classifier
+logger = logging.getLogger(__name__)
+def classify_incident(description: str, name: str = "") -> Union[Tuple[str, str], Dict]:
+    """
+    Classify incident using trained ML model.
+    Falls back to rule-based classification if ML model is not available.
+    Args:
+        description: The incident description
+        name: The incident name/title (optional)
+    Returns:
+        Dict with ML results including confidence, or Tuple of (incident_class, severity) for fallback
+    """
+    try:
+        # Try to use ML model
+        classifier = get_classifier()
+        if classifier.is_trained or classifier.load_models():
+            result = predict_incident(description, name)
+            return result  # Return full dict with confidence scores
+        else:
+            logger.warning("ML model not available, using fallback classification")
+            return _fallback_classify(description)
+    except Exception as e:
+        logger.error(f"Error in ML classification: {e}")
+        return _fallback_classify(description)
+def _fallback_classify(description: str) -> Tuple[str, str]:
+    """
+    Fallback rule-based classification when ML model is not available.
+    """
+    description_lower = description.lower()
+    # Determine threat type
+    oil_keywords = ['oil', 'fuel', 'diesel', 'gasoline', 'petroleum', 'crude', 'spill']
+    chemical_keywords = ['chemical', 'acid', 'toxic', 'hazardous', 'styrene', 'acetic']
+    if any(keyword in description_lower for keyword in chemical_keywords):
+        threat = "Chemical"
+    elif any(keyword in description_lower for keyword in oil_keywords):
+        threat = "Oil"
+    else:
+        threat = "Other"
+    # Determine severity
+    high_severity_keywords = [
+        'major', 'massive', 'explosion', 'fire', 'fatality', 'death',
+        'significant', 'extensive', 'severe', 'critical', 'emergency',
+        'disaster', 'thousands', 'gallons', 'barrels'
+    ]
+    medium_severity_keywords = [
+        'moderate', 'contained', 'limited', 'hundreds', 'investigation',
+        'response', 'cleanup', 'leak'
+    ]
+    low_severity_keywords = [
+        'minor', 'small', 'trace', 'minimal', 'observation', 'potential',
+        'suspected', 'no injuries', 'no damage'
+    ]
+    if any(keyword in description_lower for keyword in high_severity_keywords):
+        severity = "high"
+    elif any(keyword in description_lower for keyword in medium_severity_keywords):
+        severity = "medium"
+    elif any(keyword in description_lower for keyword in low_severity_keywords):
+        severity = "low"
+    else:
+        # Default based on threat type
+        if threat == "Chemical":
+            severity = "high"
+        elif threat == "Oil":
+            severity = "medium"
+        else:
+            severity = "low"
+    return threat, severity
+def get_model_info():
+    """Get information about the current model status"""
+    try:
+        classifier = get_classifier()
+        if classifier.is_trained or classifier.load_models():
+            return {
+                "model_available": True,
+                "model_type": "machine_learning",
+                "status": "active"
+            }
+        else:
+            return {
+                "model_available": False,
+                "model_type": "rule_based_fallback",
+                "status": "fallback"
+            }
+    except Exception as e:
+        return {
+            "model_available": False,
+            "model_type": "rule_based_fallback",
+            "status": "error",
+            "error": str(e)
+        }

app/services/users.py ADDED Viewed

	@@ -0,0 +1,53 @@

+from datetime import datetime
+from typing import Optional
+from bson import ObjectId
+from ..database import get_collection
+USERS_COLLECTION = "users"
+def serialize_user(document) -> Optional[dict]:
+    if not document:
+        return None
+    return {
+        "id": str(document.get("_id")),
+        "email": document.get("email"),
+        "display_name": document.get("display_name"),
+        "organization": document.get("organization"),
+        "role": document.get("role"),
+        "created_at": document.get("created_at"),
+    }
+async def get_users_collection():
+    return get_collection(USERS_COLLECTION)
+async def get_user_by_email(email: str) -> Optional[dict]:
+    users = await get_users_collection()
+    return await users.find_one({"email": email})
+async def create_user(data: dict) -> dict:
+    users = await get_users_collection()
+    now = datetime.utcnow()
+    payload = {
+        **data,
+        "created_at": now,
+        "updated_at": now,
+    }
+    result = await users.insert_one(payload)
+    payload["_id"] = result.inserted_id
+    return payload
+async def get_user_by_id(user_id: str) -> Optional[dict]:
+    users = await get_users_collection()
+    try:
+        oid = ObjectId(user_id)
+    except Exception:
+        return None
+    return await users.find_one({"_id": oid})

eda.ipynb ADDED Viewed

	@@ -0,0 +1,265 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "fed833c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd \n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "e5605712",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>id</th>\n",
+       "      <th>open_date</th>\n",
+       "      <th>name</th>\n",
+       "      <th>location</th>\n",
+       "      <th>lat</th>\n",
+       "      <th>lon</th>\n",
+       "      <th>threat</th>\n",
+       "      <th>tags</th>\n",
+       "      <th>commodity</th>\n",
+       "      <th>measure_skim</th>\n",
+       "      <th>measure_shore</th>\n",
+       "      <th>measure_bio</th>\n",
+       "      <th>measure_disperse</th>\n",
+       "      <th>measure_burn</th>\n",
+       "      <th>max_ptl_release_gallons</th>\n",
+       "      <th>posts</th>\n",
+       "      <th>description</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>10431</td>\n",
+       "      <td>2022-03-21</td>\n",
+       "      <td>Tug Vessel Loses Power, Grounds, and Leaks Die...</td>\n",
+       "      <td>Neva Strait, Sitka, AK</td>\n",
+       "      <td>57.270000</td>\n",
+       "      <td>-135.593330</td>\n",
+       "      <td>Oil</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>0</td>\n",
+       "      <td>At approximately 0400 on 21-Mar02922, the tug ...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>10430</td>\n",
+       "      <td>2022-03-17</td>\n",
+       "      <td>Compromised Fuel Transfer Pipe Spills Oil into...</td>\n",
+       "      <td>Oswego, NY</td>\n",
+       "      <td>43.459410</td>\n",
+       "      <td>-76.531650</td>\n",
+       "      <td>Oil</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>0</td>\n",
+       "      <td>On March 17, 2022, NOAA ERD was notified by Mi...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>10429</td>\n",
+       "      <td>2022-03-16</td>\n",
+       "      <td>Floating Humpback Whale Carcass off of Carolin...</td>\n",
+       "      <td>Carolina Beach, NC, USA</td>\n",
+       "      <td>34.031323</td>\n",
+       "      <td>-77.830343</td>\n",
+       "      <td>Other</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>0</td>\n",
+       "      <td>On March 16, 2022, the Gulf of Mexico Marine M...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>10428</td>\n",
+       "      <td>2022-03-15</td>\n",
+       "      <td>Containership Grounded off Gibson Island in Ch...</td>\n",
+       "      <td>Gibson Island, MD, USA</td>\n",
+       "      <td>39.070000</td>\n",
+       "      <td>-76.410000</td>\n",
+       "      <td>Oil</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>2</td>\n",
+       "      <td>On 15 March 2022, USCG Sector Maryland NCR not...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>10426</td>\n",
+       "      <td>2022-03-14</td>\n",
+       "      <td>Oil Pipeline Discharge into Cahokia Canal, Edw...</td>\n",
+       "      <td>Cahokia Canal,  Edwardsville, IL</td>\n",
+       "      <td>38.824034</td>\n",
+       "      <td>-89.974600</td>\n",
+       "      <td>Oil</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>0</td>\n",
+       "      <td>On March 14, 2022, USEPA Region 5 contacted th...</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "      id   open_date                                               name  \\\n",
+       "0  10431  2022-03-21  Tug Vessel Loses Power, Grounds, and Leaks Die...   \n",
+       "1  10430  2022-03-17  Compromised Fuel Transfer Pipe Spills Oil into...   \n",
+       "2  10429  2022-03-16  Floating Humpback Whale Carcass off of Carolin...   \n",
+       "3  10428  2022-03-15  Containership Grounded off Gibson Island in Ch...   \n",
+       "4  10426  2022-03-14  Oil Pipeline Discharge into Cahokia Canal, Edw...   \n",
+       "\n",
+       "                           location        lat         lon threat tags  \\\n",
+       "0            Neva Strait, Sitka, AK  57.270000 -135.593330    Oil  NaN   \n",
+       "1                        Oswego, NY  43.459410  -76.531650    Oil  NaN   \n",
+       "2           Carolina Beach, NC, USA  34.031323  -77.830343  Other  NaN   \n",
+       "3            Gibson Island, MD, USA  39.070000  -76.410000    Oil  NaN   \n",
+       "4  Cahokia Canal,  Edwardsville, IL  38.824034  -89.974600    Oil  NaN   \n",
+       "\n",
+       "  commodity  measure_skim  measure_shore  measure_bio  measure_disperse  \\\n",
+       "0       NaN           NaN            NaN          NaN               NaN   \n",
+       "1       NaN           NaN            NaN          NaN               NaN   \n",
+       "2       NaN           NaN            NaN          NaN               NaN   \n",
+       "3       NaN           NaN            NaN          NaN               NaN   \n",
+       "4       NaN           NaN            NaN          NaN               NaN   \n",
+       "\n",
+       "   measure_burn  max_ptl_release_gallons  posts  \\\n",
+       "0           NaN                      NaN      0   \n",
+       "1           NaN                      NaN      0   \n",
+       "2           NaN                      NaN      0   \n",
+       "3           NaN                      NaN      2   \n",
+       "4           NaN                      NaN      0   \n",
+       "\n",
+       "                                         description  \n",
+       "0  At approximately 0400 on 21-Mar02922, the tug ...  \n",
+       "1  On March 17, 2022, NOAA ERD was notified by Mi...  \n",
+       "2  On March 16, 2022, the Gulf of Mexico Marine M...  \n",
+       "3  On 15 March 2022, USCG Sector Maryland NCR not...  \n",
+       "4  On March 14, 2022, USEPA Region 5 contacted th...  "
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = pd.read_csv('incidents.csv')\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "20914440",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<Axes: xlabel='threat'>"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjEAAAHjCAYAAADScU5NAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAnbElEQVR4nO3dfXRU9Z3H8c+QkJFgMuSBzCRlJEEiBYO2gCcEq0KBBASitl1w8aS4ZUHLU1NEC7XbUo4LllVAy8pStaVSrfZ0xdKVDQSRBwWEZI0goAc04UEyBEgyAYwTSGb/8HBPhwAaILnzy7xf58w5yb2/Cd9pp/DunXtvHMFgMCgAAADDdLB7AAAAgCtBxAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwUrTdA7SWpqYmHT16VHFxcXI4HHaPAwAAvoZgMKhTp04pLS1NHTpc/lhLu42Yo0ePyuv12j0GAAC4AocPH1a3bt0uu6bdRkxcXJykL/9DiI+Pt3kaAADwddTV1cnr9Vr/jl9Ou42Y8x8hxcfHEzEAABjm65wKwom9AADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwUrTdA0BKn/2m3SO0CxVPjrJ7BABAG+JIDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACO1KGIWLFig2267TXFxcUpJSdG9996rjz/+OGRNMBjU3LlzlZaWpk6dOmnw4MHas2dPyJpAIKDp06crOTlZnTt3Vn5+vo4cORKypqamRgUFBXK5XHK5XCooKFBtbe2VvUoAANDutChiNm3apKlTp2r79u0qLi7WuXPnlJubqzNnzlhrFi5cqEWLFmnp0qXauXOnPB6Phg8frlOnTllrCgsLtWrVKr366qt65513dPr0aY0ePVqNjY3WmvHjx6usrExFRUUqKipSWVmZCgoKrsFLBgAA7YEjGAwGr/TJx48fV0pKijZt2qQ777xTwWBQaWlpKiws1M9+9jNJXx51cbvd+s1vfqOHHnpIfr9fXbt21cqVKzVu3DhJ0tGjR+X1erVmzRrl5eVp37596tOnj7Zv367s7GxJ0vbt25WTk6OPPvpIvXr1+srZ6urq5HK55Pf7FR8ff6UvsU2kz37T7hHahYonR9k9AgDgKrXk3++rOifG7/dLkhITEyVJ5eXl8vl8ys3NtdY4nU7ddddd2rp1qySptLRUZ8+eDVmTlpamrKwsa822bdvkcrmsgJGkgQMHyuVyWWsuFAgEVFdXF/IAAADt1xVHTDAY1MyZM/Wd73xHWVlZkiSfzydJcrvdIWvdbre1z+fzKSYmRgkJCZddk5KS0uzPTElJsdZcaMGCBdb5My6XS16v90pfGgAAMMAVR8y0adO0a9cu/fnPf262z+FwhHwfDAabbbvQhWsutv5yP2fOnDny+/3W4/Dhw1/nZQAAAENdUcRMnz5dq1ev1ttvv61u3bpZ2z0ejyQ1O1pSVVVlHZ3xeDxqaGhQTU3NZdccO3as2Z97/PjxZkd5znM6nYqPjw95AACA9qtFERMMBjVt2jS9/vrr2rBhgzIyMkL2Z2RkyOPxqLi42NrW0NCgTZs2adCgQZKk/v37q2PHjiFrKisr9eGHH1prcnJy5Pf7tWPHDmvNe++9J7/fb60BAACRLboli6dOnapXXnlFf/vb3xQXF2cdcXG5XOrUqZMcDocKCws1f/58ZWZmKjMzU/Pnz1dsbKzGjx9vrZ04caIeeeQRJSUlKTExUbNmzVLfvn01bNgwSVLv3r01YsQITZo0ScuXL5ckTZ48WaNHj/5aVyYBAID2r0URs2zZMknS4MGDQ7b/4Q9/0IMPPihJeuyxx1RfX68pU6aopqZG2dnZWrduneLi4qz1ixcvVnR0tMaOHav6+noNHTpUK1asUFRUlLXm5Zdf1owZM6yrmPLz87V06dIreY0AAKAduqr7xIQz7hMTebhPDACYr83uEwMAAGAXIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJFaHDGbN2/WmDFjlJaWJofDoTfeeCNk/4MPPiiHwxHyGDhwYMiaQCCg6dOnKzk5WZ07d1Z+fr6OHDkSsqampkYFBQVyuVxyuVwqKChQbW1ti18gAABon1ocMWfOnNGtt96qpUuXXnLNiBEjVFlZaT3WrFkTsr+wsFCrVq3Sq6++qnfeeUenT5/W6NGj1djYaK0ZP368ysrKVFRUpKKiIpWVlamgoKCl4wIAgHYquqVPGDlypEaOHHnZNU6nUx6P56L7/H6/XnzxRa1cuVLDhg2TJP3pT3+S1+vV+vXrlZeXp3379qmoqEjbt29Xdna2JOn5559XTk6OPv74Y/Xq1aulYwMAgHamVc6J2bhxo1JSUnTTTTdp0qRJqqqqsvaVlpbq7Nmzys3NtbalpaUpKytLW7dulSRt27ZNLpfLChhJGjhwoFwul7XmQoFAQHV1dSEPAADQfl3ziBk5cqRefvllbdiwQU8//bR27typ7373uwoEApIkn8+nmJgYJSQkhDzP7XbL5/NZa1JSUpr97JSUFGvNhRYsWGCdP+NyueT1eq/xKwMAAOGkxR8nfZVx48ZZX2dlZWnAgAHq3r273nzzTX3ve9+75POCwaAcDof1/T9+fak1/2jOnDmaOXOm9X1dXR0hAwBAO9bql1inpqaqe/fu2r9/vyTJ4/GooaFBNTU1IeuqqqrkdrutNceOHWv2s44fP26tuZDT6VR8fHzIAwAAtF+tHjEnT57U4cOHlZqaKknq37+/OnbsqOLiYmtNZWWlPvzwQw0aNEiSlJOTI7/frx07dlhr3nvvPfn9fmsNAACIbC3+OOn06dM6cOCA9X15ebnKysqUmJioxMREzZ07V9///veVmpqqiooK/fznP1dycrLuu+8+SZLL5dLEiRP1yCOPKCkpSYmJiZo1a5b69u1rXa3Uu3dvjRgxQpMmTdLy5cslSZMnT9bo0aO5MgkAAEi6gogpKSnRkCFDrO/Pn4cyYcIELVu2TLt379ZLL72k2tpapaamasiQIXrttdcUFxdnPWfx4sWKjo7W2LFjVV9fr6FDh2rFihWKioqy1rz88suaMWOGdRVTfn7+Ze9NAwAAIosjGAwG7R6iNdTV1cnlcsnv94f9+THps9+0e4R2oeLJUXaPAAC4Si3595vfnQQAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMFKLI2bz5s0aM2aM0tLS5HA49MYbb4TsDwaDmjt3rtLS0tSpUycNHjxYe/bsCVkTCAQ0ffp0JScnq3PnzsrPz9eRI0dC1tTU1KigoEAul0sul0sFBQWqra1t8QsEAADtU4sj5syZM7r11lu1dOnSi+5fuHChFi1apKVLl2rnzp3yeDwaPny4Tp06Za0pLCzUqlWr9Oqrr+qdd97R6dOnNXr0aDU2Nlprxo8fr7KyMhUVFamoqEhlZWUqKCi4gpcIAADaI0cwGAxe8ZMdDq1atUr33nuvpC+PwqSlpamwsFA/+9nPJH151MXtdus3v/mNHnroIfn9fnXt2lUrV67UuHHjJElHjx6V1+vVmjVrlJeXp3379qlPnz7avn27srOzJUnbt29XTk6OPvroI/Xq1esrZ6urq5PL5ZLf71d8fPyVvsQ2kT77TbtHaBcqnhxl9wgAgKvUkn+/r+k5MeXl5fL5fMrNzbW2OZ1O3XXXXdq6daskqbS0VGfPng1Zk5aWpqysLGvNtm3b5HK5rICRpIEDB8rlcllrLhQIBFRXVxfyAAAA7dc1jRifzydJcrvdIdvdbre1z+fzKSYmRgkJCZddk5KS0uznp6SkWGsutGDBAuv8GZfLJa/Xe9WvBwAAhK9WuTrJ4XCEfB8MBpttu9CFay62/nI/Z86cOfL7/dbj8OHDVzA5AAAwxTWNGI/HI0nNjpZUVVVZR2c8Ho8aGhpUU1Nz2TXHjh1r9vOPHz/e7CjPeU6nU/Hx8SEPAADQfl3TiMnIyJDH41FxcbG1raGhQZs2bdKgQYMkSf3791fHjh1D1lRWVurDDz+01uTk5Mjv92vHjh3Wmvfee09+v99aAwAAIlt0S59w+vRpHThwwPq+vLxcZWVlSkxM1A033KDCwkLNnz9fmZmZyszM1Pz58xUbG6vx48dLklwulyZOnKhHHnlESUlJSkxM1KxZs9S3b18NGzZMktS7d2+NGDFCkyZN0vLlyyVJkydP1ujRo7/WlUkAAKD9a3HElJSUaMiQIdb3M2fOlCRNmDBBK1as0GOPPab6+npNmTJFNTU1ys7O1rp16xQXF2c9Z/HixYqOjtbYsWNVX1+voUOHasWKFYqKirLWvPzyy5oxY4Z1FVN+fv4l700DAAAiz1XdJyaccZ+YyMN9YgDAfLbdJwYAAKCtEDEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIx0zSNm7ty5cjgcIQ+Px2PtDwaDmjt3rtLS0tSpUycNHjxYe/bsCfkZgUBA06dPV3Jysjp37qz8/HwdOXLkWo8KAAAM1ipHYm6++WZVVlZaj927d1v7Fi5cqEWLFmnp0qXauXOnPB6Phg8frlOnTllrCgsLtWrVKr366qt65513dPr0aY0ePVqNjY2tMS4AADBQdKv80OjokKMv5wWDQS1ZskSPP/64vve970mS/vjHP8rtduuVV17RQw89JL/frxdffFErV67UsGHDJEl/+tOf5PV6tX79euXl5bXGyAAAwDCtciRm//79SktLU0ZGhu6//359+umnkqTy8nL5fD7l5uZaa51Op+666y5t3bpVklRaWqqzZ8+GrElLS1NWVpa15mICgYDq6upCHgAAoP265hGTnZ2tl156SWvXrtXzzz8vn8+nQYMG6eTJk/L5fJIkt9sd8hy3223t8/l8iomJUUJCwiXXXMyCBQvkcrmsh9frvcavDAAAhJNrHjEjR47U97//ffXt21fDhg3Tm2++KenLj43OczgcIc8JBoPNtl3oq9bMmTNHfr/fehw+fPgqXgUAAAh3rX6JdefOndW3b1/t37/fOk/mwiMqVVVV1tEZj8ejhoYG1dTUXHLNxTidTsXHx4c8AABA+9XqERMIBLRv3z6lpqYqIyNDHo9HxcXF1v6GhgZt2rRJgwYNkiT1799fHTt2DFlTWVmpDz/80FoDAABwza9OmjVrlsaMGaMbbrhBVVVVeuKJJ1RXV6cJEybI4XCosLBQ8+fPV2ZmpjIzMzV//nzFxsZq/PjxkiSXy6WJEyfqkUceUVJSkhITEzVr1izr4ykAAACpFSLmyJEj+ud//medOHFCXbt21cCBA7V9+3Z1795dkvTYY4+pvr5eU6ZMUU1NjbKzs7Vu3TrFxcVZP2Px4sWKjo7W2LFjVV9fr6FDh2rFihWKioq61uMCAABDOYLBYNDuIVpDXV2dXC6X/H5/2J8fkz77TbtHaBcqnhxl9wgAgKvUkn+/+d1JAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMBIRAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMFG33AADCT/rsN+0eod2oeHKU3SMA7RZHYgAAgJGIGAAAYCQiBgAAGIlzYgAAYY/ztK6d9nSeFkdiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYiYgAAgJGIGAAAYCQiBgAAGImIAQAARiJiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAICRiBgAAGAkIgYAABiJiAEAAEYK+4h57rnnlJGRoeuuu079+/fXli1b7B4JAACEgbCOmNdee02FhYV6/PHH9f777+uOO+7QyJEjdejQIbtHAwAANgvriFm0aJEmTpyof/3Xf1Xv3r21ZMkSeb1eLVu2zO7RAACAzaLtHuBSGhoaVFpaqtmzZ4dsz83N1datW5utDwQCCgQC1vd+v1+SVFdX17qDXgNNgc/tHqFdMOG/a1Pwnrx2eF9eG7wnr51wf0+eny8YDH7l2rCNmBMnTqixsVFutztku9vtls/na7Z+wYIF+vWvf91su9frbbUZEV5cS+yeAGiO9yXCjSnvyVOnTsnlcl12TdhGzHkOhyPk+2Aw2GybJM2ZM0czZ860vm9qalJ1dbWSkpIuuh5fX11dnbxerw4fPqz4+Hi7xwF4TyIs8b68NoLBoE6dOqW0tLSvXBu2EZOcnKyoqKhmR12qqqqaHZ2RJKfTKafTGbKtS5curTlixImPj+d/mAgrvCcRjnhfXr2vOgJzXtie2BsTE6P+/furuLg4ZHtxcbEGDRpk01QAACBchO2RGEmaOXOmCgoKNGDAAOXk5Oh3v/udDh06pIcfftju0QAAgM3COmLGjRunkydPat68eaqsrFRWVpbWrFmj7t272z1aRHE6nfrVr37V7OM6wC68JxGOeF+2PUfw61zDBAAAEGbC9pwYAACAyyFiAACAkYgYAABgJCIGAAAYiYgBAABGImIAAGihc+fO6Y9//ONFf5cf2g6XWMOyevXqr702Pz+/FScBvrRr166vvfaWW25pxUmA5mJjY7Vv3z7uXWYjIgaWDh2+3oE5h8OhxsbGVp4G+PI96XA4dKm/ps7v4z0JOwwZMkSFhYW655577B4lYoX1HXvRtpqamuweAQhRXl5u9wjAJU2ZMkUzZ87U4cOH1b9/f3Xu3DlkP0cHWx9HYgAAuAIXO3rN0cG2xZEYWJ599llNnjxZ1113nZ599tnLrp0xY0YbTQWE2rt3rw4dOqSGhoaQ7ZynhbbGkUL7cSQGloyMDJWUlCgpKUkZGRmXXOdwOPTpp5+24WSA9Omnn+q+++7T7t27Q86TcTgcksT/6wUiEJdYw1JeXq6kpCTr6/Lycu3YsUMlJSXW9+Xl5QQMbPGTn/xEGRkZOnbsmGJjY7Vnzx5t3rxZAwYM0MaNG+0eDxFq5cqVuv3225WWlqaDBw9KkpYsWaK//e1vNk8WGYgYNFNbW6upU6cqOTlZHo9HKSkpSk5O1rRp0+T3++0eDxFq27Ztmjdvnrp27aoOHTqoQ4cO+s53vqMFCxbw8SZssWzZMs2cOVN33323amtrraOBXbp00ZIlS+wdLkJwTgxCVFdXKycnR5999pkeeOAB9e7dW8FgUPv27dOKFSv01ltvaevWrUpISLB7VESYxsZGXX/99ZKk5ORkHT16VL169VL37t318ccf2zwdItFvf/tbPf/887r33nv15JNPWtsHDBigWbNm2ThZ5CBiEGLevHmKiYnRJ598Irfb3Wxfbm6u5s2bp8WLF9s0ISJVVlaWdu3apR49eig7O1sLFy5UTEyMfve736lHjx52j4cIVF5erm9/+9vNtjudTp05c8aGiSIPHychxBtvvKGnnnqqWcBIksfj0cKFC7Vq1SobJkOk+8UvfmHdy+iJJ57QwYMHdccdd2jNmjVfeTUd0BoyMjJUVlbWbPv//u//qk+fPm0/UATiSAxCVFZW6uabb77k/qysLH5XCGyRl5dnfd2jRw/t3btX1dXVSkhIsK5QAtrSo48+qqlTp+qLL75QMBjUjh079Oc//1kLFizQCy+8YPd4EYGIQYjk5GRVVFSoW7duF93/j1cwAW3J7/ersbFRiYmJ1rbExERVV1crOjpa8fHxNk6HSPQv//IvOnfunB577DF9/vnnGj9+vL7xjW/omWee0f3332/3eBGB+8QgxMSJE3XgwAEVFxcrJiYmZF8gEFBeXp5uvPFGvfjiizZNiEg1cuRIjRkzRlOmTAnZ/l//9V9avXq11qxZY9NkgHTixAk1NTUpJSXF7lEiChGDEEeOHNGAAQPkdDo1depUffOb35T05V1Sn3vuOQUCAZWUlMjr9do8KSJNYmKi3n33XfXu3Ttk+0cffaTbb79dJ0+etGkyAHbh4ySE6Natm7Zt26YpU6Zozpw5IXdFHT58uJYuXUrAwBaBQEDnzp1rtv3s2bOqr6+3YSJEumPHjmnWrFl66623VFVV1ey3rXMX6dbHkRhcUk1Njfbv3y9J6tmzZ8i5CEBbGzx4sPr27avf/va3IdunTp2qXbt2acuWLTZNhkg1cuRIHTp0SNOmTVNqamqzE8zvuecemyaLHEQMACO8++67GjZsmG677TYNHTpUkvTWW29p586dWrdune644w6bJ0SkiYuL05YtW/Stb33L7lEiFveJAWCE22+/Xdu2bZPX69Vf/vIX/f3vf1fPnj21a9cuAga28Hq9zT5CQtviSAwAAFdg3bp1evrpp7V8+XKlp6fbPU5EImIAhK26ujrr/i91dXWXXct9YtAWLry54pkzZ3Tu3DnFxsaqY8eOIWurq6vberyIw9VJAMJWQkKCKisrlZKSoi5dulz0zrzBYFAOh4MrQdAm+O3U4YWIARC2NmzYYF0V9/bbb9s8DSBNmDDB7hHwD/g4CQCAKxAVFWUdKfxHJ0+eVEpKCkcH2wBHYgAY44svvtCuXbtUVVVl/Ubr8/Lz822aCpHqUscAAoFAs1/bgtZBxAAwQlFRkX74wx/qxIkTzfZxTgza0rPPPivpy/fdCy+8oOuvv97a19jYqM2bN1u/sgWti4+TABihZ8+eysvL0y9/+Uu53W67x0EEy8jIkCQdPHhQ3bp1U1RUlLUvJiZG6enpmjdvnrKzs+0aMWIQMQCMEB8fr/fff1833nij3aMAkqQhQ4bo9ddf17lz59ShQwclJSXZPVLE4Y69AIzwgx/8QBs3brR7DECSVFtbq969eyszM1Mej0cpKSlKTk7WtGnTVFtba/d4EYMjMQCM8Pnnn+uf/umf1LVrV/Xt27fZjcVmzJhh02SINNXV1crJydFnn32mBx54QL1791YwGNS+ffv0yiuvyOv1auvWrUpISLB71HaPiAFghBdeeEEPP/ywOnXqpKSkpJAb3zkcDn366ac2TodIUlhYqLfeekvr169vdn6Wz+dTbm6uhg4dqsWLF9s0YeQgYgAYwePxaMaMGZo9e7Y6dOCTcNgnPT1dy5cvV15e3kX3FxUV6eGHH1ZFRUXbDhaB+JsAgBEaGho0btw4Aga2q6ys1M0333zJ/VlZWfL5fG04UeTibwMARpgwYYJee+01u8cAlJycfNmjLOXl5Vyp1Ea42R0AIzQ2NmrhwoVau3atbrnllmYn9i5atMimyRBpRowYoccff1zFxcXN7swbCAT0b//2bxoxYoRN00UWzokBYIQhQ4Zccp/D4dCGDRvacBpEsiNHjmjAgAFyOp2aOnWqdXfevXv36rnnnlMgEFBJSYm8Xq/Nk7Z/RAwAAC1UXl6uKVOmaN26ddbvUHI4HBo+fLiWLl2qnj172jxhZCBiABjlwIED+uSTT3TnnXeqU6dOCgaDIZdbA22ppqZG+/fvl/Tlr8ZITEy0eaLIQsQAMMLJkyc1duxYvf3223I4HNq/f7969OihiRMnqkuXLnr66aftHhFAG+PqJABG+OlPf6qOHTvq0KFDio2NtbaPGzdORUVFNk4GwC5cnQTACOvWrdPatWvVrVu3kO2ZmZk6ePCgTVMBsBNHYgAY4cyZMyFHYM47ceKEnE6nDRMBsBsRA8AId955p1566SXre4fDoaamJv3Hf/zHZS+/BtB+cWIvACPs3btXgwcPVv/+/bVhwwbl5+drz549qq6u1rvvvqsbb7zR7hEBtDEiBoAxfD6fli1bptLSUjU1Nalfv36aOnWqUlNT7R4NgA2IGAAAYCSuTgJgjNraWu3YsUNVVVVqamoK2ffDH/7QpqkA2IUjMQCM8Pe//10PPPCAzpw5o7i4uJC79DocDlVXV9s4HQA7EDEAjHDTTTfp7rvv1vz58y96qTWAyEPEADBC586dtXv3bvXo0cPuUQCECe4TA8AIeXl5KikpsXsMAGGEE3sBhK3Vq1dbX48aNUqPPvqo9u7dq759+6pjx44ha/Pz89t6PAA24+MkAGGrQ4evd7DY4XCosbGxlacBEG6IGAAAYCTOiQEQ1jZs2KA+ffqorq6u2T6/36+bb75ZW7ZssWEyAHYjYgCEtSVLlmjSpEmKj49vts/lcumhhx7SokWLbJgMgN2IGABh7YMPPtCIESMuuT83N1elpaVtOBGAcEHEAAhrx44da3Yl0j+Kjo7W8ePH23AiAOGCiAEQ1r7xjW9o9+7dl9y/a9cufos1EKGIGABh7e6779Yvf/lLffHFF8321dfX61e/+pVGjx5tw2QA7MYl1gDC2rFjx9SvXz9FRUVp2rRp6tWrlxwOh/bt26f//M//VGNjo/7v//5Pbrfb7lEBtDEiBkDYO3jwoH784x9r7dq1Ov9XlsPhUF5enp577jmlp6fbOyAAWxAxAIxRU1OjAwcOKBgMKjMzUwkJCXaPBMBGRAwAADASJ/YCAAAjETEAAMBIRAwAADASEQPAFhs3bpTD4VBtba3dowAwFBEDoE0MHjxYhYWFdo9hSU9P15IlS+weA8BVIGIAGOPs2bN2jwAgjBAxAFrdgw8+qE2bNumZZ56Rw+GQw+FQRUWFJKm0tFQDBgxQbGysBg0apI8//th63ty5c/Wtb31Lv//979WjRw85nU4Fg0H5/X5NnjxZKSkpio+P13e/+1198MEH1vM++eQT3XPPPXK73br++ut12223af369db+wYMH6+DBg/rpT39qzQPAPEQMgFb3zDPPKCcnR5MmTVJlZaUqKyvl9XolSY8//riefvpplZSUKDo6Wj/60Y9CnnvgwAH95S9/0X//93+rrKxMkjRq1Cj5fD6tWbNGpaWl6tevn4YOHarq6mpJ0unTp3X33Xdr/fr1ev/995WXl6cxY8bo0KFDkqTXX39d3bp107x586x5AJgn2u4BALR/LpdLMTExio2NlcfjkSR99NFHkqR///d/11133SVJmj17tkaNGqUvvvhC1113nSSpoaFBK1euVNeuXSVJGzZs0O7du1VVVSWn0ylJeuqpp/TGG2/or3/9qyZPnqxbb71Vt956q/XnP/HEE1q1apVWr16tadOmKTExUVFRUYqLi7PmAWAeIgaArW655Rbr69TUVElSVVWVbrjhBklS9+7drYCRvvz46fTp00pKSgr5OfX19frkk08kSWfOnNGvf/1r/c///I+OHj2qc+fOqb6+3joSA6B9IGIA2Kpjx47W1+fPTWlqarK2de7cOWR9U1OTUlNTtXHjxmY/q0uXLpKkRx99VGvXrtVTTz2lnj17qlOnTvrBD36ghoaGa/8CANiGiAHQJmJiYtTY2HjVP6dfv37y+XyKjo6+5G+v3rJlix588EHdd999kr48R+b8icTXeh4A9uHEXgBtIj09Xe+9954qKip04sSJkKMtLTFs2DDl5OTo3nvv1dq1a1VRUaGtW7fqF7/4hUpKSiRJPXv21Ouvv66ysjJ98MEHGj9+fLM/Lz09XZs3b9Znn32mEydOXPXrA9D2iBgAbWLWrFmKiopSnz591LVr1ys+P8XhcGjNmjW688479aMf/Ug33XST7r//flVUVMjtdkuSFi9erISEBA0aNEhjxoxRXl6e+vXrF/Jz5s2bp4qKCt14440h59wAMIcjGAwG7R4CAACgpTgSAwAAjETEAAAAIxExAADASEQMAAAwEhEDAACMRMQAAAAjETEAAMBIRAwAADASEQMAAIxExAAAACMRMQAAwEhEDAAAMNL/A7/2DeJ7Cz28AAAAAElFTkSuQmCC",
+      "text/plain": [
+       "<Figure size 640x480 with 1 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "df['threat'].value_counts().plot(kind='bar')"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "ai-gpu",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.21"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

incidents.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

models/severity_model.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:be23df7113f24fe40e5c4dcc6bbd64244573f5fbc7d5f48ca82dc3d290f3a8ba
+size 9374205

models/threat_model.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3f75285ae4727cfe85b1dd64c0af91a668c8884dc225e892cdfa75ba4c9ba0f7
+size 4653005

requirements-docker.txt ADDED Viewed

	@@ -0,0 +1,21 @@

+# Core FastAPI dependencies
+fastapi==0.115.2
+uvicorn[standard]==0.32.1
+motor==3.6.0
+passlib[bcrypt]==1.7.4
+python-jose==3.3.0
+python-multipart==0.0.9
+pydantic[email]==2.9.2
+pydantic-settings==2.6.1
+python-dotenv==1.0.1
+# Testing dependencies (optional for production)
+pytest==8.3.3
+httpx==0.27.2
+# ML inference dependencies (optimized for Docker)
+numpy==1.26.4
+joblib==1.4.2
+scikit-learn==1.5.2
+# Note: Using sklearn 1.5.2 for better Docker compatibility while maintaining model loading capability

requirements-railway-light.txt ADDED Viewed

	@@ -0,0 +1,13 @@

+# Railway deployment without ML dependencies (rule-based classification only)
+fastapi==0.115.2
+uvicorn[standard]==0.32.1
+motor==3.6.0
+passlib[bcrypt]==1.7.4
+python-jose==3.3.0
+python-multipart==0.0.9
+pydantic[email]==2.9.2
+pydantic-settings==2.6.1
+python-dotenv==1.0.1
+# No ML dependencies - will use rule-based classification
+# This should make Railway deployment much lighter and faster

requirements-railway.txt ADDED Viewed

	@@ -0,0 +1,22 @@

+# Railway-optimized requirements - lighter build
+# Core FastAPI dependencies
+fastapi==0.115.2
+uvicorn[standard]==0.32.1
+motor==3.6.0
+passlib[bcrypt]==1.7.4
+python-jose==3.3.0
+python-multipart==0.0.9
+pydantic[email]==2.9.2
+pydantic-settings==2.6.1
+python-dotenv==1.0.1
+# Testing dependencies (optional for production)
+pytest==8.3.3
+httpx==0.27.2
+# Minimal ML dependencies for Railway (lighter build)
+numpy==1.24.3
+joblib==1.3.2
+scikit-learn==1.3.2
+# Note: Using lighter/older versions to reduce build time and memory usage on Railway

requirements-training.txt ADDED Viewed

	@@ -0,0 +1,15 @@

+# Training Requirements (for local model development only)
+# Install with: pip install -r requirements-training.txt
+# All production dependencies
+-r requirements.txt
+# Heavy ML training libraries (only needed for model training/EDA)
+pandas==2.2.2
+scikit-learn==1.5.1
+matplotlib==3.8.0
+seaborn==0.13.0
+jupyter==1.0.0
+# Additional analysis tools
+plotly==5.17.0

requirements.txt ADDED Viewed

	@@ -0,0 +1,22 @@

+# Core FastAPI dependencies
+fastapi==0.115.2
+uvicorn[standard]==0.32.1
+motor==3.6.0
+passlib[bcrypt]==1.7.4
+python-jose==3.3.0
+python-multipart==0.0.9
+pydantic[email]==2.9.2
+pydantic-settings==2.6.1
+python-dotenv==1.0.1
+# Testing dependencies (optional for production)
+pytest==8.3.3
+httpx==0.27.2
+# ML inference dependencies (required for loading pre-trained models)
+numpy==1.26.4
+joblib==1.4.2
+scikit-learn==1.7.0
+# Note: scikit-learn version 1.7.0 matches the version used for training
+# This eliminates version warnings during model loading

start-hf.sh ADDED Viewed

	@@ -0,0 +1,16 @@

+#!/bin/bash
+# Hugging Face Spaces startup script
+echo "🌊 Starting Marine Guard API on Hugging Face Spaces..."
+# Set default environment variables if not provided
+export MONGODB_URI=${MONGODB_URI:-""}
+export JWT_SECRET_KEY=${JWT_SECRET_KEY:-"huggingface-default-secret-change-in-production"}
+export ALLOWED_ORIGINS=${ALLOWED_ORIGINS:-"*"}
+# Log environment info
+echo "📡 Port: ${PORT:-7860}"
+echo "🔗 Allowed Origins: $ALLOWED_ORIGINS"
+# Start the FastAPI application
+exec uvicorn app.main:app --host 0.0.0.0 --port ${PORT:-7860} --workers 1

start.sh ADDED Viewed

	@@ -0,0 +1,13 @@

+#!/bin/bash
+# Render startup script for Marine Guard API
+echo "🌊 Starting Marine Guard API..."
+# Set default port if not provided
+export PORT=${PORT:-8000}
+echo "📡 Starting uvicorn on port $PORT..."
+# Start the FastAPI application
+exec uvicorn app.main:app --host 0.0.0.0 --port $PORT --workers 1

tests/__pycache__/conftest.cpython-311-pytest-8.3.3.pyc ADDED Viewed

Binary file (5.35 kB). View file

tests/__pycache__/test_auth.cpython-311-pytest-8.3.3.pyc ADDED Viewed

Binary file (9.43 kB). View file