BeyzaTopbas's picture
Update README.md
2aa6112 verified
metadata
title: Porto Seguro Safe Driver Prediction
emoji: πŸš€
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
  - streamlit
pinned: false
short_description: Streamlit template space

🏦 Porto Seguro – Safe Driver Prediction

This machine learning app predicts the probability that a driver will file an auto insurance claim.

πŸ“Œ Problem Statement

Insurance companies need accurate risk estimation to price policies fairly.
In this Kaggle competition, the goal is to build a model that predicts whether a policyholder will file a claim in the next year.

Better predictions help:

  • reduce costs for safe drivers
  • price high-risk drivers correctly
  • improve accessibility of insurance

This is a binary classification problem with highly imbalanced data.

πŸ“Š Dataset Overview

The dataset contains anonymized features related to:

  • driver information (ind)
  • regional data (reg)
  • car characteristics (car)
  • calculated features (calc)
  • binary and categorical variables

Missing values are represented by -1.

Target:

  • target = 1 β†’ claim filed
  • target = 0 β†’ no claim

βš™οΈ Machine Learning Pipeline

  1. Data cleaning & handling missing values
  2. Feature selection
  3. Train-test split
  4. Model training
  5. Evaluation

πŸ€– Model

Algorithm used:

  • Logistic Regression / Random Forest / XGBoost (pas aan naar jouw model)

The model outputs the probability of a claim.

πŸ“ Evaluation Metric

Competition metric:

Normalized Gini Coefficient

Why Gini?

It measures how well the model ranks high-risk drivers above low-risk drivers.

πŸš€ Streamlit App

The app allows users to:

  • Enter driver & vehicle features
  • Get real-time claim probability prediction

Output

  • Claim probability
  • Risk interpretation