|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- sklearn |
|
|
- classification |
|
|
- healthcare |
|
|
- lung-cancer |
|
|
- streamlit |
|
|
library_name: scikit-learn |
|
|
model_name: Datathon Lung Cancer Detector |
|
|
datasets: |
|
|
- custom |
|
|
language: en |
|
|
--- |
|
|
|
|
|
# π« Datathon Lung Cancer Detector |
|
|
|
|
|
This model predicts whether a patient is likely to have lung cancer based on clinical and behavioral risk factors. |
|
|
It was trained on a dataset of 309 entries with 15 input features and a binary diagnosis label. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Input Features |
|
|
|
|
|
| Feature | Type | Description | |
|
|
|------------------------|----------|---------------------------------| |
|
|
| `GENDER` | 0 = Female, 1 = Male | Biological sex | |
|
|
| `AGE` | Integer | Patient age | |
|
|
| `SMOKING` | 0/1 | Smoking habit | |
|
|
| `YELLOW_FINGERS` | 0/1 | Stained fingers from smoking | |
|
|
| `ANXIETY` | 0/1 | Anxiety symptoms | |
|
|
| `PEER_PRESSURE` | 0/1 | Influence from peers | |
|
|
| `CHRONIC DISEASE` | 0/1 | History of chronic illness | |
|
|
| `FATIGUE` | 0/1 | Feeling of tiredness | |
|
|
| `ALLERGY` | 0/1 | Known allergies | |
|
|
| `WHEEZING` | 0/1 | Wheezing symptoms | |
|
|
| `ALCOHOL CONSUMING` | 0/1 | Alcohol consumption | |
|
|
| `COUGHING` | 0/1 | Persistent coughing | |
|
|
| `SHORTNESS OF BREATH` | 0/1 | Difficulty breathing | |
|
|
| `SWALLOWING DIFFICULTY`| 0/1 | Trouble swallowing | |
|
|
| `CHEST PAIN` | 0/1 | Pain in chest area | |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Model Info |
|
|
|
|
|
- **Algorithm**: XG Boost Classifier(Highest Score) |
|
|
- **Framework**: Scikit-learn |
|
|
- **Target**: `DIAGNOSIS_LUNG_CANCER` (`YES` = Lung Cancer, `NO` = No Cancer) |
|
|
- **Dataset Size**: 309 samples |
|
|
- **Preprocessing**: Label encoding, binary encoding for yes/no inputs |
|
|
|
|
|
--- |
|
|
|
|
|
## π Try It in Streamlit |
|
|
|
|
|
This model is also available as a web app built using [Streamlit]. Access on https://datathonlungcancer.streamlit.app/ |
|
|
|
|
|
```python |
|
|
import streamlit as st |
|
|
import pandas as pd |
|
|
import joblib |
|
|
|
|
|
model = joblib.load('model.pkl') |
|
|
|
|
|
st.title('π« Lung Cancer Diagnosis') |
|
|
st.write("Please fill out the following information to assess the likelihood of lung cancer.") |
|
|
|
|
|
gender = st.selectbox('Gender', [0, 1], format_func=lambda x: "Female" if x == 0 else "Male") |
|
|
age = st.number_input('Age', max_value=120, value=0) |
|
|
smoking = st.selectbox('Smoking', ['Yes', 'No']) |
|
|
yellow_fingers = st.selectbox('Yellow Fingers', ['Yes', 'No']) |
|
|
anxiety = st.selectbox('Anxiety', ['Yes', 'No']) |
|
|
peer_pressure = st.selectbox('Peer Pressure', ['Yes', 'No']) |
|
|
chronic_disease = st.selectbox('Chronic Disease', ['Yes', 'No']) |
|
|
fatigue = st.selectbox('Fatigue', ['Yes', 'No']) |
|
|
allergy = st.selectbox('Allergy', ['Yes', 'No']) |
|
|
wheezing = st.selectbox('Wheezing', ['Yes', 'No']) |
|
|
alcohol = st.selectbox('Alcohol Consuming', ['Yes', 'No']) |
|
|
coughing = st.selectbox('Coughing', ['Yes', 'No']) |
|
|
shortness_of_breath = st.selectbox('Shortness of Breath', ['Yes', 'No']) |
|
|
swallowing_difficulty = st.selectbox('Swallowing Difficulty', ['Yes', 'No']) |
|
|
chest_pain = st.selectbox('Chest Pain', ['Yes', 'No']) |
|
|
|
|
|
def binary_encode(value): |
|
|
return 1 if value == 'Yes' else 0 |
|
|
|
|
|
data = pd.DataFrame([[gender, age, |
|
|
binary_encode(smoking), |
|
|
binary_encode(yellow_fingers), |
|
|
binary_encode(anxiety), |
|
|
binary_encode(peer_pressure), |
|
|
binary_encode(chronic_disease), |
|
|
binary_encode(fatigue), |
|
|
binary_encode(allergy), |
|
|
binary_encode(wheezing), |
|
|
binary_encode(alcohol), |
|
|
binary_encode(coughing), |
|
|
binary_encode(shortness_of_breath), |
|
|
binary_encode(swallowing_difficulty), |
|
|
binary_encode(chest_pain)]], |
|
|
columns=['GENDER', 'AGE', 'SMOKING', 'YELLOW_FINGERS', 'ANXIETY', |
|
|
'PEER_PRESSURE', 'CHRONIC DISEASE', 'FATIGUE', 'ALLERGY', |
|
|
'WHEEZING', 'ALCOHOL CONSUMING', 'COUGHING', |
|
|
'SHORTNESS OF BREATH', 'SWALLOWING DIFFICULTY', 'CHEST PAIN']) |
|
|
|
|
|
if st.button('Predict'): |
|
|
prediction = model.predict(data)[0] |
|
|
if prediction == 1: |
|
|
st.error("β οΈ High risk of lung cancer. Please consult a doctor.") |
|
|
else: |
|
|
st.success("β
No Lung Cancer.") |
|
|
|