AndrewMaru commited on
Commit
c2bde20
·
verified ·
1 Parent(s): 32a0028

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md CHANGED
@@ -5,9 +5,108 @@ tags:
5
  - classification
6
  - healthcare
7
  - lung-cancer
 
8
  library_name: scikit-learn
9
  model_name: Datathon Lung Cancer Detector
10
  datasets:
11
  - custom
12
  language: en
13
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - classification
6
  - healthcare
7
  - lung-cancer
8
+ - streamlit
9
  library_name: scikit-learn
10
  model_name: Datathon Lung Cancer Detector
11
  datasets:
12
  - custom
13
  language: en
14
  ---
15
+
16
+ # 🫁 Datathon Lung Cancer Detector
17
+
18
+ This model predicts whether a patient is likely to have lung cancer based on clinical and behavioral risk factors.
19
+ It was trained on a dataset of 309 entries with 15 input features and a binary diagnosis label.
20
+
21
+ ---
22
+
23
+ ## 📊 Input Features
24
+
25
+ | Feature | Type | Description |
26
+ |------------------------|----------|---------------------------------|
27
+ | `GENDER` | 0 = Female, 1 = Male | Biological sex |
28
+ | `AGE` | Integer | Patient age |
29
+ | `SMOKING` | 0/1 | Smoking habit |
30
+ | `YELLOW_FINGERS` | 0/1 | Stained fingers from smoking |
31
+ | `ANXIETY` | 0/1 | Anxiety symptoms |
32
+ | `PEER_PRESSURE` | 0/1 | Influence from peers |
33
+ | `CHRONIC DISEASE` | 0/1 | History of chronic illness |
34
+ | `FATIGUE` | 0/1 | Feeling of tiredness |
35
+ | `ALLERGY` | 0/1 | Known allergies |
36
+ | `WHEEZING` | 0/1 | Wheezing symptoms |
37
+ | `ALCOHOL CONSUMING` | 0/1 | Alcohol consumption |
38
+ | `COUGHING` | 0/1 | Persistent coughing |
39
+ | `SHORTNESS OF BREATH` | 0/1 | Difficulty breathing |
40
+ | `SWALLOWING DIFFICULTY`| 0/1 | Trouble swallowing |
41
+ | `CHEST PAIN` | 0/1 | Pain in chest area |
42
+
43
+ ---
44
+
45
+ ## 🧠 Model Info
46
+
47
+ - **Algorithm**: XG Boost Classifier(Highest Score)
48
+ - **Framework**: Scikit-learn
49
+ - **Target**: `DIAGNOSIS_LUNG_CANCER` (`YES` = Lung Cancer, `NO` = No Cancer)
50
+ - **Dataset Size**: 309 samples
51
+ - **Preprocessing**: Label encoding, binary encoding for yes/no inputs
52
+
53
+ ---
54
+
55
+ ## 🚀 Try It in Streamlit
56
+
57
+ This model is also available as a web app built using [Streamlit]. Access on https://datathonlungcancer-fazneuznw5uwkemskdn9kn.streamlit.app/
58
+
59
+ ```python
60
+ import streamlit as st
61
+ import pandas as pd
62
+ import joblib
63
+
64
+ model = joblib.load('model.pkl')
65
+
66
+ st.title('🫁 Lung Cancer Diagnosis')
67
+ st.write("Please fill out the following information to assess the likelihood of lung cancer.")
68
+
69
+ gender = st.selectbox('Gender', [0, 1], format_func=lambda x: "Female" if x == 0 else "Male")
70
+ age = st.number_input('Age', max_value=120, value=0)
71
+ smoking = st.selectbox('Smoking', ['Yes', 'No'])
72
+ yellow_fingers = st.selectbox('Yellow Fingers', ['Yes', 'No'])
73
+ anxiety = st.selectbox('Anxiety', ['Yes', 'No'])
74
+ peer_pressure = st.selectbox('Peer Pressure', ['Yes', 'No'])
75
+ chronic_disease = st.selectbox('Chronic Disease', ['Yes', 'No'])
76
+ fatigue = st.selectbox('Fatigue', ['Yes', 'No'])
77
+ allergy = st.selectbox('Allergy', ['Yes', 'No'])
78
+ wheezing = st.selectbox('Wheezing', ['Yes', 'No'])
79
+ alcohol = st.selectbox('Alcohol Consuming', ['Yes', 'No'])
80
+ coughing = st.selectbox('Coughing', ['Yes', 'No'])
81
+ shortness_of_breath = st.selectbox('Shortness of Breath', ['Yes', 'No'])
82
+ swallowing_difficulty = st.selectbox('Swallowing Difficulty', ['Yes', 'No'])
83
+ chest_pain = st.selectbox('Chest Pain', ['Yes', 'No'])
84
+
85
+ def binary_encode(value):
86
+ return 1 if value == 'Yes' else 0
87
+
88
+ data = pd.DataFrame([[gender, age,
89
+ binary_encode(smoking),
90
+ binary_encode(yellow_fingers),
91
+ binary_encode(anxiety),
92
+ binary_encode(peer_pressure),
93
+ binary_encode(chronic_disease),
94
+ binary_encode(fatigue),
95
+ binary_encode(allergy),
96
+ binary_encode(wheezing),
97
+ binary_encode(alcohol),
98
+ binary_encode(coughing),
99
+ binary_encode(shortness_of_breath),
100
+ binary_encode(swallowing_difficulty),
101
+ binary_encode(chest_pain)]],
102
+ columns=['GENDER', 'AGE', 'SMOKING', 'YELLOW_FINGERS', 'ANXIETY',
103
+ 'PEER_PRESSURE', 'CHRONIC DISEASE', 'FATIGUE', 'ALLERGY',
104
+ 'WHEEZING', 'ALCOHOL CONSUMING', 'COUGHING',
105
+ 'SHORTNESS OF BREATH', 'SWALLOWING DIFFICULTY', 'CHEST PAIN'])
106
+
107
+ if st.button('Predict'):
108
+ prediction = model.predict(data)[0]
109
+ if prediction == 1:
110
+ st.error("⚠️ High risk of lung cancer. Please consult a doctor.")
111
+ else:
112
+ st.success("✅ No Lung Cancer.")