Ramyamaheswari commited on
Commit
1a25497
Β·
verified Β·
1 Parent(s): 9390be0

Create app.py

Browse files
Files changed (1) hide show
  1. app.py +87 -0
app.py ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import pandas as pd
3
+ from sklearn.datasets import load_iris
4
+ from sklearn.model_selection import train_test_split
5
+ from sklearn.tree import DecisionTreeClassifier, plot_tree
6
+ from sklearn.preprocessing import StandardScaler
7
+ from sklearn.metrics import classification_report, accuracy_score
8
+ import matplotlib.pyplot as plt
9
+
10
+ st.set_page_config(page_title="Explore Decision Tree Algorithm", layout="wide")
11
+ st.title("🌳 Decision Tree Classifier Demystified")
12
+
13
+ st.markdown("""
14
+ ## 🧠 What is a Decision Tree?
15
+
16
+ A Decision Tree is a flowchart-like tree structure where each internal node represents a test on a feature,
17
+ each branch represents an outcome of that test, and each leaf node represents a class label.
18
+
19
+ Think of it like *20 Questions*, but for data.
20
+
21
+ ---
22
+ ## βš™οΈ How Decision Trees Work
23
+
24
+ 1. Split the dataset based on feature values.
25
+ 2. Choose the best feature using criteria like *Gini Index, **Entropy, or **Information Gain*.
26
+ 3. Repeat recursively until leaf nodes are pure or max depth is reached.
27
+
28
+ Decision Trees are:
29
+ - Easy to understand and interpret
30
+ - Able to handle both numerical and categorical data
31
+ - Prone to overfitting if not pruned
32
+
33
+ ---
34
+ """)
35
+
36
+ st.subheader("🌼 Try Decision Tree on the Iris Dataset")
37
+
38
+ iris = load_iris()
39
+ df = pd.DataFrame(iris.data, columns=iris.feature_names)
40
+ df['target'] = iris.target
41
+
42
+ st.dataframe(df.head(), use_container_width=True)
43
+
44
+ criterion = st.radio("Select the Splitting Criterion", ["gini", "entropy"])
45
+ max_depth = st.slider("Select Max Depth of Tree", 1, 10, value=3)
46
+
47
+ X = df.drop('target', axis=1)
48
+ y = df['target']
49
+
50
+ # Standardize features
51
+ scaler = StandardScaler()
52
+ X_scaled = scaler.fit_transform(X)
53
+
54
+ X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)
55
+
56
+ model = DecisionTreeClassifier(criterion=criterion, max_depth=max_depth, random_state=42)
57
+ model.fit(X_train, y_train)
58
+ y_pred = model.predict(X_test)
59
+
60
+ acc = accuracy_score(y_test, y_pred)
61
+ st.success(f"βœ… Model Accuracy: {acc*100:.2f}%")
62
+
63
+ st.markdown("### πŸ“Š Classification Report")
64
+ st.text(classification_report(y_test, y_pred, target_names=iris.target_names))
65
+
66
+ st.markdown("### 🌳 Visualizing the Decision Tree")
67
+ fig, ax = plt.subplots(figsize=(10, 6))
68
+ plot_tree(model, filled=True, feature_names=iris.feature_names, class_names=iris.target_names, fontsize=10)
69
+ st.pyplot(fig)
70
+
71
+ st.markdown("""
72
+ ---
73
+ ## πŸ’‘ Highlights of Decision Trees:
74
+ - Visual and easy to explain.
75
+ - No need for feature scaling.
76
+ - Can model non-linear relationships.
77
+ - Can easily overfit β€” use pruning or set max depth.
78
+
79
+ ## πŸ”§ When to Use Decision Trees?
80
+ Use them when:
81
+ - You need a quick, explainable model.
82
+ - Feature relationships are non-linear.
83
+ - Interpretability is more important than performance.
84
+
85
+ ---
86
+ 🎯 *Tip:* Watch out for overfitting. Decision Trees love to memorize the training data if left unchecked.
87
+ """)