Spaces:

sree4411
/

ML_ALGORITHMS

Sleeping

App Files Files Community

sree4411 commited on Apr 8, 2025

Commit

bc67bfd

verified ·

1 Parent(s): 0ff8008

Update pages/Linear Regression.py

Browse files

Files changed (1) hide show

pages/Linear Regression.py +88 -128

pages/Linear Regression.py CHANGED Viewed

@@ -1,142 +1,102 @@
 import streamlit as st
-import pandas as pd
-import numpy as np
-from sklearn.model_selection import train_test_split
-from sklearn.linear_model import LinearRegression
-from sklearn.metrics import mean_squared_error, r2_score
-import matplotlib.pyplot as plt
-import seaborn as sns
-st.set_page_config(page_title="Explore Linear Regression", layout="wide")
-st.title("📈 Linear Regression Explained")
-# Tabs
-with st.sidebar:
-    st.header("📊 Data Options")
-    uploaded_file = st.file_uploader("Upload your CSV file", type=["csv"])
-    if uploaded_file is None:
-        st.warning("Using default dataset (Boston Housing dataset replacement). Upload your own for custom results.")
-if uploaded_file:
-    df = pd.read_csv(uploaded_file)
-else:
-    from sklearn.datasets import fetch_california_housing
-    data = fetch_california_housing()
-    df = pd.DataFrame(data.data, columns=data.feature_names)
-    df['target'] = data.target
-# Tabs
-tab1, tab2, tab3 = st.tabs(["📖 About Linear Regression", "⚙️ Train Model", "📈 Visualize"])
-with tab1:
-    st.title("📈 Linear Regression - Intuition & Explanation")
-    st.markdown("""
-    Linear Regression is a **supervised machine learning algorithm** used to predict a continuous target variable based on one or more input features.
-    It tries to **fit a straight line** (or hyperplane) through the data that minimizes the error between actual and predicted values.
-    """)
-    st.subheader("🔹 Simple Linear Regression Formula")
-    st.latex(r'''
-    y = \beta_0 + \beta_1 x + \epsilon
-    ''')
-    st.markdown("""
-    Where:
-    - \( y \): Predicted value
-    - \( x \): Input feature
-    - \( \beta_0 \): Intercept
-    - \( \beta_1 \): Slope of the line
-    - \( \epsilon \): Error term
-    """)
-    st.subheader("🔹 Multiple Linear Regression")
-    st.latex(r'''
-    y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_n x_n + \epsilon
-    ''')
     st.markdown("""
-    This is used when we have more than one independent variable.
     """)
-    st.subheader("🎯 Objective of Linear Regression")
-    st.markdown("To find the best-fit line by minimizing the **sum of squared errors (SSE)**.")
-    st.latex(r'''
-    SSE = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
-    ''')
-    st.subheader("📘 Cost Function (Mean Squared Error)")
-    st.latex(r'''
-    J(\beta) = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
-    ''')
     st.markdown("""
-    - The algorithm tries to find values of \( \beta \) (coefficients) that **minimize this cost function**.
     """)
-    st.subheader("📌 Assumptions of Linear Regression")
     st.markdown("""
-    - **Linearity**: Relationship between input and output is linear
-    - **Independence**: Observations are independent
-    - **Homoscedasticity**: Constant variance of errors
-    - **Normality of errors**
-    - **No multicollinearity** (for multiple regression)
     """)
-    st.subheader("💡 When to Use Linear Regression?")
     st.markdown("""
-    - To predict continuous numeric values (e.g., price, salary, marks)
-    - To analyze how inputs are related to output
-    - Easy to implement and interpret
     """)
-with tab2:
-    st.subheader("⚙️ Train Linear Regression Model")
-    target_col = st.selectbox("Select Target Variable", df.columns)
-    feature_cols = st.multiselect("Select Feature Columns", [col for col in df.columns if col != target_col])
-    if feature_cols and target_col:
-        X = df[feature_cols]
-        y = df[target_col]
-        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
-        model = LinearRegression()
-        model.fit(X_train, y_train)
-        y_pred = model.predict(X_test)
-        st.success(f"Model Trained Successfully! ✅")
-        st.metric("R² Score", f"{r2_score(y_test, y_pred):.4f}")
-        st.metric("MSE", f"{mean_squared_error(y_test, y_pred):.4f}")
-        st.markdown("### Coefficients")
-        coef_df = pd.DataFrame({"Feature": feature_cols, "Coefficient": model.coef_})
-        st.dataframe(coef_df)
-with tab3:
-    st.subheader("📈 Actual vs Predicted Plot")
-    if feature_cols and target_col:
-        fig, ax = plt.subplots()
-        sns.scatterplot(x=y_test, y=y_pred, ax=ax)
-        ax.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')
-        ax.set_xlabel("Actual")
-        ax.set_ylabel("Predicted")
-        ax.set_title("Actual vs Predicted")
-        st.pyplot(fig)
-    st.markdown("---")
-    st.markdown("### 💡 Tip:")
-    st.info("If predictions look scattered from the red line, try using non-linear models or transform your features.")

 import streamlit as st
+st.set_page_config(page_title="Linear Regression", page_icon="📈", layout="wide")
+# Page Title
+st.markdown("<h1>📈 Linear Regression</h1>", unsafe_allow_html=True)
+# Introduction
+st.markdown("### 🧠 What is Linear Regression?")
+st.markdown("""
+Linear Regression is a **supervised learning** algorithm used for predicting a **continuous output**.
+It models the relationship between one or more **independent variables (features)** and a **dependent variable (target)** by fitting a linear equation to the data.
+""")
+# How It Works
+st.markdown("### ⚙️ How Linear Regression Works")
+with st.expander("Mathematical Model"):
     st.markdown("""
+    Linear regression fits a line defined by the equation:
     """)
+    st.latex(r"y = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_nx_n + \epsilon")
     st.markdown("""
+    - $\\beta_0$ is the **intercept**
+    - $\\beta_1, ..., \\beta_n$ are the **coefficients (weights)**
+    - $x_1, ..., x_n$ are the **feature values**
+    - $\\epsilon$ is the **error term**
     """)
+with st.expander("Goal of the Algorithm"):
     st.markdown("""
+    Minimize the **residual sum of squares** between actual and predicted values.
     """)
+    st.latex(r"RSS = \sum_{i=1}^n (y_i - \hat{y}_i)^2")
+with st.expander("Training Process"):
     st.markdown("""
+    - Find optimal weights (coefficients) using:
+        - **Ordinary Least Squares (OLS)**
+        - **Gradient Descent**
+    - OLS minimizes the squared differences between actual and predicted values.
     """)
+# Types
+st.markdown("### 📚 Types of Linear Regression")
+st.markdown("""
+- **Simple Linear Regression**: One independent variable
+- **Multiple Linear Regression**: More than one independent variable
+- **Polynomial Regression**: Non-linear relationship modeled with polynomial terms
+- **Ridge, Lasso Regression**: Regularized versions to prevent overfitting
+""")
+# Assumptions
+st.markdown("### ✅ Assumptions of Linear Regression")
+st.markdown("""
+- **Linearity**: Relationship between input and output is linear
+- **Independence**: Observations are independent
+- **Homoscedasticity**: Constant variance of residuals
+- **Normality**: Residuals are normally distributed
+- **No multicollinearity**: Independent variables aren't too correlated
+""")
+# Metrics
+st.markdown("### 📏 Evaluation Metrics")
+st.markdown("""
+- **Mean Absolute Error (MAE)**
+- **Mean Squared Error (MSE)**
+- **Root Mean Squared Error (RMSE)**
+- **R² Score (Coefficient of Determination)**
+""")
+# Visualization and Prediction Explanation
+st.markdown("### 📉 Visual Representation")
+st.markdown("""
+Linear regression fits a **line** (or hyperplane) to the data such that the **sum of squared errors** between actual and predicted values is minimized.
+""")
+# When to Use
+st.markdown("### 🎯 When to Use Linear Regression")
+st.markdown("""
+- When the relationship between input and output is **approximately linear**
+- When you want **interpretability** (coefficients show effect size)
+- When **speed** is important and data is not too large
+""")
+# Regularization
+st.markdown("### 🔒 Regularization Techniques")
+st.markdown("""
+To prevent overfitting, especially in multiple linear regression:
+- **Ridge Regression** (L2 penalty): Shrinks coefficients
+- **Lasso Regression** (L1 penalty): Can set some coefficients to 0
+- **Elastic Net**: Combines Ridge and Lasso
+""")
+# Final Note
+st.markdown("### ✅ Summary")
+st.markdown("""
+Linear Regression is:
+- Simple to implement and interpret
+- Fast and effective on linearly separable data
+- Not ideal for complex or non-linear relationships
+👉 Use regularization and diagnostic plots to validate model quality.
+""")