trohith89 commited on
Commit
afc36dd
·
verified ·
1 Parent(s): 3d738d0

Update pages/3_EDA_and_Feature_Engineering.py

Browse files
pages/3_EDA_and_Feature_Engineering.py CHANGED
@@ -48,57 +48,13 @@ st.markdown(
48
  # Title of the Streamlit app
49
  st.title("Exploratory Data Analysis (EDA) on Agoda Hotel Dataset")
50
 
51
- # Introduction and Aim
52
- st.header("Aim of the EDA")
53
- st.write("""
54
- The main objective of this EDA is to analyze Agoda's hotel dataset to identify key factors influencing hotel pricing strategies and customer booking preferences.
55
- The analysis will focus on uncovering patterns, trends, and relationships in hotel ratings, pricing structures, discounts, and free services.
56
- By leveraging these insights, Agoda can optimize its pricing strategy, predict booking preferences, and enhance revenue generation while maintaining customer satisfaction.
57
  """)
58
 
59
- # Description of the Data
60
- st.header("Description of the Data")
61
- st.write("""
62
- **Overall Summary:** We are analyzing the Agoda dataset by performing EDA and Statistical Tests on the data that has already been cleaned through data wrangling to address any messiness or missing information.
63
-
64
- **Table - Agoda_df:** The cleaned dataset consists of over 3,500 hotel listings, which will be used as test subjects for the hotel pricing period.
65
- **Dataset Details:**
66
- The dataset contains information about 3,219 hotel room listings with 12 features, each detailing aspects of the listing. Below is the description of each column:
67
- | Column Name | Description |
68
- |-----------------|---------------------------------------------------------------------------|
69
- | hotel_name | Name of the hotel. |
70
- | rating | Average customer rating of the hotel (float, range 1-5). |
71
- | location | Address or locality of the hotel. |
72
- | review_text | Customer feedback or comments about the hotel. |
73
- | reviews | Total number of customer reviews for the hotel. |
74
- | cashback | Cashback amount offered for the booking. |
75
- | discount | Discount percentage applied to the room price. |
76
- | free_services | Free services provided (e.g., breakfast, Wi-Fi). |
77
- | cancellation | Cancellation policy for the booking (e.g., free, non-refundable). |
78
- | price | Price of the room after discounts and cashback (float). |
79
- | state | The state where the hotel is located. |
80
- | category | Target variable representing the room type or category (e.g., budget, luxury). |
81
- """)
82
-
83
- # Table-wise EDA & Necessary Tests
84
- st.header("Table-wise EDA and Necessary Statistical Tests")
85
- st.write("""
86
- **Agoda_df:** Cleaned dataset with hotel details and key features like ratings, price, reviews, cashback, discounts, and free services.
87
- The EDA will involve the following steps:
88
- - **Summary Statistics:** Analyze the central tendency, spread, and shape of the distribution of each feature.
89
- - **Data Distribution:** Visualize the distribution of key features like price, ratings, reviews, cashback, etc.
90
- - **Correlation Analysis:** Analyze relationships between numeric features like price, ratings, reviews, cashback, etc.
91
- - **Categorical Data Analysis:** Explore categorical variables like hotel category, cancellation policy, state, and location using frequency tables and visualizations.
92
- - **Missing Value Analysis:** Ensure no missing values remain, and check the need for imputations.
93
- - **Outlier Detection:** Identify any outliers that may skew the analysis or predictions.
94
- - **Statistical Tests:** Apply appropriate statistical tests to identify significant differences or relationships (e.g., t-tests for comparing means, chi-squared for categorical variables).
95
- """)
96
-
97
- # Placeholder for further detailed code or visualizations
98
- st.write("Further steps will include generating visualizations and statistical tests to explore relationships between features in more detail.")
99
-
100
  # Access dataset from session state
101
- data= st.session_state.get("dataset")
102
 
103
  if data is not None:
104
  df = st.session_state['df']
 
48
  # Title of the Streamlit app
49
  st.title("Exploratory Data Analysis (EDA) on Agoda Hotel Dataset")
50
 
51
+ st.markdown("""
52
+ This page provides advanced Exploratory Data Analysis (EDA) and Feature Engineering using the dataset loaded in memory.
53
+ ---
 
 
 
54
  """)
55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  # Access dataset from session state
57
+ df= st.session_state.get("dataset")
58
 
59
  if data is not None:
60
  df = st.session_state['df']