File size: 3,485 Bytes
882b440
 
 
 
 
 
 
 
 
 
 
 
 
 
 
379cde1
882b440
 
 
 
41314c7
882b440
 
 
 
 
 
 
 
 
 
 
 
 
 
41314c7
882b440
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c234d65
 
 
 
 
d0cd9c4
c234d65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
import streamlit as st
import pandas as pd

st.markdown(
    """
    <style>
    /* General page settings */
    body {
        background-color: #f0f0f5; /* Light gray background */
        color: #333333;  /* Dark text for good contrast */
        font-family: 'Arial', sans-serif; /* Clean, modern font */
    }

    /* Title and Header Styling */
    h1, h2, h3 {
        color: black;  
        font-weight: bold;
    }
    /* Style for subheaders */
     h3 {
        color: red;
        font-family: 'Roboto', sans-serif;
        font-weight: 500;
        margin-top: 20px;
    }
    .custom-subheader {
        color: #00FFFF;
        font-family: 'Roboto', sans-serif;
        font-weight: 600;
        margin-bottom: 15px;
    }
    /* Paragraph styling */
    p {
        font-family: 'Georgia', serif;
        line-height: 1.8;
        color: black;
        margin-bottom: 20px;
    }
    /* List styling with checkmark bullets */
    .icon-bullet {
        list-style-type: none;
        padding-left: 20px;
    }
    .icon-bullet li {
        font-family: 'Georgia', serif;
        font-size: 1.1em;
        margin-bottom: 10px;
        color: #FFFFF0;
    }
    .icon-bullet li::before {
        content: "◆";
        padding-right: 10px;
        color: #b3b3ff;
    }
    /* Sidebar styling */
    .sidebar .sidebar-content {
        background-color: #ffffff;
        border-radius: 10px;
        padding: 15px;
    }
    .sidebar h2 {
        color: #495057;
    }
    /* Custom button style */
    .streamlit-button {
        background-color: #00FFFF;
        color: #000000;
        font-weight: bold;
    }
    </style>
    """, unsafe_allow_html=True)

st.title("📂Handling Excel files📂")
st.markdown(''' - Excel is a widely used software application for organizing, 
                storing, and analyzing data in tabular format. 
            
- It is a spreadsheet tool that allows users to work with rows, columns, and cells to manage numerical or textual data. 
- Excel files are typically saved with extensions like .xls or .xlsx.''')

st.header('**How to Read These Files:**')
st.subheader('''**Using Python Libraries:**''')
st.code('''                
            import pandas as pd
            # Reading an Excel file
            df = pd.read_excel('file.xlsx')
            print(df.head())''')
st.header('**Issues in Excel:**')
st.markdown('''
1. **File Format Issues:**
- `.xls` and `.xlsx` are different formats.
2. **Corrupted Files:**
- Files may get corrupted during transfer or storage, making them unreadable.
3. **Encoding Issues:**
- Data with special characters or non-`UTF-8` encoding can cause errors.''')
st.write('**Solution:**')
st.code('''
    df = pd.read_excel('file.xlsx', encoding='utf-8')
        ''')
st.markdown('''
4. **Missing Values:**
- Cells with missing or `NaN values` may disrupt data processing.''')
st.write('**Solution:**')
st.code('''
    df.fillna(0, inplace=True)  
    df.dropna(inplace=True) 
        ''')
st.markdown('''
5. **Large File Size:**
- Handling very large Excel files can result in memory issues.''')
st.write('**Solution:**')
st.code('''
    chunks = pd.read_excel('large_file.xlsx', chunksize=10000)
    for chunk in chunks:
        print(chunk)
        ''')
st.markdown('''
6. **Multiple Sheets:**
- Huge files may have multiple sheets, making it harder to extract relevant data.''')
st.write('**Solution:**')
st.code('''
    df = pd.read_excel('file.xlsx', sheet_name=[0,1,2])
         ''')