pilotstuki Claude commited on
Commit
ddfcf3e
·
1 Parent(s): 002262c

Add Hugging Face Spaces deployment configuration

Browse files

- Add Dockerfile for Docker-based deployment on HF Spaces
- Add .dockerignore to optimize Docker build
- Update README.md with HF Space metadata (YAML frontmatter)
- Keep README_GITHUB.md for detailed GitHub documentation
- Configure app to run on port 7860 (HF Spaces standard)

Ready for deployment to: https://huggingface.co/spaces/pilotstuki/odatalogparser

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (4) hide show
  1. .dockerignore +42 -0
  2. Dockerfile +33 -0
  3. README.md +38 -182
  4. README_GITHUB.md +208 -0
.dockerignore ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+
5
+ # Python cache
6
+ __pycache__/
7
+ *.py[cod]
8
+ *$py.class
9
+ *.so
10
+
11
+ # Virtual environments
12
+ env/
13
+ venv/
14
+ ENV/
15
+
16
+ # Log files (sample data - users upload their own)
17
+ *.log
18
+
19
+ # PDF files (sample reports)
20
+ *.pdf
21
+
22
+ # IDE
23
+ .vscode/
24
+ .idea/
25
+ *.swp
26
+ *.swo
27
+ *~
28
+ .claude/
29
+
30
+ # OS
31
+ .DS_Store
32
+ Thumbs.db
33
+
34
+ # Test files
35
+ test_*.py
36
+ *_test.py
37
+
38
+ # Scripts
39
+ run.sh
40
+
41
+ # Documentation (GitHub-specific)
42
+ README_GITHUB.md
Dockerfile ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Dockerfile for Hugging Face Spaces - Streamlit App
2
+ FROM python:3.10-slim
3
+
4
+ # Set working directory
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ build-essential \
10
+ curl \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Copy requirements file
14
+ COPY requirements.txt .
15
+
16
+ # Install Python dependencies
17
+ RUN pip install --no-cache-dir -r requirements.txt
18
+
19
+ # Copy application files
20
+ COPY app.py .
21
+ COPY log_parser.py .
22
+
23
+ # Expose port 7860 (Hugging Face Spaces default)
24
+ EXPOSE 7860
25
+
26
+ # Set environment variables for Streamlit
27
+ ENV STREAMLIT_SERVER_PORT=7860
28
+ ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
29
+ ENV STREAMLIT_SERVER_HEADLESS=true
30
+ ENV STREAMLIT_BROWSER_GATHER_USAGE_STATS=false
31
+
32
+ # Run the application
33
+ CMD ["streamlit", "run", "app.py", "--server.port=7860", "--server.address=0.0.0.0"]
README.md CHANGED
@@ -1,208 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # IIS Log Performance Analyzer
2
 
3
  High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.
4
 
5
  **GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)
6
 
7
- **Live Demo**: Deploy on [Streamlit Cloud](https://streamlit.io/cloud)
8
-
9
  ## Features
10
 
11
- - **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
12
- - **Large File Support**: Efficiently handles files up to 1GB+
13
- - **Comprehensive Metrics**:
14
- - Total requests (before/after filtering)
15
- - Error rates and breakdown by status code
16
- - Response time statistics (min/max/avg)
17
- - Slow request detection (configurable threshold)
18
- - Peak RPS (Requests Per Second) with timestamp
19
- - Top methods by request count and response time
20
- - **Multi-File Analysis**: Upload and compare multiple log files side-by-side
21
- - **Interactive Visualizations**: Charts and graphs using Plotly
22
- - **Smart Filtering**: Automatically excludes monitoring requests (Zabbix HEAD) and 401 unauthorized
23
-
24
- ## Requirements
25
-
26
- - Python 3.8+
27
- - See `requirements.txt` for package dependencies
28
-
29
- ## Installation
30
-
31
- ### Local Installation
32
-
33
- 1. Clone the repository:
34
- ```bash
35
- git clone https://github.com/pilot-stuk/odata_log_parser.git
36
- cd odata_log_parser
37
- ```
38
-
39
- 2. Install dependencies:
40
- ```bash
41
- pip install -r requirements.txt
42
- ```
43
-
44
- ### Deploy to Streamlit Cloud
45
-
46
- 1. Fork or clone this repository to your GitHub account
47
- 2. Go to [share.streamlit.io](https://share.streamlit.io/)
48
- 3. Sign in with your GitHub account
49
- 4. Click "New app"
50
- 5. Select your repository: `pilot-stuk/odata_log_parser`
51
- 6. Set the main file path: `app.py`
52
- 7. Click "Deploy"
53
-
54
- The app will be live at: `https://share.streamlit.io/pilot-stuki/odata_log_parser/main/app.py`
55
 
56
- ## Usage
57
 
58
- ### Run the Streamlit App
59
-
60
- ```bash
61
- streamlit run app.py
62
- ```
63
-
64
- The application will open in your browser at `http://localhost:8501`
65
-
66
- ### Upload Log Files
67
-
68
- 1. Click "Browse files" in the sidebar
69
- 2. Select one or more IIS log files (.log or .txt)
70
- 3. View the analysis results
71
-
72
- ### Configuration Options
73
-
74
- - **Upload Mode**: Single or Multiple files
75
- - **Top N Methods**: Number of top methods to display (3-20)
76
- - **Slow Request Threshold**: Configure what constitutes a "slow" request (default: 3000ms)
77
 
78
  ## Log Format
79
 
80
- This tool supports **IIS W3C Extended Log Format** with the following fields:
81
-
82
- ```
83
- date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip
84
- cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
85
- ```
86
-
87
- Example log line:
88
  ```
89
- 2025-09-22 00:00:46 10.21.31.42 GET /Service/Contact/Get sessionid='xxx' 443 - 212.233.92.232 - - 200 0 0 24
 
90
  ```
91
 
92
  ## Filtering Rules
93
 
94
- The analyzer applies the following filters automatically:
95
-
96
- 1. **Monitoring Exclusion**: Lines containing both `HEAD` method and `Zabbix` are excluded
97
- 2. **401 Handling**: 401 Unauthorized responses are excluded from error counts (considered authentication attempts, not system errors)
98
- 3. **Error Definition**: Errors are HTTP status codes ≠ 200 and ≠ 401
99
 
100
- ## Metrics Explained
101
 
102
- | Metric | Description |
103
- |--------|-------------|
104
- | **Total Requests (before filtering)** | Raw number of log entries |
105
- | **Excluded Requests** | Lines filtered out (HEAD+Zabbix + 401) |
106
- | **Processed Requests** | Valid requests included in analysis |
107
- | **Errors** | Requests with status ≠ 200 and ≠ 401 |
108
- | **Slow Requests** | Requests exceeding threshold (default: 3000ms) |
109
- | **Peak RPS** | Maximum requests per second observed |
110
- | **Avg/Max/Min Response Time** | Response time statistics in milliseconds |
111
 
112
  ## Performance
113
 
114
- - **Small files** (<50MB): Process in seconds
115
- - **Medium files** (50-200MB): Process in 10-30 seconds
116
- - **Large files** (200MB-1GB): Process in 1-3 minutes
117
-
118
- Performance depends on:
119
- - File size
120
- - Number of log entries
121
- - System CPU and RAM
122
- - Disk I/O speed
123
-
124
- ## Architecture
125
-
126
- ```
127
- app.py # Streamlit UI application
128
- log_parser.py # Core parsing and analysis logic using Polars
129
- requirements.txt # Python dependencies
130
- README.md # This file
131
- ```
132
-
133
- ### Key Components
134
-
135
- - **IISLogParser**: Parses IIS W3C log format into Polars DataFrame
136
- - **LogAnalyzer**: Calculates metrics and statistics
137
- - **Streamlit UI**: Interactive web interface with visualizations
138
-
139
- ## Use Cases
140
-
141
- - **Performance Analysis**: Identify slow endpoints and response time patterns
142
- - **Error Investigation**: Track error rates and problematic methods
143
- - **Capacity Planning**: Analyze peak load and RPS patterns
144
- - **Service Comparison**: Compare performance across multiple services
145
- - **Incident Review**: Analyze logs from specific time periods
146
-
147
- ## Troubleshooting
148
-
149
- ### Large File Upload Issues
150
-
151
- If Streamlit has trouble with very large files (>500MB):
152
-
153
- 1. Increase Streamlit's upload size limit:
154
- ```bash
155
- streamlit run app.py --server.maxUploadSize=1024
156
- ```
157
-
158
- 2. Or modify `.streamlit/config.toml`:
159
- ```toml
160
- [server]
161
- maxUploadSize = 1024
162
- ```
163
-
164
- ### Memory Issues
165
-
166
- For files >1GB, you may need to:
167
- - Increase available system memory
168
- - Process files in smaller chunks
169
- - Use the CLI version (can be developed if needed)
170
-
171
- ### Performance Tips
172
-
173
- - Close other memory-intensive applications
174
- - Process files one at a time for very large files
175
- - Use SSD for faster I/O
176
- - Ensure adequate RAM (8GB+ recommended for 1GB files)
177
-
178
- ## Future Enhancements
179
-
180
- Potential features for future versions:
181
- - CLI tool for batch processing
182
- - Export results to PDF/Excel
183
- - Real-time log monitoring
184
- - Custom metric definitions
185
- - Time range filtering
186
- - IP address analysis
187
- - Session tracking
188
-
189
- ## Example Output
190
-
191
- The application generates:
192
-
193
- 1. **Summary Table**: Key metrics for each log file
194
- 2. **Top Methods Chart**: Most frequently called endpoints
195
- 3. **Response Time Distribution**: Histogram of response times
196
- 4. **Error Breakdown**: Pie chart of error types
197
- 5. **Service Comparison**: Side-by-side comparison for multiple files
198
 
199
  ## License
200
 
201
- This tool is provided as-is for log analysis purposes.
202
-
203
- ## Support
204
-
205
- For issues or questions:
206
- 1. Check log file format matches IIS W3C Extended format
207
- 2. Verify all required fields are present
208
- 3. Ensure Python and dependencies are correctly installed
 
1
+ ---
2
+ title: IIS Log Performance Analyzer
3
+ emoji: 📊
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
9
+ app_port: 7860
10
+ ---
11
+
12
  # IIS Log Performance Analyzer
13
 
14
  High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.
15
 
16
  **GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)
17
 
 
 
18
  ## Features
19
 
20
+ - **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
21
+ - 📦 **Large File Support**: Efficiently handles files up to 1GB+
22
+ - 📊 **Comprehensive Metrics**: RPS, response times, error rates, and more
23
+ - 🔍 **Detailed Analysis**: Top methods, error breakdown, time distribution
24
+ - 📈 **Visual Reports**: Interactive charts with Plotly
25
+ - 🔄 **Multi-file Support**: Compare multiple services side-by-side
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
+ ## How to Use
28
 
29
+ 1. Upload one or more IIS log files (W3C Extended format)
30
+ 2. View comprehensive performance metrics
31
+ 3. Analyze errors, slow requests, and response time distribution
32
+ 4. Compare multiple services side-by-side
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ## Log Format
35
 
36
+ Supports **IIS W3C Extended Log Format** with fields:
 
 
 
 
 
 
 
37
  ```
38
+ date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username
39
+ c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
40
  ```
41
 
42
  ## Filtering Rules
43
 
44
+ - Excludes monitoring requests (HEAD + Zabbix)
45
+ - 401 Unauthorized responses excluded from error counts
46
+ - Errors defined as status codes 200 and 401
47
+ - Slow requests: response time > 3000ms (configurable)
 
48
 
49
+ ## Technology Stack
50
 
51
+ - **Frontend**: Streamlit
52
+ - **Data Processing**: Polars
53
+ - **Visualizations**: Plotly
54
+ - **Deployment**: Docker on Hugging Face Spaces
 
 
 
 
 
55
 
56
  ## Performance
57
 
58
+ - Small files (<50MB): Process in seconds
59
+ - Medium files (50-200MB): Process in 10-30 seconds
60
+ - Large files (200MB-1GB): Process in 1-3 minutes
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
 
62
  ## License
63
 
64
+ MIT License - See GitHub repository for details
 
 
 
 
 
 
 
README_GITHUB.md ADDED
@@ -0,0 +1,208 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # IIS Log Performance Analyzer
2
+
3
+ High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.
4
+
5
+ **GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)
6
+
7
+ **Live Demo**: Deploy on [Streamlit Cloud](https://streamlit.io/cloud)
8
+
9
+ ## Features
10
+
11
+ - **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
12
+ - **Large File Support**: Efficiently handles files up to 1GB+
13
+ - **Comprehensive Metrics**:
14
+ - Total requests (before/after filtering)
15
+ - Error rates and breakdown by status code
16
+ - Response time statistics (min/max/avg)
17
+ - Slow request detection (configurable threshold)
18
+ - Peak RPS (Requests Per Second) with timestamp
19
+ - Top methods by request count and response time
20
+ - **Multi-File Analysis**: Upload and compare multiple log files side-by-side
21
+ - **Interactive Visualizations**: Charts and graphs using Plotly
22
+ - **Smart Filtering**: Automatically excludes monitoring requests (Zabbix HEAD) and 401 unauthorized
23
+
24
+ ## Requirements
25
+
26
+ - Python 3.8+
27
+ - See `requirements.txt` for package dependencies
28
+
29
+ ## Installation
30
+
31
+ ### Local Installation
32
+
33
+ 1. Clone the repository:
34
+ ```bash
35
+ git clone https://github.com/pilot-stuk/odata_log_parser.git
36
+ cd odata_log_parser
37
+ ```
38
+
39
+ 2. Install dependencies:
40
+ ```bash
41
+ pip install -r requirements.txt
42
+ ```
43
+
44
+ ### Deploy to Streamlit Cloud
45
+
46
+ 1. Fork or clone this repository to your GitHub account
47
+ 2. Go to [share.streamlit.io](https://share.streamlit.io/)
48
+ 3. Sign in with your GitHub account
49
+ 4. Click "New app"
50
+ 5. Select your repository: `pilot-stuk/odata_log_parser`
51
+ 6. Set the main file path: `app.py`
52
+ 7. Click "Deploy"
53
+
54
+ The app will be live at: `https://share.streamlit.io/pilot-stuki/odata_log_parser/main/app.py`
55
+
56
+ ## Usage
57
+
58
+ ### Run the Streamlit App
59
+
60
+ ```bash
61
+ streamlit run app.py
62
+ ```
63
+
64
+ The application will open in your browser at `http://localhost:8501`
65
+
66
+ ### Upload Log Files
67
+
68
+ 1. Click "Browse files" in the sidebar
69
+ 2. Select one or more IIS log files (.log or .txt)
70
+ 3. View the analysis results
71
+
72
+ ### Configuration Options
73
+
74
+ - **Upload Mode**: Single or Multiple files
75
+ - **Top N Methods**: Number of top methods to display (3-20)
76
+ - **Slow Request Threshold**: Configure what constitutes a "slow" request (default: 3000ms)
77
+
78
+ ## Log Format
79
+
80
+ This tool supports **IIS W3C Extended Log Format** with the following fields:
81
+
82
+ ```
83
+ date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip
84
+ cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
85
+ ```
86
+
87
+ Example log line:
88
+ ```
89
+ 2025-09-22 00:00:46 10.21.31.42 GET /Service/Contact/Get sessionid='xxx' 443 - 212.233.92.232 - - 200 0 0 24
90
+ ```
91
+
92
+ ## Filtering Rules
93
+
94
+ The analyzer applies the following filters automatically:
95
+
96
+ 1. **Monitoring Exclusion**: Lines containing both `HEAD` method and `Zabbix` are excluded
97
+ 2. **401 Handling**: 401 Unauthorized responses are excluded from error counts (considered authentication attempts, not system errors)
98
+ 3. **Error Definition**: Errors are HTTP status codes ≠ 200 and ≠ 401
99
+
100
+ ## Metrics Explained
101
+
102
+ | Metric | Description |
103
+ |--------|-------------|
104
+ | **Total Requests (before filtering)** | Raw number of log entries |
105
+ | **Excluded Requests** | Lines filtered out (HEAD+Zabbix + 401) |
106
+ | **Processed Requests** | Valid requests included in analysis |
107
+ | **Errors** | Requests with status ≠ 200 and ≠ 401 |
108
+ | **Slow Requests** | Requests exceeding threshold (default: 3000ms) |
109
+ | **Peak RPS** | Maximum requests per second observed |
110
+ | **Avg/Max/Min Response Time** | Response time statistics in milliseconds |
111
+
112
+ ## Performance
113
+
114
+ - **Small files** (<50MB): Process in seconds
115
+ - **Medium files** (50-200MB): Process in 10-30 seconds
116
+ - **Large files** (200MB-1GB): Process in 1-3 minutes
117
+
118
+ Performance depends on:
119
+ - File size
120
+ - Number of log entries
121
+ - System CPU and RAM
122
+ - Disk I/O speed
123
+
124
+ ## Architecture
125
+
126
+ ```
127
+ app.py # Streamlit UI application
128
+ log_parser.py # Core parsing and analysis logic using Polars
129
+ requirements.txt # Python dependencies
130
+ README.md # This file
131
+ ```
132
+
133
+ ### Key Components
134
+
135
+ - **IISLogParser**: Parses IIS W3C log format into Polars DataFrame
136
+ - **LogAnalyzer**: Calculates metrics and statistics
137
+ - **Streamlit UI**: Interactive web interface with visualizations
138
+
139
+ ## Use Cases
140
+
141
+ - **Performance Analysis**: Identify slow endpoints and response time patterns
142
+ - **Error Investigation**: Track error rates and problematic methods
143
+ - **Capacity Planning**: Analyze peak load and RPS patterns
144
+ - **Service Comparison**: Compare performance across multiple services
145
+ - **Incident Review**: Analyze logs from specific time periods
146
+
147
+ ## Troubleshooting
148
+
149
+ ### Large File Upload Issues
150
+
151
+ If Streamlit has trouble with very large files (>500MB):
152
+
153
+ 1. Increase Streamlit's upload size limit:
154
+ ```bash
155
+ streamlit run app.py --server.maxUploadSize=1024
156
+ ```
157
+
158
+ 2. Or modify `.streamlit/config.toml`:
159
+ ```toml
160
+ [server]
161
+ maxUploadSize = 1024
162
+ ```
163
+
164
+ ### Memory Issues
165
+
166
+ For files >1GB, you may need to:
167
+ - Increase available system memory
168
+ - Process files in smaller chunks
169
+ - Use the CLI version (can be developed if needed)
170
+
171
+ ### Performance Tips
172
+
173
+ - Close other memory-intensive applications
174
+ - Process files one at a time for very large files
175
+ - Use SSD for faster I/O
176
+ - Ensure adequate RAM (8GB+ recommended for 1GB files)
177
+
178
+ ## Future Enhancements
179
+
180
+ Potential features for future versions:
181
+ - CLI tool for batch processing
182
+ - Export results to PDF/Excel
183
+ - Real-time log monitoring
184
+ - Custom metric definitions
185
+ - Time range filtering
186
+ - IP address analysis
187
+ - Session tracking
188
+
189
+ ## Example Output
190
+
191
+ The application generates:
192
+
193
+ 1. **Summary Table**: Key metrics for each log file
194
+ 2. **Top Methods Chart**: Most frequently called endpoints
195
+ 3. **Response Time Distribution**: Histogram of response times
196
+ 4. **Error Breakdown**: Pie chart of error types
197
+ 5. **Service Comparison**: Side-by-side comparison for multiple files
198
+
199
+ ## License
200
+
201
+ This tool is provided as-is for log analysis purposes.
202
+
203
+ ## Support
204
+
205
+ For issues or questions:
206
+ 1. Check log file format matches IIS W3C Extended format
207
+ 2. Verify all required fields are present
208
+ 3. Ensure Python and dependencies are correctly installed