# IIS Log Performance Analyzer High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing. **GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser) **Live Demo**: Deploy on [Streamlit Cloud](https://streamlit.io/cloud) ## Features - **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas - **Large File Support**: Efficiently handles files up to 1GB+ - **Comprehensive Metrics**: - Total requests (before/after filtering) - Error rates and breakdown by status code - Response time statistics (min/max/avg) - Slow request detection (configurable threshold) - Peak RPS (Requests Per Second) with timestamp - Top methods by request count and response time - **Multi-File Analysis**: Upload and compare multiple log files side-by-side - **Interactive Visualizations**: Charts and graphs using Plotly - **Smart Filtering**: Automatically excludes monitoring requests (Zabbix HEAD) and 401 unauthorized ## Requirements - Python 3.8+ - See `requirements.txt` for package dependencies ## Installation ### Local Installation 1. Clone the repository: ```bash git clone https://github.com/pilot-stuk/odata_log_parser.git cd odata_log_parser ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` ### Deploy to Streamlit Cloud 1. Fork or clone this repository to your GitHub account 2. Go to [share.streamlit.io](https://share.streamlit.io/) 3. Sign in with your GitHub account 4. Click "New app" 5. Select your repository: `pilot-stuk/odata_log_parser` 6. Set the main file path: `app.py` 7. Click "Deploy" The app will be live at: `https://share.streamlit.io/pilot-stuki/odata_log_parser/main/app.py` ## Usage ### Run the Streamlit App ```bash streamlit run app.py ``` The application will open in your browser at `http://localhost:8501` ### Upload Log Files 1. Click "Browse files" in the sidebar 2. Select one or more IIS log files (.log or .txt) 3. View the analysis results ### Configuration Options - **Upload Mode**: Single or Multiple files - **Top N Methods**: Number of top methods to display (3-20) - **Slow Request Threshold**: Configure what constitutes a "slow" request (default: 3000ms) ## Log Format This tool supports **IIS W3C Extended Log Format** with the following fields: ``` date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken ``` Example log line: ``` 2025-09-22 00:00:46 10.21.31.42 GET /Service/Contact/Get sessionid='xxx' 443 - 212.233.92.232 - - 200 0 0 24 ``` ## Filtering Rules The analyzer applies the following filters automatically: 1. **Monitoring Exclusion**: Lines containing both `HEAD` method and `Zabbix` are excluded 2. **401 Handling**: 401 Unauthorized responses are excluded from error counts (considered authentication attempts, not system errors) 3. **Error Definition**: Errors are HTTP status codes ≠ 200 and ≠ 401 ## Metrics Explained | Metric | Description | |--------|-------------| | **Total Requests (before filtering)** | Raw number of log entries | | **Excluded Requests** | Lines filtered out (HEAD+Zabbix + 401) | | **Processed Requests** | Valid requests included in analysis | | **Errors** | Requests with status ≠ 200 and ≠ 401 | | **Slow Requests** | Requests exceeding threshold (default: 3000ms) | | **Peak RPS** | Maximum requests per second observed | | **Avg/Max/Min Response Time** | Response time statistics in milliseconds | ## Performance - **Small files** (<50MB): Process in seconds - **Medium files** (50-200MB): Process in 10-30 seconds - **Large files** (200MB-1GB): Process in 1-3 minutes Performance depends on: - File size - Number of log entries - System CPU and RAM - Disk I/O speed ## Architecture ``` app.py # Streamlit UI application log_parser.py # Core parsing and analysis logic using Polars requirements.txt # Python dependencies README.md # This file ``` ### Key Components - **IISLogParser**: Parses IIS W3C log format into Polars DataFrame - **LogAnalyzer**: Calculates metrics and statistics - **Streamlit UI**: Interactive web interface with visualizations ## Use Cases - **Performance Analysis**: Identify slow endpoints and response time patterns - **Error Investigation**: Track error rates and problematic methods - **Capacity Planning**: Analyze peak load and RPS patterns - **Service Comparison**: Compare performance across multiple services - **Incident Review**: Analyze logs from specific time periods ## Troubleshooting ### Large File Upload Issues If Streamlit has trouble with very large files (>500MB): 1. Increase Streamlit's upload size limit: ```bash streamlit run app.py --server.maxUploadSize=1024 ``` 2. Or modify `.streamlit/config.toml`: ```toml [server] maxUploadSize = 1024 ``` ### Memory Issues For files >1GB, you may need to: - Increase available system memory - Process files in smaller chunks - Use the CLI version (can be developed if needed) ### Performance Tips - Close other memory-intensive applications - Process files one at a time for very large files - Use SSD for faster I/O - Ensure adequate RAM (8GB+ recommended for 1GB files) ## Future Enhancements Potential features for future versions: - CLI tool for batch processing - Export results to PDF/Excel - Real-time log monitoring - Custom metric definitions - Time range filtering - IP address analysis - Session tracking ## Example Output The application generates: 1. **Summary Table**: Key metrics for each log file 2. **Top Methods Chart**: Most frequently called endpoints 3. **Response Time Distribution**: Histogram of response times 4. **Error Breakdown**: Pie chart of error types 5. **Service Comparison**: Side-by-side comparison for multiple files ## License This tool is provided as-is for log analysis purposes. ## Support For issues or questions: 1. Check log file format matches IIS W3C Extended format 2. Verify all required fields are present 3. Ensure Python and dependencies are correctly installed