# IIS Log Performance Analyzer

High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.

**GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)

**Live Demo**: Deploy on [Streamlit Cloud](https://streamlit.io/cloud)

## Features

- **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
- **Large File Support**: Efficiently handles files up to 1GB+
- **Comprehensive Metrics**:
  - Total requests (before/after filtering)
  - Error rates and breakdown by status code
  - Response time statistics (min/max/avg)
  - Slow request detection (configurable threshold)
  - Peak RPS (Requests Per Second) with timestamp
  - Top methods by request count and response time
- **Multi-File Analysis**: Upload and compare multiple log files side-by-side
- **Interactive Visualizations**: Charts and graphs using Plotly
- **Smart Filtering**: Automatically excludes monitoring requests (Zabbix HEAD) and 401 unauthorized

## Requirements

- Python 3.8+
- See `requirements.txt` for package dependencies

## Installation

### Local Installation

1. Clone the repository:
```bash
git clone https://github.com/pilot-stuk/odata_log_parser.git
cd odata_log_parser
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

### Deploy to Streamlit Cloud

1. Fork or clone this repository to your GitHub account
2. Go to [share.streamlit.io](https://share.streamlit.io/)
3. Sign in with your GitHub account
4. Click "New app"
5. Select your repository: `pilot-stuk/odata_log_parser`
6. Set the main file path: `app.py`
7. Click "Deploy"

The app will be live at: `https://share.streamlit.io/pilot-stuki/odata_log_parser/main/app.py`

## Usage

### Run the Streamlit App

```bash
streamlit run app.py
```

The application will open in your browser at `http://localhost:8501`

### Upload Log Files

1. Click "Browse files" in the sidebar
2. Select one or more IIS log files (.log or .txt)
3. View the analysis results

### Configuration Options

- **Upload Mode**: Single or Multiple files
- **Top N Methods**: Number of top methods to display (3-20)
- **Slow Request Threshold**: Configure what constitutes a "slow" request (default: 3000ms)

## Log Format

This tool supports **IIS W3C Extended Log Format** with the following fields:

```
date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip
cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
```

Example log line:
```
2025-09-22 00:00:46 10.21.31.42 GET /Service/Contact/Get sessionid='xxx' 443 - 212.233.92.232 - - 200 0 0 24
```

## Filtering Rules

The analyzer applies the following filters automatically:

1. **Monitoring Exclusion**: Lines containing both `HEAD` method and `Zabbix` are excluded
2. **401 Handling**: 401 Unauthorized responses are excluded from error counts (considered authentication attempts, not system errors)
3. **Error Definition**: Errors are HTTP status codes ≠ 200 and ≠ 401

## Metrics Explained

| Metric | Description |
|--------|-------------|
| **Total Requests (before filtering)** | Raw number of log entries |
| **Excluded Requests** | Lines filtered out (HEAD+Zabbix + 401) |
| **Processed Requests** | Valid requests included in analysis |
| **Errors** | Requests with status ≠ 200 and ≠ 401 |
| **Slow Requests** | Requests exceeding threshold (default: 3000ms) |
| **Peak RPS** | Maximum requests per second observed |
| **Avg/Max/Min Response Time** | Response time statistics in milliseconds |

## Performance

- **Small files** (<50MB): Process in seconds
- **Medium files** (50-200MB): Process in 10-30 seconds
- **Large files** (200MB-1GB): Process in 1-3 minutes

Performance depends on:
- File size
- Number of log entries
- System CPU and RAM
- Disk I/O speed

## Architecture

```
app.py              # Streamlit UI application
log_parser.py       # Core parsing and analysis logic using Polars
requirements.txt    # Python dependencies
README.md          # This file
```

### Key Components

- **IISLogParser**: Parses IIS W3C log format into Polars DataFrame
- **LogAnalyzer**: Calculates metrics and statistics
- **Streamlit UI**: Interactive web interface with visualizations

## Use Cases

- **Performance Analysis**: Identify slow endpoints and response time patterns
- **Error Investigation**: Track error rates and problematic methods
- **Capacity Planning**: Analyze peak load and RPS patterns
- **Service Comparison**: Compare performance across multiple services
- **Incident Review**: Analyze logs from specific time periods

## Troubleshooting

### Large File Upload Issues

If Streamlit has trouble with very large files (>500MB):

1. Increase Streamlit's upload size limit:
```bash
streamlit run app.py --server.maxUploadSize=1024
```

2. Or modify `.streamlit/config.toml`:
```toml
[server]
maxUploadSize = 1024
```

### Memory Issues

For files >1GB, you may need to:
- Increase available system memory
- Process files in smaller chunks
- Use the CLI version (can be developed if needed)

### Performance Tips

- Close other memory-intensive applications
- Process files one at a time for very large files
- Use SSD for faster I/O
- Ensure adequate RAM (8GB+ recommended for 1GB files)

## Future Enhancements

Potential features for future versions:
- CLI tool for batch processing
- Export results to PDF/Excel
- Real-time log monitoring
- Custom metric definitions
- Time range filtering
- IP address analysis
- Session tracking

## Example Output

The application generates:

1. **Summary Table**: Key metrics for each log file
2. **Top Methods Chart**: Most frequently called endpoints
3. **Response Time Distribution**: Histogram of response times
4. **Error Breakdown**: Pie chart of error types
5. **Service Comparison**: Side-by-side comparison for multiple files

## License

This tool is provided as-is for log analysis purposes.

## Support

For issues or questions:
1. Check log file format matches IIS W3C Extended format
2. Verify all required fields are present
3. Ensure Python and dependencies are correctly installed