Spaces:

pilotstuki
/

odatalogparser

Sleeping

pilotstuki Claude commited on Oct 15, 2025

Commit

ddfcf3e

1 Parent(s): 002262c

Add Hugging Face Spaces deployment configuration

- Add Dockerfile for Docker-based deployment on HF Spaces
- Add .dockerignore to optimize Docker build
- Update README.md with HF Space metadata (YAML frontmatter)
- Keep README_GITHUB.md for detailed GitHub documentation
- Configure app to run on port 7860 (HF Spaces standard)

Ready for deployment to: https://huggingface.co/spaces/pilotstuki/odatalogparser

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (4) hide show

.dockerignore +42 -0
Dockerfile +33 -0
README.md +38 -182
README_GITHUB.md +208 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,42 @@

+# Git
+.git
+.gitignore
+# Python cache
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+# Virtual environments
+env/
+venv/
+ENV/
+# Log files (sample data - users upload their own)
+*.log
+# PDF files (sample reports)
+*.pdf
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+.claude/
+# OS
+.DS_Store
+Thumbs.db
+# Test files
+test_*.py
+*_test.py
+# Scripts
+run.sh
+# Documentation (GitHub-specific)
+README_GITHUB.md

Dockerfile ADDED Viewed

	@@ -0,0 +1,33 @@

+# Dockerfile for Hugging Face Spaces - Streamlit App
+FROM python:3.10-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements file
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application files
+COPY app.py .
+COPY log_parser.py .
+# Expose port 7860 (Hugging Face Spaces default)
+EXPOSE 7860
+# Set environment variables for Streamlit
+ENV STREAMLIT_SERVER_PORT=7860
+ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
+ENV STREAMLIT_SERVER_HEADLESS=true
+ENV STREAMLIT_BROWSER_GATHER_USAGE_STATS=false
+# Run the application
+CMD ["streamlit", "run", "app.py", "--server.port=7860", "--server.address=0.0.0.0"]

README.md CHANGED Viewed

@@ -1,208 +1,64 @@
 # IIS Log Performance Analyzer
 High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.
 **GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)
-**Live Demo**: Deploy on [Streamlit Cloud](https://streamlit.io/cloud)
 ## Features
-- **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
-- **Large File Support**: Efficiently handles files up to 1GB+
-- **Comprehensive Metrics**:
-  - Total requests (before/after filtering)
-  - Error rates and breakdown by status code
-  - Response time statistics (min/max/avg)
-  - Slow request detection (configurable threshold)
-  - Peak RPS (Requests Per Second) with timestamp
-  - Top methods by request count and response time
-- **Multi-File Analysis**: Upload and compare multiple log files side-by-side
-- **Interactive Visualizations**: Charts and graphs using Plotly
-- **Smart Filtering**: Automatically excludes monitoring requests (Zabbix HEAD) and 401 unauthorized
-## Requirements
-- Python 3.8+
-- See `requirements.txt` for package dependencies
-## Installation
-### Local Installation
-1. Clone the repository:
-```bash
-git clone https://github.com/pilot-stuk/odata_log_parser.git
-cd odata_log_parser
-```
-2. Install dependencies:
-```bash
-pip install -r requirements.txt
-```
-### Deploy to Streamlit Cloud
-1. Fork or clone this repository to your GitHub account
-2. Go to [share.streamlit.io](https://share.streamlit.io/)
-3. Sign in with your GitHub account
-4. Click "New app"
-5. Select your repository: `pilot-stuk/odata_log_parser`
-6. Set the main file path: `app.py`
-7. Click "Deploy"
-The app will be live at: `https://share.streamlit.io/pilot-stuki/odata_log_parser/main/app.py`
-## Usage
-### Run the Streamlit App
-```bash
-streamlit run app.py
-```
-The application will open in your browser at `http://localhost:8501`
-### Upload Log Files
-1. Click "Browse files" in the sidebar
-2. Select one or more IIS log files (.log or .txt)
-3. View the analysis results
-### Configuration Options
-- **Upload Mode**: Single or Multiple files
-- **Top N Methods**: Number of top methods to display (3-20)
-- **Slow Request Threshold**: Configure what constitutes a "slow" request (default: 3000ms)
 ## Log Format
-This tool supports **IIS W3C Extended Log Format** with the following fields:
-```
-date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip
-cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
-```
-Example log line:
 ```
-2025-09-22 00:00:46 10.21.31.42 GET /Service/Contact/Get sessionid='xxx' 443 - 212.233.92.232 - - 200 0 0 24
 ```
 ## Filtering Rules
-The analyzer applies the following filters automatically:
-1. **Monitoring Exclusion**: Lines containing both `HEAD` method and `Zabbix` are excluded
-2. **401 Handling**: 401 Unauthorized responses are excluded from error counts (considered authentication attempts, not system errors)
-3. **Error Definition**: Errors are HTTP status codes ≠ 200 and ≠ 401
-## Metrics Explained
-| Metric | Description |
-|--------|-------------|
-| **Total Requests (before filtering)** | Raw number of log entries |
-| **Excluded Requests** | Lines filtered out (HEAD+Zabbix + 401) |
-| **Processed Requests** | Valid requests included in analysis |
-| **Errors** | Requests with status ≠ 200 and ≠ 401 |
-| **Slow Requests** | Requests exceeding threshold (default: 3000ms) |
-| **Peak RPS** | Maximum requests per second observed |
-| **Avg/Max/Min Response Time** | Response time statistics in milliseconds |
 ## Performance
-- **Small files** (<50MB): Process in seconds
-- **Medium files** (50-200MB): Process in 10-30 seconds
-- **Large files** (200MB-1GB): Process in 1-3 minutes
-Performance depends on:
-- File size
-- Number of log entries
-- System CPU and RAM
-- Disk I/O speed
-## Architecture
-```
-app.py              # Streamlit UI application
-log_parser.py       # Core parsing and analysis logic using Polars
-requirements.txt    # Python dependencies
-README.md          # This file
-```
-### Key Components
-- **IISLogParser**: Parses IIS W3C log format into Polars DataFrame
-- **LogAnalyzer**: Calculates metrics and statistics
-- **Streamlit UI**: Interactive web interface with visualizations
-## Use Cases
-- **Performance Analysis**: Identify slow endpoints and response time patterns
-- **Error Investigation**: Track error rates and problematic methods
-- **Capacity Planning**: Analyze peak load and RPS patterns
-- **Service Comparison**: Compare performance across multiple services
-- **Incident Review**: Analyze logs from specific time periods
-## Troubleshooting
-### Large File Upload Issues
-If Streamlit has trouble with very large files (>500MB):
-1. Increase Streamlit's upload size limit:
-```bash
-streamlit run app.py --server.maxUploadSize=1024
-```
-2. Or modify `.streamlit/config.toml`:
-```toml
-[server]
-maxUploadSize = 1024
-```
-### Memory Issues
-For files >1GB, you may need to:
-- Increase available system memory
-- Process files in smaller chunks
-- Use the CLI version (can be developed if needed)
-### Performance Tips
-- Close other memory-intensive applications
-- Process files one at a time for very large files
-- Use SSD for faster I/O
-- Ensure adequate RAM (8GB+ recommended for 1GB files)
-## Future Enhancements
-Potential features for future versions:
-- CLI tool for batch processing
-- Export results to PDF/Excel
-- Real-time log monitoring
-- Custom metric definitions
-- Time range filtering
-- IP address analysis
-- Session tracking
-## Example Output
-The application generates:
-1. **Summary Table**: Key metrics for each log file
-2. **Top Methods Chart**: Most frequently called endpoints
-3. **Response Time Distribution**: Histogram of response times
-4. **Error Breakdown**: Pie chart of error types
-5. **Service Comparison**: Side-by-side comparison for multiple files
 ## License
-This tool is provided as-is for log analysis purposes.
-## Support
-For issues or questions:
-1. Check log file format matches IIS W3C Extended format
-2. Verify all required fields are present
-3. Ensure Python and dependencies are correctly installed

+---
+title: IIS Log Performance Analyzer
+emoji: 📊
+colorFrom: blue
+colorTo: purple
+sdk: docker
+pinned: false
+license: mit
+app_port: 7860
+---
 # IIS Log Performance Analyzer
 High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.
 **GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)
 ## Features
+- ⚡ **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
+- 📦 **Large File Support**: Efficiently handles files up to 1GB+
+- 📊 **Comprehensive Metrics**: RPS, response times, error rates, and more
+- 🔍 **Detailed Analysis**: Top methods, error breakdown, time distribution
+- 📈 **Visual Reports**: Interactive charts with Plotly
+- 🔄 **Multi-file Support**: Compare multiple services side-by-side
+## How to Use
+1. Upload one or more IIS log files (W3C Extended format)
+2. View comprehensive performance metrics
+3. Analyze errors, slow requests, and response time distribution
+4. Compare multiple services side-by-side
 ## Log Format
+Supports **IIS W3C Extended Log Format** with fields:
 ```
+date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username
+c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
 ```
 ## Filtering Rules
+- Excludes monitoring requests (HEAD + Zabbix)
+- 401 Unauthorized responses excluded from error counts
+- Errors defined as status codes ≠ 200 and ≠ 401
+- Slow requests: response time > 3000ms (configurable)
+## Technology Stack
+- **Frontend**: Streamlit
+- **Data Processing**: Polars
+- **Visualizations**: Plotly
+- **Deployment**: Docker on Hugging Face Spaces
 ## Performance
+- Small files (<50MB): Process in seconds
+- Medium files (50-200MB): Process in 10-30 seconds
+- Large files (200MB-1GB): Process in 1-3 minutes
 ## License
+MIT License - See GitHub repository for details

README_GITHUB.md ADDED Viewed

	@@ -0,0 +1,208 @@

+# IIS Log Performance Analyzer
+High-performance web application for analyzing large IIS log files (200MB-1GB+). Built with Streamlit and Polars for fast, efficient processing.
+**GitHub Repository**: [https://github.com/pilot-stuk/odata_log_parser](https://github.com/pilot-stuk/odata_log_parser)
+**Live Demo**: Deploy on [Streamlit Cloud](https://streamlit.io/cloud)
+## Features
+- **Fast Processing**: Uses Polars library for 10-100x faster parsing compared to pandas
+- **Large File Support**: Efficiently handles files up to 1GB+
+- **Comprehensive Metrics**:
+  - Total requests (before/after filtering)
+  - Error rates and breakdown by status code
+  - Response time statistics (min/max/avg)
+  - Slow request detection (configurable threshold)
+  - Peak RPS (Requests Per Second) with timestamp
+  - Top methods by request count and response time
+- **Multi-File Analysis**: Upload and compare multiple log files side-by-side
+- **Interactive Visualizations**: Charts and graphs using Plotly
+- **Smart Filtering**: Automatically excludes monitoring requests (Zabbix HEAD) and 401 unauthorized
+## Requirements
+- Python 3.8+
+- See `requirements.txt` for package dependencies
+## Installation
+### Local Installation
+1. Clone the repository:
+```bash
+git clone https://github.com/pilot-stuk/odata_log_parser.git
+cd odata_log_parser
+```
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+### Deploy to Streamlit Cloud
+1. Fork or clone this repository to your GitHub account
+2. Go to [share.streamlit.io](https://share.streamlit.io/)
+3. Sign in with your GitHub account
+4. Click "New app"
+5. Select your repository: `pilot-stuk/odata_log_parser`
+6. Set the main file path: `app.py`
+7. Click "Deploy"
+The app will be live at: `https://share.streamlit.io/pilot-stuki/odata_log_parser/main/app.py`
+## Usage
+### Run the Streamlit App
+```bash
+streamlit run app.py
+```
+The application will open in your browser at `http://localhost:8501`
+### Upload Log Files
+1. Click "Browse files" in the sidebar
+2. Select one or more IIS log files (.log or .txt)
+3. View the analysis results
+### Configuration Options
+- **Upload Mode**: Single or Multiple files
+- **Top N Methods**: Number of top methods to display (3-20)
+- **Slow Request Threshold**: Configure what constitutes a "slow" request (default: 3000ms)
+## Log Format
+This tool supports **IIS W3C Extended Log Format** with the following fields:
+```
+date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip
+cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
+```
+Example log line:
+```
+2025-09-22 00:00:46 10.21.31.42 GET /Service/Contact/Get sessionid='xxx' 443 - 212.233.92.232 - - 200 0 0 24
+```
+## Filtering Rules
+The analyzer applies the following filters automatically:
+1. **Monitoring Exclusion**: Lines containing both `HEAD` method and `Zabbix` are excluded
+2. **401 Handling**: 401 Unauthorized responses are excluded from error counts (considered authentication attempts, not system errors)
+3. **Error Definition**: Errors are HTTP status codes ≠ 200 and ≠ 401
+## Metrics Explained
+| Metric | Description |
+|--------|-------------|
+| **Total Requests (before filtering)** | Raw number of log entries |
+| **Excluded Requests** | Lines filtered out (HEAD+Zabbix + 401) |
+| **Processed Requests** | Valid requests included in analysis |
+| **Errors** | Requests with status ≠ 200 and ≠ 401 |
+| **Slow Requests** | Requests exceeding threshold (default: 3000ms) |
+| **Peak RPS** | Maximum requests per second observed |
+| **Avg/Max/Min Response Time** | Response time statistics in milliseconds |
+## Performance
+- **Small files** (<50MB): Process in seconds
+- **Medium files** (50-200MB): Process in 10-30 seconds
+- **Large files** (200MB-1GB): Process in 1-3 minutes
+Performance depends on:
+- File size
+- Number of log entries
+- System CPU and RAM
+- Disk I/O speed
+## Architecture
+```
+app.py              # Streamlit UI application
+log_parser.py       # Core parsing and analysis logic using Polars
+requirements.txt    # Python dependencies
+README.md          # This file
+```
+### Key Components
+- **IISLogParser**: Parses IIS W3C log format into Polars DataFrame
+- **LogAnalyzer**: Calculates metrics and statistics
+- **Streamlit UI**: Interactive web interface with visualizations
+## Use Cases
+- **Performance Analysis**: Identify slow endpoints and response time patterns
+- **Error Investigation**: Track error rates and problematic methods
+- **Capacity Planning**: Analyze peak load and RPS patterns
+- **Service Comparison**: Compare performance across multiple services
+- **Incident Review**: Analyze logs from specific time periods
+## Troubleshooting
+### Large File Upload Issues
+If Streamlit has trouble with very large files (>500MB):
+1. Increase Streamlit's upload size limit:
+```bash
+streamlit run app.py --server.maxUploadSize=1024
+```
+2. Or modify `.streamlit/config.toml`:
+```toml
+[server]
+maxUploadSize = 1024
+```
+### Memory Issues
+For files >1GB, you may need to:
+- Increase available system memory
+- Process files in smaller chunks
+- Use the CLI version (can be developed if needed)
+### Performance Tips
+- Close other memory-intensive applications
+- Process files one at a time for very large files
+- Use SSD for faster I/O
+- Ensure adequate RAM (8GB+ recommended for 1GB files)
+## Future Enhancements
+Potential features for future versions:
+- CLI tool for batch processing
+- Export results to PDF/Excel
+- Real-time log monitoring
+- Custom metric definitions
+- Time range filtering
+- IP address analysis
+- Session tracking
+## Example Output
+The application generates:
+1. **Summary Table**: Key metrics for each log file
+2. **Top Methods Chart**: Most frequently called endpoints
+3. **Response Time Distribution**: Histogram of response times
+4. **Error Breakdown**: Pie chart of error types
+5. **Service Comparison**: Side-by-side comparison for multiple files
+## License
+This tool is provided as-is for log analysis purposes.
+## Support
+For issues or questions:
+1. Check log file format matches IIS W3C Extended format
+2. Verify all required fields are present
+3. Ensure Python and dependencies are correctly installed