Audit_AI / README_GRADIO.md
Sakshi2005's picture
Upload folder using huggingface_hub
27697ee verified

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

🧠 AuditAI β€” Enhanced Agentic AI Website Auditor (Gradio Edition)

An Agentic AI-powered web application built with Gradio that provides comprehensive website audits including SEO, performance, accessibility, security, mobile responsiveness, and broken link detection with AI-generated insights and PDF reports.


πŸ†• What's New in Gradio Edition

Enhanced Features:

  • βœ… Accessibility Checker - WCAG 2.1 compliance analysis
  • βœ… Mobile Responsiveness Analyzer - Viewport, responsive images, touch targets
  • βœ… Broken Link Detection - Parallel link checking with detailed reports
  • βœ… PDF Report Generation - Professional downloadable audit reports
  • βœ… Historical Tracking - Track score improvements over time
  • βœ… Trend Analysis - Visualize performance changes across audits
  • βœ… Enhanced UI - Modern Gradio tabbed interface with better UX

Original Features (Retained):

  • πŸ” Website scanning (load time, HTTPS, page size, links, headings)
  • πŸ€– Agentic AI analysis with Google Gemini 1.5 Flash
  • πŸ“Š Interactive visualizations (gauges, radar charts, bar charts)
  • ⬇️ Downloadable optimized HTML
  • πŸ’‘ AI-powered suggestions and fix snippets

πŸš€ Quick Start

1️⃣ Install Dependencies

pip install -r requirements.txt

2️⃣ Set Up Gemini API Key

Create a .env file in the project root:

GEMINI_API_KEY=your_gemini_api_key_here

3️⃣ Run the Gradio App

python app_gradio.py

The app will launch at http://localhost:7860 with a shareable link.

4️⃣ Run the Original Streamlit App (Optional)

streamlit run app.py

πŸ“‹ New Features Details

β™Ώ Accessibility Checker (accessibility_checker.py)

Analyzes WCAG 2.1 compliance:

  • Missing alt text on images
  • Proper heading hierarchy (H1-H6)
  • Form labels and ARIA landmarks
  • Link text quality
  • Language attributes
  • Skip navigation links
  • Video captions

πŸ“± Mobile Responsiveness (mobile_checker.py)

Checks mobile-friendliness:

  • Viewport meta tag validation
  • Responsive images (srcset/sizes)
  • Page size optimization for mobile
  • Flash content detection
  • Fixed-width elements
  • Touch target sizes
  • Media queries analysis
  • Relative font sizing

πŸ”— Broken Link Detector (link_checker.py)

Identifies broken links:

  • Parallel processing for speed (10 concurrent workers)
  • Checks up to 50 links per audit
  • HTTP status code validation
  • Internal vs external link tracking
  • Detailed error reporting

πŸ“„ PDF Report Generator (report_generator.py)

Creates professional reports:

  • Multi-page comprehensive audit summary
  • Color-coded scores and metrics
  • All detected issues organized by category
  • AI recommendations
  • Broken link details
  • Timestamp and metadata

πŸ“ˆ Historical Tracking (history_tracker.py)

Tracks performance over time:

  • JSON-based storage (last 100 audits)
  • Per-site history retrieval
  • Trend data for visualizations
  • Score comparison across audits

🎨 Gradio UI Structure

The new interface uses 5 tabs:

  1. πŸ“Š Overview - Summary, scores, gauge & radar charts
  2. πŸ“ˆ Metrics & Trends - Technical metrics and historical trends
  3. ⚠️ Issues - AI, accessibility, mobile, and broken link issues
  4. βœ… Recommendations - AI-powered suggestions
  5. πŸ“„ PDF Report - Download comprehensive report

πŸ“Š Scoring System

Overall Score Calculation (0-100)

Based on:

  • HTTPS (15 points)
  • Load time (5-15 points)
  • Title presence (10 points)
  • Meta description (10 points)
  • H1 tags (5-10 points)
  • Images with alt text (up to 10 points)
  • Links & scripts (up to 10 points)
  • Paragraph content (up to 10 points)
  • HTTP status (10 points)

Individual Scores

  • SEO Score: 100 - (images_without_alt Γ— 5)
  • Performance Score: 100 - (load_time Γ— 10)
  • Accessibility Score: WCAG compliance based (0-100)
  • Security Score: 100 if HTTPS, else 50
  • Mobile Score: Mobile-friendliness based (0-100)

πŸ”§ Tech Stack

Core Technologies

  • Python 3.9+
  • Gradio 4.x β€” Modern web UI framework
  • Google Gemini API β€” Gemini 1.5 Flash for AI analysis
  • BeautifulSoup4 β€” HTML parsing
  • Requests β€” HTTP client

Visualization & Reports

  • Plotly β€” Interactive charts (gauges, radar, bar)
  • Matplotlib β€” Word clouds
  • Pandas β€” Data manipulation
  • FPDF β€” PDF generation

Other

  • python-dotenv β€” Environment variables
  • concurrent.futures β€” Parallel link checking

πŸ“ Project Structure

AuditAI-main/
β”œβ”€β”€ app.py                      # Original Streamlit app
β”œβ”€β”€ app_gradio.py               # NEW: Gradio app
β”œβ”€β”€ scanner.py                  # Website scanner
β”œβ”€β”€ ai_analyzer.py              # OpenAI integration
β”œβ”€β”€ scoring.py                  # Score calculation
β”œβ”€β”€ dashboard.py                # Streamlit dashboard
β”œβ”€β”€ utils.py                    # Utility functions
β”œβ”€β”€ accessibility_checker.py    # NEW: Accessibility analysis
β”œβ”€β”€ mobile_checker.py           # NEW: Mobile responsiveness
β”œβ”€β”€ link_checker.py             # NEW: Broken link detection
β”œβ”€β”€ report_generator.py         # NEW: PDF generation
β”œβ”€β”€ history_tracker.py          # NEW: Historical tracking
β”œβ”€β”€ requirements.txt            # Dependencies
β”œβ”€β”€ README.md                   # Original readme
β”œβ”€β”€ README_GRADIO.md           # This file
└── .env                        # API keys (create this)

🎯 Usage Guide

  1. Enter URL: Input the website URL (e.g., https://example.com)
  2. Choose Options: Check/uncheck "Check for Broken Links" (optional, slower)
  3. Click Audit: Start the comprehensive analysis
  4. View Results:
    • Overview tab shows summary and scores
    • Issues tab lists all detected problems
    • Recommendations tab shows AI suggestions
    • PDF tab provides downloadable report
  5. Track Progress: Re-audit the same site to see trend improvements

⚑ Performance Notes

  • Broken Link Checking: Uses parallel processing (10 workers) but can take 30-60s for 50 links
  • AI Analysis: Powered by Google Gemini AI | Enhanced with Advanced Analytics
  • PDF Generation: Instant (<1s)
  • Historical Trends: Only show after 2+ audits of the same site

πŸ”’ Environment Variables

Required in .env file:

GEMINI_API_KEY=your-gemini-key-here

πŸ†š Gradio vs Streamlit

Why Gradio?

  • βœ… Easier deployment (built-in sharing)
  • βœ… Better tab organization
  • βœ… Cleaner API for complex workflows
  • βœ… Automatic shareable links
  • βœ… Better mobile experience

Keeping Streamlit?

Both versions are maintained. Use:

  • app_gradio.py for the enhanced version
  • app.py for the original Streamlit version

πŸ‘¨β€πŸ’» Author

Mirza Yasir Abdullah Baig


πŸ“ License

Educational purposes. Not for commercial use without permission.


πŸ› Troubleshooting

Issue: Gemini API errors
Solution: Check your API key in .env and get it from https://aistudio.google.com/app/apikey

Issue: Broken link checking takes too long
Solution: Uncheck the "Check for Broken Links" option

Issue: PDF generation fails
Solution: Ensure fpdf is installed: pip install fpdf

Issue: No trend data shown
Solution: Audit the same site multiple times to build history


πŸš€ Future Enhancements

  • Multi-page website crawling
  • Competitor comparison
  • Lighthouse integration
  • Email report scheduling
  • Database storage (replace JSON)
  • Custom scoring weights
  • Screenshot capture
  • Security header analysis

πŸ“Έ Screenshots

Coming soon! Run the app to see the beautiful new Gradio interface.


Enjoy auditing! πŸŽ‰