Spaces:

lljz66
/

uptime

Sleeping

App Files Files Community

HuB commited on 24 days ago

Commit

cc4181a

1 Parent(s): f71e0c8

Expand API endpoints and add comprehensive documentation (README_FULL.md)

Browse files

Files changed (3) hide show

README.md +23 -1
README_FULL.md +60 -0
api/routes.py +44 -6

README.md CHANGED Viewed

@@ -53,5 +53,27 @@ To enable full functionality, set the following in your `.env`:
 - `GOOGLE_API_KEY`: For Safe Browsing checks.
 - `FAST_CHECK_INTERVAL`: Frequency of uptime checks (default: 60s).
 ## Deployment
-WebGuard is pre-configured for deployment on Hugging Face Spaces using the provided `Dockerfile`. It follows the standard port (7860) and user (1000) requirements for secure execution.

 - `GOOGLE_API_KEY`: For Safe Browsing checks.
 - `FAST_CHECK_INTERVAL`: Frequency of uptime checks (default: 60s).
+## API Reference
+WebGuard provides a comprehensive REST API for triggering scans and retrieving monitoring data.
+### 1. Unified Full Scan
+`POST /api/scan/full`
+Runs all available checks (SSL, DNS, HTTP, Port, Ping, Blacklist) and provides an AI-powered analysis summary.
+### 2. Protocol-Specific Scans
+- `POST /api/scan/ssl`: Targeted SSL certificate audit.
+- `POST /api/scan/dns`: Deep DNS record verification.
+- `POST /api/scan/ping`: Latency and reachability check.
+- `POST /api/scan/port`: Common service discovery.
+- `POST /api/scan/http`: Web server and redirect analysis.
+- `POST /api/scan/blacklist`: Reputation and phishing database lookup.
+### 3. Monitoring Management
+- `GET /api/urls`: List all currently monitored URLs.
+- `POST /api/monitor/add?url={url}`: Start continuous monitoring for a site.
+- `POST /api/monitor/remove?url={url}`: Stop monitoring.
+- `GET /api/results/{url}`: Retrieve historical scan data for a specific site.
 ## Deployment
+WebGuard is pre-configured for deployment on Hugging Face Spaces using the provided `Dockerfile`. It follows the standard port (7860) and user (1000) requirements for secure execution.

README_FULL.md ADDED Viewed

	@@ -0,0 +1,60 @@

+# WebGuard: Full Project Architecture & Technical Specification
+## 1. Project Overview
+WebGuard is a modular, asynchronous monitoring engine designed to provide high-fidelity insights into web infrastructure. Unlike traditional uptime monitors that only perform simple HTTP "pings," WebGuard audits the entire stack—from low-level ICMP reachability to high-level visual consistency.
+## 2. Technical Stack & Tool Selection
+### Backend: FastAPI
+- **Why**: Chosen for its native support for `asyncio`, which is critical for a monitoring tool that performs hundreds of concurrent network requests. It provides automatic OpenAPI documentation and high performance.
+- **Workflow**: Serves as the API gateway and orchestrates the check lifecycle via a Pydantic-validated request/response model.
+### Automation: Playwright (Chromium)
+- **Why**: Industry-leading browser automation tool. Unlike Selenium, it is faster, more reliable, and handles modern SPAs (Single Page Applications) natively.
+- **Role**: Powers the `crawler` for subpage auditing, `visual_regression` for layout shift detection, and `screenshot` for visual evidence of outages.
+### Scheduler: APScheduler
+- **Why**: A flexible, in-process scheduler that doesn't require an external message broker like Redis or RabbitMQ.
+- **Workflows**:
+  - **Fast (60s)**: Lightweight checks (Ping, HTTP, SSL).
+  - **Medium (300s)**: Heuristic checks (Blacklists, Headers, DNS).
+  - **Heavy (900s)**: Resource-intensive checks (Browser crawling, Visual regression).
+### Database: SQLite & SQLAlchemy
+- **Why**: SQLite is a zero-config, serverless database ideal for self-hosted apps on platforms like Hugging Face Spaces. SQLAlchemy provides a robust ORM to handle complex queries for historical data analysis.
+### AI Engine: Anthropic Claude (API)
+- **Why**: Superior reasoning capabilities for technical debugging.
+- **Role**: Analyzes raw JSON results from all scanners and provides a human-readable "semantic" explanation of what exactly is failing (e.g., "The site is up, but your SSL certificate is mismatched for the WWW subdomain").
+## 3. Directory Structure
+```text
+uptime/
+├── ai/             # AI Analysis logic (Anthropic integration)
+├── api/            # FastAPI routes and Pydantic schemas
+├── browser/        # Playwright-based browser automation (Crawler, Screenshots)
+├── checkers/       # Protocol-specific scanner modules (SSL, DNS, Port, etc.)
+├── config/         # Application settings and logging configuration
+├── frontend/       # Static HTML/JS for the dashboard
+├── scheduler/      # Background task management and monitoring cycles
+├── storage/        # Database models, migrations, and initialization
+├── app.py          # Main entry point (FastAPI initialization)
+└── Dockerfile      # Containerization for Hugging Face Spaces
+```
+## 4. How It Works (The Lifecycle)
+1. **Request**: A user or the scheduler triggers a scan for a URL.
+2. **Orchestration**: The `runner.py` in the scheduler module picks up the request. It dynamically imports and executes the relevant modules from the `checkers/` and `browser/` directories.
+3. **Execution**: Scanners run in parallel using `asyncio.gather` to maximize performance.
+4. **Logging**: Every step is captured by the centralized logging system, providing a real-time audit trail in `webguard.log`.
+5. **Persistence**: Results are normalized and stored in the SQLite database.
+6. **AI Analysis**: If issues are detected, the raw data is sent to the AI Agent. The agent returns a root-cause summary.
+7. **Response**: The API returns the combined results, including the AI's semantic interpretation.
+## 5. Security & Deployment Standards
+WebGuard is built with security-first principles:
+- **Isolation**: Runs as a non-privileged user (UID 1000) inside Docker.
+- **Data Privacy**: All monitoring data stays on your local storage; only sanitized scan results are sent to the optional AI provider.
+- **Reliability**: Uses PhishTank and Google Safe Browsing to provide a "Shield" for your brand reputation.

api/routes.py CHANGED Viewed

@@ -1,16 +1,24 @@
 from fastapi import APIRouter, HTTPException
-from fastapi.responses import FileResponse
 from api.schemas import ScanRequest, ScanResponse
 from scheduler.runner import run_all_now, add_url, remove_url, _monitored_urls
-from storage.db import get_latest_results, get_all_urls, save_alert
-import os
 router = APIRouter()
-@router.post("/scan", response_model=ScanResponse)
-async def scan(req: ScanRequest):
     try:
         result = await run_all_now(req.url)
         if req.monitor:
             add_url(req.url)
@@ -20,8 +28,38 @@ async def scan(req: ScanRequest):
             results=result["results"],
         )
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @router.get("/results/{url:path}")
 async def get_results(url: str, limit: int = 20):

 from fastapi import APIRouter, HTTPException
 from api.schemas import ScanRequest, ScanResponse
 from scheduler.runner import run_all_now, add_url, remove_url, _monitored_urls
+from storage.db import get_latest_results, get_all_urls
+import checkers.ssl_check as ssl_check
+import checkers.dns_check as dns_check
+import checkers.http_check as http_check
+import checkers.ping_check as ping_check
+import checkers.port_check as port_check
+import checkers.blacklist_check as blacklist_check
+import checkers.multi_scanner as multi_scanner
+from config.logging import get_logger
+logger = get_logger("api.routes")
 router = APIRouter()
+@router.post("/scan/full", response_model=ScanResponse)
+async def scan_full(req: ScanRequest):
+    """Run all available checks and return a unified report."""
     try:
+        logger.info(f"Full scan requested for: {req.url}")
         result = await run_all_now(req.url)
         if req.monitor:
             add_url(req.url)
             results=result["results"],
         )
     except Exception as e:
+        logger.error(f"Full scan failed: {e}")
         raise HTTPException(status_code=500, detail=str(e))
+@router.post("/scan/ssl")
+async def scan_ssl(req: ScanRequest):
+    return await ssl_check.run(req.url)
+@router.post("/scan/dns")
+async def scan_dns(req: ScanRequest):
+    return await dns_check.run(req.url)
+@router.post("/scan/http")
+async def scan_http(req: ScanRequest):
+    return await http_check.run(req.url)
+@router.post("/scan/ping")
+async def scan_ping(req: ScanRequest):
+    return await ping_check.run(req.url)
+@router.post("/scan/port")
+async def scan_port(req: ScanRequest):
+    return await port_check.run(req.url)
+@router.post("/scan/blacklist")
+async def scan_blacklist(req: ScanRequest):
+    return await blacklist_check.run(req.url)
+@router.post("/scan", response_model=ScanResponse)
+async def scan_legacy(req: ScanRequest):
+    # Keep original /scan for backward compatibility
+    return await scan_full(req)
 @router.get("/results/{url:path}")
 async def get_results(url: str, limit: int = 20):