A newer version of the Gradio SDK is available:
6.2.0
title: Vulnerability Scanner
emoji: π’
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.47.0
app_file: app.py
pinned: false
license: mit
short_description: AI-powered tool that analyzes GitHub repositories
π‘οΈ AI-Powered GitHub Vulnerability Scanner
Track Tag: mcp-in-action-track-enterprise
An autonomous AI agent system that performs comprehensive security analysis of GitHub repositories using Model Context Protocol (MCP) tools and agentic RAG. This intelligent agent autonomously plans, retrieves, and executes vulnerability assessments by combining GitHub data access, CVE knowledge bases, and advanced language models.
β οΈ Important Notice: This tool is designed for legitimate security research and vulnerability assessment purposes only. Do not use this scanner for malicious activities, unauthorized access, or any illegal purposes. Always ensure you have proper authorization before scanning repositories.
π₯ Demo Video
Watch Demo Video (1-5 minutes showing the autonomous agent in action)
π± Social Media
Project Announcement on X/LinkedIn
π€ Autonomous Agent Capabilities
This project showcases advanced autonomous agent behavior with:
Planning & Reasoning
- Intelligent Query Understanding: Agent analyzes user requests and automatically plans multi-step security assessments
- Context-Aware Decisions: Dynamically selects appropriate MCP tools based on repository structure and file types
- Adaptive Analysis: Adjusts scanning depth and focus based on discovered vulnerabilities
Tool Orchestration
- MCP Tool Integration: Seamlessly uses 7 GitHub MCP tools for repository exploration and file retrieval
- CVE Knowledge Base Search: Autonomously queries 10,000+ real-world vulnerability records from Hugging Face dataset
- Web Scraping: Automatically fetches CVE details from NVD webpages for enhanced context
Agentic RAG System
- Retrieval-Augmented Generation: BM25-based retrieval finds relevant vulnerability patterns from CVE database
- Evidence-Based Analysis: Correlates code patterns with real CVE examples for accurate detection
- Context Engineering: Combines code analysis with historical vulnerability data for informed assessments
- Multi-Source Synthesis: Integrates GitHub content, CVE records, and NVD data into comprehensive reports
Execution & Reporting
- Autonomous Scanning: Agent independently navigates repositories, analyzes files, and identifies vulnerabilities
- Structured Output: Generates professional security reports with severity ratings, CWE classifications, and remediation advice
- Interactive Follow-up: Maintains conversation context for clarifying questions and deeper analysis
π Project Links
- π Source Code: GitHub Repository
- π§ MCP Server: Hugging Face Space
- π‘οΈ Live Demo: Vulnerability Scanner Client
β¨ Key Features
π€ Autonomous Agent System
- Intelligent Planning: Agent autonomously plans vulnerability assessment strategies
- Multi-Tool Orchestration: Coordinates 7 MCP tools for comprehensive repository analysis
- Agentic RAG: Retrieves and applies knowledge from 10,000+ CVE records
- Context Engineering: Maintains conversation state and builds analysis context progressively
π Advanced Detection
- AI-Powered Analysis: Uses Hugging Face Inference API with advanced language models
- CVE Knowledge Base: Leverages
CIRCL/vulnerabilitydataset with CWE classifications and CVSS scores - Multi-Language Support: Analyzes Python, JavaScript, TypeScript, PHP, Java, C/C++, Go, Ruby, and more
- Pattern Recognition: Identifies vulnerability patterns from historical security data
π Professional Reporting
- Comprehensive Reports: Detailed findings with CVE references, severity ratings, and code snippets
- Remediation Guidance: Specific fix recommendations for each identified vulnerability
- CWE Mapping: Links vulnerabilities to Common Weakness Enumeration codes
- CVSS Scoring: Provides severity assessment based on industry standards
π¨ Modern Interface
- Gradio Web UI: User-friendly chat interface for interacting with the agent
- Real-Time Analysis: Immediate feedback as the agent explores and analyzes code
- Interactive Chat: Ask follow-up questions and request deeper analysis
- Secure API Key Handling: Keys entered in UI, never stored permanently
ποΈ System Architecture
MCP-Based Agent Workflow
User Request β AI Agent (Planning) β MCP Tools (GitHub Data) β CVE RAG (Knowledge Retrieval) β
AI Analysis (Reasoning) β Security Report (Execution) β User Response
Core Components
GitHub MCP Server (
server.py):- 7 MCP tools for GitHub API access
- Repository information retrieval
- File content extraction and directory scanning
- CVE Knowledge Base with BM25 retrieval
- NVD API integration for official CVE details
AI Agent Client (
client.py):- Autonomous planning and reasoning engine
- MCP tool orchestration
- Agentic RAG with CVE database integration
- Web scraping for enhanced research
- Gradio interface for user interaction
Knowledge Base:
- Hugging Face dataset:
CIRCL/vulnerability - 10,000+ CVE records with descriptions
- CWE codes and CVSS severity scores
- Vulnerability summaries and technical details
- Hugging Face dataset:
π Quick Start
Prerequisites
- Python 3.11+
- Hugging Face account and API token (Get one)
- GitHub personal access token (optional, for private repos - Get one)
Installation
# Clone the repository
git clone https://github.com/banno-0720/vulnerability-scanner.git
cd vulnerability-scanner
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Authenticate with Hugging Face (for CVE dataset)
huggingface-cli login
# Optional: Set GitHub token in .env
echo "GITHUB_TOKEN=your_token_here" > .env
Usage
# Start the vulnerability scanner
python client.py
# Open browser to http://localhost:7861
# Enter your Hugging Face API key
# Paste a GitHub file URL and watch the agent work!
Example Analysis
Try these test files:
https://github.com/ayushmittal62/vunreability_scanner_testing/blob/master/python/database.pyhttps://github.com/ayushmittal62/vunreability_scanner_testing/blob/master/database/schema.sql
π Vulnerability Detection Categories
The autonomous agent identifies:
- π Injection Vulnerabilities: SQL injection, command injection, code injection
- π Cross-Site Scripting (XSS): Reflected, stored, and DOM-based XSS
- βοΈ Security Misconfigurations: Hardcoded secrets, weak crypto, insecure configs
- π Access Control Issues: Broken authentication, session flaws, authorization bypasses
- π Data Exposure: Sensitive data in logs, information disclosure
- β Input Validation: Path traversal, file upload issues, unvalidated inputs
π¦ Dependencies
Core Agent Framework
gradio[oauth,mcp]==5.45.0- Web interface with MCP supportsmolagents[mcp]>=0.1.0- AI agent framework for autonomous behaviormcp==1.10.1- Model Context Protocol implementation
Agentic RAG Stack
datasets>=2.0.0- Hugging Face datasets for CVE datalangchain>=0.1.0- LLM application frameworksentence-transformers>=2.2.0- Semantic embeddingsrank-bm25>=0.2.2- BM25 retrieval algorithm
Tool Integration
requests>=2.28.0- HTTP client for APIsbeautifulsoup4>=4.12.0- Web scrapingmarkdownify>=0.11.6- HTML to Markdown conversion
π Hackathon Submission - Track 2: MCP in Action
Category: Enterprise Applications
Track Tag: mcp-in-action-track-enterprise
Why This Qualifies for Track 2
β Autonomous Agent Behavior:
- Planning: Agent analyzes requests and plans multi-step security assessments
- Reasoning: Makes context-aware decisions about tool selection and analysis depth
- Execution: Autonomously navigates repositories, analyzes code, and generates reports
β MCP Tools Integration:
- Uses 7 GitHub MCP tools for repository data access
- Seamlessly orchestrates multiple tools in a single analysis workflow
β Advanced Agent Features:
- Context Engineering: Maintains conversation state and builds progressive analysis context
- Agentic RAG: Retrieves relevant knowledge from 10,000+ CVE records using BM25 algorithm
- Multi-Source Synthesis: Combines GitHub data, CVE database, and NVD information
β Clear Enterprise Value:
- Automated security assessment for development teams
- Identifies vulnerabilities before production deployment
- Provides actionable remediation guidance
- Reduces manual security review time
β Gradio Application: β
π₯ Team Members
- Himanshu Goyal - @HimanshuGoyal2004
- Ayush Mittal - @baction
π Security & Ethics
- Authorized Use Only: Only scan repositories you have permission to analyze
- API Key Security: Keys entered in UI, never stored permanently
- Rate Limiting: Respectful of API quotas and rate limits
- Responsible Disclosure: Use findings responsibly for legitimate security improvements
π€ Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
π License
MIT License - See LICENSE.md for details
π Acknowledgments
- Anthropic - For MCP protocol and hackathon
- Hugging Face - For Spaces, Inference API, and CVE dataset
- Gradio - For the excellent web framework with MCP support
- CIRCL - For maintaining the comprehensive vulnerability dataset
Built with β€οΈ for the MCP 1st Anniversary Hackathon - Showcasing autonomous AI agents with MCP tools and Agentic RAG