Nikhil Pravin Pise
feat: implement comprehensive improvement plan (Phases 1-5)
e98cc10
Medium-MCP Architecture
System Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Entry Points β
βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββββββββββ€
β server.py β app.py β CLI β
β (MCP Server) β (Gradio UI) β (Commands) β
ββββββββββ¬βββββββββ΄βββββββββ¬βββββββββ΄βββββββββ¬βββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ScraperService β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Tier Chain β β
β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β
β β β Cache βββGraphQL βββ HTTPX βββBrowser ββ... β β
β β β (T0) β β (T10) β β (T20) β β (T30) β β β
β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β database β β extractor β β parser β
β (SQLite) β β (Apollo/LD) β β (HTML) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
Core Modules
Entry Points
| Module |
Purpose |
server.py |
MCP server (FastMCP) |
app.py |
Gradio web UI |
Service Layer
| Module |
Purpose |
src/service.py |
ScraperService orchestration |
src/tiers/ |
Tier chain pattern |
Data Layer
| Module |
Purpose |
src/database.py |
SQLite caching |
src/extractor.py |
Content extraction |
src/parser.py |
HTML parsing |
Infrastructure
| Module |
Purpose |
src/http_pool.py |
Connection pooling |
src/resilience.py |
Circuit breaker |
src/validation.py |
Input validation |
src/security.py |
Rate limiting |
Tier Chain
| Priority |
Tier |
Description |
| 0 |
Cache |
SQLite lookup |
| 10 |
GraphQL |
Medium API |
| 20 |
HTTPX |
Fast HTTP |
| 30 |
Browser |
Playwright |
| 40 |
Wayback |
Archive.org |
Each tier returns TierResult(success, data, error).
Data Flow
URL β validate β resolve_id β tier_chain β extract β cache β render
- Validate: Check URL format
- Resolve: Extract post ID
- Tier Chain: Try each tier until success
- Extract: Parse Apollo/JSON-LD
- Cache: Store in SQLite
- Render: Output markdown/HTML
Package Structure
Medium-MCP/
βββ src/
β βββ tiers/ # Tier implementations
β βββ validation.py # Input validation
β βββ security.py # Rate limiting
β βββ exceptions.py # Error handling
β βββ metrics.py # Performance metrics
βββ mcp/
β βββ schemas.py # Pydantic models
β βββ tools/ # Tool implementations
βββ ui/
β βββ styles/ # CSS modules
βββ tests/
β βββ unit/
β βββ integration/
β βββ e2e/
βββ docs/