Spaces:
Sleeping
Sleeping
| # Medium-MCP Architecture | |
| ## System Overview | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Entry Points β | |
| βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββββββββββ€ | |
| β server.py β app.py β CLI β | |
| β (MCP Server) β (Gradio UI) β (Commands) β | |
| ββββββββββ¬βββββββββ΄βββββββββ¬βββββββββ΄βββββββββ¬βββββββββββββββββ | |
| β β β | |
| βΌ βΌ βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β ScraperService β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β Tier Chain β β | |
| β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β | |
| β β β Cache βββGraphQL βββ HTTPX βββBrowser ββ... β β | |
| β β β (T0) β β (T10) β β (T20) β β (T30) β β β | |
| β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β β β | |
| βΌ βΌ βΌ | |
| βββββββββββββββ βββββββββββββββ βββββββββββββββ | |
| β database β β extractor β β parser β | |
| β (SQLite) β β (Apollo/LD) β β (HTML) β | |
| βββββββββββββββ βββββββββββββββ βββββββββββββββ | |
| ``` | |
| --- | |
| ## Core Modules | |
| ### Entry Points | |
| | Module | Purpose | | |
| |--------|---------| | |
| | `server.py` | MCP server (FastMCP) | | |
| | `app.py` | Gradio web UI | | |
| ### Service Layer | |
| | Module | Purpose | | |
| |--------|---------| | |
| | `src/service.py` | ScraperService orchestration | | |
| | `src/tiers/` | Tier chain pattern | | |
| ### Data Layer | |
| | Module | Purpose | | |
| |--------|---------| | |
| | `src/database.py` | SQLite caching | | |
| | `src/extractor.py` | Content extraction | | |
| | `src/parser.py` | HTML parsing | | |
| ### Infrastructure | |
| | Module | Purpose | | |
| |--------|---------| | |
| | `src/http_pool.py` | Connection pooling | | |
| | `src/resilience.py` | Circuit breaker | | |
| | `src/validation.py` | Input validation | | |
| | `src/security.py` | Rate limiting | | |
| --- | |
| ## Tier Chain | |
| | Priority | Tier | Description | | |
| |----------|------|-------------| | |
| | 0 | Cache | SQLite lookup | | |
| | 10 | GraphQL | Medium API | | |
| | 20 | HTTPX | Fast HTTP | | |
| | 30 | Browser | Playwright | | |
| | 40 | Wayback | Archive.org | | |
| Each tier returns `TierResult(success, data, error)`. | |
| --- | |
| ## Data Flow | |
| ``` | |
| URL β validate β resolve_id β tier_chain β extract β cache β render | |
| ``` | |
| 1. **Validate**: Check URL format | |
| 2. **Resolve**: Extract post ID | |
| 3. **Tier Chain**: Try each tier until success | |
| 4. **Extract**: Parse Apollo/JSON-LD | |
| 5. **Cache**: Store in SQLite | |
| 6. **Render**: Output markdown/HTML | |
| --- | |
| ## Package Structure | |
| ``` | |
| Medium-MCP/ | |
| βββ src/ | |
| β βββ tiers/ # Tier implementations | |
| β βββ validation.py # Input validation | |
| β βββ security.py # Rate limiting | |
| β βββ exceptions.py # Error handling | |
| β βββ metrics.py # Performance metrics | |
| βββ mcp/ | |
| β βββ schemas.py # Pydantic models | |
| β βββ tools/ # Tool implementations | |
| βββ ui/ | |
| β βββ styles/ # CSS modules | |
| βββ tests/ | |
| β βββ unit/ | |
| β βββ integration/ | |
| β βββ e2e/ | |
| βββ docs/ | |
| ``` | |