Spaces:
Sleeping
Sleeping
File size: 4,748 Bytes
e98cc10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | # Medium-MCP Architecture
## System Overview
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Entry Points β
βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββββββββββ€
β server.py β app.py β CLI β
β (MCP Server) β (Gradio UI) β (Commands) β
ββββββββββ¬βββββββββ΄βββββββββ¬βββββββββ΄βββββββββ¬βββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ScraperService β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Tier Chain β β
β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β
β β β Cache βββGraphQL βββ HTTPX βββBrowser ββ... β β
β β β (T0) β β (T10) β β (T20) β β (T30) β β β
β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β database β β extractor β β parser β
β (SQLite) β β (Apollo/LD) β β (HTML) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
```
---
## Core Modules
### Entry Points
| Module | Purpose |
|--------|---------|
| `server.py` | MCP server (FastMCP) |
| `app.py` | Gradio web UI |
### Service Layer
| Module | Purpose |
|--------|---------|
| `src/service.py` | ScraperService orchestration |
| `src/tiers/` | Tier chain pattern |
### Data Layer
| Module | Purpose |
|--------|---------|
| `src/database.py` | SQLite caching |
| `src/extractor.py` | Content extraction |
| `src/parser.py` | HTML parsing |
### Infrastructure
| Module | Purpose |
|--------|---------|
| `src/http_pool.py` | Connection pooling |
| `src/resilience.py` | Circuit breaker |
| `src/validation.py` | Input validation |
| `src/security.py` | Rate limiting |
---
## Tier Chain
| Priority | Tier | Description |
|----------|------|-------------|
| 0 | Cache | SQLite lookup |
| 10 | GraphQL | Medium API |
| 20 | HTTPX | Fast HTTP |
| 30 | Browser | Playwright |
| 40 | Wayback | Archive.org |
Each tier returns `TierResult(success, data, error)`.
---
## Data Flow
```
URL β validate β resolve_id β tier_chain β extract β cache β render
```
1. **Validate**: Check URL format
2. **Resolve**: Extract post ID
3. **Tier Chain**: Try each tier until success
4. **Extract**: Parse Apollo/JSON-LD
5. **Cache**: Store in SQLite
6. **Render**: Output markdown/HTML
---
## Package Structure
```
Medium-MCP/
βββ src/
β βββ tiers/ # Tier implementations
β βββ validation.py # Input validation
β βββ security.py # Rate limiting
β βββ exceptions.py # Error handling
β βββ metrics.py # Performance metrics
βββ mcp/
β βββ schemas.py # Pydantic models
β βββ tools/ # Tool implementations
βββ ui/
β βββ styles/ # CSS modules
βββ tests/
β βββ unit/
β βββ integration/
β βββ e2e/
βββ docs/
```
|