--- title: Markdown Layout Extractor emoji: 📄 colorFrom: red colorTo: yellow sdk: docker app_port: 7860 pinned: false ---

PDF to Markdown MCP

Python 3.12 uv FastMCP Mistral AI Starlette Uvicorn Loguru

An MCP (Model Context Protocol) server that converts PDFs and documents into Markdown using **Mistral OCR**. ## Features - **`pdf_to_markdown`** — Convert any publicly accessible PDF/document URL to merged Markdown. - **`pdf_to_structured_markdown`** — Convert and get per-page structured output (page index, individual markdown, merged result). - CORS-enabled SSE transport — connect from any MCP client or inspector. - `/health` endpoint for liveness probing. - Structured, colorized logging via Loguru. ## Project Structure ``` pdf_to_md_mcp/ ├── main.py # Entry point — uvicorn runner ├── pyproject.toml ├── sample.env # Secrets reference (copy to .env) ├── development.yml # Non-secret config (server, CORS, OCR model) └── app/ ├── server.py # ASGI app factory (MCP + CORS + health) ├── core/ │ ├── config.py # Pydantic settings (loads .env + development.yml) │ ├── logger.py # Loguru logger │ ├── lifespan.py # AppContext + Mistral client lifecycle │ └── exceptions.py # Domain exceptions ├── services/ │ └── ocr_service.py # Mistral OCR business logic ├── tools/ │ └── markdown_tools.py # @mcp.tool() definitions └── utils/ ├── response.py # create_response() helper └── validators.py # URL validation ``` ## Setup ```bash # Install uv if not already installed curl -LsSf https://astral.sh/uv/install.sh | sh # Install dependencies uv sync # Configure secrets cp sample.env .env # Edit .env — set MISTRAL_API_KEY # Non-secret config (server, CORS, OCR model) lives in development.yml ``` ## Run ```bash uv run main.py ``` Server starts at `http://127.0.0.1:8000` by default. | Endpoint | Description | | --- | --- | | `GET /health` | Liveness probe | | `GET /sse` | MCP SSE transport | | `POST /messages/` | MCP message handler | ## MCP Tools ### `pdf_to_markdown` Convert a document URL to merged Markdown (all pages concatenated). **Input** | Parameter | Type | Description | | --- | --- | --- | | `document_url` | `string` | Publicly accessible URL of a PDF or image document | **Returns** — `string` ``` # Introduction This paper presents... ## Section 2 ... ``` --- ### `pdf_to_structured_markdown` Convert a document URL and get per-page structured output alongside the merged result. **Input** | Parameter | Type | Description | | --- | --- | --- | | `document_url` | `string` | Publicly accessible URL of a PDF or image document | **Returns** — `object` ```json { "page_count": 3, "pages": [ { "index": 0, "markdown": "# Page 1\n..." }, { "index": 1, "markdown": "## Page 2\n..." }, { "index": 2, "markdown": "### Page 3\n..." } ], "markdown": "# Page 1\n...\n\n## Page 2\n...\n\n### Page 3\n..." } ``` ## Debugging with MCP Inspector ```bash npx -y @modelcontextprotocol/inspector ``` Connect to `http://127.0.0.1:8000/sse` locally or your Railway URL in production. ## Deploy to Railway ### 1. Push to GitHub ```bash git init git add . git commit -m "initial commit" gh repo create pdf-to-md-mcp --public --source=. --push ``` ### 2. Create a Railway project Go to [railway.app](https://railway.app) → **New Project** → **Deploy from GitHub repo** → select your repo. Railway detects the `railway.json` and uses `uv run main.py` as the start command automatically. ### 3. Set environment variables In Railway → your service → **Variables**, add: | Variable | Value | |---|---| | `MISTRAL_API_KEY` | your Mistral API key | | `HOST` | `0.0.0.0` | > `PORT` is injected automatically by Railway — do **not** set it manually. > All other config (`MISTRAL_OCR_MODEL`, `LOG_LEVEL`, etc.) is read from `development.yml`. ### 4. Deploy Railway triggers a deploy on every push to your default branch. Once live, your public SSE URL will be: ``` https://.up.railway.app/sse ``` Use that URL in any MCP client or pass it to the inspector: ```bash npx -y @modelcontextprotocol/inspector # connect to: https://.up.railway.app/sse ``` ### Why it works - Railway injects `PORT` as an env var — pydantic-settings reads env vars before `development.yml`, so it's picked up automatically. - `HOST=0.0.0.0` (set via Railway Variables) overrides the local `127.0.0.1` default so the container is reachable. - `proxy_headers=True` in `main.py` makes uvicorn trust Railway's `X-Forwarded-*` headers. - `/health` is set as Railway's healthcheck path in `railway.json`. ## Configuration Configuration is split across two files to separate secrets from non-sensitive settings. ### `.env` — Secrets only ```dotenv MISTRAL_API_KEY=your_mistral_api_key_here ``` ### `development.yml` — Non-secret config ```yaml # Mistral MISTRAL_OCR_MODEL: mistral-ocr-latest MISTRAL_TABLE_FORMAT: markdown # Server APP_NAME: "Markdown & Layout Extractor" HOST: "127.0.0.1" PORT: 8000 LOG_LEVEL: INFO # CORS CORS_ALLOW_ORIGINS: - "*" CORS_ALLOW_METHODS: - "*" CORS_ALLOW_HEADERS: - "*" ``` **Priority (highest → lowest):** environment variables → `.env` → `development.yml` ### All settings | Variable | File | Default | Description | | --- | --- | --- | --- | | `MISTRAL_API_KEY` | `.env` | **required** | Mistral AI API key | | `MISTRAL_OCR_MODEL` | `development.yml` | `mistral-ocr-latest` | OCR model identifier | | `MISTRAL_TABLE_FORMAT` | `development.yml` | `markdown` | Table output format | | `APP_NAME` | `development.yml` | `Markdown & Layout Extractor` | MCP server name | | `HOST` | `development.yml` | `127.0.0.1` | Bind address | | `PORT` | `development.yml` | `8000` | Bind port | | `LOG_LEVEL` | `development.yml` | `INFO` | Log level (`DEBUG`, `INFO`, `WARNING`, `ERROR`) | | `CORS_ALLOW_ORIGINS` | `development.yml` | `["*"]` | Allowed CORS origins | | `CORS_ALLOW_METHODS` | `development.yml` | `["*"]` | Allowed HTTP methods | | `CORS_ALLOW_HEADERS` | `development.yml` | `["*"]` | Allowed HTTP headers | ## Design Notes - **Single Starlette app** — `sse_app()` is the sole ASGI application; the health route and CORS middleware are injected directly onto it to prevent double-middleware stacking (which causes the `http.response.start` crash). - **Separation of concerns** — Tools are thin wrappers around `OCRService`; business logic is independently testable. - **Lifespan-managed client** — The Mistral client is initialized once at startup and shared across all tool calls. - **Loguru logging** — Structured, colorized logs across all layers via Loguru. - **Pydantic Settings** — Type-safe, `.env`-driven configuration with an LRU-cached singleton.