--- title: Marker PDF emoji: 📄 colorFrom: yellow colorTo: green sdk: gradio sdk_version: 5.29.0 app_file: app.py pinned: false license: apache-2.0 short_description: Extracts structured scientific content (Markdown) from PDFs --- # Marker PDF → Markdown This Space demonstrates the use of [Marker](https://github.com/VikParuchuri/marker), a powerful scientific paper parser developed by Papers with Code, to extract Markdown content from PDF articles. You can either: - Upload a `.pdf` file - Or paste a direct link to a PDF (e.g., from arXiv or OpenAccess journals) The output is rendered as Markdown and includes the title, abstract, sections, and references if properly parsed. --- ## How it works This demo uses the `marker.pdf.process_pdf` Python interface instead of command-line tools, running purely on CPU for maximum compatibility with free-tier Spaces. The interface is built with Gradio and supports both file uploads and remote URL access. --- ## Limitations - Some PDFs with unusual structures (e.g., scanned images or complex layouts) may fail or produce incomplete output. --- ## License Apache 2.0 — see original license from [Marker on GitHub](https://github.com/VikParuchuri/marker)