marker-pdf / README.md
haryde's picture
Update Readme.md
a6fb3f4 verified

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: Marker PDF
emoji: πŸ“„
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Extracts structured scientific content (Markdown) from PDFs

Marker PDF β†’ Markdown

This Space demonstrates the use of Marker, a powerful scientific paper parser developed by Papers with Code, to extract Markdown content from PDF articles.

You can either:

  • Upload a .pdf file
  • Or paste a direct link to a PDF (e.g., from arXiv or OpenAccess journals)

The output is rendered as Markdown and includes the title, abstract, sections, and references if properly parsed.


How it works

This demo uses the marker.pdf.process_pdf Python interface instead of command-line tools, running purely on CPU for maximum compatibility with free-tier Spaces.

The interface is built with Gradio and supports both file uploads and remote URL access.


Limitations

  • Some PDFs with unusual structures (e.g., scanned images or complex layouts) may fail or produce incomplete output.

License

Apache 2.0 β€” see original license from Marker on GitHub