Spaces:
Sleeping
Sleeping
| title: Marker PDF | |
| emoji: π | |
| colorFrom: yellow | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.29.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Extracts structured scientific content (Markdown) from PDFs | |
| # Marker PDF β Markdown | |
| This Space demonstrates the use of [Marker](https://github.com/VikParuchuri/marker), a powerful scientific paper parser developed by Papers with Code, to extract Markdown content from PDF articles. | |
| You can either: | |
| - Upload a `.pdf` file | |
| - Or paste a direct link to a PDF (e.g., from arXiv or OpenAccess journals) | |
| The output is rendered as Markdown and includes the title, abstract, sections, and references if properly parsed. | |
| --- | |
| ## How it works | |
| This demo uses the `marker.pdf.process_pdf` Python interface instead of command-line tools, running purely on CPU for maximum compatibility with free-tier Spaces. | |
| The interface is built with Gradio and supports both file uploads and remote URL access. | |
| --- | |
| ## Limitations | |
| - Some PDFs with unusual structures (e.g., scanned images or complex layouts) may fail or produce incomplete output. | |
| --- | |
| ## License | |
| Apache 2.0 β see original license from [Marker on GitHub](https://github.com/VikParuchuri/marker) | |