tfrere's picture
tfrere HF Staff
docs: refresh README, architecture, spec and add agent teams proposal
32a4aca

Collab Editor - Project Specification

1. Overview

A collaborative, real-time scientific article editor deployed as a Hugging Face Space. Users write rich content (math, citations, custom components) in a TipTap-based editor synced via Yjs/Hocuspocus, then publish a self-contained static HTML article. An AI assistant helps with writing and editing.

Stack: React 18 + TipTap 3 + Yjs (frontend) / Express + Hocuspocus + Node 20 (backend) / Docker on HF Spaces. No CSS-in-JS - all styling via vanilla CSS custom properties.

Relationship to research-article-template: the CSS foundation, design tokens, and visual language come from the research-article-template project. The editor imports the template's CSS files (_variables.css, _reset.css, _base.css, _layout.css, component partials) and the publisher injects them inline into published HTML. The published output is designed to look identical to articles built with the Astro-based template.


2. Architecture

graph TB
  subgraph browser [Browser]
    SPA["React SPA<br/>TipTap + Yjs"]
  end

  subgraph server [Node Backend - port 8080]
    Express["Express HTTP"]
    Hocuspocus["Hocuspocus<br/>WebSocket"]
    Publisher["Publisher Pipeline<br/>HTML + PDF"]
    Agent["AI Agent<br/>OpenRouter"]
    Auth["HF OAuth"]
  end

  subgraph storage [Persistence]
    LocalFS["Local FS<br/>data/*.yjs"]
    HFDataset["HF Dataset<br/>articles/ published/"]
  end

  SPA -->|"WebSocket /collab"| Hocuspocus
  SPA -->|"REST /api/*"| Express
  Hocuspocus -->|"Database ext"| LocalFS
  LocalFS -->|"schedulePush"| HFDataset
  HFDataset -->|"pullDocument"| LocalFS
  Express --> Publisher
  Express --> Agent
  Express --> Auth
  Publisher --> LocalFS
  Publisher -->|"uploadPublishedAssets"| HFDataset

Single process in production: the backend serves the Vite-built frontend, all REST APIs, the WebSocket collab channel, and static published articles. No reverse proxy needed.


3. Data Model (Yjs Shared Types)

The entire collaborative state lives in a single Y.Doc:

  • Y.XmlFragment("default") - TipTap document content (ProseMirror nodes synced via Collaboration extension)
  • Y.Map("frontmatter") - scalar metadata: title, subtitle, description, published, doi, template, licence
  • Y.Array("frontmatter.authors") - { name, url?, affiliations: number[] }[]
  • Y.Array("frontmatter.affiliations") - { name, url? }[]
  • Y.Map("citations") - CSL-JSON entries keyed by citation ID
  • Y.Map("settings") - citationStyle, primaryHue, and future editor preferences
  • Y.Map("comments") - comment threads keyed by commentId, each with author/text/resolved

All types are concurrently editable by multiple users and persist to data/default.yjs.


4. Backend Components

4.1 HTTP Routes

Method Path Auth Purpose
GET /oauth/authorize Public Redirect to HF OAuth
GET /auth/callback Public (CSRF state) Exchange code, set cookie, redirect to /editor
GET /api/auth/status Cookie Return { authenticated, canEdit, user }
POST /api/chat None Stream AI agent responses (OpenRouter)
POST /api/publish OAuth (canEdit) Run publish pipeline, generate HTML/PDF
POST /api/admin/reset-document OAuth (canEdit) Delete local .yjs, close connections
POST /api/upload None (uses cookie for HF) Upload image (multipart, max 10MB)
POST /api/citations/resolve None Resolve DOI/URL to CSL-JSON
POST /api/citations/format None Format entries to HTML bibliography
POST /api/citations/import-bib None Parse BibTeX to CSL-JSON
GET /editor OAuth (canEdit) Serve SPA (or login page)
GET * Public Serve published article (or login page)

4.2 WebSocket Collaboration

  • Upgrade on /collab only; all other paths rejected
  • Single document: DEFAULT_DOC_NAME = "default"
  • Hocuspocus onAuthenticate: validates OAuth token if enabled, checks canEdit
  • Database extension: fetch reads local .yjs or pulls from HF; store writes local + schedules HF push (10s debounce)

4.3 HF Storage

  • Dataset ID: HF_DATASET_ID or {SPACE_ID}-data
  • Token: HF_TOKEN (env) or cached OAuth token from last authenticated user
  • Documents: articles/<name>.yjs - debounced push on every Hocuspocus store
  • Published assets: published/<name>/{index.html, article.pdf, thumb.jpg, meta.json}
  • Images: images/<uuid-filename> with public resolve URL
  • flushAll() on SIGTERM/SIGINT to push pending changes

4.4 Publisher Pipeline

flowchart LR
  YDoc["Y.Doc (.yjs)"] --> Extract["extractFromYDoc<br/>frontmatter + JSON"]
  Extract --> GenHTML["generateHTML<br/>@tiptap/html"]
  GenHTML --> PostProc["postProcess<br/>accordion, biblio,<br/>mermaid, htmlEmbed"]
  PostProc --> Render["renderArticleHTML<br/>full HTML page"]
  CSS["loadCSS<br/>template styles"] --> Render
  Render --> LocalWrite["Write local<br/>index.html"]
  Render --> PDF["Playwright<br/>PDF + thumbnail"]
  LocalWrite --> HFUpload["uploadPublishedAssets<br/>HF dataset"]
  • CSS loading: reads template CSS files, resolves @custom-media queries via resolveCustomMedia(), splits into variables/reset/base/layout/components/article/print
  • Post-processing: accordion divs to <details>, bibliography injection, mermaid to <pre>, htmlEmbed to <iframe>
  • HTML output: self-contained page with inline CSS, CDN assets (KaTeX, highlight.js, Mermaid), theme toggle (SVG sun/moon), TOC generation (scroll-based, collapsible), lightbox, footer with citation/BibTeX/DOI
  • PDF: optional Playwright Chromium headless (1200x630 thumbnail + full PDF)
  • Server extensions: mirror of frontend TipTap extensions for server-side HTML generation

4.5 Auth

  • Enabled when SPACE_ID + OAUTH_CLIENT_ID are set
  • OAuth 2.0 flow with HF as provider; cookie hf_access_token (httpOnly, secure, sameSite: none)
  • resolveUser: whoAmI via @huggingface/hub, then checkWriteAccess (Space owner or org member with write/admin role)
  • In-memory state map with 10-min TTL for CSRF protection

4.6 AI Agent

  • Provider: OpenRouter (OPENROUTER_API_KEY), default model anthropic/claude-sonnet-4
  • Streaming via Vercel AI SDK streamText
  • Context: document text, current selection, frontmatter (sent by frontend with each message)
  • Tools (declarative, executed client-side by the frontend):
    • replaceSelection - replace selected text
    • insertAtCursor - insert at cursor position
    • applyDiff - search/replace in document
    • updateFrontmatter - modify metadata fields
    • addAuthor / removeAuthor - manage author list
  • Agent edits are grouped in a single Yjs UndoManager batch for Cmd+Z

4.7 Citations

  • Uses @citation-js/core with bibtex, doi, csl plugins
  • Resolve: DOI URL or identifier to CSL-JSON entries
  • Format: entries + style + locale to HTML bibliography
  • Import: BibTeX string to CSL-JSON

5. Frontend Components

5.1 App Shell

  • No router - single view with conditional rendering
  • Theme: CSS custom properties with dynamic primary color from settings.primaryHue (OKLCH color model, synced via Yjs settings)
  • Layout: top bar (undo/redo, settings, publish, user chip) + 3-column CSS grid (TOC / editor / comments)
  • Chat: floating button bottom-left, ChatPanel overlay
  • Modals: comment dialog, settings drawer, publish confirmation

5.2 Editor

  • Creates Y.Doc + HocuspocusProvider (WebSocket to /collab)
  • Seeding: after provider synced event only, if Y.XmlFragment("default") is empty, inserts DEFAULT_CONTENT + seedFrontmatter + SEED_CITATIONS
  • Yjs Maps: citations, settings, comments, frontmatter (via dedicated stores)
  • Image handling: paste/drop with upload to /api/upload

5.3 TipTap Extensions

Built-in (configured):

  • StarterKit (no codeBlock, no undo), CodeBlockLowlight (all languages), Placeholder, Collaboration, CollaborationCursorV3, Mathematics (KaTeX), Image, Table/Row/Cell/Header

Custom:

  • CollaborationUndo - bridges Yjs UndoManager for agent batch edits
  • Comment - inline mark with commentId + resolved
  • SlashCommands - / trigger with suggestion popup
  • ImageUpload - drag-drop upload node with progress
  • Citation - inline atomic node (key + label), links to citationsMap
  • Bibliography - block node with rendered HTML from citations
  • Glossary - inline atomic (term + definition tooltip)
  • Footnote - inline atomic (content shown in footer)
  • Stack + StackColumn - multi-column layout (2/3/4 cols)

5.4 Component System

Registry-based system for MDX-like custom components:

Component Kind Purpose
accordion wrapper Collapsible section (details/summary)
note wrapper Info/warning/danger/success callout
quoteBlock wrapper Styled blockquote
wide wrapper Content wider than column
fullWidth wrapper Full viewport width
sidenote wrapper Marginal note
reference wrapper Reference container
htmlEmbed atomic External HTML embed (iframe)
hfUser atomic HF user card
rawHtml atomic Raw HTML injection
mermaid atomic Mermaid diagram (live preview)
  • Factory: createComponentExtension(def) generates TipTap nodes from registry definitions (handles both wrapper and atomic kinds)
  • NodeViews: WrapperView (editable content area + chrome), AtomicView (placeholder + field editor), MermaidView (textarea + SVG preview)
  • Slash menu integration: each component generates a slash menu item via getComponentSlashItems()

5.5 Frontmatter System

  • FrontmatterStore: wraps Y.Map + Y.Array for real-time collaborative metadata editing
  • useFrontmatter hook: React state synced with Yjs observations
  • FrontmatterHero: WYSIWYG editable hero section (title, subtitle, authors, affiliations, date, DOI)
  • SettingsDrawer: template variant, SEO, banner, citation style, primary color hue slider, PDF/TOC/licence toggles
  • HueSlider: OKLCH hue picker (0-360) with live preview, synced to settingsMap.primaryHue

5.6 Other UI

  • TableOfContents: extracts headings from TipTap doc, scroll-based active state, collapsible sub-sections
  • ChatPanel: message list + quick actions on selection + input with streaming
  • CommentPopover: positioned comment popover anchored to the active thread (resolve/delete inline)
  • BubbleToolbar: floating toolbar on text selection (bold, italic, link, comment, etc.)
  • BlockHandle: drag handle for block-level nodes

5.7 CSS Architecture

styles/
  _variables.css       # Template tokens: --primary-color, breakpoints, @custom-media
  _reset.css           # Scoped reset for article content
  _base.css            # Typography, scoped to article content
  _layout.css          # 3-column grid, .wide/.full-width helpers
  _print.css           # Print styles
  _ui.css              # Editor chrome: buttons, dialogs, drawers, spinner
  tokens.css           # Design tokens (light/dark): text, bg, accent, code, danger, shadows
  article.css          # .tiptap content styles (shared editor/published)
  toc.css              # Editor TOC overrides
  editing.css          # Editor-only: layout, cursors, slash menu
  _publisher.css       # Published-only: theme toggle, wide/fullWidth, footer, lightbox
  components/
    _code.css          # Code blocks + syntax highlighting
    _table.css         # Tables
    _tag.css           # Tags
    _card.css          # Cards
    _mermaid.css       # Mermaid diagrams
    _embed.css         # Embed containers
    _embed-studio.css  # Embed studio overlay
    _hero.css          # Hero section (from template)
    _toc.css           # Base TOC styles (from template)
    _button.css        # Buttons (template)
    _form.css          # Form elements (template)
    _footer.css        # Footer (template)

The publisher reads these same CSS files server-side and injects them inline into published HTML, using resolveCustomMedia() to expand @custom-media queries into standard @media rules.


6. Deployment

6.1 Docker Build (3-stage)

  1. frontend-build: npm install + npm run build (Vite)
  2. backend-build: npm install + npx tsc
  3. runtime: node:20-slim + Chromium system deps + npm install --omit=dev + Playwright Chromium + copy frontend-dist/ + copy frontend/src/styles/ to frontend-styles/

CMD: node dist/server.js on port 8080.

6.2 HF Space Configuration (README.md frontmatter)

  • SDK: docker, port 8080
  • OAuth: hf_oauth: true, scopes: manage-repos
  • Two git remotes: space (tfrere/collab-editor, dev) and prod (tfrere/research-article-template-editor, production)

6.3 Environment Variables

Variable Required Purpose
PORT No (default 8080) HTTP listen port
NODE_ENV No production switches to frontend-dist path
SPACE_ID For OAuth/HF HF Space identifier, enables OAuth + dataset
SPACE_HOST For OAuth HTTPS callback URL host
OAUTH_CLIENT_ID For OAuth HF OAuth client
OAUTH_CLIENT_SECRET For OAuth HF OAuth secret
OAUTH_SCOPES No (default openid profile) OAuth scopes
HF_DATASET_ID No Override dataset name (default: {SPACE_ID}-data)
HF_TOKEN No Fallback Hub token for HF API
OPENROUTER_API_KEY For AI chat OpenRouter API key
OPENROUTER_MODEL No Default AI model
ENABLE_PDF No (default true) Toggle PDF/thumbnail generation

6.4 Local Development

# Terminal 1 - Backend
cd backend && npm install && npm run dev
# Starts on http://localhost:8080

# Terminal 2 - Frontend
cd frontend && npm install && npm run dev
# Starts on http://localhost:5678 (proxies /api and /collab to :8080)

Create a .env file in backend/ with at minimum OPENROUTER_API_KEY for AI chat. Without SPACE_ID, OAuth is disabled and all users can edit.


7. Key Data Flows

7.1 Collaborative Editing

sequenceDiagram
  participant ClientA as Client A
  participant Server as Hocuspocus
  participant ClientB as Client B
  participant Disk as Local FS
  participant HF as HF Dataset

  ClientA->>Server: WebSocket connect /collab
  Server->>Disk: Database.fetch (load .yjs)
  Server-->>ClientA: sync Y.Doc state
  ClientA->>Server: Y.Doc update (edit)
  Server->>ClientB: broadcast update
  Server->>Disk: Database.store (write .yjs)
  Disk-->>HF: schedulePush (10s debounce)

7.2 Publish Flow

sequenceDiagram
  participant User as Editor UI
  participant API as POST /api/publish
  participant HP as Hocuspocus
  participant Pub as Publisher
  participant FS as Local FS
  participant HF as HF Dataset

  User->>API: Click Publish
  API->>HP: openDirectConnection
  HP-->>API: Y.Doc snapshot
  API->>FS: Write .yjs snapshot
  API->>Pub: publishDocument()
  Pub->>Pub: extractFromYDoc + loadCSS
  Pub->>Pub: renderArticleHTML + PDF
  Pub->>FS: Write index.html locally
  Pub->>HF: uploadPublishedAssets
  Pub-->>API: { htmlUrl, pdfUrl, success }
  API-->>User: Publish result

7.3 Published Article Lifecycle (Container Restarts)

HF Spaces containers are ephemeral. The local filesystem is wiped on every restart (git push, Space rebuild, idle timeout). The published article survives via this restore flow:

sequenceDiagram
  participant Container as New Container
  participant FS as Local FS
  participant HF as HF Dataset
  participant Visitor as GET /

  Container->>Container: Server starts
  Container->>HF: ensurePublishedRestored()
  HF-->>FS: Pull index.html, PDF, meta.json
  Note over FS: data/published/default/index.html

  Visitor->>Container: GET /
  Container->>FS: Check published path
  FS-->>Container: index.html exists
  Container-->>Visitor: Serve published article

On publish, HTML is always written locally first (so GET / serves the new version immediately), then uploaded to HF dataset for persistence across restarts.

7.4 AI Agent Chat

sequenceDiagram
  participant User as Chat Panel
  participant Hook as useAgentChat
  participant API as POST /api/chat
  participant LLM as OpenRouter

  User->>Hook: sendMessage(text)
  Hook->>Hook: Build context (doc, selection, frontmatter)
  Hook->>API: { messages, context }
  API->>LLM: streamText (system prompt + tools)
  LLM-->>API: Stream (text + tool_calls)
  API-->>Hook: SSE stream
  Hook->>Hook: Execute tool calls client-side
  Note over Hook: replaceSelection, applyDiff,<br/>updateFrontmatter, etc.
  Hook->>Hook: UndoManager batch for Cmd+Z

8. Current Limitations and Known Issues

  • Test suite in progress: P0 tests being added (see docs/TESTS.md)
  • Single document: only "default" document supported; no multi-doc
  • Single-user token: last OAuth token cached globally for all HF API calls
  • No rate limiting on /api/chat or /api/citations/*
  • XSS surface: meta.licence and biblioHtml not escaped in published HTML
  • WS debug logging: every WebSocket message logged in production
  • No .env.example: environment variables documented only in code