Powerpoint_AI / README.md
Reubencf's picture
Readme updated
45b3fab
metadata
title: Powerpoint AI
emoji: 🎨
colorFrom: green
colorTo: yellow
sdk: docker
app_port: 7860
pinned: true

Powerpoint AI

Powerpoint AI is a Next.js presentation generator that turns a prompt into editable slides, lets the user refine them in a Google Slides-like editor, and exports the final deck as a .pptx file. The project is built around Hugging Face login and inference, a template-driven slide system, and a hybrid editor that supports both generated slide specs and draggable canvas elements.

This README is written as a learning guide. It explains:

  1. What the app does end-to-end.
  2. How to run it locally and deploy it to a Hugging Face Space.
  3. How the request flow moves from the homepage to the editor.
  4. What each important file does and how the files connect.
  5. How to rebuild the project manually if you want to learn by coding it yourself.

What The App Does

At a high level, the app works like this:

  1. The user opens / and enters a presentation prompt.
  2. The home page verifies that the user is logged in with Hugging Face.
  3. The prompt and selected template are stored in browser session/local storage.
  4. The app routes the user to /editor.
  5. The editor calls /api/presentations/generate with the prompt and model.
  6. The backend requests structured JSON slides from Hugging Face Inference.
  7. The response is normalized into the app's SlideSpec shape.
  8. The editor renders those slides through the template/theme system.
  9. The user edits text, layout, images, and formatting inside the editor.
  10. The user exports the deck as an editable PowerPoint file.

Stack

  • Next.js App Router
  • React + TypeScript
  • Tailwind CSS
  • Hugging Face OAuth + Inference
  • Unsplash image search/download integration
  • pptxgenjs for editable PPTX export

Prerequisites

You need:

  • Node.js 20+
  • npm
  • A Hugging Face account
  • A manual Hugging Face connected OAuth app
  • Optional: an Unsplash access key if you want real image search results

Environment Variables

Set these in .env for local development or in Hugging Face Space secrets/variables for deployment.

Variable Required Purpose
HUGGINGFACE_CLIENT_ID Yes for login Client ID from your manual Hugging Face connected app
HF_TOKEN Optional Fallback Hugging Face token if the user is not logged in
HF_API_KEY Optional Alternate fallback token name supported by the generation route
UNSPLASH_ACCESS_KEY Optional Enables live Unsplash image search/results
NEXT_PUBLIC_UNSPLASH_ACCESS_KEY Optional Alternate client-visible Unsplash key fallback used by some routes

Notes:

  • This project is currently set up for a manual connected app, not Hugging Face Spaces native OAuth.
  • Do not add hf_oauth: true if you want to keep using your existing connected app.
  • The allowed redirect URI in your Hugging Face connected app should be your actual app origin, for example:
    • http://localhost:3000/
    • https://reubencf-powerpoint-ai.hf.space/

Local Development

  1. Install dependencies:

    npm install
    
  2. Create .env and add the variables you need:

    HUGGINGFACE_CLIENT_ID=your_client_id
    HF_TOKEN=your_optional_fallback_token
    UNSPLASH_ACCESS_KEY=your_optional_unsplash_key
    
  3. Start the dev server:

    npm run dev
    
  4. Open:

    http://localhost:3000
    
  5. Log in with Hugging Face, enter a prompt, and verify that generation reaches /editor.

Hugging Face Space Deployment

This repo is configured as a Docker Space using the frontmatter at the top of this file.

To deploy:

  1. Create a Hugging Face Docker Space.
  2. Push this repo to the Space.
  3. Add the required Space secrets/variables:
    • HUGGINGFACE_CLIENT_ID
    • optionally HF_TOKEN
    • optionally UNSPLASH_ACCESS_KEY
  4. In your manual connected Hugging Face app settings, add the Space origin as an allowed redirect URI.
  5. Redeploy the Space.

Important:

  • The app currently uses HUGGINGFACE_CLIENT_ID through /api/auth/client-id.
  • The login flow dynamically uses window.location.origin as the redirect target.
  • On Hugging Face Spaces, the homepage detects iframe/Space hosting and escapes the container before login if needed.

End-To-End Architecture Flow

1. Route entrypoints

  • / loads the homepage prompt builder.
  • /editor loads the editor only if a generation session was created from the homepage.
  • /template exists as a template preview/browsing route.
  • /api/* provides auth, generation, AI edit, image search, and image proxy services.

2. Homepage to editor handoff

The homepage stores these keys before routing to the editor:

  • generationPrompt
  • generationModel
  • isGenerating
  • editorAccess
  • ppt_theme in local storage

The editor reads those values on mount to decide whether to:

  • redirect back to /
  • restore the chosen theme
  • call the generation API automatically

3. Generation backend flow

/api/presentations/generate does the following:

  1. Validates the prompt.
  2. Resolves the Hugging Face token from headers, cookies, or env fallback.
  3. Builds the system prompt with lib/slide-prompt.ts.
  4. Calls the approved model through lib/hf-client.ts.
  5. Parses loose model JSON defensively.
  6. Normalizes the payload into the app's slide shape.
  7. Optionally enriches image slides with Unsplash data.
  8. Returns normalized slide data for the editor.

4. Editor rendering flow

The editor supports two rendering modes:

  • Template mode: uses SlideSpec and SlideFactory to render theme-specific layouts.
  • Canvas mode: uses positioned EditorElement objects for draggable/editable items.

For generated presentations, the app first normalizes API data into SlideSpec, then builds canvas/editor elements where needed so the legacy editing features still work.

5. Export flow

  • The top bar export button calls hooks/useExport.ts.
  • useExport delegates to lib/editable-pptx-export.ts.
  • The exporter converts the current editor state into an editable .pptx.

How The Editor Works Internally

The editor is intentionally state-heavy because it coordinates:

  • the list of slides
  • the current slide index
  • selected element state
  • inline text editing state
  • zoom state
  • undo/redo history
  • AI text editing
  • image replacement/search
  • template-rendered slides and older element-based slides

The two most important ideas are:

  1. SlideSpec is the declarative template-friendly shape.
  2. SlideModel / EditorElement is the interactive editor canvas shape.

That split is why some code looks duplicated: the app preserves a template rendering pipeline while also keeping direct canvas editing features.

How Templates And Themes Are Organized

There are three template families right now:

  • neobrutalism
  • galeryn
  • noisy

Each template has:

  • a declarative definition in data/templates/*.ts
  • theme-specific React layouts in components/slides/<theme>/layouts.tsx
  • fallback generic layout components in components/slides/*.tsx

The registry in data/templates/index.ts describes:

  • template IDs
  • layout IDs
  • field definitions
  • style tokens
  • default slide data

The SlideFactory uses that registry plus the selected template ID to render the correct slide component.

How AI Text Editing, Image Search, And Export Connect

AI text editing

  • The editor opens components/editor/AIToolsDialog.tsx.
  • That dialog talks to /api/ai-edit-text/route.ts.
  • The route uses the Hugging Face provider layer to rewrite selected text.

Image search

  • The editor opens components/UnsplashImageSearch.tsx.
  • That component calls /api/search-images/route.ts.
  • /api/unsplash-download/route.ts is used for download attribution.
  • /api/image-proxy/route.ts helps keep remote images safe for capture/export flows.

Export

  • The editor export button calls hooks/useExport.ts.
  • useExport calls lib/editable-pptx-export.ts.
  • The export library reads current slides, theme data, and layout information to create a real PowerPoint file.

File Map

This section is intentionally explicit so a new contributor can jump from route to component to helper without guessing.

App routes and app shell

  • app/layout.tsx
    • Root HTML/layout wrapper.
    • Adds global fonts, metadata, and the ThemeProvider.
  • app/globals.css
    • Global styles and utility-level CSS used across pages/components.
  • app/page.tsx
    • Root route entrypoint.
    • Re-exports components/home/HomePage.tsx.
  • app/editor/page.tsx
    • Guards /editor with sessionStorage.editorAccess.
    • Renders GoogleSlidesEditor.
  • app/template/page.tsx
    • Template preview/browsing route.

API routes

  • app/api/presentations/generate/route.ts
    • Main slide generation endpoint.
    • Calls HFClient, buildSlidePrompt, and Unsplash helpers.
  • app/api/ai-edit-text/route.ts
    • AI text rewriting endpoint for the editor.
  • app/api/search-images/route.ts
    • Searches Unsplash images for the editor modal.
  • app/api/unsplash-download/route.ts
    • Handles Unsplash download tracking.
  • app/api/image-proxy/route.ts
    • Proxies remote images for safe browser capture/export use.
  • app/api/auth/client-id/route.ts
    • Returns HUGGINGFACE_CLIENT_ID to the client login flow.

Homepage and top-level UI

  • components/home/HomePage.tsx
    • Landing page.
    • Restores auth, handles login, stores generation state, and routes to /editor.
  • components/ThemeProvider.tsx
    • Theme context wrapper built on next-themes.
  • components/UnsplashImageSearch.tsx
    • Search/select modal for remote images.

Editor components

  • components/editor/GoogleSlidesEditor.tsx
    • Main workspace.
    • Owns generation bootstrap, editor state, slide switching, toolbar wiring, and export entry.
  • components/editor/BottomToolbar.tsx
    • Floating toolbar for layout, typography, AI tools, zoom, and slide actions.
  • components/editor/AIToolsDialog.tsx
    • Text-editing dialog for AI-powered text changes.

Slide rendering

  • components/slides/SlideFactory.tsx
    • Central dispatcher that chooses the correct themed or fallback slide layout.
  • components/slides/TitleSlideLayout.tsx
    • Generic fallback layout for title/subtitle slides.
  • components/slides/AgendaSlideLayout.tsx
    • Generic fallback layout for agenda slides.
  • components/slides/TitleAndBodyLayout.tsx
    • Generic fallback layout for title/body slides.
  • components/slides/ThreeColumnLayout.tsx
    • Generic fallback layout for three-column slides.
  • components/slides/ImageAndTextLayout.tsx
    • Generic fallback layout for image/text slides.
  • components/slides/ReferenceLayout.tsx
    • Generic fallback layout for reference/list slides.
  • components/slides/ThankYouLayout.tsx
    • Generic fallback layout for ending slides.
  • components/slides/neobrutalism/layouts.tsx
    • Theme-specific layouts for the Neo-Brutalism template.
  • components/slides/galeryn/layouts.tsx
    • Theme-specific layouts for the Galeryn template.
  • components/slides/noisy/layouts.tsx
    • Theme-specific layouts for the Noisy template.
  • components/slides/shared/PersistedDraggableSurface.tsx
    • Shared editable/draggable slide-surface logic used by themed layouts.

Template registry

  • data/templates/index.ts
    • Template registry and template-related type definitions.
  • data/templates/neo-brutalism.ts
    • Neo-Brutalism template declaration.
  • data/templates/galeryn.ts
    • Galeryn template declaration.
  • data/templates/noisy.ts
    • Noisy template declaration.

Hooks

  • hooks/useExport.ts
    • Small export wrapper used by the editor.
  • hooks/useKeyboardShortcuts.ts
    • Keyboard shortcut behavior for editor actions.
  • hooks/useSlideHistory.ts
    • Undo/redo history storage for slides.

Library helpers

  • lib/ai-models.ts
    • Approved model constants and model validation.
  • lib/hf-client.ts
    • Hugging Face inference wrapper and provider-specific error handling.
  • lib/slide-prompt.ts
    • System prompt builder and layout normalization helpers.
  • lib/generated-presentation.ts
    • Typed generated-slide response helpers and SlideSpec mapping.
  • lib/template-options.ts
    • Template picker labels and normalization of saved IDs.
  • lib/theme-system.ts
    • Theme-level types/helpers used by the rendering system.
  • lib/editor-types.ts
    • Shared editor model types.
  • lib/editor-themes.ts
    • Theme/color/font definitions used by the editor and export code.
  • lib/layout-templates.ts
    • Default canvas layout creation helpers.
  • lib/editable-pptx-export.ts
    • Editable PowerPoint export implementation.
  • lib/capture-element.ts
    • DOM capture helper used by image/export flows.
  • lib/utils.ts
    • Small shared utility helpers.

Shared types

  • types/index.ts
    • Project-wide shared types that do not belong to a single feature module.

Config

  • next.config.ts
    • Next.js configuration.
  • package.json
    • Scripts, dependencies, and project metadata.
  • Dockerfile
    • Runtime image used by the Hugging Face Docker Space.

How To Manually Build This Project Yourself

If you want to learn by recreating the app manually, build it in this order:

Step 1: Create the app shell

  1. Start a Next.js app with TypeScript and App Router.
  2. Add app/layout.tsx and app/globals.css.
  3. Add a ThemeProvider so dark/light mode is available globally.

Step 2: Build the homepage

  1. Create a landing page with:
    • a prompt textarea
    • a template picker
    • a submit button
  2. Add client-side storage for:
    • the prompt
    • the selected template
    • a session flag that allows /editor
  3. Add Hugging Face login/logout state restoration.

Step 3: Add Hugging Face login

  1. Create a manual connected app in Hugging Face settings.
  2. Add HUGGINGFACE_CLIENT_ID to env/secrets.
  3. Create /api/auth/client-id/route.ts.
  4. Use oauthLoginUrl and oauthHandleRedirectIfPresent on the homepage.
  5. Store the returned access token in local storage for later API calls.

Step 4: Create the generation API

  1. Create /api/presentations/generate/route.ts.
  2. Accept a prompt and optional model input.
  3. Resolve the Hugging Face token from headers/cookies/env.
  4. Build a strict system prompt that asks for presentation JSON.
  5. Call the model through a small Hugging Face client wrapper.
  6. Parse and normalize the model response into your own slide format.

Step 5: Define the slide data model

  1. Create a template-side shape like SlideSpec.
  2. Create an editor-side shape like SlideModel / EditorElement.
  3. Add normalization helpers that map generated API data into those shapes.

Step 6: Build the template system

  1. Define templates in data/templates/*.ts.
  2. Add a registry in data/templates/index.ts.
  3. Create a SlideFactory that renders a slide by template ID + layout ID.
  4. Add generic fallback layouts so rendering still works if a template-specific layout is missing.

Step 7: Build the editor

  1. Create an editor page that reads the generation session.
  2. Call the generation API on first load.
  3. Store slides, current selection, zoom, and undo/redo history.
  4. Render:
    • a slide thumbnail rail
    • a main canvas
    • a toolbar
    • dialogs/modals for AI editing and image search

Step 8: Add AI text editing and images

  1. Create /api/ai-edit-text/route.ts.
  2. Add a dialog to send selected text for rewrite.
  3. Create Unsplash search and download routes.
  4. Add an image picker modal in the editor.

Step 9: Add export

  1. Create a hook for export actions.
  2. Convert the current slide state into pptxgenjs calls.
  3. Expose a top-bar export button in the editor.

Step 10: Polish and verify

  1. Add keyboard shortcuts and history.
  2. Add route guards so /editor is only reached through the expected flow.
  3. Verify login, generation, editing, image selection, and export.

Contributor Notes

  • Prefer the template registry and SlideSpec path when adding new slide styles.
  • Keep behavior changes separate from cleanup/documentation changes.
  • If you change storage keys or generation payload shapes, update both the homepage/editor handoff and this README.
  • If you add a new route or editor subsystem, add it to the file map above so new contributors can follow the flow quickly.