--- title: Doc Alive - RAG to Image emoji: πŸ“¦πŸŽ¨ colorFrom: blue colorTo: pink sdk: gradio sdk_version: "5.44.1" app_file: app.py pinned: false --- # πŸ“¦β†’πŸ§ β†’πŸŽ¨ Doc Alive: RAG-to-Image with OpenAI This project turns documents into **illustrations** with the help of RAG (Retrieval-Augmented Generation), LLM prompt engineering, and OpenAI’s image generation. Upload a `.txt`, `.md`, or `.pdf` file, describe your goal, and the app will: 1. **Extract text** from your file 2. **Retrieve key excerpts** using embeddings 3. **Ask an LLM** to craft a structured image generation spec 4. **Generate an illustration** with OpenAI’s image model --- ## πŸš€ Demo This app runs on **Hugging Face Spaces** using **Gradio**. --- ## πŸ”‘ API Key You must provide your own **OpenAI API key** to use this demo. - Enter your key in the input box (starts with `sk-...`) - The key is **not stored** β€” it is only used in memory for your current session --- ## πŸ“‚ Project Structure β”œβ”€ app.py # Gradio UI entry β”œβ”€ requirements.txt # Dependencies β”œβ”€ rag/ # Text extraction + retrieval β”œβ”€ llm/ # Structured LLM call helper β”œβ”€ generation/ # Image generation helper --- ## πŸ›  Tech Stack - [Gradio](https://www.gradio.app/) – UI framework - [OpenAI](https://platform.openai.com/) – LLM + image generation - [RAG (text-embedding-3-small)](https://platform.openai.com/docs/guides/embeddings) – semantic retrieval --- ## ⚠️ Notes - The OpenAI API key is required for both embeddings and image generation - We do **not** log or save your key - Depending on your key usage, OpenAI will bill API calls