File size: 1,669 Bytes
5c9f0d9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
title: Doc Alive - RAG to Image
emoji: π¦π¨
colorFrom: blue
colorTo: pink
sdk: gradio
sdk_version: "5.44.1"
app_file: app.py
pinned: false
---
# π¦βπ§ βπ¨ Doc Alive: RAG-to-Image with OpenAI
This project turns documents into **illustrations** with the help of RAG (Retrieval-Augmented Generation), LLM prompt engineering, and OpenAIβs image generation.
Upload a `.txt`, `.md`, or `.pdf` file, describe your goal, and the app will:
1. **Extract text** from your file
2. **Retrieve key excerpts** using embeddings
3. **Ask an LLM** to craft a structured image generation spec
4. **Generate an illustration** with OpenAIβs image model
---
## π Demo
This app runs on **Hugging Face Spaces** using **Gradio**.
---
## π API Key
You must provide your own **OpenAI API key** to use this demo.
- Enter your key in the input box (starts with `sk-...`)
- The key is **not stored** β it is only used in memory for your current session
---
## π Project Structure
ββ app.py # Gradio UI entry
ββ requirements.txt # Dependencies
ββ rag/ # Text extraction + retrieval
ββ llm/ # Structured LLM call helper
ββ generation/ # Image generation helper
---
## π Tech Stack
- [Gradio](https://www.gradio.app/) β UI framework
- [OpenAI](https://platform.openai.com/) β LLM + image generation
- [RAG (text-embedding-3-small)](https://platform.openai.com/docs/guides/embeddings) β semantic retrieval
---
## β οΈ Notes
- The OpenAI API key is required for both embeddings and image generation
- We do **not** log or save your key
- Depending on your key usage, OpenAI will bill API calls |