|
|
--- |
|
|
title: Doc Alive - RAG to Image |
|
|
emoji: π¦π¨ |
|
|
colorFrom: blue |
|
|
colorTo: pink |
|
|
sdk: gradio |
|
|
sdk_version: "5.44.1" |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# π¦βπ§ βπ¨ Doc Alive: RAG-to-Image with OpenAI |
|
|
|
|
|
This project turns documents into **illustrations** with the help of RAG (Retrieval-Augmented Generation), LLM prompt engineering, and OpenAIβs image generation. |
|
|
|
|
|
Upload a `.txt`, `.md`, or `.pdf` file, describe your goal, and the app will: |
|
|
1. **Extract text** from your file |
|
|
2. **Retrieve key excerpts** using embeddings |
|
|
3. **Ask an LLM** to craft a structured image generation spec |
|
|
4. **Generate an illustration** with OpenAIβs image model |
|
|
|
|
|
--- |
|
|
|
|
|
## π Demo |
|
|
|
|
|
This app runs on **Hugging Face Spaces** using **Gradio**. |
|
|
|
|
|
--- |
|
|
|
|
|
## π API Key |
|
|
|
|
|
You must provide your own **OpenAI API key** to use this demo. |
|
|
- Enter your key in the input box (starts with `sk-...`) |
|
|
- The key is **not stored** β it is only used in memory for your current session |
|
|
|
|
|
--- |
|
|
|
|
|
## π Project Structure |
|
|
|
|
|
|
|
|
ββ app.py # Gradio UI entry |
|
|
ββ requirements.txt # Dependencies |
|
|
ββ rag/ # Text extraction + retrieval |
|
|
ββ llm/ # Structured LLM call helper |
|
|
ββ generation/ # Image generation helper |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## π Tech Stack |
|
|
|
|
|
- [Gradio](https://www.gradio.app/) β UI framework |
|
|
- [OpenAI](https://platform.openai.com/) β LLM + image generation |
|
|
- [RAG (text-embedding-3-small)](https://platform.openai.com/docs/guides/embeddings) β semantic retrieval |
|
|
|
|
|
--- |
|
|
|
|
|
## β οΈ Notes |
|
|
|
|
|
- The OpenAI API key is required for both embeddings and image generation |
|
|
- We do **not** log or save your key |
|
|
- Depending on your key usage, OpenAI will bill API calls |