File size: 1,669 Bytes
5c9f0d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
title: Doc Alive - RAG to Image
emoji: πŸ“¦πŸŽ¨
colorFrom: blue
colorTo: pink
sdk: gradio
sdk_version: "5.44.1"
app_file: app.py
pinned: false
---

# πŸ“¦β†’πŸ§ β†’πŸŽ¨ Doc Alive: RAG-to-Image with OpenAI

This project turns documents into **illustrations** with the help of RAG (Retrieval-Augmented Generation), LLM prompt engineering, and OpenAI’s image generation.

Upload a `.txt`, `.md`, or `.pdf` file, describe your goal, and the app will:
1. **Extract text** from your file  
2. **Retrieve key excerpts** using embeddings  
3. **Ask an LLM** to craft a structured image generation spec  
4. **Generate an illustration** with OpenAI’s image model  

---

## πŸš€ Demo

This app runs on **Hugging Face Spaces** using **Gradio**.  

---

## πŸ”‘ API Key

You must provide your own **OpenAI API key** to use this demo.  
- Enter your key in the input box (starts with `sk-...`)  
- The key is **not stored** β€” it is only used in memory for your current session  

---

## πŸ“‚ Project Structure


β”œβ”€ app.py # Gradio UI entry
β”œβ”€ requirements.txt # Dependencies
β”œβ”€ rag/ # Text extraction + retrieval
β”œβ”€ llm/ # Structured LLM call helper
β”œβ”€ generation/ # Image generation helper


---

## πŸ›  Tech Stack

- [Gradio](https://www.gradio.app/) – UI framework  
- [OpenAI](https://platform.openai.com/) – LLM + image generation  
- [RAG (text-embedding-3-small)](https://platform.openai.com/docs/guides/embeddings) – semantic retrieval  

---

## ⚠️ Notes

- The OpenAI API key is required for both embeddings and image generation  
- We do **not** log or save your key  
- Depending on your key usage, OpenAI will bill API calls