Spaces:

slxhere
/

doc_alive

Sleeping

App Files Files Community

doc_alive / README.md

slxhere

Add audio generation

5c9f0d9 4 months ago

preview code

raw

history blame contribute delete

1.67 kB

	---
	title: Doc Alive - RAG to Image
	emoji: 📦🎨
	colorFrom: blue
	colorTo: pink
	sdk: gradio
	sdk_version: "5.44.1"
	app_file: app.py
	pinned: false
	---

	# 📦→🧠→🎨 Doc Alive: RAG-to-Image with OpenAI

	This project turns documents into illustrations with the help of RAG (Retrieval-Augmented Generation), LLM prompt engineering, and OpenAI’s image generation.

	Upload a `.txt`, `.md`, or `.pdf` file, describe your goal, and the app will:
	1. Extract text from your file
	2. Retrieve key excerpts using embeddings
	3. Ask an LLM to craft a structured image generation spec
	4. Generate an illustration with OpenAI’s image model

	---

	## 🚀 Demo

	This app runs on Hugging Face Spaces using Gradio.

	---

	## 🔑 API Key

	You must provide your own OpenAI API key to use this demo.
	- Enter your key in the input box (starts with `sk-...`)
	- The key is not stored — it is only used in memory for your current session

	---

	## 📂 Project Structure


	├─ app.py # Gradio UI entry
	├─ requirements.txt # Dependencies
	├─ rag/ # Text extraction + retrieval
	├─ llm/ # Structured LLM call helper
	├─ generation/ # Image generation helper


	---

	## 🛠 Tech Stack

	- [Gradio](https://www.gradio.app/) – UI framework
	- [OpenAI](https://platform.openai.com/) – LLM + image generation
	- [RAG (text-embedding-3-small)](https://platform.openai.com/docs/guides/embeddings) – semantic retrieval

	---

	## ⚠️ Notes

	- The OpenAI API key is required for both embeddings and image generation
	- We do not log or save your key
	- Depending on your key usage, OpenAI will bill API calls