Anshu13 commited on
Commit
fab575d
·
verified ·
1 Parent(s): f66cb61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -172
README.md CHANGED
@@ -1,173 +1,11 @@
1
- # 🧠 Prompt Engine
2
-
3
- A powerful AI-based system that converts **text, image, and audio inputs** into **high-quality, structured prompts** for generative AI models like Stable Diffusion, Midjourney, and DALL·E.
4
-
5
  ---
6
-
7
- ## 🚀 Features
8
-
9
- * ✍️ **Text → Prompt**
10
- Refines and extends simple prompts into detailed, high-quality prompts.
11
-
12
- * 🖼️ **Image + Text → Prompt**
13
- Understands an image and user intent to generate a descriptive prompt.
14
-
15
- * 🎧 **Audio → Prompt**
16
- Converts speech into text and then generates a refined prompt.
17
-
18
- * 🧠 **Multimodal AI (Janus-Pro-1B)**
19
- Uses a vision-language model for intelligent prompt generation.
20
-
21
- * 🎨 **Gradio UI**
22
- Interactive web interface for easy usage.
23
-
24
- ---
25
-
26
- ## 🧩 Architecture
27
-
28
- ```
29
- Input (Text / Image / Audio)
30
-
31
- Preprocessing Layer
32
- (Whisper for audio)
33
-
34
- Instruction Builder (Prompt Engineering)
35
-
36
- Janus-Pro-1B Model
37
-
38
- Post-processing (clean output)
39
-
40
- Final AI Prompt
41
- ```
42
-
43
- ---
44
-
45
- ## 🛠️ Tech Stack
46
-
47
- * **Python**
48
- * **HuggingFace Transformers**
49
- * **DeepSeek Janus-Pro-1B**
50
- * **OpenAI Whisper (Speech-to-Text)**
51
- * **Gradio (UI)**
52
- * **PyTorch**
53
-
54
- ---
55
-
56
- ## 📦 Installation
57
-
58
- ### 1. Clone the repository
59
-
60
- ```bash
61
- git clone https://github.com/your-username/prompt-generator.git
62
- cd prompt-generator
63
- ```
64
-
65
- ---
66
-
67
- ### 2. Install dependencies
68
-
69
- ```bash
70
- pip install -r requirements.txt
71
- ```
72
-
73
- ---
74
-
75
- ### 3. Run the application
76
-
77
- ```bash
78
- python app.py
79
- ```
80
-
81
- ---
82
-
83
- ## 🧪 Usage
84
-
85
- 1. Open the Gradio UI in your browser
86
- 2. Select input type:
87
-
88
- * Text
89
- * Image + Text
90
- * Audio
91
- 3. Provide input
92
- 4. Click **Generate Prompt 🚀**
93
- 5. Get your refined AI prompt
94
-
95
- ---
96
-
97
- ## 🧠 Example
98
-
99
- ### Input:
100
-
101
- ```
102
- boy in forest
103
- ```
104
-
105
- ### Output:
106
-
107
- ```
108
- A cinematic scene of a young boy standing in a dense forest, soft sunlight filtering through tall trees, atmospheric fog, ultra-detailed, 4k, depth of field, masterpiece
109
- ```
110
-
111
- ---
112
-
113
- ## 📁 Project Structure
114
-
115
- ```
116
- project/
117
-
118
- ├── app.py
119
- ├── requirements.txt
120
- └── README.md
121
- ```
122
-
123
- ---
124
-
125
- ## ⚙️ Core Functions
126
-
127
- * `text_to_prompt()`
128
- * `image_text_to_prompt()`
129
- * `audio_to_prompt()`
130
- * `generate_universal_prompt()`
131
-
132
- ---
133
-
134
- ## ⚠️ Limitations
135
-
136
- * Requires GPU for best performance
137
- * Video input not supported (yet)
138
- * Output quality depends on prompt instruction
139
-
140
- ---
141
-
142
- ## 🔮 Future Improvements
143
-
144
- * 🎥 Video input support
145
- * 🎨 Style selection (anime, cinematic, realistic)
146
- * 📊 Prompt scoring system
147
- * ☁️ Deployment on HuggingFace Spaces
148
-
149
- ---
150
-
151
- ## 🤝 Contributing
152
-
153
- Pull requests are welcome!
154
- For major changes, please open an issue first.
155
-
156
- ---
157
-
158
- ## 📜 License
159
-
160
- This project is open-source under the MIT License.
161
-
162
- ---
163
-
164
- ## 👨‍💻 Author
165
-
166
- **Anshu Singh**
167
-
168
- ---
169
-
170
- ## ⭐ If you like this project
171
-
172
- Give it a ⭐ on GitHub!
173
-
 
 
 
 
 
1
  ---
2
+ title: Multimodal AI Prompt Generator
3
+ emoji: 🧠
4
+ colorFrom: purple
5
+ colorTo: blue
6
+ sdk: gradio
7
+ sdk_version: "4.44.0"
8
+ python_version: "3.10"
9
+ app_file: app.py
10
+ pinned: false
11
+ ---