ryanshelley commited on
Commit
e5597bf
Β·
verified Β·
1 Parent(s): 08e8d11

Create app.py

Browse files
Files changed (1) hide show
  1. app.py +469 -0
app.py ADDED
@@ -0,0 +1,469 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import os
3
+ from typing import Dict, List, Tuple
4
+ from openai import OpenAI
5
+ import time
6
+
7
+ class GlossaryGenerator:
8
+ def __init__(self):
9
+ self.template = """
10
+ **Glossary Page Template**
11
+ Use this template to create individual glossary pages for specific terms. Fill in each section with relevant information.
12
+
13
+ **[TERM NAME]**
14
+
15
+ **1. Introduction / Brief Definition (AI Overview)**
16
+ * **Purpose:** Provide the absolute clearest, most concise, and direct answer to "What is [TERM NAME]?" This should be a short, no-fluff definition, similar to an AI overview or a quick dictionary entry.
17
+ * **Content:** Start immediately with the core definition. Get straight to the point.
18
+
19
+ **2. Detailed Explanation**
20
+ * **Purpose:** Expand on the brief definition, offering a comprehensive explanation of the term. This section should follow a "reverse pyramid" structure, meaning the most critical information is presented first, followed by supporting details.
21
+ * **Content:**
22
+ * **Elaborate on the core concept:** Build upon the initial definition, providing more depth and context.
23
+ * **Explore related questions (PAA / Query Fans):** Anticipate what users might ask next or what related topics they might search for. Integrate answers to "People Also Ask" (PAA) type questions or expand into "query fan" concepts that naturally branch off the main term.
24
+ * Provide context, background, or the purpose of the term.
25
+ * Include key characteristics, functions, or processes associated with the term.
26
+ * Use examples to illustrate the concept clearly.
27
+ * Break down complex ideas into simpler parts.
28
+
29
+ **3. Key Concepts / Components (Optional)**
30
+ * **Purpose:** If the term has distinct sub-sections, components, or related key ideas that warrant separate discussion, list and explain them here.
31
+ * **Content:**
32
+ * Use bullet points or sub-headings for each key concept.
33
+ * Briefly define and explain each component.
34
+
35
+ **4. Importance / Application (Optional)**
36
+ * **Purpose:** Explain why the term is significant, its impact, or how it is applied in real-world scenarios.
37
+ * **Content:** Discuss the relevance, benefits, challenges, or practical uses of the term.
38
+
39
+ **5. Related Terms / Concepts**
40
+ * **Purpose:** Link to other relevant terms within your glossary or related concepts that readers might find useful for further understanding.
41
+ * **Content:**
42
+ * List terms that are closely associated or often discussed alongside the current term.
43
+
44
+ **6. Sources / References**
45
+ * **Purpose:** Cite the sources from which the information was gathered. This adds credibility and allows readers to explore further.
46
+ * **Content:**
47
+ * List URLs, book titles, or other references.
48
+ """
49
+
50
+ # Initialize OpenAI client
51
+ self.client = None
52
+ self._setup_openai()
53
+
54
+ def _setup_openai(self):
55
+ """Initialize OpenAI client with API key"""
56
+ api_key = os.getenv("OPENAI_API_KEY")
57
+ if api_key:
58
+ try:
59
+ self.client = OpenAI(api_key=api_key)
60
+ # Test the connection
61
+ self.client.models.list()
62
+ print("βœ… OpenAI client initialized successfully")
63
+ except Exception as e:
64
+ print(f"❌ Error initializing OpenAI client: {e}")
65
+ self.client = None
66
+ else:
67
+ print("⚠️ OPENAI_API_KEY not found in environment variables")
68
+ self.client = None
69
+
70
+ def _call_openai(self, prompt: str, max_tokens: int = 2000) -> str:
71
+ """Make a call to OpenAI GPT-4"""
72
+ if not self.client:
73
+ return "❌ OpenAI API key not configured. Please add your OPENAI_API_KEY to the environment variables in Hugging Face Spaces settings."
74
+
75
+ try:
76
+ response = self.client.chat.completions.create(
77
+ model="gpt-4",
78
+ messages=[
79
+ {"role": "system", "content": "You are a professional content writer specializing in creating high-quality glossary entries. You follow templates precisely and create comprehensive, well-structured content."},
80
+ {"role": "user", "content": prompt}
81
+ ],
82
+ max_tokens=max_tokens,
83
+ temperature=0.7,
84
+ top_p=1,
85
+ frequency_penalty=0,
86
+ presence_penalty=0
87
+ )
88
+ return response.choices[0].message.content.strip()
89
+
90
+ except Exception as e:
91
+ return f"❌ Error calling OpenAI API: {str(e)}"
92
+
93
+ def generate_new_content(self, term: str, context: str = "", target_audience: str = "general") -> str:
94
+ """Generate new glossary content for a given term"""
95
+
96
+ prompt = f"""
97
+ Create a comprehensive glossary entry for the term "{term}" following this EXACT template structure:
98
+
99
+ {self.template}
100
+
101
+ **Requirements:**
102
+ - Replace [TERM NAME] with "{term}"
103
+ - Target Audience: {target_audience}
104
+ - Additional Context: {context if context else "No additional context provided"}
105
+ - Fill in ALL sections with relevant, accurate information
106
+ - Use the "reverse pyramid" structure - most important info first
107
+ - Include relevant PAA (People Also Ask) questions in section 2
108
+ - Remove optional sections only if truly not applicable
109
+ - Maintain clear, concise language
110
+ - Provide at least 3 related terms in section 5
111
+ - Include credible sources/references in section 6
112
+
113
+ **Focus Areas:**
114
+ - Make the brief definition crystal clear and direct
115
+ - Expand thoroughly in the detailed explanation
116
+ - Include practical examples and use cases
117
+ - Address common questions people might have
118
+ - Ensure professional, authoritative tone
119
+
120
+ Generate the complete glossary entry now:
121
+ """
122
+
123
+ return self._call_openai(prompt, max_tokens=2500)
124
+
125
+ def update_existing_content(self, term: str, existing_content: str, update_instructions: str = "") -> Tuple[str, str]:
126
+ """Analyze existing content and provide update recommendations"""
127
+
128
+ # First, analyze the content
129
+ analysis_prompt = f"""
130
+ Analyze this existing glossary content for "{term}" against the template standard and provide specific improvement recommendations.
131
+
132
+ **EXISTING CONTENT:**
133
+ {existing_content}
134
+
135
+ **TEMPLATE STANDARD:**
136
+ {self.template}
137
+
138
+ **UPDATE INSTRUCTIONS:** {update_instructions if update_instructions else "General content improvement"}
139
+
140
+ **Provide a detailed analysis covering:**
141
+
142
+ 1. **STRUCTURAL ANALYSIS:**
143
+ - Does it follow the template structure?
144
+ - Which sections are missing or incomplete?
145
+ - Is the reverse pyramid structure implemented?
146
+
147
+ 2. **CONTENT QUALITY ASSESSMENT:**
148
+ - Clarity and conciseness of the brief definition
149
+ - Depth and comprehensiveness of detailed explanation
150
+ - Relevance and usefulness of examples
151
+ - Quality of related terms and references
152
+
153
+ 3. **SPECIFIC RECOMMENDATIONS (prioritized):**
154
+ - HIGH PRIORITY: Critical improvements needed
155
+ - MEDIUM PRIORITY: Important enhancements
156
+ - LOW PRIORITY: Nice-to-have improvements
157
+
158
+ 4. **SEO & USER EXPERIENCE:**
159
+ - Missing PAA questions to address
160
+ - Keyword opportunities
161
+ - Cross-linking possibilities
162
+ - Readability improvements
163
+
164
+ 5. **SOURCES & CREDIBILITY:**
165
+ - Quality of current references
166
+ - Missing authoritative sources
167
+ - Fact-checking requirements
168
+
169
+ Format as a professional content analysis report.
170
+ """
171
+
172
+ recommendations = self._call_openai(analysis_prompt, max_tokens=1500)
173
+
174
+ # Then generate updated content
175
+ update_prompt = f"""
176
+ Create an improved version of the glossary entry for "{term}" based on the analysis and recommendations.
177
+
178
+ **ORIGINAL CONTENT:**
179
+ {existing_content}
180
+
181
+ **ANALYSIS & RECOMMENDATIONS:**
182
+ {recommendations}
183
+
184
+ **TEMPLATE TO FOLLOW:**
185
+ {self.template}
186
+
187
+ **UPDATE INSTRUCTIONS:** {update_instructions if update_instructions else "Apply the key recommendations from the analysis"}
188
+
189
+ **Create the improved glossary entry that:**
190
+ 1. Follows the template structure exactly
191
+ 2. Implements the high and medium priority recommendations
192
+ 3. Maintains the best elements from the original
193
+ 4. Adds missing sections or information
194
+ 5. Improves clarity, structure, and usefulness
195
+ 6. Includes better examples and explanations
196
+ 7. Enhances SEO and user experience
197
+
198
+ Generate the complete, improved glossary entry:
199
+ """
200
+
201
+ updated_content = self._call_openai(update_prompt, max_tokens=2500)
202
+
203
+ return recommendations, updated_content
204
+
205
+ def create_outline_brief(self, topic: str, scope: str = "comprehensive") -> str:
206
+ """Create an outline or brief for new glossary content"""
207
+
208
+ prompt = f"""
209
+ Create a comprehensive content brief for developing a glossary focused on "{topic}".
210
+
211
+ **Scope:** {scope}
212
+ **Template Standard:** Follow the 6-section template structure provided
213
+
214
+ **Create a detailed brief covering:**
215
+
216
+ **1. TOPIC OVERVIEW & STRATEGY**
217
+ - Comprehensive topic definition and boundaries
218
+ - Target audience analysis and segmentation
219
+ - Content complexity and depth recommendations
220
+ - Competitive landscape and differentiation opportunities
221
+
222
+ **2. TERM IDENTIFICATION & PRIORITIZATION**
223
+ - **Primary Terms (10-15 key terms):** Most important, high-search volume terms
224
+ - **Secondary Terms (8-12 supporting terms):** Important supporting concepts
225
+ - **Long-tail Terms (5-10 specific terms):** Niche but valuable terms
226
+ - **Priority Matrix:** High/Medium/Low priority for each term with reasoning
227
+
228
+ **3. CONTENT ARCHITECTURE**
229
+ - Template section recommendations for each term type
230
+ - Suggested content depth and length for each priority level
231
+ - Cross-linking strategy between terms
232
+ - Information hierarchy and user journey mapping
233
+
234
+ **4. RESEARCH & DEVELOPMENT PLAN**
235
+ - **Primary Sources:** Authoritative websites, publications, studies
236
+ - **Expert Sources:** Industry leaders, academic researchers, practitioners
237
+ - **User Research:** Common questions, search patterns, knowledge gaps
238
+ - **Competitive Analysis:** What others are doing well/poorly
239
+
240
+ **5. SEO & DISCOVERABILITY STRATEGY**
241
+ - **Primary Keywords:** Main search terms for each priority level
242
+ - **Long-tail Keywords:** Specific phrases users search for
243
+ - **PAA Questions:** "People Also Ask" questions to address
244
+ - **Content Gap Analysis:** Opportunities competitors are missing
245
+ - **Internal Linking Strategy:** How terms connect to each other
246
+
247
+ **6. PRODUCTION ROADMAP**
248
+ - **Phase 1:** High-priority terms (timeline and resource allocation)
249
+ - **Phase 2:** Secondary terms and enhancements
250
+ - **Phase 3:** Long-tail terms and optimization
251
+ - **Resource Requirements:** Estimated hours per term type
252
+ - **Quality Assurance:** Review process and standards
253
+ - **Maintenance Plan:** Update frequency and monitoring
254
+
255
+ **7. SUCCESS METRICS & KPIs**
256
+ - Content quality indicators
257
+ - User engagement metrics
258
+ - SEO performance targets
259
+ - Conversion and utility measurements
260
+
261
+ Create a comprehensive, actionable brief that will guide the entire glossary development process.
262
+ """
263
+
264
+ return self._call_openai(prompt, max_tokens=3000)
265
+
266
+ def create_gradio_interface():
267
+ """Create the Gradio interface for the glossary generator"""
268
+
269
+ generator = GlossaryGenerator()
270
+
271
+ def generate_new_wrapper(term, context, audience):
272
+ if not term.strip():
273
+ return "Please enter a term to generate content for."
274
+ return generator.generate_new_content(term, context, audience)
275
+
276
+ def update_existing_wrapper(term, existing_content, update_instructions):
277
+ if not term.strip() or not existing_content.strip():
278
+ return "Please provide both term and existing content.", ""
279
+ recommendations, updated_content = generator.update_existing_content(term, existing_content, update_instructions)
280
+ return recommendations, updated_content
281
+
282
+ def create_outline_wrapper(topic, scope):
283
+ if not topic.strip():
284
+ return "Please enter a topic for the outline."
285
+ return generator.create_outline_brief(topic, scope)
286
+
287
+ # Create the Gradio interface
288
+ with gr.Blocks(title="Glossary Content Generator", theme=gr.themes.Soft()) as demo:
289
+ gr.Markdown("""
290
+ # πŸ“š Glossary Content Generator
291
+
292
+ **Powered by OpenAI GPT-4** - Professional glossary content creation and optimization tool.
293
+
294
+ > πŸ”‘ **Setup Required:** Add your `OPENAI_API_KEY` in the Hugging Face Spaces settings under "Repository secrets"
295
+ """)
296
+
297
+ # Add API key status indicator
298
+ api_status = "βœ… OpenAI Connected" if generator.client else "❌ OpenAI API Key Required"
299
+ gr.Markdown(f"**Status:** {api_status}")
300
+
301
+ with gr.Tabs():
302
+ # Tab 1: Generate New Content
303
+ with gr.TabItem("πŸ†• Generate New Content"):
304
+ gr.Markdown("### Create a new glossary entry from scratch using GPT-4")
305
+
306
+ with gr.Row():
307
+ with gr.Column(scale=1):
308
+ new_term = gr.Textbox(
309
+ label="Term to Define",
310
+ placeholder="e.g., Machine Learning, CPQ, SEO, API",
311
+ lines=1
312
+ )
313
+ new_context = gr.Textbox(
314
+ label="Additional Context (Optional)",
315
+ placeholder="Provide industry context, specific use cases, or background information",
316
+ lines=3
317
+ )
318
+ new_audience = gr.Dropdown(
319
+ label="Target Audience",
320
+ choices=["general", "technical", "business", "beginner", "expert"],
321
+ value="general"
322
+ )
323
+ generate_btn = gr.Button("πŸš€ Generate Content", variant="primary", size="lg")
324
+
325
+ with gr.Column(scale=2):
326
+ new_output = gr.Textbox(
327
+ label="Generated Glossary Entry",
328
+ lines=25,
329
+ max_lines=30,
330
+ show_copy_button=True
331
+ )
332
+
333
+ generate_btn.click(
334
+ generate_new_wrapper,
335
+ inputs=[new_term, new_context, new_audience],
336
+ outputs=[new_output]
337
+ )
338
+
339
+ # Add examples
340
+ gr.Markdown("**πŸ’‘ Example Terms:** API, Machine Learning, Blockchain, SaaS, Customer Journey, A/B Testing")
341
+
342
+ # Tab 2: Update Existing Content
343
+ with gr.TabItem("πŸ”„ Update Existing Content"):
344
+ gr.Markdown("### Analyze and improve existing glossary entries with AI-powered recommendations")
345
+
346
+ with gr.Row():
347
+ with gr.Column(scale=1):
348
+ update_term = gr.Textbox(
349
+ label="Term Name",
350
+ placeholder="Name of the term being updated",
351
+ lines=1
352
+ )
353
+ existing_content = gr.Textbox(
354
+ label="Existing Content",
355
+ placeholder="Paste your current glossary entry here",
356
+ lines=10
357
+ )
358
+ update_instructions = gr.Textbox(
359
+ label="Update Instructions (Optional)",
360
+ placeholder="e.g., 'Add more technical details', 'Include recent developments', 'Improve SEO focus'",
361
+ lines=3
362
+ )
363
+ update_btn = gr.Button("πŸ” Analyze & Update", variant="primary", size="lg")
364
+
365
+ with gr.Column(scale=2):
366
+ with gr.Row():
367
+ recommendations_output = gr.Textbox(
368
+ label="πŸ“Š Analysis & Recommendations",
369
+ lines=12,
370
+ max_lines=15,
371
+ show_copy_button=True
372
+ )
373
+ with gr.Row():
374
+ updated_content_output = gr.Textbox(
375
+ label="✨ Updated Content",
376
+ lines=12,
377
+ max_lines=15,
378
+ show_copy_button=True
379
+ )
380
+
381
+ update_btn.click(
382
+ update_existing_wrapper,
383
+ inputs=[update_term, existing_content, update_instructions],
384
+ outputs=[recommendations_output, updated_content_output]
385
+ )
386
+
387
+ # Tab 3: Create Outline/Brief
388
+ with gr.TabItem("πŸ“‹ Create Content Brief"):
389
+ gr.Markdown("### Generate a comprehensive strategy brief for glossary development")
390
+
391
+ with gr.Row():
392
+ with gr.Column(scale=1):
393
+ outline_topic = gr.Textbox(
394
+ label="Topic/Subject Area",
395
+ placeholder="e.g., Digital Marketing, Cloud Computing, Artificial Intelligence, E-commerce",
396
+ lines=1
397
+ )
398
+ outline_scope = gr.Dropdown(
399
+ label="Scope & Depth",
400
+ choices=["comprehensive", "focused", "basic", "advanced", "specialized"],
401
+ value="comprehensive"
402
+ )
403
+ outline_btn = gr.Button("πŸ“‹ Create Strategic Brief", variant="primary", size="lg")
404
+
405
+ with gr.Column(scale=2):
406
+ outline_output = gr.Textbox(
407
+ label="πŸ“ˆ Content Strategy Brief",
408
+ lines=25,
409
+ max_lines=30,
410
+ show_copy_button=True
411
+ )
412
+
413
+ outline_btn.click(
414
+ create_outline_wrapper,
415
+ inputs=[outline_topic, outline_scope],
416
+ outputs=[outline_output]
417
+ )
418
+
419
+ gr.Markdown("**πŸ’‘ Example Topics:** Digital Marketing, FinTech, SaaS Operations, Data Science, Cybersecurity")
420
+
421
+ # Tab 4: Template Reference
422
+ with gr.TabItem("πŸ“„ Template Reference"):
423
+ gr.Markdown("### Official Glossary Template Structure")
424
+ template_display = gr.Textbox(
425
+ label="Template Guidelines",
426
+ value=generator.template,
427
+ lines=35,
428
+ max_lines=40,
429
+ interactive=False,
430
+ show_copy_button=True
431
+ )
432
+
433
+ gr.Markdown("""
434
+ ---
435
+ ## πŸ”§ Setup Instructions for Hugging Face Spaces:
436
+
437
+ 1. **Add OpenAI API Key:**
438
+ - Go to your Space settings
439
+ - Navigate to "Repository secrets"
440
+ - Add: `OPENAI_API_KEY` = `your-openai-api-key-here`
441
+ - Restart the Space
442
+
443
+ 2. **Get OpenAI API Key:**
444
+ - Visit [platform.openai.com](https://platform.openai.com)
445
+ - Create account and navigate to API keys
446
+ - Generate new secret key
447
+ - Add billing information (GPT-4 requires paid account)
448
+
449
+ ## ✨ Features:
450
+ - πŸ€– **GPT-4 Powered**: High-quality, professional content generation
451
+ - πŸ“ **Template Consistency**: Follows your exact 6-section structure
452
+ - πŸ” **Content Analysis**: Detailed improvement recommendations
453
+ - πŸ“Š **Strategic Planning**: Comprehensive content briefs and roadmaps
454
+ - 🎯 **SEO Optimized**: Includes PAA questions and keyword strategies
455
+ - πŸ“‹ **Copy-Friendly**: Easy copy buttons for all outputs
456
+
457
+ **Cost Estimate**: ~$0.02-0.10 per generation (depending on content length)
458
+ """)
459
+
460
+ return demo
461
+
462
+ # Launch the application
463
+ if __name__ == "__main__":
464
+ app = create_gradio_interface()
465
+ app.launch(
466
+ share=False,
467
+ server_name="0.0.0.0",
468
+ server_port=7860
469
+ )