Spaces:

mroccuper
/

PR-Style

Sleeping

File size: 25,126 Bytes

import gradio as gr
import google.generativeai as genai
import textstat
import re
import os
from datetime import datetime
import tempfile

class PRArticleGenerator:
    def __init__(self):
        self.pr_prompt = """
🔧 SYSTEM ROLE:
You are an elite PR copywriter and strategic storyteller, trusted by Forbes-level publishers, corporate PR firms, and influential entrepreneurs. Your role is to transform raw business content, personal bios, investment stories, or news leads into a **professionally polished, elite-grade PR article** optimized for SEO, brand perception, and high-impact online publication.

🎯 OBJECTIVE:
- Transform the user's input into a **trustworthy, media-ready PR article with strategic depth and journalistic integrity**.
- Use a **neutral-positive, authoritative, and sophisticated tone**—like a feature in Bloomberg, Forbes, or Business Insider.
- Follow a **Rank Math-optimized structure** to improve search visibility and engagement, ensuring narrative flow and impactful messaging.

🧠 CONTENT STRATEGY:
Mimic top-tier PR publications using this structure:
1.  **Headline (`<h1>`)**: Strong, keyword-rich, highly engaging, and precisely reflects the core news, aiming for virality without sensationalism.
2.  **Intro Paragraph**: News-style lead. Who, what, when, where, why — concise, compelling, and immediately establishes the article's significance.
3.  **Market/Impact Context (`<h2>`)**: Provide a robust analysis of the market landscape, competitive positioning, and the long-term implications of this news. Connect the development to broader industry trends or societal shifts, illustrating its strategic importance and why it truly matters.
4.  **About the Subject (`<h2>`)**: Craft a compelling narrative around the individual or company, emphasizing their unique vision, core values, and the distinctive philosophy driving their success. Go beyond mere credentials to illustrate their leadership impact and market influence.
5.  **Portfolio Highlights or Track Record (`<h2>`)**: A concise bullet-point or paragraph list of relevant, high-impact deals, significant actions, or verifiable achievements that underscore credibility and success. Quantify where possible.
6.  **Vision, Quote, or Strategy Statement (`<h2>`)**: Generate a visionary, impactful quote that not only articulates future plans but also conveys the underlying philosophy, strategic foresight, and commitment to long-term value creation. The quote should resonate with investors and industry leaders.
7.  **Key Differentiators/Unique Value Proposition (`<h2>` - Optional but Recommended if applicable):** Briefly articulate what truly sets the subject or company apart – be it their innovative technology, groundbreaking business model, distinctive cultural approach, or specialized market niche.
8.  **Closing Paragraph (`<h2>`)**: Conclude with a powerful summary that underscores the enduring significance of the announcement, reinforcing the company's trajectory, its commitment to innovation, and its anticipated long-term impact on the industry or broader ecosystem. This should be a memorable final statement.

✅ SEO RULES (Rank Math-Ready & Elite Optimization):
- Target **one primary keyword** (from the input or suggest one, ensuring it's highly relevant).
- Insert the keyword strategically and naturally in:
  - `<h1>` title
  - First 100 words (prominently)
  - At least one `<h2>` subheading
  - Last paragraph
  - Aim for a natural keyword density (approx. 0.5-1.5%).
- Use semantic synonyms and Latent Semantic Indexing (LSI) terms naturally throughout the text.
- Apply proper heading tags (`<h2>`, `<p>`, `<ul>`, `<strong>`) for optimal WordPress compatibility and readability.
- Ensure 600+ words for standard articles (unless the input content is extremely brief), prioritizing depth over brevity.
- NEVER keyword-stuff. Prioritize clarity, professionalism, semantic variation, and reader value over keyword density.

🌍 TARGET AUDIENCE:
- Sophisticated business readers, institutional investors, top-tier journalists, industry analysts, and strategic stakeholders.
- Tone: Strategic, editorial, refined, and highly credible — absolutely NOT hypey, salesy, or overly promotional. Maintain a detached, objective yet positive journalistic voice.

🎨 FORMATTING:
- Structure article in clean **WordPress-compatible HTML**.
- Use `<h2>` subheadings for each primary section as outlined above.
- Employ bullet lists (`<ul>` or `<ol>`) for portfolio highlights or key achievements for easy scannability.
- Insert **1 to 2 vision-driven, high-quality quotes** (real or fictional if not provided, ensuring they sound authentic to a CEO/leader).
- Ensure readability for a broad, intelligent audience.
- Ready to paste into any CMS or press distribution tool.
- Return ONLY the HTML content without any markdown code blocks or backticks.

📥 INPUT FORMAT:
User may provide raw facts, a detailed bio, a business update, or a rough news draft. The AI should infer the central narrative and elaborate it into a full PR article.

📤 OUTPUT FORMAT:
- Return a fully structured PR article with HTML formatting.
- At the END of the article, include a separate section with SEO metadata formatted as an HTML list:
  <h3>SEO Metadata</h3>
  <ul>
  <li><strong>Meta Title:</strong> [Your suggested title here]</li>
  <li><strong>Meta Description:</strong> [Your suggested description here]</li>
  <li><strong>Target Keyword:</strong> [Your suggested keyword here]</li>
  </ul>
- Do NOT wrap the output in ```html or ``` markdown blocks.

Please transform the following content into a professional PR-style article:
"""
        self.target_keyword = None # To store the extracted keyword for scoring

    def clean_html_output(self, text):
        """Remove markdown code blocks and clean HTML output"""
        # Remove markdown code blocks (```html, ```HTML, ```, etc.)
        cleaned = re.sub(r'```[a-zA-Z]*\n?', '', text)
        cleaned = re.sub(r'\n?```', '', cleaned)
        
        # Remove any remaining backticks at start/end
        cleaned = cleaned.strip('`')
        
        # Clean up extra whitespace
        cleaned = re.sub(r'\n{3,}', '\n\n', cleaned)
        cleaned = cleaned.strip()
        
        return cleaned

    def setup_gemini(self, api_key):
        """Configure Gemini API"""
        try:
            genai.configure(api_key=api_key)
            model = genai.GenerativeModel('gemini-1.5-pro')
            return model, None
        except Exception as e:
            return None, f"API Configuration Error: {str(e)}"

    def calculate_readability_score(self, text):
        """Calculate readability score using textstat and custom PR metrics"""
        try:
            # Remove HTML tags for analysis
            clean_text = re.sub(r'<[^>]+>', '', text)
            
            # Basic readability metrics
            flesch_score = textstat.flesch_reading_ease(clean_text)
            avg_sentence_length = textstat.avg_sentence_length(clean_text)
            syllable_count = textstat.avg_syllables_per_word(clean_text)
            
            # Custom PR writing metrics
            pr_score = self.calculate_pr_score(text, clean_text)
            
            # Normalize Flesch score to 1-10 scale (Flesch maxes around 120, typical good is 60-70)
            # A score of 60-70 is generally good for PR; scale 30-70 roughly to 0-10
            readability_base = max(0, min(10, (flesch_score - 30) / 4)) 
            
            # Combined score (60% readability, 40% PR best practices)
            final_score = (readability_base * 0.6) + (pr_score * 0.4)
            
            return round(final_score, 1), self.get_improvement_suggestions(flesch_score, avg_sentence_length, syllable_count, text, clean_text)
            
        except Exception as e:
            return 5.0, f"Error calculating readability: {str(e)}"

    def calculate_pr_score(self, html_text, clean_text):
        """Calculate score based on PR writing best practices for Elite PR"""
        score = 0
        max_score = 10 # Total points available for PR best practices
        
        # 1. HTML Structure & Formatting (Max 3 points)
        if re.search(r'<h1>', html_text, re.IGNORECASE): # H1 presence
            score += 0.5
        h2_count = len(re.findall(r'<h2>', html_text, re.IGNORECASE))
        if h2_count >= 6: # At least 6 H2s for detailed elite structure (including SEO Metadata)
            score += 1.5
        elif h2_count >= 4:
            score += 1
        if re.search(r'<ul>|<ol>', html_text, re.IGNORECASE): # Lists for scannability
            score += 0.5
        if re.search(r'<strong>|<b>', html_text, re.IGNORECASE): # Strong text for emphasis
            score += 0.5

        # 2. Content Depth & Word Count (Max 2 points)
        word_count = len(clean_text.split())
        if word_count >= 800: # Elite articles often longer for depth
            score += 2
        elif word_count >= 600:
            score += 1.5
        elif word_count >= 400:
            score += 1
            
        # 3. Credibility & Authority (Max 2 points)
        # Check for quotes (professional credibility) - aim for 2+ distinct quotes
        if clean_text.count('"') >= 4: # At least two full quotes (4 quote marks)
            score += 1
        elif clean_text.count('"') >= 2:
            score += 0.5
            
        # Check for numbers/statistics/financial data (authority building)
        # Looks for percentages, dollar amounts, and large number terms
        if re.search(r'\d+%|\$\s*\d+(?:,\d{3})*(?:\.\d{2})?|[\d,]+\s*(?:million|billion|thousand|trillion)\b', clean_text, re.IGNORECASE):
            score += 1

        # 4. Keyword Integration (Max 1.5 points)
        if self.target_keyword:
            keyword_lower = self.target_keyword.lower()
            # Check H1
            if re.search(r'<h1>.*?'+re.escape(keyword_lower)+'.*?</h1>', html_text, re.IGNORECASE):
                score += 0.5 
            # Check first 200 words
            if keyword_lower in clean_text[:200].lower(): 
                score += 0.5
            # Check last 200 words
            if keyword_lower in clean_text[-200:].lower(): 
                score += 0.5

        # 5. Tone & Professionalism (Max 1 point)
        # Check for absence of hype words (negative scoring for presence)
        hype_words = ['unbelievable', 'game-changing', 'revolutionary', 'groundbreaking', 'amazing', 'stunning', 'incredible', 'breakthrough']
        if not any(word in clean_text.lower() for word in hype_words):
            score += 0.5 # Reward for mature tone
        
        # Check for presence of strategic/journalistic language
        strategic_indicators = ['strategic', 'implications', 'visionary', 'innovation', 'leadership', 'market dynamics', 'ecosystem', 'trajectory', 'future-forward', 'robust analysis', 'pivotal moment', 'transformative', 'significant investment']
        if sum(1 for indicator in strategic_indicators if indicator in clean_text.lower()) >= 5: # At least 5 strategic terms
            score += 0.5
            
        return min(max_score, score)

    def get_improvement_suggestions(self, flesch_score, avg_sentence_length, syllable_count, html_text, clean_text):
        """Generate specific improvement suggestions for Elite PR"""
        suggestions = []
        
        # Readability suggestions
        if flesch_score < 60: # Aim for 60-70 for good readability
            suggestions.append("• Improve Flesch Reading Ease: Simplify sentence structures, reduce jargon, and break down complex ideas. Aim for a score of 60-70.")
        if avg_sentence_length > 22: # Keep sentences concise for impact
            suggestions.append("• Reduce average sentence length to enhance flow and clarity. Aim for 18-22 words per sentence.")
        if syllable_count > 1.6: # Use more common words
            suggestions.append("• Opt for more accessible vocabulary where appropriate, without sacrificing professionalism.")
        
        # Structure & Content suggestions
        h2_count = len(re.findall(r'<h2>', html_text, re.IGNORECASE))
        if h2_count < 6: # Including SEO Metadata, we expect around 6-8 H2s
            suggestions.append(f"• Incorporate more descriptive H2 subheadings (currently {h2_count}) to segment the article and improve SEO/scannability. Aim for 6-8 sections.")
        if not re.search(r'<ul>|<ol>', html_text, re.IGNORECASE):
            suggestions.append("• Utilize bullet points or numbered lists to highlight key achievements, making content more digestible.")
        if len(clean_text.split()) < 600:
            suggestions.append("• Expand the article's content to reach at least 600-800 words for comprehensive coverage and better SEO depth.")
        
        # Credibility & Impact suggestions
        if clean_text.count('"') < 4: # Less than two full quotes
            suggestions.append("• Include at least two insightful and forward-looking quotes to enhance credibility and provide a strong leadership voice.")
        if not re.search(r'\d+%|\$\s*\d+(?:,\d{3})*(?:\.\d{2})?|[\d,]+\s*(?:million|billion|thousand|trillion)\b', clean_text, re.IGNORECASE):
            suggestions.append("• Integrate specific metrics, financial figures, or quantifiable achievements to add authority and impact.")
        
        # SEO & Tone suggestions
        if self.target_keyword:
            keyword_lower = self.target_keyword.lower()
            if not re.search(r'<h1>.*?'+re.escape(keyword_lower)+'.*?</h1>', html_text, re.IGNORECASE):
                suggestions.append(f"• Ensure the target keyword '{self.target_keyword}' is present in the H1 title for optimal SEO.")
            if keyword_lower not in clean_text[:200].lower():
                suggestions.append(f"• Place the target keyword '{self.target_keyword}' within the first 200 words of the article.")
            if keyword_lower not in clean_text[-200:].lower():
                suggestions.append(f"• Include the target keyword '{self.target_keyword}' in the concluding paragraph.")
        
        hype_words = ['unbelievable', 'game-changing', 'revolutionary', 'groundbreaking', 'amazing', 'stunning', 'incredible', 'breakthrough']
        if any(word in clean_text.lower() for word in hype_words):
            suggestions.append("• Review language for overly promotional or 'hypey' terms. Elite PR uses a more measured, authoritative tone.")

        strategic_indicators = ['strategic', 'implications', 'visionary', 'innovation', 'leadership', 'market dynamics', 'ecosystem', 'trajectory', 'future-forward', 'robust analysis', 'pivotal moment', 'transformative', 'significant investment']
        if sum(1 for indicator in strategic_indicators if indicator in clean_text.lower()) < 5:
            suggestions.append("• Incorporate more strategic and analytical language to convey depth, expertise, and a broader market perspective.")

        if not suggestions:
            suggestions.append("• Your article is exceptionally well-crafted, adhering to elite PR standards! No major improvements suggested.")
            
        return "\n".join(suggestions)

    def generate_article(self, api_key, content, max_length=None):
        """Generate PR-style article using Gemini API"""
        if not api_key.strip():
            return "❌ Error: Please enter your Gemini API key", "", None, "", ""
            
        if not content.strip():
            return "❌ Error: Please enter content to transform", "", None, "", ""

        model, error = self.setup_gemini(api_key)
        if error:
            return f"❌ {error}", "", None, "", ""

        try:
            # Generate article
            full_prompt = self.pr_prompt + "\n\nCONTENT TO TRANSFORM:\n" + content
            response = model.generate_content(full_prompt)
            
            if not response.text:
                return "❌ Error: No response generated from API", "", None, "", ""
                
            # Clean the HTML output to remove markdown code blocks
            article_html = self.clean_html_output(response.text.strip())
            
            # Extract SEO metadata to pass to readability score
            # Regex to find the target keyword within the SEO Metadata list
            target_keyword_match = re.search(r'<li><strong>Target Keyword:</strong>\s*(.*?)</li>', article_html, re.IGNORECASE)

            # Store target keyword for scoring
            self.target_keyword = target_keyword_match.group(1).strip() if target_keyword_match else None

            # Calculate readability and suggestions
            readability_score, suggestions = self.calculate_readability_score(article_html)
            
            # Prepare readability display
            score_color = "error" if readability_score < 6.0 else ("warning" if readability_score < 8.0 else "success")
            readability_display = f"<div class='score-display {score_color}'>Elite PR Score: {readability_score}/10</div>"
            
            # Create downloadable file
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"pr_article_{timestamp}.html"
            
            # Create temporary file
            temp_file = tempfile.NamedTemporaryFile(mode='w', suffix='.html', delete=False, encoding='utf-8')
            temp_file.write(f"""<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>PR Article - Generated {datetime.now().strftime("%Y-%m-%d")}</title>
    <style>
        body {{ font-family: Arial, sans-serif; max-width: 800px; margin: 0 auto; padding: 20px; }}
        h1 {{ color: #2c3e50; }}
        h2 {{ color: #34495e; border-bottom: 2px solid #3498db; padding-bottom: 5px; }}
        p {{ line-height: 1.6; }}
        ul {{ margin: 15px 0; }}
        li {{ margin: 5px 0; }}
    </style>
</head>
<body>
{article_html}
</body>
</html>""")
            temp_file.close()
            
            success_msg = f"✅ Article generated successfully! See Elite PR Score and suggestions below."
            
            return success_msg, article_html, temp_file.name, readability_display, suggestions
            
        except Exception as e:
            return f"❌ Generation Error: {str(e)}", "", None, "", ""

def create_interface():
    generator = PRArticleGenerator()
    
    # Custom CSS for better styling
    css = """
    .container { max-width: 1200px; margin: 0 auto; }
    .header { text-align: center; margin-bottom: 30px; }
    .score-display { 
        font-size: 18px; 
        font-weight: bold; 
        padding: 10px; 
        border-radius: 5px; 
        margin: 10px 0;
        text-align: center;
    }
    .error { background-color: #fcecec; color: #e74c3c; border: 1px solid #e74c3c; }
    .success { background-color: #e8fce8; color: #27ae60; border: 1px solid #27ae60; }
    .warning { background-color: #fff9e6; color: #f39c12; border: 1px solid #f39c12; }
    .suggestions-box {
        background-color: #f8f8f8;
        border: 1px solid #ddd;
        padding: 15px;
        border-radius: 8px;
        margin-top: 20px;
    }
    .suggestions-box h3 {
        color: #34495e;
        margin-top: 0;
    }
    """
    
    with gr.Blocks(css=css, title="Elite PR Article Generator", theme=gr.themes.Soft()) as interface:
        gr.HTML("""
        <div class="header">
            <h1>🎯 Elite PR Article Generator</h1>
            <p>Transform any content into professional, high-impact PR-style articles optimized for SEO and top-tier media consumption.</p>
        </div>
        """)
        
        with gr.Row():
            with gr.Column(scale=1):
                gr.HTML("<h3>📝 Input Configuration</h3>")
                
                api_key_input = gr.Textbox(
                    label="🔑 Gemini API Key",
                    placeholder="Enter your Gemini 1.5 Pro API key here...",
                    type="password",
                    lines=1
                )
                
                content_input = gr.Textbox(
                    label="📄 Content to Transform",
                    placeholder="Paste your raw content, news, facts, or story here...\n\nExample:\n- Company announcement detailing a funding round, acquisition, or strategic partnership.\n- In-depth executive biography highlighting leadership philosophy and career milestones.\n- Detailed product launch outlining market problem, solution, and future roadmap.\n- Business news covering a major achievement or industry impact.",
                    lines=15,
                    max_lines=20
                )
                
                generate_btn = gr.Button(
                    "🚀 Generate Elite PR Article", 
                    variant="primary",
                    size="lg"
                )
                
            with gr.Column(scale=1):
                gr.HTML("<h3>📊 Output & Performance</h3>")
                
                status_output = gr.Textbox(
                    label="📋 Status",
                    lines=2,
                    interactive=False
                )
                
                readability_score_output = gr.HTML(
                    value="", # Placeholder for the score display
                    label="Elite PR Score"
                )
                
                suggestions_output = gr.Textbox(
                    label="💡 Improvement Suggestions",
                    lines=8,
                    interactive=False,
                    elem_classes="suggestions-box"
                )
                
                download_file = gr.File(
                    label="📥 Download HTML Article",
                    interactive=False
                )
        
        with gr.Row():
            with gr.Column():
                gr.HTML("<h3>🎨 Generated Article Preview</h3>")
                article_output = gr.HTML(
                    label="Generated PR Article",
                    show_label=False
                )
        
        # Event handlers
        generate_btn.click(
            fn=generator.generate_article,
            inputs=[api_key_input, content_input],
            outputs=[status_output, article_output, download_file, readability_score_output, suggestions_output]
        )
        
        # Add examples
        gr.HTML("""
        <div style="margin-top: 30px; padding: 20px; border-radius: 10px;">
            <h3>💡 Example Input Content:</h3>
            <p><strong>Business Acquisition:</strong> "Global tech leader InnovateCo today announced its strategic acquisition of cutting-edge AI startup, Nexus Solutions, for $150 million in cash. The acquisition aims to integrate Nexus's patented machine learning algorithms into InnovateCo's enterprise software suite, enhancing predictive analytics capabilities for Fortune 500 clients. InnovateCo CEO, Dr. Anya Sharma, stated, 'This acquisition is a pivotal moment for us, accelerating our AI roadmap and solidifying our leadership in intelligent automation.' Nexus Solutions, founded by visionary AI engineer Mark Chen, is renowned for its breakthroughs in unsupervised learning and has consistently been recognized for its innovation in the AI landscape."</p>
            <p><strong>Product Launch:</strong> "Sustainable Energy Corp. unveils 'EcoGrid Pro,' a revolutionary smart grid management software designed to optimize renewable energy distribution by 40%. The system leverages AI to predict energy demand fluctuations and seamlessly integrate solar and wind power into existing grids. Early pilot programs with major utility providers in California and Germany demonstrated a 25% reduction in energy waste. CEO Elena Petrova highlights, 'EcoGrid Pro isn't just a product; it's a commitment to a more resilient and sustainable energy future.' The company plans a global rollout in Q3 2025."</p>
            <p><strong>Executive Profile:</strong> "Dr. David Kim, newly appointed Chief Innovation Officer at BioSynth Labs, brings over two decades of experience in pharmaceutical R&D, including leading three blockbuster drug development programs at PharmaGiant Inc. A graduate of Stanford University with a Ph.D. in Molecular Biology, Dr. Kim is known for his pioneering work in CRISPR gene-editing technologies. His vision for BioSynth Labs focuses on accelerating therapeutic breakthroughs for neurodegenerative diseases and establishing new industry benchmarks for drug discovery. He believes, 'True innovation stems from a relentless pursuit of curiosity and a collaborative spirit to challenge existing paradigms.'"</p>
        </div>
        """)
    
    return interface

if __name__ == "__main__":
    # Install required packages
    try:
        import google.generativeai
        import textstat
    except ImportError:
        print("Installing required packages...")
        os.system("pip install google-generativeai textstat")
    
    app = create_interface()
    app.launch(
        # You can add launch parameters here, e.g., share=True for a public link
    )