smgp / QWEN.md
muhammadmaazuddin's picture
feat: working
ed02112

SMGP - Social Media Content Generator

Project Overview

SMGP (Social Media Post Generator) is an AI-powered tool that automates the creation of social media content including both text and visual elements. The system uses multi-agent architecture to coordinate different specialized tasks:

  • Content Generation: Creates platform-specific social media post text
  • Visual Generation: Produces AI-generated images based on design specifications
  • Web Content Extraction: Extracts text and theme colors from websites
  • Design Specification Generation: Creates detailed JSON specifications for visual content

The project leverages various AI models (primarily Google's Gemini) and integrates multiple tools and services for web scraping, image generation, and content creation.

Architecture

The system is built around a multi-agent architecture:

  1. Main Agent - Coordinates the overall social media post generation workflow
  2. Content Agent - Generates engaging text content tailored to specific platforms
  3. Media Agent - Creates visual content based on design briefs and specifications
  4. Web Inspector Agent - Extracts text, colors, and metadata from websites
  5. Browser Agent - Performs advanced web analysis and data extraction

Key Files

  • src/main.py - Entry point that orchestrates the agent workflow
  • src/_agents.py - Contains implementations of all specialized agents and tools
  • src/model.py - Configuration for AI models and API clients
  • src/utils/ - Utility functions (directory exists based on structure)

Dependencies

The project uses several key libraries:

  • browser-use - For web browsing and content extraction
  • google-genai - Google's Generative AI for content creation
  • fal-client - For AI image generation services
  • langchain - For LLM integration and tools
  • langfuse - For observability and tracing
  • openai-agents - For multi-agent coordination
  • duckduckgo-search - For web search capabilities

Building and Running

Prerequisites

  • Python 3.11+
  • Node.js and npx (required for browser tools)
  • API keys for various services (Google, OpenRouter, etc.)

Setup

  1. Install dependencies with uv or pip:

    uv sync  # or pip install -r requirements.txt if using pip
    
  2. Set up environment variables in .env file:

    GOOGLE_API_KEY=your_google_api_key
    GEMINI_API_KEY=your_gemini_api_key
    OPENROUTER_API_KEY=your_openrouter_api_key
    OPENAI_API_KEY=your_openai_api_key
    
  3. Run the main application:

    python -m src.main
    

Development Conventions

  • The code uses Pydantic for data validation and type safety
  • Multi-agent coordination is handled through the Runner class
  • Asynchronous programming is used throughout for better performance
  • Tools are defined using the @function_tool decorator
  • Langfuse is used for tracing and observability
  • Session data is stored in SQLite for conversation persistence

Usage

The main workflow involves:

  1. Providing a user input/brief for the social media post
  2. Using the Web Inspector to extract content from websites (if needed)
  3. Generating content with the Content Agent
  4. Creating visual assets with the Media Agent
  5. Combining content and visuals into a complete social media post

The system is particularly effective at creating LinkedIn posts with content extracted from websites, generating both the text and complementary visual elements.

Key Features

  • Platform-specific content generation (LinkedIn, Twitter/X, Instagram, Facebook)
  • Automated image generation with detailed design specifications
  • Website content extraction with theme color detection
  • Multi-modal AI integration for rich content creation
  • Browser automation for advanced web data extraction
  • Structured JSON design specification generation