# 📘 SEO Keyword Analyzer API - Complete Development Guide ## 1. Project Overview This project is an **AI-Powered Microservice** built with **FastAPI**. It serves as an intelligent SEO consultant that accepts a topic (e.g., "Digital Marketing") and generates a comprehensive strategy including: - High-volume Keywords - Viral Hashtags - Competition Analysis - Strategic Tips **Key Feature:** This project runs the **Qwen2.5-0.5B-Instruct** model **LOCALLY** inside the container. - **Zero External Dependencies**: It does NOT use an external API. The brain lives inside the app. - **100% Free**: No rate limits, no credit usage. - **Privacy**: Data never leaves your container. --- ## 2. Technology Stack used We used the following technologies to build this application from scratch: | Component | Technology | Purpose | | :--- | :--- | :--- | | **Framework** | **FastAPI** | High-performance web framework for building the API endpoints. | | **AI Model** | **Qwen2.5-0.5B-Instruct** | The "Nano" model. Ultra-lightweight (0.5B) for maximum speed and zero timeouts. | | **Connection** | **Local Inference (CPU)** | No API calls. The model lives inside your app. Zero external dependencies. | | **Container** | **Docker + PyTorch** | Includes Torch/Transformers to run the AI engine self-contained. | | **Deployment** | **Hugging Face Spaces** | The cloud platform hosting the Docker container. | --- ## 3. Directory Structure Explaination Here is how the project files are organized: ``` SEO_Analyzer_FastAPI/ ├── main.py # 🚦 Entry Point: Defines the API routes & server. ├── requirements.txt # 📦 Dependencies: Lists libraries (torch, transformers, fastapi). ├── Dockerfile # 🐳 Deployment: Instructions to build the Linux container. ├── models/ │ └── schemas.py # 📝 Data Models: Pydantic classes to validate input/output. └── services/ └── analyzer.py # 🧠 The Brain: Loads the Local Model and handles inference. ``` --- ## 4. How It Was Built (A to Z) ### Step 1: Defining the Data Structure (`models/schemas.py`) Before writing code, we defined what the "Input" and "Output" should look like using **Pydantic**. - **Input**: A simple JSON object `{"content": "..."}`. - **Output**: A strict JSON schema ensuring the UI always receives `core_keywords`, `hashtags`, `relevance` scores, etc. ### Step 2: Building the Logic Core (`services/analyzer.py`) This is the heart of the "Local AI" engine: 1. **Loading**: On startup, we use `transformers.pipeline` to download `Qwen2.5-0.5B` (approx 1GB). 2. **Inference**: When a request comes in, the **CPU** runs the mathematical calculations to generate text. 3. **Optimization**: We use `torch_dtype=bfloat16` to make it run faster and use less RAM. 4. **Temperature Control**: We set `temperature=0.3` to make the AI strict and reliable for JSON. ### Step 3: Creating the API Endpoints (`main.py`) We created a FastAPI app with two routes: - `GET /`: A health check. - `POST /analyze-seo`: The main worker. It includes a **Safety Net** that auto-fills missing data if the AI makes a mistake. ### Step 4: Dockerization (`Dockerfile`) To make this run on the cloud: - **Base Image**: `python:3.9` - **Dependency**: We install `torch` (PyTorch) so the AI can run mathematically. - **Port**: Exposes port **7860** for Hugging Face Spaces. --- ## 5. How It Works (The Flow) 1. **User Action**: Sends a request: `POST {"content": "dropshipping"}`. 2. **API Layer**: FastAPI receives it. 3. **Local Inference**: - The server passes the text to the loaded Qwen model. - The **CPU** generates the response token-by-token. - This takes ~10-20 seconds. 4. **Parsing & Repair**: The app cleans the JSON and fixes any syntax errors automatically. 5. **Response**: The user receives the data. --- ## 6. How to Run Locally 1. **Install Requirements**: ```bash pip install -r requirements.txt ``` 2. **No Keys Needed**: You do NOT need an API key. It runs locally. 3. **Run the Server**: ```bash python -m uvicorn main:app --reload ``` *Note: The first run will download the model (1GB).* 4. **Access Documentation**: Open `http://localhost:8000/docs`. --- ## 7. Configuration Limitations - **CPU Speed**: Since it runs on a free CPU, we limit generation to **30 keywords** to ensure it finishes quickly. - **Model Choice**: We used the **0.5B (Nano)** model because it is the only modern LLM that fits comfortably in the free tier RAM while remaining fast. --- **Developed by Ihtesham | Powered by Open Source AI** --- ## 8. Local Hardware Recommendations (RTX 4050) If you run this application on an **RTX 4050 (6GB VRAM) + 16GB RAM**: | Model | Size | VRAM Usage | Speed (Tokens/s) | Recommendation | | :--- | :--- | :--- | :--- | :--- | | **Qwen-0.5B** | 0.5B | ~0.8 GB | **100+** (Instant) | ⚡ Overkill Speed. Low Memory usage. | | **Qwen-1.5B** | 1.5B | ~2.5 GB | **70+** (Very Fast) | ✅ **Perfect Balance.** Best for 6GB cards. | | **Qwen-7B (4-bit)** | 7B | ~5.5 GB | **30+** (Fast) | 🧠 **Smartest.** Maxes out your VRAM. | **Conclusion**: Your RTX 4050 is **10x more powerful** than the Free Tier CPU. You should upgrade to the **1.5B Model** locally for better intelligence without sacrificing speed.