# 📘 SEO Keyword Analyzer API - Complete Development Guide

## 1. Project Overview
This project is an **AI-Powered Microservice** built with **FastAPI**. It serves as an intelligent SEO consultant that accepts a topic (e.g., "Digital Marketing") and generates a comprehensive strategy including:
- High-volume Keywords
- Viral Hashtags
- Competition Analysis
- Strategic Tips

**Key Feature:** This project runs the **Qwen2.5-0.5B-Instruct** model **LOCALLY** inside the container.
- **Zero External Dependencies**: It does NOT use an external API. The brain lives inside the app.
- **100% Free**: No rate limits, no credit usage.
- **Privacy**: Data never leaves your container.

---

## 2. Technology Stack used
We used the following technologies to build this application from scratch:

| Component | Technology | Purpose |
| :--- | :--- | :--- |
| **Framework** | **FastAPI** | High-performance web framework for building the API endpoints. |
| **AI Model** | **Qwen2.5-0.5B-Instruct** | The "Nano" model. Ultra-lightweight (0.5B) for maximum speed and zero timeouts. |
| **Connection** | **Local Inference (CPU)** | No API calls. The model lives inside your app. Zero external dependencies. |
| **Container** | **Docker + PyTorch** | Includes Torch/Transformers to run the AI engine self-contained. |
| **Deployment** | **Hugging Face Spaces** | The cloud platform hosting the Docker container. |

---

## 3. Directory Structure Explaination
Here is how the project files are organized:

```
SEO_Analyzer_FastAPI/
├── main.py                 # 🚦 Entry Point: Defines the API routes & server.
├── requirements.txt        # 📦 Dependencies: Lists libraries (torch, transformers, fastapi).
├── Dockerfile              # 🐳 Deployment: Instructions to build the Linux container.
├── models/
│   └── schemas.py          # 📝 Data Models: Pydantic classes to validate input/output.
└── services/
    └── analyzer.py         # 🧠 The Brain: Loads the Local Model and handles inference.
```

---

## 4. How It Was Built (A to Z)

### Step 1: Defining the Data Structure (`models/schemas.py`)
Before writing code, we defined what the "Input" and "Output" should look like using **Pydantic**.
- **Input**: A simple JSON object `{"content": "..."}`.
- **Output**: A strict JSON schema ensuring the UI always receives `core_keywords`, `hashtags`, `relevance` scores, etc.

### Step 2: Building the Logic Core (`services/analyzer.py`)
This is the heart of the "Local AI" engine:
1.  **Loading**: On startup, we use `transformers.pipeline` to download `Qwen2.5-0.5B` (approx 1GB).
2.  **Inference**: When a request comes in, the **CPU** runs the mathematical calculations to generate text.
3.  **Optimization**: We use `torch_dtype=bfloat16` to make it run faster and use less RAM.
4.  **Temperature Control**: We set `temperature=0.3` to make the AI strict and reliable for JSON.

### Step 3: Creating the API Endpoints (`main.py`)
We created a FastAPI app with two routes:
- `GET /`: A health check.
- `POST /analyze-seo`: The main worker. It includes a **Safety Net** that auto-fills missing data if the AI makes a mistake.

### Step 4: Dockerization (`Dockerfile`)
To make this run on the cloud:
- **Base Image**: `python:3.9`
- **Dependency**: We install `torch` (PyTorch) so the AI can run mathematically.
- **Port**: Exposes port **7860** for Hugging Face Spaces.

---

## 5. How It Works (The Flow)

1.  **User Action**: Sends a request: `POST {"content": "dropshipping"}`.
2.  **API Layer**: FastAPI receives it.
3.  **Local Inference**:
    - The server passes the text to the loaded Qwen model.
    - The **CPU** generates the response token-by-token.
    - This takes ~10-20 seconds.
4.  **Parsing & Repair**: The app cleans the JSON and fixes any syntax errors automatically.
5.  **Response**: The user receives the data.

---

## 6. How to Run Locally

1.  **Install Requirements**:
    ```bash
    pip install -r requirements.txt
    ```

2.  **No Keys Needed**: You do NOT need an API key. It runs locally.

3.  **Run the Server**:
    ```bash
    python -m uvicorn main:app --reload
    ```
    *Note: The first run will download the model (1GB).*

4.  **Access Documentation**:
    Open `http://localhost:8000/docs`.

---

## 7. Configuration Limitations
- **CPU Speed**: Since it runs on a free CPU, we limit generation to **30 keywords** to ensure it finishes quickly.
- **Model Choice**: We used the **0.5B (Nano)** model because it is the only modern LLM that fits comfortably in the free tier RAM while remaining fast.

---
**Developed by Ihtesham | Powered by Open Source AI**

---

## 8. Local Hardware Recommendations (RTX 4050)
If you run this application on an **RTX 4050 (6GB VRAM) + 16GB RAM**:

| Model | Size | VRAM Usage | Speed (Tokens/s) | Recommendation |
| :--- | :--- | :--- | :--- | :--- |
| **Qwen-0.5B** | 0.5B | ~0.8 GB | **100+** (Instant) | ⚡ Overkill Speed. Low Memory usage. |
| **Qwen-1.5B** | 1.5B | ~2.5 GB | **70+** (Very Fast) | ✅ **Perfect Balance.** Best for 6GB cards. |
| **Qwen-7B (4-bit)** | 7B | ~5.5 GB | **30+** (Fast) | 🧠 **Smartest.** Maxes out your VRAM. |

**Conclusion**: Your RTX 4050 is **10x more powerful** than the Free Tier CPU. You should upgrade to the **1.5B Model** locally for better intelligence without sacrificing speed.