Spaces:
Sleeping
π SEO Keyword Analyzer API - Complete Development Guide
1. Project Overview
This project is an AI-Powered Microservice built with FastAPI. It serves as an intelligent SEO consultant that accepts a topic (e.g., "Digital Marketing") and generates a comprehensive strategy including:
- High-volume Keywords
- Viral Hashtags
- Competition Analysis
- Strategic Tips
Key Feature: This project runs the Qwen2.5-0.5B-Instruct model LOCALLY inside the container.
- Zero External Dependencies: It does NOT use an external API. The brain lives inside the app.
- 100% Free: No rate limits, no credit usage.
- Privacy: Data never leaves your container.
2. Technology Stack used
We used the following technologies to build this application from scratch:
| Component | Technology | Purpose |
|---|---|---|
| Framework | FastAPI | High-performance web framework for building the API endpoints. |
| AI Model | Qwen2.5-0.5B-Instruct | The "Nano" model. Ultra-lightweight (0.5B) for maximum speed and zero timeouts. |
| Connection | Local Inference (CPU) | No API calls. The model lives inside your app. Zero external dependencies. |
| Container | Docker + PyTorch | Includes Torch/Transformers to run the AI engine self-contained. |
| Deployment | Hugging Face Spaces | The cloud platform hosting the Docker container. |
3. Directory Structure Explaination
Here is how the project files are organized:
SEO_Analyzer_FastAPI/
βββ main.py # π¦ Entry Point: Defines the API routes & server.
βββ requirements.txt # π¦ Dependencies: Lists libraries (torch, transformers, fastapi).
βββ Dockerfile # π³ Deployment: Instructions to build the Linux container.
βββ models/
β βββ schemas.py # π Data Models: Pydantic classes to validate input/output.
βββ services/
βββ analyzer.py # π§ The Brain: Loads the Local Model and handles inference.
4. How It Was Built (A to Z)
Step 1: Defining the Data Structure (models/schemas.py)
Before writing code, we defined what the "Input" and "Output" should look like using Pydantic.
- Input: A simple JSON object
{"content": "..."}. - Output: A strict JSON schema ensuring the UI always receives
core_keywords,hashtags,relevancescores, etc.
Step 2: Building the Logic Core (services/analyzer.py)
This is the heart of the "Local AI" engine:
- Loading: On startup, we use
transformers.pipelineto downloadQwen2.5-0.5B(approx 1GB). - Inference: When a request comes in, the CPU runs the mathematical calculations to generate text.
- Optimization: We use
torch_dtype=bfloat16to make it run faster and use less RAM. - Temperature Control: We set
temperature=0.3to make the AI strict and reliable for JSON.
Step 3: Creating the API Endpoints (main.py)
We created a FastAPI app with two routes:
GET /: A health check.POST /analyze-seo: The main worker. It includes a Safety Net that auto-fills missing data if the AI makes a mistake.
Step 4: Dockerization (Dockerfile)
To make this run on the cloud:
- Base Image:
python:3.9 - Dependency: We install
torch(PyTorch) so the AI can run mathematically. - Port: Exposes port 7860 for Hugging Face Spaces.
5. How It Works (The Flow)
- User Action: Sends a request:
POST {"content": "dropshipping"}. - API Layer: FastAPI receives it.
- Local Inference:
- The server passes the text to the loaded Qwen model.
- The CPU generates the response token-by-token.
- This takes ~10-20 seconds.
- Parsing & Repair: The app cleans the JSON and fixes any syntax errors automatically.
- Response: The user receives the data.
6. How to Run Locally
Install Requirements:
pip install -r requirements.txtNo Keys Needed: You do NOT need an API key. It runs locally.
Run the Server:
python -m uvicorn main:app --reloadNote: The first run will download the model (1GB).
Access Documentation: Open
http://localhost:8000/docs.
7. Configuration Limitations
- CPU Speed: Since it runs on a free CPU, we limit generation to 30 keywords to ensure it finishes quickly.
- Model Choice: We used the 0.5B (Nano) model because it is the only modern LLM that fits comfortably in the free tier RAM while remaining fast.
Developed by Ihtesham | Powered by Open Source AI
8. Local Hardware Recommendations (RTX 4050)
If you run this application on an RTX 4050 (6GB VRAM) + 16GB RAM:
| Model | Size | VRAM Usage | Speed (Tokens/s) | Recommendation |
|---|---|---|---|---|
| Qwen-0.5B | 0.5B | ~0.8 GB | 100+ (Instant) | β‘ Overkill Speed. Low Memory usage. |
| Qwen-1.5B | 1.5B | ~2.5 GB | 70+ (Very Fast) | β Perfect Balance. Best for 6GB cards. |
| Qwen-7B (4-bit) | 7B | ~5.5 GB | 30+ (Fast) | π§ Smartest. Maxes out your VRAM. |
Conclusion: Your RTX 4050 is 10x more powerful than the Free Tier CPU. You should upgrade to the 1.5B Model locally for better intelligence without sacrificing speed.