Technical-SEO-Detector: Technical SEO Issue Detection
Type: Commercial | Domain: SEO, Technical SEO
Hugging Face: syeedalireza/technical-seo-detector
Classify or detect technical SEO issues: redirect chains, thin content, duplicate signals, etc., from URL and page data.
Author
Alireza Aminzadeh
- Hugging Face: syeedalireza
- LinkedIn: alirezaaminzadeh
- Email: alireza.aminzadeh@hotmail.com
Problem
Automated detection of technical issues (redirect chains, thin content, canonical/duplicate problems) speeds up audits.
Approach
- Input: URL, redirect_chain_length, content_length, status_code, canonical_match, etc.
- Output: Issue type (redirect_chain, thin_content, duplicate, ok) or multi-label.
- Models: XGBoost/LightGBM on tabular features; optional text classifier for βthinβ from content snippet.
Tech Stack
| Category | Tools |
|---|---|
| ML | scikit-learn, XGBoost |
| Data | pandas, NumPy |
| Evaluation | sklearn metrics |
Setup
pip install -r requirements.txt
Usage
python train.py
python inference.py --input data/crawl_export.csv --output issues.csv
Project structure
07_technical-seo-detector/
βββ config.py
βββ train.py # XGBoost classifier for issue_type
βββ inference.py # Predict issue type for crawl export
βββ requirements.txt
βββ .env.example
βββ data/
β βββ crawl_export.csv # Sample: features + issue_type
βββ models/
Data
- Sample data (included):
data/crawl_export.csvβ columns:redirect_chain_length,content_length,status_code,word_count,internal_links,issue_type(e.g.ok,redirect_chain,thin_content). - Set
DATA_PATHin.envif using another file.
License
MIT.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support