Technical-SEO-Detector: Technical SEO Issue Detection

Type: Commercial | Domain: SEO, Technical SEO
Hugging Face: syeedalireza/technical-seo-detector

Classify or detect technical SEO issues: redirect chains, thin content, duplicate signals, etc., from URL and page data.

Author

Alireza Aminzadeh

Problem

Automated detection of technical issues (redirect chains, thin content, canonical/duplicate problems) speeds up audits.

Approach

  • Input: URL, redirect_chain_length, content_length, status_code, canonical_match, etc.
  • Output: Issue type (redirect_chain, thin_content, duplicate, ok) or multi-label.
  • Models: XGBoost/LightGBM on tabular features; optional text classifier for β€œthin” from content snippet.

Tech Stack

Category Tools
ML scikit-learn, XGBoost
Data pandas, NumPy
Evaluation sklearn metrics

Setup

pip install -r requirements.txt

Usage

python train.py
python inference.py --input data/crawl_export.csv --output issues.csv

Project structure

07_technical-seo-detector/
β”œβ”€β”€ config.py
β”œβ”€β”€ train.py           # XGBoost classifier for issue_type
β”œβ”€β”€ inference.py       # Predict issue type for crawl export
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env.example
β”œβ”€β”€ data/
β”‚   └── crawl_export.csv   # Sample: features + issue_type
└── models/

Data

  • Sample data (included): data/crawl_export.csv β€” columns: redirect_chain_length, content_length, status_code, word_count, internal_links, issue_type (e.g. ok, redirect_chain, thin_content).
  • Set DATA_PATH in .env if using another file.

License

MIT.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using syeedalireza/technical-seo-detector 1