--- language: en license: mit tags: - sdf - classification - qwen2.5 - gguf - content-type - web-content base_model: Qwen/Qwen2.5-1.5B-Instruct pipeline_tag: text-generation --- # SDF Classify Content type classifier for the [SDF Protocol](https://sdfprotocol.org). Fine-tuned from Qwen2.5-1.5B-Instruct using QLoRA. ## Purpose Classifies web content into SDF's hierarchical type system: 10 parent types and 50+ subtypes (e.g., `article.news`, `commerce.product`, `documentation.api_docs`). ## Training - **Base model**: Qwen2.5-1.5B-Instruct - **Method**: QLoRA (rank 32, alpha 64, dropout 0.05) - **Training data**: 2,335 classified web documents - **Accuracy**: 95.2% exact type match ## Files | File | Size | Description | |------|------|-------------| | `sdf-classify-Qwen2.5-1.5B-Instruct-Q4_K_M.gguf` | 941 MB | Quantized (Q4_K_M) — recommended for deployment | | `sdf-classify-Qwen2.5-1.5B-Instruct-f16.gguf` | 2.9 GB | Full precision (f16) | | `Modelfile` | — | Ollama import configuration | ## Usage with Ollama ```bash # Download the Q4_K_M file, then: ollama create sdf-classify -f Modelfile ``` ## Part of SDF Protocol - **Protocol**: [sdfprotocol.org](https://sdfprotocol.org) - **Specification**: [github.com/sdfprotocol/sdf](https://github.com/sdfprotocol/sdf) - **Whitepaper**: [DOI 10.5281/zenodo.18559223](https://doi.org/10.5281/zenodo.18559223) - **Extractor model**: [pranab2050/sdf-extract](https://huggingface.co/pranab2050/sdf-extract) ## Citation ```bibtex @article{sarkar2026sdf, title={Convert Once, Consume Many: SDF for Cacheable, Typed Semantic Extraction from Web Pages}, author={Sarkar, Pranab}, year={2026}, doi={10.5281/zenodo.18559223}, publisher={Zenodo} } ```