sdf-classify / README.md
pranab2050's picture
Upload README.md with huggingface_hub
df6fef7 verified
metadata
language: en
license: mit
tags:
  - sdf
  - classification
  - qwen2.5
  - gguf
  - content-type
  - web-content
base_model: Qwen/Qwen2.5-1.5B-Instruct
pipeline_tag: text-generation

SDF Classify

Content type classifier for the SDF Protocol. Fine-tuned from Qwen2.5-1.5B-Instruct using QLoRA.

Purpose

Classifies web content into SDF's hierarchical type system: 10 parent types and 50+ subtypes (e.g., article.news, commerce.product, documentation.api_docs).

Training

  • Base model: Qwen2.5-1.5B-Instruct
  • Method: QLoRA (rank 32, alpha 64, dropout 0.05)
  • Training data: 2,335 classified web documents
  • Accuracy: 95.2% exact type match

Files

File Size Description
sdf-classify-Qwen2.5-1.5B-Instruct-Q4_K_M.gguf 941 MB Quantized (Q4_K_M) — recommended for deployment
sdf-classify-Qwen2.5-1.5B-Instruct-f16.gguf 2.9 GB Full precision (f16)
Modelfile Ollama import configuration

Usage with Ollama

# Download the Q4_K_M file, then:
ollama create sdf-classify -f Modelfile

Part of SDF Protocol

Citation

@article{sarkar2026sdf,
  title={Convert Once, Consume Many: SDF for Cacheable, Typed Semantic Extraction from Web Pages},
  author={Sarkar, Pranab},
  year={2026},
  doi={10.5281/zenodo.18559223},
  publisher={Zenodo}
}