sdfprotocol
/

sdf-extract

Text Generation

structured-data

Model card Files Files and versions

sdf-extract / README.md

pranab2050's picture

Upload README.md with huggingface_hub

60309be verified 6 days ago

|

history blame contribute delete

1.84 kB

	---
	language: en
	license: mit
	tags:
	- sdf
	- extraction
	- smollm3
	- gguf
	- structured-data
	- web-content
	base_model: HuggingFaceTB/SmolLM3-3B
	pipeline_tag: text-generation
	---

	# SDF Extract

	Structured data extractor for the [SDF Protocol](https://sdfprotocol.org). Fine-tuned from SmolLM3-3B using QLoRA.

	## Purpose

	Extracts structured semantic data from web content: entities, claims, relationships, summaries, and type-specific fields. Takes the type classification from [sdf-classify](https://huggingface.co/pranab2050/sdf-classify) as input to condition extraction on the content type.

	## Training

	- Base model: HuggingFaceTB/SmolLM3-3B
	- Method: QLoRA (rank 32, alpha 64, dropout 0.05)
	- Training data: 2,335 extracted web documents
	- Accuracy: 90% exact extraction match across all field types

	## Files

	\| File \| Size \| Description \|
	\|------\|------\|-------------\|
	\| `sdf-extract-SmolLM3-3B-Q4_K_M.gguf` \| 1.8 GB \| Quantized (Q4_K_M) — recommended for deployment \|
	\| `sdf-extract-SmolLM3-3B-f16.gguf` \| 5.8 GB \| Full precision (f16) \|
	\| `Modelfile` \| — \| Ollama import configuration \|

	## Usage with Ollama

	```bash
	# Download the Q4_K_M file, then:
	ollama create sdf-extract -f Modelfile
	```

	## Part of SDF Protocol

	- Protocol: [sdfprotocol.org](https://sdfprotocol.org)
	- Specification: [github.com/sdfprotocol/sdf](https://github.com/sdfprotocol/sdf)
	- Whitepaper: [DOI 10.5281/zenodo.18559223](https://doi.org/10.5281/zenodo.18559223)
	- Classifier model: [pranab2050/sdf-classify](https://huggingface.co/pranab2050/sdf-classify)

	## Citation

	```bibtex
	@article{sarkar2026sdf,
	title={Convert Once, Consume Many: SDF for Cacheable, Typed Semantic Extraction from Web Pages},
	author={Sarkar, Pranab},
	year={2026},
	doi={10.5281/zenodo.18559223},
	publisher={Zenodo}
	}
	```