Text Generation
Safetensors
GGUF
English
qwen2_5_vl
nuextract
extraction
json
schedule
rrule
ical
rfc5545
structured-data
conversational
Instructions to use connect211/RRULE_Extractor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use connect211/RRULE_Extractor with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="connect211/RRULE_Extractor", filename="model-q4_k_m.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use connect211/RRULE_Extractor with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf connect211/RRULE_Extractor:Q4_K_M # Run inference directly in the terminal: llama-cli -hf connect211/RRULE_Extractor:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf connect211/RRULE_Extractor:Q4_K_M # Run inference directly in the terminal: llama-cli -hf connect211/RRULE_Extractor:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf connect211/RRULE_Extractor:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf connect211/RRULE_Extractor:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf connect211/RRULE_Extractor:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf connect211/RRULE_Extractor:Q4_K_M
Use Docker
docker model run hf.co/connect211/RRULE_Extractor:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use connect211/RRULE_Extractor with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "connect211/RRULE_Extractor" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "connect211/RRULE_Extractor", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/connect211/RRULE_Extractor:Q4_K_M
- Ollama
How to use connect211/RRULE_Extractor with Ollama:
ollama run hf.co/connect211/RRULE_Extractor:Q4_K_M
- Unsloth Studio new
How to use connect211/RRULE_Extractor with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for connect211/RRULE_Extractor to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for connect211/RRULE_Extractor to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for connect211/RRULE_Extractor to start chatting
- Docker Model Runner
How to use connect211/RRULE_Extractor with Docker Model Runner:
docker model run hf.co/connect211/RRULE_Extractor:Q4_K_M
- Lemonade
How to use connect211/RRULE_Extractor with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull connect211/RRULE_Extractor:Q4_K_M
Run and chat with the model
lemonade run user.RRULE_Extractor-Q4_K_M
List all available models
lemonade list
Update README.md
Browse files
README.md
CHANGED
|
@@ -40,6 +40,29 @@ The result is a highly accurate extraction model that:
|
|
| 40 |
|
| 41 |
_(The model will output a valid JSON array of schedule objects containing fields such as `opens_at`, `closes_at`, `freq`, `interval`, `byday`, etc.)_
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
## Training Data
|
| 44 |
|
| 45 |
The model was fine-tuned on examples of 211 schedule data consisting of unstructured schedule strings annotated with corresponding valid ICAL compliant RRULES.
|
|
@@ -48,3 +71,11 @@ The model was fine-tuned on examples of 211 schedule data consisting of unstruct
|
|
| 48 |
|
| 49 |
- Designed primarily for English language schedule descriptions.
|
| 50 |
- Output generation should be validated by a JSON parser to ensure strict downstream compatibility, though the model is highly trained to output valid JSON formats natively.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
_(The model will output a valid JSON array of schedule objects containing fields such as `opens_at`, `closes_at`, `freq`, `interval`, `byday`, etc.)_
|
| 42 |
|
| 43 |
+
## Available Model Files
|
| 44 |
+
|
| 45 |
+
This repository includes both the full-precision safetensors weights and a quantized GGUF for flexible usage:
|
| 46 |
+
|
| 47 |
+
| File(s) | Format | Precision | Size | Use Case |
|
| 48 |
+
|---|---|---|---|---|
|
| 49 |
+
| `model-0000[1-4]-of-00004.safetensors` | SafeTensors | BF16 | ~21 GB | Further fine-tuning / research |
|
| 50 |
+
| `model-q4_k_m.gguf` | GGUF | Q4_K_M | ~4.7 GB | Inference / deployment |
|
| 51 |
+
|
| 52 |
+
### Full-Precision Weights (SafeTensors)
|
| 53 |
+
|
| 54 |
+
The four `.safetensors` shards contain the complete model weights at **16-bit (F16) precision** — the native precision at which the model was trained and fine-tuned. F16 was chosen over BF16 to keep the door open for future reinforcment learning techniques if I get the time to revisit this project when we are looking to deploy this model or a smaller QAT finetuned 4B parameter model at scale. Either way, the safetensors are provided for researchers and practitioners who want to:
|
| 55 |
+
|
| 56 |
+
- Continue fine-tuning on additional or domain-specific schedule data.
|
| 57 |
+
- Experiment with alternative quantization schemes.
|
| 58 |
+
- Run evaluations at full precision.
|
| 59 |
+
|
| 60 |
+
### Quantized GGUF (Q4_K_M)
|
| 61 |
+
|
| 62 |
+
The `model-q4_k_m.gguf` file is the recommended file for most inference use cases. It was produced using **Quantization-Aware Training (QAT)** — a technique that simulates the effects of quantization *during* the training process, allowing the model to adapt its weights to minimize accuracy loss before the final quantization step is applied. This is in contrast to post-training quantization (PTQ), which quantizes a fully trained model without any opportunity for weight adaptation.
|
| 63 |
+
|
| 64 |
+
The practical result is that the Q4_K_M model retains the vast majority of the full-precision model's accuracy at a fraction of the memory footprint, making it well-suited for local inference and production deployment. For a deeper technical explanation of how QAT enables low-precision accuracy recovery, see [NVIDIA's overview here](https://developer.nvidia.com/blog/how-quantization-aware-training-enables-low-precision-accuracy-recovery/).
|
| 65 |
+
|
| 66 |
## Training Data
|
| 67 |
|
| 68 |
The model was fine-tuned on examples of 211 schedule data consisting of unstructured schedule strings annotated with corresponding valid ICAL compliant RRULES.
|
|
|
|
| 71 |
|
| 72 |
- Designed primarily for English language schedule descriptions.
|
| 73 |
- Output generation should be validated by a JSON parser to ensure strict downstream compatibility, though the model is highly trained to output valid JSON formats natively.
|
| 74 |
+
|
| 75 |
+
## Future Work & Significance
|
| 76 |
+
|
| 77 |
+
At 8B parameters, this model carries capabilities well beyond what schedule extraction requires. Its generalized architecture handles the task reliably, but we believe similar extraction quality is achievable with a much smaller model — potentially in the 1–4B range — trained on the same labeled dataset. A purpose-built smaller model would dramatically reduce inference cost, latency, and memory footprint, which matters at the scale 211 networks operate at.
|
| 78 |
+
|
| 79 |
+
Looking ahead, the ~10k labeled examples used here could be expanded to an estimated 80k+ training examples, and the same fine-tuning methodology applied to other structured fields across 211 datasets beyond schedules.
|
| 80 |
+
|
| 81 |
+
More broadly, this model serves as a working proof of concept for a meaningful hypothesis: that the semantically rich, human-stewarded data maintained by 211 networks is not limited by its lack of machine-readable structure. With targeted fine-tuning, AI can bridge that gap — preserving the accuracy and nuance of human curation while producing the structured outputs that governments, hospitals, researchers, and software providers need to build effective solutions for society's biggest problems. The bottleneck is not the data. The bottleneck is tooling, and that bottleneck is solvable.
|