Instructions to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf", dtype="auto") - llama-cpp-python
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf", filename="unsloth.Q4_K_M.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Use Docker
docker model run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Ollama:
ollama run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
- Unsloth Studio new
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf to start chatting
- Pi new
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Docker Model Runner:
docker model run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
- Lemonade
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
Run and chat with the model
lemonade run user.DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf-Q4_K_M
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:# Run inference directly in the terminal:
llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:# Run inference directly in the terminal:
./llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:# Run inference directly in the terminal:
./build/bin/llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Use Docker
docker model run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:๐ฆ Uploaded Model
| Field | Value |
|---|---|
| Developed by | MasterControlAIML |
| License | Apache 2.0 |
| Finetuned from | unsloth/Qwen2.5-3B-Instruct |
| Training Framework | Unsloth ร Hugging Face TRL |
๐ Whatโs New?
*The protein-shake sequel to MasterControlAIML/DeepSeek-R1-Qwen2.5-1.5b-SFT-R1-JSON-Unstructured-To-Structuredโnow with more neurons, zero SFT, and a league of reward functions.*
| Upgrade | Explanation |
|---|---|
| Bigger Backbone | 1.5 B โ 3 B Qwen 2.5 for bigger reasoning muscles. |
| Pure RL | No supervised fine-tuningโpolicy learned only from reward signals (GRPO). |
| LM-as-Judge | Separate LLM rates each candidate for correctness, JSON validity, styleโฆ |
| 2ร Faster Train | Unslothโs flash-attention & fused ops = less VRAM, more speed. |
๐ ๏ธ Intended Use
- Convert messy prose, logs, or audit notes into a pristine JSON document that follows a complex, nested schema.
- Drop-in replacement for any pipeline using the older DeepSeek-R1 1.5 B structurerโjust swap the checkpoint and enjoy the headroom.
๐ง How to Use (Reasoning + JSON)
The snippet below:
- Primes the model with the exact Pydantic schema, so it outputs the right keys.
- Makes the model think step-by-step (reasoning) but still wraps the final JSON in an easy-to-parse container.
- Uses the correct repo name:
MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora.
# โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# QUICK-START
# Structured-data extraction with reasoning + JSON output
# โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch, json, textwrap, inspect
from pydantic import BaseModel
from typing import List, Optional
MODEL = "MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora"
# 1๏ธโฃ Inline schema (keeps the LLM on-rails) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
class MultipleChoice(BaseModel):
question: str
options: List[str]
selected: str
class FormField(BaseModel):
fieldName: str
value: str
notes: Optional[str] = ""
class Calculation(BaseModel):
formula: str
result: str
notes: Optional[str] = ""
class Metadata(BaseModel):
reportDate: str
auditorId: Optional[str] = None
comments: Optional[str] = None
class Content(BaseModel):
paragraphs: List[str]
tables: List["Table"] # assume Table defined elsewhere
checkboxes: List["Checkbox"] # ใ
multipleChoice: List[MultipleChoice]
formFields: List[FormField]
calculations: List[Calculation]
metadata: Optional[Metadata] = Metadata(reportDate="")
class Section(BaseModel):
id: str
title: str
content: Content
class Document(BaseModel):
documentTitle: str
documentDate: str
sections: List[Section]
SCHEMA_TEXT = inspect.getsource(Document)
# 2๏ธโฃ Build prompts โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
SYSTEM_PROMPT = textwrap.dedent(f"""
You are an expert **data-extraction assistant**.
Extract structured info from unstructured text **exactly** following the Pydantic schema.
โโ Schema โโ
{SCHEMA_TEXT}
โโโโโโโโโโโโโ
Rules:
1. Follow the schema for keys & nesting.
2. Copy values verbatim when possible.
3. If a field is missing, return null.
4. Output your step-by-step reasoning first.
5. Then return ONLY the JSON inside this wrapper:
final answer[ json object: {{ ... }} ]
Format:
<reasoning>โฆ</reasoning>
<answer>
final answer[ json object: {{ โฆ }} ]
</answer>
""").strip()
UNSTRUCTURED_TEXT = """
12 April 2025 โ Onsite audit performed by Jane Smith.
Observations: Two fire extinguishers past expiry; emergency lights functional.
Calculations: Total extinguishers = 8, expired = 2 โ 25 % overdue.
"""
USER_PROMPT = textwrap.dedent(f"""
### Task
Convert the following *hier* text to the schema.
### hier
{UNSTRUCTURED_TEXT}
""").strip()
# 3๏ธโฃ Generate โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
tok = AutoTokenizer.from_pretrained(MODEL, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL,
device_map="auto",
torch_dtype=torch.bfloat16
)
gen = pipeline("text-generation", model=model, tokenizer=tok,
max_new_tokens=512, do_sample=False)
prompt = f"<|system|>\n{SYSTEM_PROMPT}\n<|user|>\n{USER_PROMPT}"
raw_out = gen(prompt)[0]["generated_text"]
# 4๏ธโฃ Slice out the JSON โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
start = raw_out.find("final answer[")
end = raw_out.rfind("]") + 1
json_text = raw_out[start:].split("json object:")[-1].strip(" []\n")
data = json.loads(json_text) # โ
Raises if malformed
print(raw_out) # reasoning + JSON
print("\nโ
Parsed object:\n", data)
Why it Works ๐ง
- Schema-priming ensures key-level fidelityโno โcreativeโ field names.
- Chain-of-thought improves factual extraction (was rewarded during GRPO).
- The
final answer[โฆ]wrapper makes downstream parsing a one-liner.
๐๏ธ Training Recipe (Condensed)
| Setting | Value |
|---|---|
| Algorithm | GRPO โ policy โ LM, reward LM โ Qwen2.5-7B w/ JSON-validator head |
| Epochs | 3 (effective) |
| Batch | Grad-accum 8, bfloat16 |
| Optimizer | Fused AdamW |
| Throughput | โ 45 k tokens/s on 8รA100 |
๐ Evaluation (WIP)
| Metric | Status |
|---|---|
| Exact-Match JSON Accuracy | ๐ |
| Structural F1 | ๐ |
| Valid-JSON Rate | ๐ |
Stay tunedโnumbers landing faster than you can say โschema validation.โ ๐ฐ๏ธ
๐ค Citation
@misc{bhaviktheslider_2025_unsloth_qwen2.5_3b_grpo,
title = {An Unsloth-accelerated GRPO-trained Qwen 2.5-3B for JSON structuring},
author = {MasterControlAIML},
year = {2025},
howpublished = {\url{https://huggingface.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora}}
}
May your JSON always parse and your losses always converge! ๐
- Downloads last month
- 94
4-bit
5-bit
8-bit

Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:# Run inference directly in the terminal: llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf: