MasterControlAIML
/

DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf

Instructions to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf", dtype="auto")

llama-cpp-python

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf",
	filename="unsloth.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Use Docker

docker model run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

LM Studio
Jan
Ollama
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Ollama:
```
ollama run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
```

Unsloth Studio new

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf to start chatting

Pi new

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Docker Model Runner:
```
docker model run hf.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M
```

Lemonade

How to use MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf:Q4_K_M

Run and chat with the model

lemonade run user.DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf-Q4_K_M

List all available models

lemonade list

📦 Uploaded Model

Field	Value
Developed by	MasterControlAIML
License	Apache 2.0
Finetuned from	`unsloth/Qwen2.5-3B-Instruct`
Training Framework	Unsloth × Hugging Face TRL

🚀 What’s New?

*The protein-shake sequel to MasterControlAIML/DeepSeek-R1-Qwen2.5-1.5b-SFT-R1-JSON-Unstructured-To-Structured—now with more neurons, zero SFT, and a league of reward functions.*

Upgrade	Explanation
Bigger Backbone	1.5 B → 3 B Qwen 2.5 for bigger reasoning muscles.
Pure RL	No supervised fine-tuning—policy learned only from reward signals (GRPO).
LM-as-Judge	Separate LLM rates each candidate for correctness, JSON validity, style…
2× Faster Train	Unsloth’s flash-attention & fused ops = less VRAM, more speed.

🛠️ Intended Use

Convert messy prose, logs, or audit notes into a pristine JSON document that follows a complex, nested schema.
Drop-in replacement for any pipeline using the older DeepSeek-R1 1.5 B structurer—just swap the checkpoint and enjoy the headroom.

🔧 How to Use (Reasoning + JSON)

The snippet below:

Primes the model with the exact Pydantic schema, so it outputs the right keys.
Makes the model think step-by-step (reasoning) but still wraps the final JSON in an easy-to-parse container.
Uses the correct repo name: MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora.

# ─────────────────────────────────────────────────────────────────────────────
# QUICK-START
# Structured-data extraction with reasoning + JSON output
# ─────────────────────────────────────────────────────────────────────────────
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch, json, textwrap, inspect
from pydantic import BaseModel
from typing import List, Optional

MODEL = "MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora"

# 1️⃣  Inline schema (keeps the LLM on-rails) ─────────────────────────────────
class MultipleChoice(BaseModel):
    question: str
    options: List[str]
    selected: str

class FormField(BaseModel):
    fieldName: str
    value: str
    notes: Optional[str] = ""

class Calculation(BaseModel):
    formula: str
    result: str
    notes: Optional[str] = ""

class Metadata(BaseModel):
    reportDate: str
    auditorId: Optional[str] = None
    comments: Optional[str] = None

class Content(BaseModel):
    paragraphs: List[str]
    tables: List["Table"]          # assume Table defined elsewhere
    checkboxes: List["Checkbox"]   #          〃
    multipleChoice: List[MultipleChoice]
    formFields: List[FormField]
    calculations: List[Calculation]
    metadata: Optional[Metadata] = Metadata(reportDate="")

class Section(BaseModel):
    id: str
    title: str
    content: Content

class Document(BaseModel):
    documentTitle: str
    documentDate: str
    sections: List[Section]

SCHEMA_TEXT = inspect.getsource(Document)

# 2️⃣  Build prompts ──────────────────────────────────────────────────────────
SYSTEM_PROMPT = textwrap.dedent(f"""
    You are an expert **data-extraction assistant**.
    Extract structured info from unstructured text **exactly** following the Pydantic schema.

    ── Schema ──
    {SCHEMA_TEXT}
    ─────────────

    Rules:
      1. Follow the schema for keys & nesting.
      2. Copy values verbatim when possible.
      3. If a field is missing, return null.
      4. Output your step-by-step reasoning first.
      5. Then return ONLY the JSON inside this wrapper:
         final answer[ json object: {{ ... }} ]

    Format:
      <reasoning>…</reasoning>
      <answer>
      final answer[ json object: {{ … }} ]
      </answer>
""").strip()

UNSTRUCTURED_TEXT = """
    12 April 2025 – Onsite audit performed by Jane Smith.
    Observations: Two fire extinguishers past expiry; emergency lights functional.
    Calculations: Total extinguishers = 8, expired = 2 → 25 % overdue.
"""

USER_PROMPT = textwrap.dedent(f"""
    ### Task
    Convert the following *hier* text to the schema.

    ### hier
    {UNSTRUCTURED_TEXT}
""").strip()

# 3️⃣  Generate ───────────────────────────────────────────────────────────────
tok   = AutoTokenizer.from_pretrained(MODEL, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL,
    device_map="auto",
    torch_dtype=torch.bfloat16
)
gen = pipeline("text-generation", model=model, tokenizer=tok,
               max_new_tokens=512, do_sample=False)

prompt = f"<|system|>\n{SYSTEM_PROMPT}\n<|user|>\n{USER_PROMPT}"
raw_out = gen(prompt)[0]["generated_text"]

# 4️⃣  Slice out the JSON ─────────────────────────────────────────────────────
start = raw_out.find("final answer[")
end   = raw_out.rfind("]") + 1
json_text = raw_out[start:].split("json object:")[-1].strip(" []\n")
data = json.loads(json_text)    # ✅ Raises if malformed

print(raw_out)  # reasoning + JSON
print("\n✅ Parsed object:\n", data)

Why it Works 🧐

Schema-priming ensures key-level fidelity—no “creative” field names.
Chain-of-thought improves factual extraction (was rewarded during GRPO).
The final answer[…] wrapper makes downstream parsing a one-liner.

🏋️ Training Recipe (Condensed)

Setting	Value
Algorithm	GRPO – policy ≈ LM, reward LM ≈ `Qwen2.5-7B` w/ JSON-validator head
Epochs	3 (effective)
Batch	Grad-accum 8, bfloat16
Optimizer	Fused AdamW
Throughput	≈ 45 k tokens/s on 8×A100

📊 Evaluation (WIP)

Metric	Status
Exact-Match JSON Accuracy	🔜
Structural F1	🔜
Valid-JSON Rate	🔜

Stay tuned—numbers landing faster than you can say “schema validation.” 🛰️

🤝 Citation

@misc{bhaviktheslider_2025_unsloth_qwen2.5_3b_grpo,
  title  = {An Unsloth-accelerated GRPO-trained Qwen 2.5-3B for JSON structuring},
  author = {MasterControlAIML},
  year   = {2025},
  howpublished = {\url{https://huggingface.co/MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora}}
}

May your JSON always parse and your losses always converge! 😎

Downloads last month: 94

GGUF

Model size

8B params

Architecture

qwen2

Hardware compatibility

4-bit

5-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MasterControlAIML/DeepSeek-R1-Qwen2.5-3b-LLM-Judge-Reward-JSON-Unstructured-To-Structured-Lora-gguf

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

unsloth/Qwen2.5-3B-Instruct

Quantized

(10)

this model