Ferr0's picture
ZeroGPU: Qwen2.5-3B + Outlines — schema-conformance demo
a4ab2df verified
|
Raw
History Blame Contribute Delete
2.17 kB
metadata
title: Structured Output Playground
emoji: 🔒
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 6.19.0
python_version: '3.12'
app_file: app.py
pinned: true
license: mit
short_description: Lock any LLM's output to a JSON schema

🔒 Structured Output Playground

Lock any LLM's output to a JSON schema. Paste free text, pick (or write) a schema, and a local model returns structured data that is guaranteed to conform — because the decoder is constrained to the schema at generation time, not asked nicely afterwards.

The point isn't the model. It's that schema-conformance is a property of the decoder. Right keys, right types, valid enums — every time.

The toggle is the demo

There's a Constraints ON / OFF switch.

  • ON — the JSON Schema becomes a grammar; the model can only emit tokens that keep the output valid and conformant. You always get the right shape, the right types, and valid enums.
  • OFF — the same model just tries. A good model often succeeds, but "often" isn't "always": watch it wrap the JSON in a markdown fence, or — more subtly — return valid JSON that violates the schema (a string where you asked for an integer, a value outside your enum). The Event example is built to show exactly this.

How it works

  • ModelQwen2.5-3B-Instruct.
  • Inferencetransformers on ZeroGPU (H200).
  • ConstraintOutlines turns the JSON Schema into a grammar, so only schema-valid token sequences are allowed.
  • Validation — every output is checked with jsonschema so you can see conformant vs. broken.

Four presets (contact, product, job posting, event) plus a Custom mode where you paste your own JSON Schema. All example texts are fictional.

About

Built by Ferr0 — infra-minded AI: local LLM inference, structured generation & tool-calling, offline RAG, defensive AI security. More at pixelium.win · GitHub.

License: MIT.