Model Card for schemaquake-lora

This model is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct.

It was trained with TRL / GRPO on SchemaQuake, an OpenEnv-compatible environment for training LLM agents to handle silent schema and policy drift in professional workflows.

What is SchemaQuake?

SchemaQuake is a travel-booking environment designed to test whether an agent can notice when the world changes underneath it.

The agent receives a task such as:

Book a refundable flight from BLR to DEL under 8000 rupees.

During the episode, the environment may silently change:

  • a field name, such as price_rupees to ticket_price
  • a unit, such as rupees to paise
  • a refundability field, such as boolean to refund tier
  • a policy document, such as a 24-hour cancellation window to 48 hours

The model is rewarded for completing the user’s real task, detecting drift, inspecting schema or policy when needed, avoiding silent violations, and acting efficiently.

Intended Use

This adapter is intended for the SchemaQuake hackathon environment:

It is not intended as a general-purpose travel-booking assistant. The main contribution is the environment and training pipeline for safer agent behavior under changing schemas and policies.

Quick start

This repo contains the trained adapter artifacts. Use it with the SchemaQuake Space or load it as an adapter on top of the base Qwen model.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "Qwen/Qwen2.5-0.5B-Instruct"
adapter = "realambuj2001/schemaquake1-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

prompt = """You are a SchemaQuake travel-booking agent.
Return JSON with an actions list.
Task: Book a refundable flight from BLR to DEL under 8000 rupees.
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
264
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Video Preview
loading

Model tree for realambuj2001/schemaquake1-lora

Finetuned
(751)
this model

Space using realambuj2001/schemaquake1-lora 1