Model Card for schemaquake-lora
This model is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct.
It was trained with TRL / GRPO on SchemaQuake, an OpenEnv-compatible environment for training LLM agents to handle silent schema and policy drift in professional workflows.
What is SchemaQuake?
SchemaQuake is a travel-booking environment designed to test whether an agent can notice when the world changes underneath it.
The agent receives a task such as:
Book a refundable flight from BLR to DEL under 8000 rupees.
During the episode, the environment may silently change:
- a field name, such as
price_rupeestoticket_price - a unit, such as rupees to paise
- a refundability field, such as boolean to refund tier
- a policy document, such as a 24-hour cancellation window to 48 hours
The model is rewarded for completing the user’s real task, detecting drift, inspecting schema or policy when needed, avoiding silent violations, and acting efficiently.
Intended Use
This adapter is intended for the SchemaQuake hackathon environment:
- Space: realambuj2001/schemaquake1
- Environment/results repo: realambuj2001/schemaquake1
- Task type: drift-aware professional agent workflows
- Training method: light SFT warm start + TRL GRPO with rollout rewards
It is not intended as a general-purpose travel-booking assistant. The main contribution is the environment and training pipeline for safer agent behavior under changing schemas and policies.
Quick start
This repo contains the trained adapter artifacts. Use it with the SchemaQuake Space or load it as an adapter on top of the base Qwen model.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "Qwen/Qwen2.5-0.5B-Instruct"
adapter = "realambuj2001/schemaquake1-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
prompt = """You are a SchemaQuake travel-booking agent.
Return JSON with an actions list.
Task: Book a refundable flight from BLR to DEL under 8000 rupees.
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 264