OpenHands
/

openhands-critic-4b-v1.0

+# OpenHands Critic 4B v1.0
+A 4B parameter critic model for evaluating AI agent trajectories, trained to predict task success from behavioral rubrics.
+## Model Details
+- **Base Model**: Qwen3-4B
+- **Training**: Full parameter fine-tuning with BCE loss
+- **Context Length**: Trained on 64K, supports up to 256K tokens
+- **Task**: Multi-label classification (26 labels: 25 rubric features + 1 success prediction)
+## Paper
+This model is described in the paper: **"Rubric-Supervised Critics for Sparse Agent Feedback"**
+### Key Results (Mixed-Outcome Subset)
+- **+15.9 points** Best@8 improvement over random selection (73.8% vs 57.9%)
+- **0.83 MRR** - correct trajectory typically ranked first
+- **83% compute reduction** via adaptive rollout (1.36 attempts vs 8)
+## Usage
+This model is designed for use with vLLM's classification API:
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="YOUR_VLLM_SERVER_URL/v1",
+    api_key="YOUR_API_KEY"
+)
+# Format your trajectory as a conversation
+messages = [
+    {"role": "system", "content": "You are evaluating an AI agent's task attempt..."},
+    {"role": "user", "content": "Task: ..."},
+    {"role": "assistant", "content": "Agent actions..."}
+]
+# Get classification scores
+response = client.classifications.create(
+    model="openhands-critic-4b-v1.0",
+    messages=messages
+)
+# The model outputs probabilities for 26 labels:
+# - Labels 0-24: Rubric features (behavioral indicators)
+# - Label 25: Success prediction (primary output for ranking)
+```
+## Training Data
+Trained on 154K segments from:
+- Production agent conversations (150K segments)
+- SWE-Gym benchmark trajectories (4K segments)
+## License
+Please refer to the Qwen3 license for base model terms.
+## Citation
+```bibtex
+@article{openhands2025critic,
+  title={Rubric-Supervised Critics for Sparse Agent Feedback},
+  author={OpenHands Team},
+  year={2025}
+}
+```