Spaces:

bathientran
/

recruitopenenv

Runtime error

App Files Files Community

recruitopenenv / README.md

bathientran

Upload folder using huggingface_hub

be37527 verified 8 days ago

preview code

raw

history blame contribute delete

4.12 kB

metadata

title: Driver Recruit Environment
emoji: 🚛
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
  - openenv
  - reinforcement-learning
  - recruiting
  - multi-turn

🚛 Driver Recruit Environment

A multi-turn, tool-based RL environment for training LLMs to recruit truck drivers through a CRM system. Built on OpenEnv 0.2.1.

The agent must discover driver qualifications through conversation, record info in the CRM, get management approval, and hire — all using structured tool calls across 15-40+ step episodes.

Pipeline

lead → contacted → interested → approval_pending → offer_sent → hired

Tools

Tool	Actions	Purpose
crm	`read_candidate`, `update_stage`, `update_field`, `add_note`	Manage pipeline & record info
messaging	`send_message`, `read_reply`	Screen driver (18 topics)
approval	`request_approval`, `check_approval`	Get management sign-off
workflow	`wait`	Advance time for approval processing

Reward Signal

Successful hire (good job fit): +10 to +15 (base + CRM bonus)
Bad hire (poor match): -5
Ghosted (trust runs out): -4
Per-step: Small rewards/penalties for correct/incorrect actions

What Makes This Hard

Long horizon: 15-40+ tool calls per episode
Information gathering: Must ask the right screening questions to match driver to the right job
Trust dynamics: Each message costs trust — ask too many questions and the driver ghosts
Job matching: 6 jobs per episode (1-2 good, 1-2 traps with deal-breakers, 2-3 partial)
Procedural correctness: Must follow stage order, read replies before messaging, get approval before offering

Quick Start

from recruitopenenv import RecruitopenenvEnv, RecruitopenenvAction

env = RecruitopenenvEnv(base_url="YOUR_SPACE_URL")

result = env.reset(seed=42)
obs = result.observation
print(f"Driver: {obs.driver_name}, Stage: {obs.stage}")

# Read CRM
result = env.step(RecruitopenenvAction(tool="crm", action="read_candidate"))
print(result.observation.jobs_summary)

# Greet driver
result = env.step(RecruitopenenvAction(tool="messaging", action="send_message", topic="greeting"))
print(f"Reward: {result.reward}")

# Read reply
result = env.step(RecruitopenenvAction(tool="messaging", action="read_reply"))
print(result.observation.discovered_info)

env.close()

Training

We train using GRPO/REINFORCE with the model choosing screening topics. See train_grpo.py for the full training script.

python train_grpo.py --model Qwen/Qwen2.5-3B-Instruct

Deploying

# From the recruitopenenv/ directory
openenv push

Action Format

{"tool": "crm", "action": "read_candidate"}
{"tool": "messaging", "action": "send_message", "topic": "experience"}
{"tool": "messaging", "action": "read_reply"}
{"tool": "crm", "action": "update_field", "field": "cdl_class", "value": "A"}
{"tool": "crm", "action": "update_stage", "stage": "contacted"}
{"tool": "approval", "action": "request_approval", "job_id": 2}
{"tool": "workflow", "action": "wait"}
{"tool": "approval", "action": "check_approval"}
{"tool": "messaging", "action": "send_message", "topic": "offer", "job_id": 2}
{"tool": "crm", "action": "update_stage", "stage": "hired"}

Observation Fields

Field	Description
`driver_name`	Driver's name
`crm_summary`	Full CRM record (empty until `read_candidate`)
`jobs_summary`	6 available job listings
`discovered_info`	Info from screening conversations
`stage`	Current pipeline stage
`feedback`	API response from last action
`pending_reply`	Whether driver has unread message

Screening Topics

greeting, call, experience, home_time, pay, equipment, route, deal_breakers, availability, violations, medical_card, references, pitch, offer, negotiate_pay, negotiate_home_time, signing_bonus, address_concern