Spaces:
Sleeping
Sleeping
File size: 1,801 Bytes
bb2cdb9 15eeac0 bb2cdb9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | cff-version: 1.2.0
message: "If you use ChargebackOps in your research, please cite it as below."
title: "ChargebackOps: A cost-asymmetric multi-round adversarial environment for training LLM agents on B2B dispute workflows"
abstract: |
ChargebackOps is an OpenEnv-compatible reinforcement learning environment that
simulates the merchant side of a credit-card chargeback dispute. The environment
exposes a decision-theoretic primitive — multi-round adjudication with cost-
asymmetric terminal economics, partial observability, and a procedurally-
constrained adversary — that is rare in current RL benchmarks and generalizes
beyond chargebacks to insurance claims, tax audits, content-moderation appeals,
and patent disputes. The repository ships an 8-dimension decomposable Rubric
system, a parametric task generator, an ISO 20022 adapter, a Stripe sandbox
connector, and a reproducible single-T4 SFT + GRPO training pipeline that
documents and remedies a previously-undescribed post-SFT GRPO collapse failure
mode on token-deterministic tasks.
type: software
authors:
- family-names: Dutta
given-names: Mitudru
email: mitudrudutta72@gmail.com
repository-code: "https://github.com/MitudruDutta/ChargeBackOps"
url: "https://huggingface.co/spaces/mitudrudutta/ChargeBackOps"
license: MIT
keywords:
- reinforcement learning
- large language models
- multi-round adjudication
- chargeback disputes
- cost-asymmetric environments
- GRPO
- RLVR
- OpenEnv
preferred-citation:
type: software
title: "ChargebackOps: A cost-asymmetric multi-round adversarial environment for training LLM agents on B2B dispute workflows"
authors:
- family-names: Dutta
given-names: Mitudru
url: "https://github.com/MitudruDutta/ChargeBackOps"
year: 2026
|