TOMAGPT

A Qwen3-4B-Instruct-2507 model fine-tuned with GRPO (Group Relative Policy Optimization) to classify legal hearsay by decomposing it into three sub-elements under the U.S. Federal Rules of Evidence.

What It Does

TOMAGPT classifies whether a statement is hearsay by analyzing three sub-elements:

  1. Assertion -- Is the statement an assertion?
  2. Out-of-court -- Was the statement made out of court?
  3. TOMA -- Is the statement offered to prove the truth of the matter asserted?

Hearsay = YES only if all three sub-elements are YES.

Results

Evaluated on the LegalBench hearsay test set (94 examples):

Metric Base Model TOMAGPT Delta
Overall accuracy 71.3% 77.7% +6.4%
TOMA sub-element 78.0% 95.1% +17.1%
Assertion sub-element 90.2% 95.1% +4.9%
Non-verbal hearsay 33.3% 83.3% +50.0%
Standard hearsay 93.1% 100.0% +6.9%
Non-assertive conduct 89.5% 100.0% +10.5%

Training Details

  • Method: GRPO (Group Relative Policy Optimization)
  • Platform: Prime Intellect Lab
  • Environment: smolclaims/TOMAGPT (v0.3.0)
  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Training data: DoodDood/HearsayGRPOTrainingData2 (3,140 examples)
  • Steps: 500
  • Learning rate: 1e-5
  • Batch size: 128
  • Rollouts per example: 16

LoRA Configuration

  • Rank (r): 16
  • Alpha: 32
  • Dropout: 0.0
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Reward Functions

Function Weight Description
assertion_reward 1.5 +1/-1 on assertion accuracy
out_of_court_reward 1.0 +1/-1 on out-of-court accuracy
toma_reward 2.0 +1/-1 on TOMA accuracy
consistency_penalty 1.0 -0.5 for contradictory outputs
format_compliance 1.0 -0.25 per missing field
constraint_penalty 1.0 -0.5 for logical violations

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "DoodDood/TOMAGPT", torch_dtype=torch.bfloat16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("DoodDood/TOMAGPT")

system_prompt = (
    "You are a legal assistant identifying hearsay. Hearsay is defined as "
    "an out-of-court statement introduced to prove the truth of the matter "
    "asserted.\n\n"
    "Respond in EXACTLY this format (semicolon-separated):\n"
    "is_hearsay: YES/NO; an_assertion: YES/NO; made_out_of_court: YES/NO; "
    "is_for_toma: YES/NO"
)

scenario = "At trial, the prosecution presents testimony from a police officer who states that a bystander at the scene told him, 'The defendant ran the red light.'"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": scenario}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=128, do_sample=False)

response = tokenizer.decode(output[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Expected: is_hearsay: YES; an_assertion: YES; made_out_of_court: YES; is_for_toma: YES

Links

Downloads last month
13
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DoodDood/TOMAGPT

Adapter
(5263)
this model
Adapters
1 model

Dataset used to train DoodDood/TOMAGPT

Evaluation results