File size: 2,796 Bytes
07b2de3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b992f1d
07b2de3
 
 
 
 
 
 
b992f1d
07b2de3
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
base_model: unsloth/Qwen3-32B
language:
  - en
  - ja
library_name: transformers
pipeline_tag: text-generation
license: apache-2.0
---

# Preferred-MedRECT-32B

## Model Description

Preferred-MedRECT-32B is a finetuned model based on [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B), which has been optimized for medical error detection and correction tasks using LoRA (Low-Rank Adaptation).

The model is trained on bilingual (Japanese/English) medical reasoning data with explicit reasoning processes, enabling it to detect errors, extract erroneous sentences, and provide corrections in clinical texts.

The model is released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).

## Model Performance

The table below shows cross-lingual performance comparison on MedRECT-ja (Japanese) and MedRECT-en (English) benchmarks. MedRECT evaluates models on three subtasks: error detection (F1), sentence extraction (Acc.), and error correction (EC Avg. Score).

| Model | MedRECT-ja Error Det. F1 | MedRECT-ja Sent. Ext. Acc. | MedRECT-ja EC Avg. Score | MedRECT-en Error Det. F1 | MedRECT-en Sent. Ext. Acc. | MedRECT-en EC Avg. Score |
|:------|:------------------------:|:--------------------------:|:------------------------:|:------------------------:|:--------------------------:|:------------------------:|
| Preferred-MedRECT-32B | **0.743** | **81.5%** | **0.627** | 0.728 | **90.9%** | **0.718** |
| Qwen3-32B (think) | 0.723 | 72.5% | 0.549 | 0.740 | 83.5% | 0.550 |
| gpt-oss-120b (medium) | 0.721 | 77.4% | 0.581 | 0.777 | 88.1% | 0.630 |
| gpt-oss-20b (medium) | 0.718 | 64.3% | 0.543 | 0.762 | 87.2% | 0.590 |
| GPT-4.1 | 0.658 | 52.6% | 0.655 | **0.789** | 72.8% | 0.710 |

## Training Details

- **Base Model**: unsloth/Qwen3-32B
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Training Data**:
  - Japanese: 5,538 samples from JMLE (2018-2023)
  - English: 2,439 samples from MEDEC MS Subset
  - All samples include reasoning processes generated by DeepSeek-R1-0528

## Limitations

The model was developed for research purposes and is not intended for clinical diagnosis.
It is the users' responsibility to ensure compliance with applicable rules and regulations.

## Contributors

Preferred Networks, Inc.
- Naoto Iwase
- Hiroki Okuyama
- Junichiro Iwasawa

## Publications

Detailed evaluation results will be given in the [research paper](https://arxiv.org/abs/2511.00421).

## Citations

```
@article{medrect2025,
      title={MedRECT: A Medical Reasoning Benchmark for Error Correction in Clinical Texts},
      author={Iwase, Naoto and Okuyama, Hiroki and Iwasawa, Junichiro},
      journal={arXiv preprint arXiv:2511.00421},
      year={2025}
}
```

## License

[Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)