File size: 2,255 Bytes
45d83c0
 
 
 
 
 
 
 
 
 
ea25187
45d83c0
 
 
 
 
 
 
ea25187
45d83c0
ea25187
 
 
 
45d83c0
 
 
 
 
ecb3dc4
305d76e
45d83c0
 
 
 
 
e2649a7
45d83c0
 
 
 
 
e3d1eb6
 
 
 
 
45d83c0
 
 
0f81019
45d83c0
a75d5ab
45d83c0
0f81019
45d83c0
a75d5ab
45d83c0
 
 
a75d5ab
45d83c0
 
 
a75d5ab
9b8471d
a75d5ab
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
# Doc / guide: https://huggingface.co/docs/hub/model-cards
{}
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

This modelcard documents FM-FCI/DateArith-VLSP2025, a Vietnamese LLM fine-tuned for date arithmetic task. It achieved #1 in the VLSP 2025 benchmark for date-arith task.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

This work investigates two subtasks in temporal reasoning: 1. Date Arithmetic (datearith) and 2. Duration Question Answering (durationQA). For date-arith, we focus on finetuning large language models (LLMs) to directly extract and compute answers. For durationQA, the challenge lies in identifying both explicit and implicit duration expressions in text and reasoning with world knowledge to assess correctness. We explore multiple approaches, from naive supervised fine-tuning (SFT) to SFT augmented with reasoning-based synthetic data and GRPO. Our findings highlight the critical role of carefully constructed data and appropriate training strategies in enabling effective temporal reasoning.

- **Developed by:** FPT Smart Cloud, FPT Corporation
- **Model type:** MoE
- **Language(s) (NLP):** Vietnamese (primary)
- **License:** ?

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/duccd4/vlsp2025-temporal-qa
- **Paper:** Enabling Temporal Commonsense in Vietnamese LLMs – Date-Arith and DurationQA

## Training Details

### Training Data

40,000 synthetic samples

### Training Procedure

#### Training Hyperparameters

- **Precision:** BF16
- **Learning rate:** 5.0e-5
- **Batch size per device:** 16
- **Epoch:** 5
- **Cutoff length:** 2048

## Evaluation

### Testing Data

Đánh giá dựa vào tập valid, public test, private test mà BTC cung cấp

### Metrics

Accuracy

### Results

Độ chính xác trên public test là 98% và trên private test là 99%

**BibTeX:**

Enabling Temporal Commonsense in Vietnamese LLMs – Date-Arith and DurationQA

Duc Dinh Chu*, Thanh-Bac Nguyen Ba*, Duy Dinh Le, Khanh Van Tran