my_indo_qa_model

This model is a fine-tuned version of google-bert/bert-base-multilingual-uncased on an Indonesian Question Answering dataset.

It achieves the following results on the evaluation set:

Validation Loss: 1.5991
F1 Score: 0.0873
Precision: 0.0920
Recall: 0.0831

Model description

This is a fine-tuned BERT multilingual uncased model for the task of extractive question answering in the Indonesian language. It takes a question and a context as input, and predicts the start and end position of the answer span within the context.

Intended uses & limitations

Intended Uses:

Answering factual questions from Indonesian texts
Educational and research purposes in NLP or linguistics

Limitations:

Low F1 score (0.0873) indicates the model may struggle with generalization
Performance may degrade significantly with out-of-domain or noisy texts
Not recommended for production use without further fine-tuning

Training and evaluation data

The model was fine-tuned using a custom Indonesian QA dataset consisting of:

2,551 training samples
638 test samples

The dataset is structured in SQuAD format with context, question, and answer fields.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Learning rate: 2e-05
Train batch size: 16
Eval batch size: 16
Epochs: 3
Optimizer: AdamW (betas=(0.9, 0.999), epsilon=1e-08)
Scheduler: Linear
Seed: 42

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	160	1.8313
No log	2.0	320	1.5645
No log	3.0	480	1.5991

Evaluation metrics

Metric	Value
F1 Score	0.0873
Precision	0.0920
Recall	0.0831

Framework versions

Transformers: 4.54.0
PyTorch: 2.6.0+cu124
Datasets: 4.0.0
Tokenizers: 0.21.2

Downloads last month: 2

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support