epfl-ml4ed
/

LernnaviBERT

@@ -1,35 +1,35 @@
 ---
-license: mit
 base_model: dbmdz/bert-base-german-uncased
-tags:
-- generated_from_trainer
 model-index:
-- name: LernnaviBERT
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# LernnaviBERT
-This model is a fine-tuned version of [dbmdz/bert-base-german-uncased](https://huggingface.co/dbmdz/bert-base-german-uncased) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0060
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -37,10 +37,9 @@ The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 16
 - eval_batch_size: 16
-- seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3.0
 - mixed_precision_training: Native AMP
 ### Training results
@@ -52,6 +51,26 @@ The following hyperparameters were used during training:
 | 0.0096        | 3.0   | 7215 | 0.0072          |
 ### Framework versions
 - Transformers 4.37.1

 ---
+library_name: transformers
 base_model: dbmdz/bert-base-german-uncased
+license: mit
+language:
+- de
 model-index:
+  - name: LernnaviBERT
+    results: []
 ---
+# LernnaviBERT Model Card
+LernnaviBERT is finetuning of [German BERT](https://huggingface.co/dbmdz/bert-base-german-uncased) on educational textual data from the Lernnavi Intelligent Tutoring Systems (ITS). It is trained on masked language modeling following the BERT training scheme.
+### Model Sources
+- **Repository:** [https://github.com/epfl-ml4ed/answer-forecasting](https://github.com/epfl-ml4ed/answer-forecasting)
+- **Paper:** [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)
+### Direct Use
+Being a fine-tuning of a base BERT model, LernnaviBERT is suitable for all BERT uses, especially in the educational domain in the German language.
+### Downstream Use
+LernnaviBERT has been fine-tuned for [MCQ answering](https://huggingface.co/epfl-ml4ed/MCQBert) and Student Answer Forecasting (like [MCQStudentBertCat](https://huggingface.co/epfl-ml4ed/MCQStudentBertCat) and [MCQStudentBertSum](https://huggingface.co/epfl-ml4ed/MCQStudentBertSum)) as described in [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)
+## Training Details
+The model was trained on text data from a real-world ITS, Lernnavi, on ~40k text pieces for 3 epochs with a batch size of 16, going from an initial perplexity of 1.21 on Lernnavi data to a final perplexity of 1.01
 ### Training hyperparameters
 - learning_rate: 2e-05
 - train_batch_size: 16
 - eval_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 3
 - mixed_precision_training: Native AMP
 ### Training results
 | 0.0096        | 3.0   | 7215 | 0.0072          |
+## Citation
+If you find this useful in your work, please cite our paper
+```
+@misc{gado2024student,
+      title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning},
+      author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser},
+      year={2024},
+      eprint={2405.20079},
+      archivePrefix={arXiv},
+}
+```
+```
+Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024).
+Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning.
+In: Proceedings of the Conference on Educational Data Mining (EDM 2024).
+```
 ### Framework versions
 - Transformers 4.37.1