Inigohm123 commited on
Commit
2aa4a2e
·
verified ·
1 Parent(s): b80e9ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -73,7 +73,7 @@ Normalized: informazio gehiago hitz puntu e hatxe u puntu eus web horrian
73
  ## Training
74
 
75
  ### Data preparation
76
- The training data was compiled by our research group from multiple heterogeneous sources and consists of approximately 9,784,905 sentences.
77
 
78
  Prior to training, the data underwent preprocessing steps including cleaning, punctuation standardization, filtering, and the creation of aligned input–output sentence pairs for the capitalization and punctuation restoration task.
79
 
 
73
  ## Training
74
 
75
  ### Data preparation
76
+ The training data was compiled by our research group from multiple heterogeneous sources and consists of approximately 9,784,905 sentences. This dataset is a subset of the data used in the training of the following machine translation model [mt-hitz-eu-es](https://huggingface.co/HiTZ/mt-hitz-eu-es)
77
 
78
  Prior to training, the data underwent preprocessing steps including cleaning, punctuation standardization, filtering, and the creation of aligned input–output sentence pairs for the capitalization and punctuation restoration task.
79