Upload SPLADE-PT-BR model v1.0.0
Browse files- README.md +5 -2
- model_metadata.json +2 -1
README.md
CHANGED
|
@@ -9,6 +9,7 @@ tags:
|
|
| 9 |
- bert
|
| 10 |
datasets:
|
| 11 |
- unicamp-dl/mmarco
|
|
|
|
| 12 |
base_model: neuralmind/bert-base-portuguese-cased
|
| 13 |
---
|
| 14 |
|
|
@@ -36,8 +37,10 @@ SPLADE is a neural retrieval model that learns to expand queries and documents w
|
|
| 36 |
|
| 37 |
### Training Data
|
| 38 |
|
| 39 |
-
- **
|
| 40 |
-
-
|
|
|
|
|
|
|
| 41 |
- **Format**: Triplets (query, positive document, negative document)
|
| 42 |
|
| 43 |
### Training Configuration
|
|
|
|
| 9 |
- bert
|
| 10 |
datasets:
|
| 11 |
- unicamp-dl/mmarco
|
| 12 |
+
- unicamp-dl/mrobust
|
| 13 |
base_model: neuralmind/bert-base-portuguese-cased
|
| 14 |
---
|
| 15 |
|
|
|
|
| 37 |
|
| 38 |
### Training Data
|
| 39 |
|
| 40 |
+
- **Training Dataset**: mMARCO Portuguese (`unicamp-dl/mmarco`) - MS MARCO translated to Portuguese
|
| 41 |
+
- Used for training with triplets (query, positive document, negative document)
|
| 42 |
+
- **Validation Dataset**: mRobust (`unicamp-dl/mrobust`) - TREC Robust04 translated to Portuguese
|
| 43 |
+
- Used for validation and evaluation during training
|
| 44 |
- **Format**: Triplets (query, positive document, negative document)
|
| 45 |
|
| 46 |
### Training Configuration
|
model_metadata.json
CHANGED
|
@@ -13,7 +13,8 @@
|
|
| 13 |
},
|
| 14 |
|
| 15 |
"training": {
|
| 16 |
-
"
|
|
|
|
| 17 |
"num_iterations": 150000,
|
| 18 |
"final_loss": 0.000047,
|
| 19 |
"batch_size": 8,
|
|
|
|
| 13 |
},
|
| 14 |
|
| 15 |
"training": {
|
| 16 |
+
"training_dataset": "mMARCO Portuguese (unicamp-dl/mmarco)",
|
| 17 |
+
"validation_dataset": "mRobust (unicamp-dl/mrobust)",
|
| 18 |
"num_iterations": 150000,
|
| 19 |
"final_loss": 0.000047,
|
| 20 |
"batch_size": 8,
|