Update README.md
Browse files
README.md
CHANGED
|
@@ -14,22 +14,24 @@ base_model:
|
|
| 14 |
- deepset/gbert-base
|
| 15 |
---
|
| 16 |
|
| 17 |
-
#
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
## Model Details
|
| 22 |
|
| 23 |
-
### Model Description
|
| 24 |
-
- **Model Type:** PyLate model
|
| 25 |
-
- **Base model:** [deepset/gbert-base](https://huggingface.co/deepset/gbert-base)
|
| 26 |
-
- **Document Length:** 180 tokens
|
| 27 |
-
- **Query Length:** 32 tokens
|
| 28 |
-
- **Output Dimensionality:** 128 tokens
|
| 29 |
-
- **Similarity Function:** MaxSim
|
| 30 |
-
- **Training Dataset:** samheym/ger-dpr-collection
|
| 31 |
-
- **Language:** de
|
| 32 |
-
<!-- - **License:** Unknown -->
|
| 33 |
|
| 34 |
|
| 35 |
|
|
@@ -55,17 +57,7 @@ model = models.ColBERT(
|
|
| 55 |
|
| 56 |
|
| 57 |
|
| 58 |
-
## Training Details
|
| 59 |
|
| 60 |
-
### Framework Versions
|
| 61 |
-
- Python: 3.12.3
|
| 62 |
-
- Sentence Transformers: 3.4.1
|
| 63 |
-
- PyLate: 1.1.4
|
| 64 |
-
- Transformers: 4.48.2
|
| 65 |
-
- PyTorch: 2.6.0+cu124
|
| 66 |
-
- Accelerate: 1.4.0
|
| 67 |
-
- Datasets: 2.21.0
|
| 68 |
-
- Tokenizers: 0.21.0
|
| 69 |
|
| 70 |
<!--
|
| 71 |
## Citation
|
|
|
|
| 14 |
- deepset/gbert-base
|
| 15 |
---
|
| 16 |
|
| 17 |
+
# Model Overview
|
| 18 |
|
| 19 |
+
GerColBERT is a ColBERT-based retrieval model trained on German text. It is designed for efficient late interaction-based retrieval while maintaining high-quality ranking performance.
|
| 20 |
+
Training Configuration
|
| 21 |
+
|
| 22 |
+
- Base Model: [deepset/gbert-base](https://huggingface.co/deepset/gbert-base)
|
| 23 |
+
- Training Dataset: samheym/ger-dpr-collection
|
| 24 |
+
- Dataset: 10% of randomly selected triples from the final dataset
|
| 25 |
+
- Vector Length: 128
|
| 26 |
+
- Maximum Document Length: 256 characters
|
| 27 |
+
- Batch Size: 50
|
| 28 |
+
- Training Steps: 80,000
|
| 29 |
+
- Gradient Accumulation: 1 step
|
| 30 |
+
- Learning Rate: 5 × 10⁻⁶
|
| 31 |
+
- Optimizer: AdamW
|
| 32 |
+
- In-Batch Negatives: Included
|
| 33 |
|
|
|
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
|
| 37 |
|
|
|
|
| 57 |
|
| 58 |
|
| 59 |
|
|
|
|
| 60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
<!--
|
| 63 |
## Citation
|