diaslmb commited on
Commit
449adaa
Β·
verified Β·
1 Parent(s): 4ddc8b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -13
README.md CHANGED
@@ -49,6 +49,13 @@ We evaluated both the **base BGE-M3** and the **fine-tuned AVI-M3** on a held-ou
49
 
50
  ---
51
 
 
 
 
 
 
 
 
52
  ## πŸ“Š Metrics Explained
53
 
54
  - **Accuracy:** proportion of queries where the top-1 retrieved document is correct.
@@ -57,22 +64,30 @@ We evaluated both the **base BGE-M3** and the **fine-tuned AVI-M3** on a held-ou
57
 
58
  ---
59
 
60
- ## πŸ› οΈ Training
61
 
62
- - **Training dataset:** Custom domain-specific dataset (1088 train, 273 eval)
63
- - **Evaluation dataset:** 273 examples (~20 % held-out split)
64
- - **Hardware:** 1Γ— NVIDIA A40 48 GB
65
- - **Batch size:** (depends on your config)
66
- - **Optimizer:** AdamW
67
- - **Loss:** Contrastive / InfoNCE
68
- - **Framework:** FlagEmbedding
69
 
70
  ---
71
 
72
- ## πŸ’‘ Usage Example (Python)
73
 
74
- ```python
75
- from FlagEmbedding import FlagModel
76
 
77
- model = FlagModel("shaipro/avi-m3", query_max_len=128, doc_max_len=512, use_fp16=True)
78
- embeddings = model.encode(["your query"], normalize_embeddings=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
  ---
51
 
52
+ ### πŸ“š Dataset
53
+ - **Training set:** 1088 examples
54
+ - **Evaluation set:** 273 examples (~20% held-out split)
55
+ - **Task:** Query β†’ Positive passage retrieval
56
+
57
+ ---
58
+
59
  ## πŸ“Š Metrics Explained
60
 
61
  - **Accuracy:** proportion of queries where the top-1 retrieved document is correct.
 
64
 
65
  ---
66
 
67
+ ### πŸ’» Hardware
68
 
69
+ - **GPU:** 1Γ— NVIDIA A40 (48 GB VRAM)
70
+ - **Precision:** FP16 with gradient checkpointing
71
+ - **Effective batch size:** 32 (8 Γ— grad accumulation 4)
 
 
 
 
72
 
73
  ---
74
 
75
+ ## πŸ› οΈ Training
76
 
 
 
77
 
78
+ - **Evaluation dataset:** 273 examples (~20 % held-out split)
79
+ - **Epochs:** 10
80
+ - **Learning rate:** 2e-5
81
+ - **Per-device batch size:** 8
82
+ - **Gradient accumulation:** 4
83
+ - **Pooling method:** `cls`
84
+ - **Temperature:** 0.02
85
+ - **Loss:** `m3_kd_loss` (knowledge distillation + contrastive)
86
+ - **Knowledge distillation:** Enabled
87
+ - **Self-distillation:** Enabled
88
+ - **Unified fine-tuning:** Enabled
89
+ - **Encoder freezing:** Disabled
90
+ - **Optimizer:** AdamW
91
+ - **Scheduler:** Linear with 10% warmup
92
+
93
+ ---