phonghoccode commited on
Commit
cb8f17a
·
verified ·
1 Parent(s): 89e3b6e

Model save

Browse files
Files changed (1) hide show
  1. README.md +39 -117
README.md CHANGED
@@ -1,137 +1,59 @@
1
  ---
 
 
 
2
  tags:
3
- - sentence-transformers
4
- - cross-encoder
5
- - reranker
6
- pipeline_tag: text-ranking
7
- library_name: sentence-transformers
8
  ---
9
 
10
- # CrossEncoder
 
11
 
12
- This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model trained using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
13
 
14
- ## Model Details
15
 
16
- ### Model Description
17
- - **Model Type:** Cross Encoder
18
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
19
- - **Maximum Sequence Length:** 8192 tokens
20
- - **Number of Output Labels:** 1 label
21
- <!-- - **Training Dataset:** Unknown -->
22
- <!-- - **Language:** Unknown -->
23
- <!-- - **License:** Unknown -->
24
 
25
- ### Model Sources
26
 
27
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
28
- - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
29
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
30
- - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
31
 
32
- ## Usage
33
 
34
- ### Direct Usage (Sentence Transformers)
35
 
36
- First install the Sentence Transformers library:
37
 
38
- ```bash
39
- pip install -U sentence-transformers
40
- ```
41
 
42
- Then you can load this model and run inference.
43
- ```python
44
- from sentence_transformers import CrossEncoder
45
 
46
- # Download from the 🤗 Hub
47
- model = CrossEncoder("phonghoccode/ALQAC_2025_Reranker_top20")
48
- # Get scores for pairs of texts
49
- pairs = [
50
- ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
51
- ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
52
- ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
53
- ]
54
- scores = model.predict(pairs)
55
- print(scores.shape)
56
- # (3,)
 
 
 
57
 
58
- # Or rank different texts based on similarity to a single text
59
- ranks = model.rank(
60
- 'How many calories in an egg',
61
- [
62
- 'There are on average between 55 and 80 calories in an egg depending on its size.',
63
- 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
64
- 'Most of the calories in an egg come from the yellow yolk in the center.',
65
- ]
66
- )
67
- # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
68
- ```
69
 
70
- <!--
71
- ### Direct Usage (Transformers)
72
 
73
- <details><summary>Click to see the direct usage in Transformers</summary>
74
 
75
- </details>
76
- -->
77
 
78
- <!--
79
- ### Downstream Usage (Sentence Transformers)
80
-
81
- You can finetune this model on your own dataset.
82
-
83
- <details><summary>Click to expand</summary>
84
-
85
- </details>
86
- -->
87
-
88
- <!--
89
- ### Out-of-Scope Use
90
-
91
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
92
- -->
93
-
94
- <!--
95
- ## Bias, Risks and Limitations
96
-
97
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
98
- -->
99
-
100
- <!--
101
- ### Recommendations
102
-
103
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
104
- -->
105
-
106
- ## Training Details
107
-
108
- ### Framework Versions
109
- - Python: 3.10.12
110
- - Sentence Transformers: 5.0.0
111
- - Transformers: 4.53.2
112
- - PyTorch: 2.7.0+cu126
113
- - Accelerate: 1.8.1
114
- - Datasets: 4.0.0
115
- - Tokenizers: 0.21.2
116
-
117
- ## Citation
118
-
119
- ### BibTeX
120
-
121
- <!--
122
- ## Glossary
123
-
124
- *Clearly define terms in order to be accessible across audiences.*
125
- -->
126
-
127
- <!--
128
- ## Model Card Authors
129
-
130
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
131
- -->
132
-
133
- <!--
134
- ## Model Card Contact
135
-
136
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
137
- -->
 
1
  ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: phonghoccode/ALQAC_2025_Reranker_top20_zalo
5
  tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: ALQAC_2025_Reranker_top20
9
+ results: []
 
10
  ---
11
 
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # ALQAC_2025_Reranker_top20
16
 
17
+ This model is a fine-tuned version of [phonghoccode/ALQAC_2025_Reranker_top20_zalo](https://huggingface.co/phonghoccode/ALQAC_2025_Reranker_top20_zalo) on an unknown dataset.
18
 
19
+ ## Model description
 
 
 
 
 
 
 
20
 
21
+ More information needed
22
 
23
+ ## Intended uses & limitations
 
 
 
24
 
25
+ More information needed
26
 
27
+ ## Training and evaluation data
28
 
29
+ More information needed
30
 
31
+ ## Training procedure
 
 
32
 
33
+ ### Training hyperparameters
 
 
34
 
35
+ The following hyperparameters were used during training:
36
+ - learning_rate: 6e-05
37
+ - train_batch_size: 4
38
+ - eval_batch_size: 8
39
+ - seed: 42
40
+ - distributed_type: multi-GPU
41
+ - num_devices: 2
42
+ - total_train_batch_size: 8
43
+ - total_eval_batch_size: 16
44
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
+ - lr_scheduler_type: linear
46
+ - lr_scheduler_warmup_ratio: 0.1
47
+ - num_epochs: 3.0
48
+ - mixed_precision_training: Native AMP
49
 
50
+ ### Training results
 
 
 
 
 
 
 
 
 
 
51
 
 
 
52
 
 
53
 
54
+ ### Framework versions
 
55
 
56
+ - Transformers 4.53.2
57
+ - Pytorch 2.7.0+cu126
58
+ - Datasets 4.0.0
59
+ - Tokenizers 0.21.2