Shitao commited on
Commit
c378214
·
verified ·
1 Parent(s): 6441bc4

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +34 -8
README.md CHANGED
@@ -1,19 +1,24 @@
1
  ---
2
- {}
 
 
 
 
 
3
  ---
4
- # LLARA-7B-Passage
5
 
6
- This model is fine-tuned from LLaMA-2-7B using LoRA and the embedding size is 4096.
7
 
8
- ## Training Data
9
 
10
- The model is fine-tuned on the training split of [MS MARCO Passage Ranking](https://microsoft.github.io/msmarco/Datasets) datasets for 1 epoch. Please check our paper for details.
 
 
11
 
12
- ## Usage
13
 
14
- Below is an example to encode a query and a passage, and then compute their similarity using their embedding.
15
 
16
- ```python
17
  import torch
18
  from transformers import AutoModel, AutoTokenizer, LlamaModel
19
 
@@ -92,6 +97,27 @@ with torch.no_grad():
92
  score = query_embedding @ passage_embeddings.T
93
  print(score)
94
 
 
 
 
 
 
 
 
 
 
 
 
95
 
 
96
 
 
 
 
 
 
 
 
 
 
97
  ```
 
1
  ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ license: mit
8
  ---
 
9
 
10
+ For more details please refer to our github repo: https://github.com/FlagOpen/FlagEmbedding
11
 
12
+ # LLARA ([paper](https://arxiv.org/pdf/2312.15503))
13
 
14
+ In this project, we introduce LLaRA:
15
+ - EBAE: Embedding-Based Auto-Encoding.
16
+ - EBAR: Embedding-Based Auto-Regression.
17
 
 
18
 
19
+ ## Usage
20
 
21
+ ```
22
  import torch
23
  from transformers import AutoModel, AutoTokenizer, LlamaModel
24
 
 
97
  score = query_embedding @ passage_embeddings.T
98
  print(score)
99
 
100
+ ```
101
+
102
+
103
+ ## Acknowledgement
104
+
105
+ Thanks to the authors of open-sourced datasets, including MSMARCO, BEIR, etc.
106
+ Thanks to the open-sourced libraries like [Pyserini](https://github.com/castorini/pyserini).
107
+
108
+
109
+
110
+ ## Citation
111
 
112
+ If you find this repository useful, please consider giving a star :star: and citation
113
 
114
+ ```
115
+ @misc{li2023making,
116
+ title={Making Large Language Models A Better Foundation For Dense Retrieval},
117
+ author={Chaofan Li and Zheng Liu and Shitao Xiao and Yingxia Shao},
118
+ year={2023},
119
+ eprint={2312.15503},
120
+ archivePrefix={arXiv},
121
+ primaryClass={cs.CL}
122
+ }
123
  ```