Upload folder using huggingface_hub

Files changed (7) hide show

README.md CHANGED Viewed

@@ -1,19 +1,24 @@
 ---
-{}
 ---
-# LLARA-7B-BEIR
-This model is fine-tuned from LLaMA-2-7B using LoRA and the embedding size is 4096.
-## Training Data
-The model is fine-tuned on the training split of [MS MARCO Passage Ranking](https://microsoft.github.io/msmarco/Datasets) datasets for 1 epoch. Please check our paper for details.
-## Usage
-Below is an example to encode a query and a passage, and then compute their similarity using their embedding.
-```python
 import torch
 from transformers import AutoModel, AutoTokenizer, LlamaModel
@@ -64,8 +69,8 @@ def get_passage_inputs(passages, tokenizer, max_length=512):
         )
 # Load the tokenizer and model
-tokenizer = AutoTokenizer.from_pretrained('cfli/LLARA-beir')
-model = AutoModel.from_pretrained('cfli/LLARA-beir')
 # Define query and passage inputs
 query = "What is llama?"
@@ -92,6 +97,27 @@ with torch.no_grad():
     score = query_embedding @ passage_embeddings.T
     print(score)
 ```

 ---
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- feature-extraction
+- sentence-similarity
+license: mit
 ---
+For more details please refer to our github repo: https://github.com/FlagOpen/FlagEmbedding
+# LLARA ([paper](https://arxiv.org/pdf/2312.15503))
+In this project, we introduce LLaRA:
+- EBAE: Embedding-Based Auto-Encoding.
+- EBAR: Embedding-Based Auto-Regression.
+## Usage
+```
 import torch
 from transformers import AutoModel, AutoTokenizer, LlamaModel
         )
 # Load the tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained('BAAI/LLARA-beir')
+model = AutoModel.from_pretrained('BAAI/LLARA-beir')
 # Define query and passage inputs
 query = "What is llama?"
     score = query_embedding @ passage_embeddings.T
     print(score)
+```
+## Acknowledgement
+Thanks to the authors of open-sourced datasets, including MSMARCO, BEIR, etc.
+Thanks to the open-sourced libraries like [Pyserini](https://github.com/castorini/pyserini).
+## Citation
+If you find this repository useful, please consider giving a star :star: and citation
+```
+@misc{li2023making,
+      title={Making Large Language Models A Better Foundation For Dense Retrieval},
+      author={Chaofan Li and Zheng Liu and Shitao Xiao and Yingxia Shao},
+      year={2023},
+      eprint={2312.15503},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
 ```

model-00001-of-00006.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:01dd7ea5dfa7418b6f2d6a29c50d7a632a74459e5445f6d13d04b22f23201f13
 size 4840658560

 version https://git-lfs.github.com/spec/v1
+oid sha256:d6fccd8125fbe08012de41b19f254477b4fb7653016d54f63a4cf05d6344a058
 size 4840658560

model-00002-of-00006.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bde0b61c7e53a84ddc7432a2049fb125191da2a7d5dbae7fe4a592f35e2224fc
 size 4857206856

 version https://git-lfs.github.com/spec/v1
+oid sha256:2ab32cbdc54b39e8f04666421c5e6ad181b59954cf257a9d131d55ebb16e1268
 size 4857206856

model-00003-of-00006.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:75c9c8c1a7ffdc359e7be1ec0e40c7598d2a2167c1ea1380ad67b5409c80c4f2
 size 4857206904

 version https://git-lfs.github.com/spec/v1
+oid sha256:9ca1bfeacefd35b9dd1c801c45fa0ad809071faccdd99253e4cf5255cf38ee9b
 size 4857206904

model-00004-of-00006.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b99f7143e6bb96b6d56fbda293f97e4aa7554a13691949e5aa0d3ccdd422029f
 size 4857206904

 version https://git-lfs.github.com/spec/v1
+oid sha256:0b027c43b746337990b12d3f0f6f7b2c2dc0524cd632738eab22557bc0446be0
 size 4857206904

model-00005-of-00006.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:728bfd285dd8bf48458715de351982e1fa9449a8cf8111f2e9da05e567e3c346
 size 4857206904

 version https://git-lfs.github.com/spec/v1
+oid sha256:54890415310f56758951b79a88be1592fba4bd5168053f379c59c4a6e7f1bd25
 size 4857206904

model-00006-of-00006.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d5034cc2eafecc0c004dad0880a7e7d75a61bd8ecdf2b1e4bfdc6bd1b8f9e8b7
 size 2684734256

 version https://git-lfs.github.com/spec/v1
+oid sha256:8f8fb6aa36b09ed23b828f92500c76dc784a22511661bd2200e76cf79d9c52a7
 size 2684734256