s8frbroy commited on
Commit
816e2c1
·
verified ·
1 Parent(s): 7b420a5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -9
README.md CHANGED
@@ -22,6 +22,36 @@ It serves as the **key-side encoder** in a **dual-encoder (DPR-style)** retrieva
22
 
23
  ---
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ## 🧩 Model Overview
26
 
27
  | Property | Description |
@@ -63,15 +93,18 @@ Before training, a **domain adaptation stage** aligned each talk with its own pa
63
 
64
  ---
65
 
66
- ## 💡 Usage Example
67
 
68
- ```python
69
- from transformers import AutoTokenizer, AutoModel
70
- import torch
71
 
72
- model = AutoModel.from_pretrained("s8frbroy/talk2ref_ref_key_cited_paper_encoder")
73
- tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
74
 
75
- paper_text = "We propose a retrieval architecture for linking long spoken documents to their references..."
76
- inputs = tokenizer(paper_text, return_tensors="pt", truncation=True, padding=True, max_length=512)
77
- embeddings = model(**inputs).last_hidden_state.mean(dim=1) # mean pooling
 
 
 
 
 
 
 
 
22
 
23
  ---
24
 
25
+ ---
26
+
27
+ ## 🎯 Usage
28
+
29
+ Example with `transformers`:
30
+
31
+ ```python
32
+ from transformers import AutoModel
33
+ import torch
34
+
35
+ # Load model
36
+ model = AutoModel.from_pretrained("s8frbroy/talk2ref_ref_key_cited_paper_encoder")
37
+
38
+ # Example input
39
+ title = "Attention Is All You Need"
40
+ year = 2017
41
+ abstract = "The Transformer model replaces recurrence with attention mechanisms for ..."
42
+
43
+
44
+ # Build input in Talk2Ref format
45
+ key_text = f"Title: {title}. Published in {year}. Abstract: {abstract}"
46
+
47
+ # Compute embedding
48
+ with torch.no_grad():
49
+ embedding = model([key_text])
50
+
51
+ print(embedding.shape) # (1, hidden_dim)
52
+
53
+ ```
54
+
55
  ## 🧩 Model Overview
56
 
57
  | Property | Description |
 
93
 
94
  ---
95
 
 
96
 
97
+ ## Citation
 
 
98
 
99
+ If you use this dataset, please cite the following paper:
 
100
 
101
+ ```bibtex
102
+ @misc{broy2025talk2refdatasetreferenceprediction,
103
+ title = {Talk2Ref: A Dataset for Reference Prediction from Scientific Talks},
104
+ author = {Frederik Broy and Maike Züfle and Jan Niehues},
105
+ year = {2025},
106
+ eprint = {2510.24478},
107
+ archivePrefix= {arXiv},
108
+ primaryClass = {cs.CL},
109
+ url = {https://arxiv.org/abs/2510.24478}
110
+ }