s8frbroy commited on
Commit
bda8d80
·
verified ·
1 Parent(s): c5ec30f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -7
README.md CHANGED
@@ -52,18 +52,21 @@ print(embedding.shape) # (1, hidden_dim)
52
 
53
  ## 🧩 Model Overview
54
 
55
- | Property | Description |
56
- |-----------|--------------|
57
- | **Architecture** | Sentence-BERT (all-MiniLM-L6-v2 backbone) |
58
- | **Pooling** | Weighted mean aggregation over transcript chunks |
59
- | **Max tokens per chunk** | 512 |
60
- | **Trained on** | Talk2Ref dataset transcripts of 6,279 scientific talks |
61
- | **Objective** | Contrastive learning (DPR-style) using binary similarity loss |
 
62
  | **Task** | Encode scientific talks into a shared semantic space with their cited papers |
 
63
 
64
  ---
65
 
66
 
 
67
  ## Citation
68
 
69
  If you use this dataset, please cite the following paper:
 
52
 
53
  ## 🧩 Model Overview
54
 
55
+ | **Property** | **Description** |
56
+ |:-------------|:----------------|
57
+ | **Architecture** | Sentence-BERT (`all-MiniLM-L6-v2` backbone) |
58
+ | **Pooling Strategy** | Weighted mean aggregation over transcript chunks |
59
+ | **Max Tokens per Chunk** | 512 |
60
+ | **Input Representation** | Transcript + talk title + publication year |
61
+ | **Training Objective** | Contrastive learning (DPR-style) using binary similarity loss |
62
+ | **Training Data** | [Talk2Ref dataset](https://huggingface.co/datasets/s8frbroy/talk2ref) – transcripts of 6,279 scientific talks |
63
  | **Task** | Encode scientific talks into a shared semantic space with their cited papers |
64
+ | **Paired Model** | [Talk2Ref Cited Paper Encoder](https://huggingface.co/s8frbroy/talk2ref_ref_key_cited_paper_encoder) |
65
 
66
  ---
67
 
68
 
69
+
70
  ## Citation
71
 
72
  If you use this dataset, please cite the following paper: