s8frbroy commited on
Commit
f937764
·
verified ·
1 Parent(s): bda8d80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -10
README.md CHANGED
@@ -52,16 +52,14 @@ print(embedding.shape) # (1, hidden_dim)
52
 
53
  ## 🧩 Model Overview
54
 
55
- | **Property** | **Description** |
56
- |:-------------|:----------------|
57
- | **Architecture** | Sentence-BERT (`all-MiniLM-L6-v2` backbone) |
58
- | **Pooling Strategy** | Weighted mean aggregation over transcript chunks |
59
- | **Max Tokens per Chunk** | 512 |
60
- | **Input Representation** | Transcript + talk title + publication year |
61
- | **Training Objective** | Contrastive learning (DPR-style) using binary similarity loss |
62
- | **Training Data** | [Talk2Ref dataset](https://huggingface.co/datasets/s8frbroy/talk2ref) transcripts of 6,279 scientific talks |
63
- | **Task** | Encode scientific talks into a shared semantic space with their cited papers |
64
- | **Paired Model** | [Talk2Ref Cited Paper Encoder](https://huggingface.co/s8frbroy/talk2ref_ref_key_cited_paper_encoder) |
65
 
66
  ---
67
 
 
52
 
53
  ## 🧩 Model Overview
54
 
55
+ | Property | Description |
56
+ |-----------|-------------|
57
+ | **Architecture** | Sentence-BERT (all-MiniLM-L6-v2 backbone) |
58
+ | **Pooling** | Mean pooling |
59
+ | **Max sequence length** | 512 tokens |
60
+ | **Training data** | Talk2Ref dataset (≈ 43 k cited papers linked to 6 k talks) |
61
+ | **Objective** | Contrastive binary (DPR-style) loss |
62
+ | **Task** | Encode cited papers into a shared semantic space with talk transcripts |
 
 
63
 
64
  ---
65