JasonWang1 commited on
Commit
1db1628
·
verified ·
1 Parent(s): 7b4c60d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -1
README.md CHANGED
@@ -11,4 +11,65 @@ tags:
11
  ---
12
 
13
  # BiTimeBERT
14
- BiTimeBERT is pretrained on the New York Times Annotated Corpus using two temporal objectives: TAMLM (Time-aware Masked Language Modeling) and DD (Document Dating). Note that the DD task employs monthly temporal granularity, classifying documents into 246 month labels spanning the corpus timeline, and thus, the seq_relationship_head outputs 246-class temporal predictions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
  # BiTimeBERT
14
+ BiTimeBERT is pretrained on the New York Times Annotated Corpus using two temporal objectives: TAMLM (Time-aware Masked Language Modeling) and DD (Document Dating). Note that the DD task employs monthly temporal granularity, classifying documents into 246 month labels spanning the corpus timeline, and thus, the seq_relationship_head outputs 246-class temporal predictions.
15
+
16
+ ## 🎯 Model Details
17
+
18
+ | Property | Value |
19
+ |----------|-------|
20
+ | **Base Model** | `bert-base-cased` |
21
+ | **Pretraining Tasks** | TAMLM + DD |
22
+ | **Temporal Granularity** | Month-level |
23
+ | **DD Labels** | 246 month classes |
24
+ | **Training Corpus** | NYT Annotated Corpus |
25
+ | **Framework** | PyTorch / Transformers |
26
+ | **Language** | English |
27
+
28
+
29
+ ## 🚀 How to Load This Model
30
+
31
+ ### ⚠️ Important: Custom Loading Required
32
+
33
+ Due to the modified `seq_relationship` head (246-class vs. standard 2-class NSP), you **cannot** load this model with the default `from_pretrained()` alone. Follow one of the methods below:
34
+
35
+ ---
36
+
37
+ ### You can use Helper Function as below to load BiTimeBERT
38
+
39
+ import torch
40
+ import torch.nn as nn
41
+ from transformers import BertForPreTraining, BertTokenizer, BertConfig
42
+ from huggingface_hub import hf_hub_download
43
+ import safetensors.torch as safetensors_lib
44
+
45
+ def load_bitembert(model_id="JasonWang1/BiTimeBERT", device=None, num_temporal_labels=246):
46
+ if device is None:
47
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
48
+
49
+ # Load config and tokenizer
50
+ config = BertConfig.from_pretrained(model_id)
51
+ tokenizer = BertTokenizer.from_pretrained(model_id)
52
+
53
+ # Load model with mismatched sizes ignored
54
+ model = BertForPreTraining.from_pretrained(
55
+ model_id,
56
+ config=config,
57
+ ignore_mismatched_sizes=True
58
+ )
59
+
60
+ # Replace DD head with correct dimension
61
+ model.cls.seq_relationship = nn.Linear(config.hidden_size, num_temporal_labels)
62
+
63
+ # Download and load DD head weights from safetensors
64
+ weights_path = hf_hub_download(repo_id=model_id, filename="model.safetensors")
65
+ state_dict = safetensors_lib.load_file(weights_path, device='cpu')
66
+
67
+ if 'cls.seq_relationship.weight' in state_dict:
68
+ model.cls.seq_relationship.weight.data = state_dict['cls.seq_relationship.weight']
69
+ model.cls.seq_relationship.bias.data = state_dict['cls.seq_relationship.bias']
70
+
71
+ model.eval()
72
+ return model.to(device), tokenizer
73
+
74
+ # ================= Usage =================
75
+ model, tokenizer = load_bitembert("JasonWang1/BiTimeBERT")