ojhfklsjhl commited on
Commit
8c12f3a
·
verified ·
1 Parent(s): edb3859

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -8
README.md CHANGED
@@ -101,14 +101,7 @@ print(f"Text: {text}")
101
  print(f"Embedding (first 10 dimensions): {cls_embedding[:10].tolist()}")
102
  ```
103
 
104
- ### A note about model choice
105
-
106
- Even though NoLBERT has the advantage of no lookahead and lookback bias, researchers should carefully consider their model choice on a case-by-case basis, especially for long texts.
107
-
108
- In particular, there is a bias–performance trade-off between NoLBERT or other custom small models (or simpler NLP methods, e.g., BoW, Word2Vec, etc.) versus large industrial-grade language models. On one hand, a BERT-like custom information-leakage-free model avoids temporal inconsistencies by design. On the other hand, these models lack the ability to process long texts due to limited context windows, and their output text representations are often of lower quality compared to large models trained on unconstrained data.
109
-
110
- The advantage of avoiding temporal biases is pronounced in tasks where models must predict outcomes that go beyond the information explicitly stated in the text, such as forecasting stock price reactions from earnings call transcripts, despite the tradeoff of having less precise text representations. However, for in-context information retrieval tasks such as summarization, classification, and other NLP tasks based on given precise guidelines, the risk of information leakage from the model’s out-of-context knowledge base is limited (with careful prompting and verification, or by using methods like RAG). Therefore, large, highly performant models may be preferable.
111
-
112
 
113
  ## Citation
114
 
 
101
  print(f"Embedding (first 10 dimensions): {cls_embedding[:10].tolist()}")
102
  ```
103
 
104
+ See [paper](https://arxiv.org/abs/2509.01110) for more details (NeurIPS 2025 Gen AI in Finance).
 
 
 
 
 
 
 
105
 
106
  ## Citation
107