tim1900
/

bert-chunker-3.5

Token Classification

Model card Files Files and versions

tim1900 commited on 15 days ago

Commit

1f736e0

·

verified ·

1 Parent(s): 0c64768

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -183,7 +183,7 @@ bert-chunker-3 (prob_threshold=0.50543) | N/A | 0 | 90.4 ± 28.7 | 3.3 ± 3.1 |
 ★ bert-chunker-3.5 | <= 200 | 0 | 90.4 ± 26.2 | 7.7 ± 5.7 | 29.2 ± 17.9 | 7.6 ± 5.7 |**O(N)** | **Yes**
 ## Future
 This model is undertrained due to lack of money and laziness. I observed it is still undertrained for two reasons:
-- I train it on 110 K window data for 2~3 epoch and the eval loss keep decreasing.
 - The outputs from two non-overlapping windows show poor comparability in split point probabilities, which will undermines the performance when max_tokens_per_chunk is large. I think more data will amplify probability difference and enhance their comparability. This was corroborated by some of my experiments. So next version will probably just be more data.
 ## Citation
 ```bibtex

 ★ bert-chunker-3.5 | <= 200 | 0 | 90.4 ± 26.2 | 7.7 ± 5.7 | 29.2 ± 17.9 | 7.6 ± 5.7 |**O(N)** | **Yes**
 ## Future
 This model is undertrained due to lack of money and laziness. I observed it is still undertrained for two reasons:
+- I trained it on 110 K window data for 2~3 epoch and the eval loss kept decreasing.
 - The outputs from two non-overlapping windows show poor comparability in split point probabilities, which will undermines the performance when max_tokens_per_chunk is large. I think more data will amplify probability difference and enhance their comparability. This was corroborated by some of my experiments. So next version will probably just be more data.
 ## Citation
 ```bibtex