Update README.md
Browse files
README.md
CHANGED
|
@@ -184,7 +184,9 @@ bert-chunker-3 (prob_threshold=0.50543) | N/A | 0 | 90.4 ± 28.7 | 3.3 ± 3.1 |
|
|
| 184 |
## Future
|
| 185 |
This model is undertrained due to lack of money and laziness. I observed it is still undertrained for two reasons:
|
| 186 |
- I trained it on 110 K window data for 2~3 epoch and the eval loss kept decreasing.
|
| 187 |
-
- The outputs from two non-overlapping windows show poor comparability in split point probabilities, which will undermines the performance when max_tokens_per_chunk is large. I think more data will amplify probability difference and enhance their comparability. This was corroborated by some of my experiments.
|
|
|
|
|
|
|
| 188 |
## Citation
|
| 189 |
```bibtex
|
| 190 |
@article{bert-chunker,
|
|
|
|
| 184 |
## Future
|
| 185 |
This model is undertrained due to lack of money and laziness. I observed it is still undertrained for two reasons:
|
| 186 |
- I trained it on 110 K window data for 2~3 epoch and the eval loss kept decreasing.
|
| 187 |
+
- The outputs from two non-overlapping windows show poor comparability in split point probabilities, which will undermines the performance when max_tokens_per_chunk is large. I think more data will amplify probability difference and enhance their comparability. This was corroborated by some of my experiments.
|
| 188 |
+
|
| 189 |
+
So next version will probably just be more data.
|
| 190 |
## Citation
|
| 191 |
```bibtex
|
| 192 |
@article{bert-chunker,
|