Update README.md
Browse files
README.md
CHANGED
|
@@ -203,7 +203,7 @@ for i, (c, t) in enumerate(zip(chunks, token_pos)):
|
|
| 203 |
print(c)
|
| 204 |
```
|
| 205 |
## Experimental
|
| 206 |
-
The following script supports specifying max tokens per chunk
|
| 207 |
```python
|
| 208 |
import torch
|
| 209 |
from transformers import AutoTokenizer, BertForTokenClassification
|
|
|
|
| 203 |
print(c)
|
| 204 |
```
|
| 205 |
## Experimental
|
| 206 |
+
The following script supports specifying max tokens per chunk. If max_tokens_per_chunk is specified, texts will be forced to choose a best possible position from history to chunk when it is about to exceed the max_tokens_per_chunk and no token satisfy the prob_threshold. If max_tokens_per_chunk is None, it acts the same as above. This script can be seen as a new experimental version of the scripts above.
|
| 207 |
```python
|
| 208 |
import torch
|
| 209 |
from transformers import AutoTokenizer, BertForTokenClassification
|