Update README.md
Browse files
README.md
CHANGED
|
@@ -186,8 +186,8 @@ Content type:Research
|
|
| 186 |
Published on: 6 August 2024"
|
| 187 |
"""
|
| 188 |
# Chunk the text. The prob_threshold should be between (0, 1). The lower it is, the more chunks will be generated.
|
| 189 |
-
# Therefore adjust it to your need, when prob_threshold is small like 0.000001,
|
| 190 |
-
# when it is set to 1,
|
| 191 |
chunks, token_pos = chunk_text(model, ad, tokenizer, prob_threshold=0.5)
|
| 192 |
|
| 193 |
# print chunks
|
|
|
|
| 186 |
Published on: 6 August 2024"
|
| 187 |
"""
|
| 188 |
# Chunk the text. The prob_threshold should be between (0, 1). The lower it is, the more chunks will be generated.
|
| 189 |
+
# Therefore adjust it to your need, when prob_threshold is small like 0.000001, each token is one chunk,
|
| 190 |
+
# when it is set to 1, the whole text will be one chunk.
|
| 191 |
chunks, token_pos = chunk_text(model, ad, tokenizer, prob_threshold=0.5)
|
| 192 |
|
| 193 |
# print chunks
|