tim1900 commited on
Commit
88c98f1
·
verified ·
1 Parent(s): 546002b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -171,7 +171,9 @@ Python,但估计听说过这门语言的读者很少。
171
  部分,也拥有自己的全局命名空间。内置名称实际上也在模块里,即
172
  "builtins" 。
173
  '''
174
- # chunk the text. The prob_threshold should be between (0, 1). The lower it is, the more chunks will be generated.
 
 
175
  chunks, token_pos = chunk_text(model, doc, tokenizer, prob_threshold=0.5)
176
 
177
  # print chunks
 
171
  部分,也拥有自己的全局命名空间。内置名称实际上也在模块里,即
172
  "builtins" 。
173
  '''
174
+ # Chunk the text. The prob_threshold should be between (0, 1). The lower it is, the more chunks will be generated.
175
+ # Therefore adjust it to your need, when prob_threshold is small like 0.000001, one token is one chunk,
176
+ # when it is set to 1, no chunk will be generated.
177
  chunks, token_pos = chunk_text(model, doc, tokenizer, prob_threshold=0.5)
178
 
179
  # print chunks