Upload folder using huggingface_hub

Co-authored-by: Cursor <cursoragent@cursor.com>

Files changed (19) hide show

rnnlm_model/__pycache__/__init__.cpython-311.pyc DELETED Viewed

Binary file (576 Bytes)

rnnlm_model/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (530 Bytes)

rnnlm_model/__pycache__/__init__.cpython-38.pyc DELETED Viewed

Binary file (508 Bytes)

rnnlm_model/__pycache__/configuration_rnnlm.cpython-311.pyc DELETED Viewed

Binary file (2.11 kB)

rnnlm_model/__pycache__/configuration_rnnlm.cpython-312.pyc DELETED Viewed

Binary file (1.87 kB)

rnnlm_model/__pycache__/configuration_rnnlm.cpython-38.pyc DELETED Viewed

Binary file (1.46 kB)

rnnlm_model/__pycache__/modeling_rnnlm.cpython-311.pyc DELETED Viewed

Binary file (17.4 kB)

rnnlm_model/__pycache__/modeling_rnnlm.cpython-312.pyc DELETED Viewed

Binary file (16.6 kB)

rnnlm_model/__pycache__/modeling_rnnlm.cpython-38.pyc DELETED Viewed

Binary file (9.25 kB)

rnnlm_model/__pycache__/pipeline_rnnlm.cpython-311.pyc DELETED Viewed

Binary file (6.17 kB)

rnnlm_model/__pycache__/pipeline_rnnlm.cpython-312.pyc DELETED Viewed

Binary file (5.38 kB)

rnnlm_model/__pycache__/pipeline_rnnlm.cpython-38.pyc DELETED Viewed

Binary file (3.38 kB)

rnnlm_model/__pycache__/tokenization_rnnlm.cpython-311.pyc DELETED Viewed

Binary file (17.4 kB)

rnnlm_model/__pycache__/tokenization_rnnlm.cpython-312.pyc DELETED Viewed

Binary file (15.3 kB)

rnnlm_model/__pycache__/tokenization_rnnlm.cpython-38.pyc DELETED Viewed

Binary file (9.78 kB)

rnnlm_model/__pycache__/tokenization_utils.cpython-311.pyc DELETED Viewed

Binary file (24.6 kB)

rnnlm_model/__pycache__/tokenization_utils.cpython-312.pyc DELETED Viewed

Binary file (18.1 kB)

rnnlm_model/__pycache__/tokenization_utils.cpython-38.pyc DELETED Viewed

Binary file (11.8 kB)

rnnlm_model/tokenization_utils.py CHANGED Viewed

@@ -351,7 +351,16 @@ def filter_gen_seq(encoder, seq, n_sents=1, eos_tokens=[]):
         else:
             seq = getattr(doc, 'text', getattr(doc, 'string', str(doc)))
     else:
-        seq = " ".join(segment(encoder, seq)[:n_sents])
     if leading_space and seq:
         seq = " " + seq.lstrip()
     return seq

         else:
             seq = getattr(doc, 'text', getattr(doc, 'string', str(doc)))
     else:
+        sentences = segment(encoder, seq)
+        n = n_sents
+        seq = ""
+        while n <= len(sentences):
+            seq = " ".join(sentences[:n]).strip()
+            if seq:
+                break
+            n += 1
+        if not seq and sentences:
+            seq = " ".join(sentences).strip()
     if leading_space and seq:
         seq = " " + seq.lstrip()
     return seq