SJ-Ray
/

Re-Punctuate

text2text-generation

Model card Files Files and versions

SJ-Ray commited on Mar 16, 2022

Commit

e80f685

·

1 Parent(s): c215d2a

Update README.md

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -1,3 +1,28 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+<h2>Re-Punctuate:</h2>
+Re-Punctuate is a T5 model that attempts to correct Capitalization and Punctuations in the sentences.
+<h3>DataSet:</h3>
+DialogSum dataset (115056 Records) was used to fine-tune the model for Punctuation and Capitalization correction.
+<h3>Usage:</h3>
+<pre>
+from transformers import T5Tokenizer, TFT5ForConditionalGeneration
+tokenizer = T5Tokenizer.from_pretrained('SJ-Ray/Re-Punctuate/')
+model = TFT5ForConditionalGeneration.from_pretrained('SJ-Ray/Re-Punctuate/')
+input_text = 'the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination'
+inputs = tokenizer.encode("punctuate: " + input_text, return_tensors="tf")
+result = model.generate(inputs)
+decoded_output = tokenizer.decode(result[0], skip_special_tokens=True)
+print(decoded_output)
+</pre>
+<h4> Example: </h4>
+<b>Input:</b>  the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination <br>
+<b>Output:</b> The story of this brave, brilliant athlete, whose very being was questioned so publicly, is one that still captures the imagination.