| --- |
| license: apache-2.0 |
| language: |
| - gu |
| --- |
| |
| <!-- This model card has been generated automatically according to the information Keras had access to. You should |
| probably proofread and complete it, then remove this comment. --> |
|
|
| # Model description |
| The model is a mt5-small version of Gujarati Grammarly for spell correction given a sentence. Only this small version checkpoints are open source. |
|
|
| # Example usage: |
| from transformers import AutoTokenizer |
| import tensorflow as tf |
| from transformers import TFAutoModelForSeq2SeqLM |
| from transformers import create_optimizer |
| |
| model_checkpoint = "Jayveersinh-Raj/guj-grammar-small" |
| tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) |
| model = TFAutoModelForSeq2SeqLM.from_pretrained(model_checkpoint) |
| |
| sent="સુંદરકાંડના પ્રારંભમાં હનૂમાન બળવાન તો છે પણ સાથે-સાથે બુદ્ધિમાન પણ છે તેની રોચક ધર્મકથા છૈ" |
| inputs = tokenizer.encode(sent, return_tensors='tf') |
| output_ids = model.generate(inputs, max_length=128, num_beams = 4, early_stopping=True) |
| output = tokenizer.decode(output_ids[0], skip_special_tokens=True) |
| |
| print("Generated Correction:") |
| print(output) |
| |
| # Notes: |
| - Only supports Gujarati language for now |
| - Private dataset is used |
| - Only Tensorflow model is available for now, Pytorch checkpoints would be available soon. |