flexudy
/

cheapity3

text2text-generation

text-generation-inference

Model card Files Files and versions

flexudy commited on Dec 25, 2021

Commit

da43d4f

·

1 Parent(s): 446e7f9

init

Files changed (1) hide show

README.md +54 -0

README.md ADDED Viewed

	@@ -0,0 +1,54 @@

+# Cheapity3 🐷
+GPT3-like T5 model trained to generate text in multiple languages.
+## Motivation
+- GPT models are expensive run.
+- GPT models are monolingual.
+## Solution
+- Maybe, Small Models aren't Terrible (*SMarT*)
+- Plus, they are cheaper to run.
+I fine-tuned T5 on multiple languages (🇬🇧 English, 🇩🇪 German, 🇫🇷 French) and multiple academic text snippets from various
+domains like tech, law, finance and science etc. to generate text, just like GPT models do.
+## Usage
+- Provide some text e.g `"Italy, officially the Italian Republic is a country consisting of"`
+- Tell Cheapity3 how many words you want to generate e.g `15` -- 😃 Yes, you can control the length.
+- Cheapity3 reads your text and generates a continuation containing approximately 15 words.
+```python
+from transformers import AutoTokenizer, AutoModelWithLMHead
+tokenizer = AutoTokenizer.from_pretrained("flexudy/cheapity3")
+model = AutoModelWithLMHead.from_pretrained("flexudy/cheapity3")
+input_text = "guess: Italy, officially the Italian Republic is a country consisting of { _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ }" # 15 words
+inputs = tokenizer.encode(input_text, return_tensors="pt", truncation=True, max_length=512)
+input_ids = inputs["input_ids"]
+attention_mask = inputs["attention_mask"]
+outputs = model.generate(
+            input_ids=input_ids,
+            attention_mask=attention_mask,
+            max_length=128,
+            do_sample=True,
+            early_stopping=True,
+            num_return_sequences=4,
+            repetition_penalty=2.5
+        )
+for i in range(4):
+    print(tokenizer.decode(outputs[i], skip_special_tokens=True, clean_up_tokenization_spaces=True))
+# >
+# >
+# >
+# >
+```
+#