Sentdex
/

GPyT

Text Generation

text-generation-inference

Model card Files Files and versions

Sentdex commited on Jul 16, 2021

Commit

35b3547

·

1 Parent(s): a94558c

Update README.md

Files changed (1) hide show

README.md +24 -10

README.md CHANGED Viewed

@@ -12,27 +12,41 @@ GPyT is a GPT2 model trained from scratch (not fine tuned) on Python code from G
 Newlines are replaced by `<N>`
 Input to the model is code, up to the context length of 1024, with newlines replaced by `<N>`
-Here's an example of a quick converter to take your multi-line code and replace the newlines:
 ```py
-inp = """def do_something():
-    print("Hello")
-"""
 newlinechar = "<N>"
 converted = inp.replace("\n", newlinechar)
-print("length:", len(converted))
-print(converted)
 ```
-This should give you something like:
-`def do_something():<N>    print("Hello")<N>`
-...which is what the model is expecting as input.
 Considerations:

 Newlines are replaced by `<N>`
 Input to the model is code, up to the context length of 1024, with newlines replaced by `<N>`
+Here's a quick example of using this model:
 ```py
+from transformers import AutoTokenizer, AutoModelWithLMHead
+tokenizer = AutoTokenizer.from_pretrained("Sentdex/GPyT")
+model = AutoModelWithLMHead.from_pretrained("Sentdex/GPyT")
+# copy and paste some code in here
+inp = """import"""
 newlinechar = "<N>"
 converted = inp.replace("\n", newlinechar)
+tokenized = tokenizer.encode(converted, return_tensors='pt').to("cuda")
+resp = model.generate(tokenized).to("cuda")
+decoded = tokenizer.decode(resp[0])
+reformatted = decoded.replace("<N>","\n")
+print(reformatted)
+```
+Should produce:
+```
+import numpy as np
+import pytest
+import pandas as pd<N
 ```
 Considerations: