Update README.md
Browse files
README.md
CHANGED
|
@@ -4,12 +4,15 @@ This model takes in a word as an input and splits it into syllables. I did this
|
|
| 4 |
## Calling the Model
|
| 5 |
```python
|
| 6 |
from transformers import AutoTokenizer, T5ForConditionalGeneration
|
|
|
|
| 7 |
model = T5ForConditionalGeneration.from_pretrained('imjeffhi/syllabizer')
|
| 8 |
tokenizer = AutoTokenizer.from_pretrained('imjeffhi/syllabizer')
|
|
|
|
| 9 |
def generate_output(word):
|
| 10 |
tokens = tokenizer(word, return_tensors='pt')
|
| 11 |
output = model.generate(**tokens, do_sample=False, max_length=30, early_stopping=True)[0]
|
| 12 |
return tokenizer.decode(output, skip_special_tokens=True)
|
|
|
|
| 13 |
syllables = generate_output('syllabizer')
|
| 14 |
```
|
| 15 |
The model returns syllables in spaced format. See output below.
|
|
@@ -20,10 +23,13 @@ syl la biz er
|
|
| 20 |
You can easily syllabize an entire sentence/paragraph and/or convert the output into a list of syllables with the following code:
|
| 21 |
```python
|
| 22 |
from transformers import pipeline
|
|
|
|
| 23 |
syllabizer_pipe = pipeline('text2text-generation', model = 'imjeffhi/syllabizer', tokenizer='imjeffhi/syllabizer')
|
|
|
|
| 24 |
sentence = "A unit of spoken language consisting of a single uninterrupted sound formed by a vowel, diphthong, or syllabic consonant alone, or by any of these sounds preceded, followed, or surrounded by one or more consonants."
|
| 25 |
words = sentence.split(" ")
|
| 26 |
output = syllabizer_pipe(words, batch_size=len(words),do_sample=False, max_length=30, early_stopping=True)
|
|
|
|
| 27 |
[{words[i]: gen_text['generated_text'].split(" ")} for i, gen_text in enumerate(output)]
|
| 28 |
```
|
| 29 |
|
|
|
|
| 4 |
## Calling the Model
|
| 5 |
```python
|
| 6 |
from transformers import AutoTokenizer, T5ForConditionalGeneration
|
| 7 |
+
|
| 8 |
model = T5ForConditionalGeneration.from_pretrained('imjeffhi/syllabizer')
|
| 9 |
tokenizer = AutoTokenizer.from_pretrained('imjeffhi/syllabizer')
|
| 10 |
+
|
| 11 |
def generate_output(word):
|
| 12 |
tokens = tokenizer(word, return_tensors='pt')
|
| 13 |
output = model.generate(**tokens, do_sample=False, max_length=30, early_stopping=True)[0]
|
| 14 |
return tokenizer.decode(output, skip_special_tokens=True)
|
| 15 |
+
|
| 16 |
syllables = generate_output('syllabizer')
|
| 17 |
```
|
| 18 |
The model returns syllables in spaced format. See output below.
|
|
|
|
| 23 |
You can easily syllabize an entire sentence/paragraph and/or convert the output into a list of syllables with the following code:
|
| 24 |
```python
|
| 25 |
from transformers import pipeline
|
| 26 |
+
|
| 27 |
syllabizer_pipe = pipeline('text2text-generation', model = 'imjeffhi/syllabizer', tokenizer='imjeffhi/syllabizer')
|
| 28 |
+
|
| 29 |
sentence = "A unit of spoken language consisting of a single uninterrupted sound formed by a vowel, diphthong, or syllabic consonant alone, or by any of these sounds preceded, followed, or surrounded by one or more consonants."
|
| 30 |
words = sentence.split(" ")
|
| 31 |
output = syllabizer_pipe(words, batch_size=len(words),do_sample=False, max_length=30, early_stopping=True)
|
| 32 |
+
|
| 33 |
[{words[i]: gen_text['generated_text'].split(" ")} for i, gen_text in enumerate(output)]
|
| 34 |
```
|
| 35 |
|