Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ from transformers import AutoTokenizer, AutoModelForTokenClassification
|
|
| 22 |
|
| 23 |
Firstly, you'll have to define this method, since the text preprocessing is custom and the standard `pipeline` method won't suffice:
|
| 24 |
```python
|
| 25 |
-
def
|
| 26 |
text: str,
|
| 27 |
model_id="auhide/keybert-bg",
|
| 28 |
max_len: int = 300,
|
|
@@ -85,14 +85,14 @@ def get_keywords(
|
|
| 85 |
```
|
| 86 |
|
| 87 |
Choose a text and use the model on it. For example, I've chosen to use [this](https://novini.bg/biznes/biznes_tehnologii/781108) article.
|
| 88 |
-
Then, you can call `
|
| 89 |
```python
|
| 90 |
# Reading the text from a file, since it is an article, and the text is large.
|
| 91 |
with open("input_text.txt", "r", encoding="utf-8") as f:
|
| 92 |
text = f.read()
|
| 93 |
|
| 94 |
# You can change the threshold based on your needs.
|
| 95 |
-
keywords =
|
| 96 |
print("Keywords:")
|
| 97 |
pprint(keywords)
|
| 98 |
```
|
|
|
|
| 22 |
|
| 23 |
Firstly, you'll have to define this method, since the text preprocessing is custom and the standard `pipeline` method won't suffice:
|
| 24 |
```python
|
| 25 |
+
def extract_keywords(
|
| 26 |
text: str,
|
| 27 |
model_id="auhide/keybert-bg",
|
| 28 |
max_len: int = 300,
|
|
|
|
| 85 |
```
|
| 86 |
|
| 87 |
Choose a text and use the model on it. For example, I've chosen to use [this](https://novini.bg/biznes/biznes_tehnologii/781108) article.
|
| 88 |
+
Then, you can call `extract_keywords` on it and extract its keywords:
|
| 89 |
```python
|
| 90 |
# Reading the text from a file, since it is an article, and the text is large.
|
| 91 |
with open("input_text.txt", "r", encoding="utf-8") as f:
|
| 92 |
text = f.read()
|
| 93 |
|
| 94 |
# You can change the threshold based on your needs.
|
| 95 |
+
keywords = extract_keywords(text, threshold=0.5)
|
| 96 |
print("Keywords:")
|
| 97 |
pprint(keywords)
|
| 98 |
```
|