Update README.md, link demo in Spaces
Browse files
README.md
CHANGED
|
@@ -27,6 +27,8 @@ Inspired by [KeyBERT](https://github.com/MaartenGr/KeyBERT), KeyBERTVi implement
|
|
| 27 |
This implementation took inspiration from the simple yet intuitive and powerful method of [KeyBERT](https://github.com/MaartenGr/KeyBERT/), applied for the Vietnamese language. PhoBERT are used to generate both document-level embeddings and word-level embeddings for extracted N-grams. Cosine similarity is then used to compute which N-grams are most similar to the document-level embedding, thus can be perceived as most representative of the document.
|
| 28 |
Preprocessing catered to the Vietnamese language was applied.
|
| 29 |
|
|
|
|
|
|
|
| 30 |
<a name="gettingstarted"/></a>
|
| 31 |
## 2. Getting Started
|
| 32 |
<a name="installation"/></a>
|
|
@@ -48,6 +50,8 @@ You can use existing pre-trained models in the repo or download your own and put
|
|
| 48 |
torch.save(ner_model, f'{dir_path}/pretrained-models/ner-vietnamese-electra-base.pt')
|
| 49 |
```
|
| 50 |
|
|
|
|
|
|
|
| 51 |
As [PhoBERT](https://huggingface.co/vinai/phobert-base) requires [VnCoreNLP](https://github.com/vncorenlp/VnCoreNLP) as part of pre-processing, the folder `pretrained-models/vncorenlp` is required. To download your own:
|
| 52 |
```bash
|
| 53 |
pip install py_vncorenlp
|
|
|
|
| 27 |
This implementation took inspiration from the simple yet intuitive and powerful method of [KeyBERT](https://github.com/MaartenGr/KeyBERT/), applied for the Vietnamese language. PhoBERT are used to generate both document-level embeddings and word-level embeddings for extracted N-grams. Cosine similarity is then used to compute which N-grams are most similar to the document-level embedding, thus can be perceived as most representative of the document.
|
| 28 |
Preprocessing catered to the Vietnamese language was applied.
|
| 29 |
|
| 30 |
+
Test with your own documents at [KeyBERTVi Space](https://huggingface.co/spaces/tpha4308/keybertvi-app).
|
| 31 |
+
|
| 32 |
<a name="gettingstarted"/></a>
|
| 33 |
## 2. Getting Started
|
| 34 |
<a name="installation"/></a>
|
|
|
|
| 50 |
torch.save(ner_model, f'{dir_path}/pretrained-models/ner-vietnamese-electra-base.pt')
|
| 51 |
```
|
| 52 |
|
| 53 |
+
**Note:** `dir_path` is the absolute path to the repo.
|
| 54 |
+
|
| 55 |
As [PhoBERT](https://huggingface.co/vinai/phobert-base) requires [VnCoreNLP](https://github.com/vncorenlp/VnCoreNLP) as part of pre-processing, the folder `pretrained-models/vncorenlp` is required. To download your own:
|
| 56 |
```bash
|
| 57 |
pip install py_vncorenlp
|