Update README.md
Browse files
README.md
CHANGED
|
@@ -4,26 +4,18 @@ tags:
|
|
| 4 |
- albert-persian
|
| 5 |
- persian-lm
|
| 6 |
license: apache-2.0
|
| 7 |
-
datasets:
|
| 8 |
-
- Persian Wikidumps
|
| 9 |
-
- MirasText
|
| 10 |
-
- BigBang Page
|
| 11 |
-
- Chetor
|
| 12 |
-
- Eligasht
|
| 13 |
-
- DigiMag
|
| 14 |
-
- Ted Talks
|
| 15 |
-
- Books (Novels, ...)
|
| 16 |
---
|
| 17 |
|
| 18 |
# ALBERT-Persian
|
| 19 |
|
| 20 |
-
|
|
|
|
| 21 |
|
| 22 |
## Introduction
|
| 23 |
|
| 24 |
ALBERT-Persian trained on a massive amount of public corpora ([Persian Wikidumps](https://dumps.wikimedia.org/fawiki/), [MirasText](https://github.com/miras-tech/MirasText)) and six other manually crawled text data from a various type of websites ([BigBang Page](https://bigbangpage.com/) `scientific`, [Chetor](https://www.chetor.com/) `lifestyle`, [Eligasht](https://www.eligasht.com/Blog/) `itinerary`, [Digikala](https://www.digikala.com/mag/) `digital magazine`, [Ted Talks](https://www.ted.com/talks) `general conversational`, Books `novels, storybooks, short stories from old to the contemporary era`).
|
| 25 |
|
| 26 |
-
|
| 27 |
|
| 28 |
## Intended uses & limitations
|
| 29 |
|
|
@@ -34,6 +26,9 @@ fine-tuned versions on a task that interests you.
|
|
| 34 |
|
| 35 |
### How to use
|
| 36 |
|
|
|
|
|
|
|
|
|
|
| 37 |
#### TensorFlow 2.0
|
| 38 |
|
| 39 |
```python
|
|
|
|
| 4 |
- albert-persian
|
| 5 |
- persian-lm
|
| 6 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
# ALBERT-Persian
|
| 10 |
|
| 11 |
+
A Lite BERT for Self-supervised Learning of Language Representations for the Persian Language
|
| 12 |
+
> میتونی بهش بگی برت_کوچولو
|
| 13 |
|
| 14 |
## Introduction
|
| 15 |
|
| 16 |
ALBERT-Persian trained on a massive amount of public corpora ([Persian Wikidumps](https://dumps.wikimedia.org/fawiki/), [MirasText](https://github.com/miras-tech/MirasText)) and six other manually crawled text data from a various type of websites ([BigBang Page](https://bigbangpage.com/) `scientific`, [Chetor](https://www.chetor.com/) `lifestyle`, [Eligasht](https://www.eligasht.com/Blog/) `itinerary`, [Digikala](https://www.digikala.com/mag/) `digital magazine`, [Ted Talks](https://www.ted.com/talks) `general conversational`, Books `novels, storybooks, short stories from old to the contemporary era`).
|
| 17 |
|
| 18 |
+
Please follow the [ALBERT-Persian](https://github.com/m3hrdadfi/albert-persian) repo for the latest information about previous and current models.
|
| 19 |
|
| 20 |
## Intended uses & limitations
|
| 21 |
|
|
|
|
| 26 |
|
| 27 |
### How to use
|
| 28 |
|
| 29 |
+
- for using any type of Albert you have to install sentencepiece
|
| 30 |
+
- run this in your notebook ``` !pip install -q sentencepiece ```
|
| 31 |
+
|
| 32 |
#### TensorFlow 2.0
|
| 33 |
|
| 34 |
```python
|