Update README colab notebook
Browse files
README.md
CHANGED
|
@@ -7,14 +7,16 @@ tags:
|
|
| 7 |
- 文言文
|
| 8 |
- ancient
|
| 9 |
- classical
|
|
|
|
|
|
|
| 10 |
license: cc-by-nc-sa-4.0
|
| 11 |
---
|
| 12 |
|
| 13 |
# BertForSequenceClassification model (Classical Chinese)
|
|
|
|
| 14 |
|
| 15 |
-
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
| 16 |
-
|
| 17 |
-
#### Labels: 0 = non-letter, 1 = letter
|
| 18 |
|
| 19 |
## Model description
|
| 20 |
|
|
@@ -35,12 +37,13 @@ Note that this model is primiarly aimed at predicting whether a Classical Chines
|
|
| 35 |
|
| 36 |
Here is how to use this model to get the features of a given text in PyTorch:
|
| 37 |
|
| 38 |
-
1. Import model
|
| 39 |
```python
|
| 40 |
from transformers import BertTokenizer
|
| 41 |
from transformers import BertForSequenceClassification
|
| 42 |
import torch
|
| 43 |
from numpy import exp
|
|
|
|
| 44 |
|
| 45 |
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
|
| 46 |
model = BertForSequenceClassification.from_pretrained('cbdb/ClassicalChineseLetterClassification',
|
|
|
|
| 7 |
- 文言文
|
| 8 |
- ancient
|
| 9 |
- classical
|
| 10 |
+
- letter
|
| 11 |
+
- 书信标题
|
| 12 |
license: cc-by-nc-sa-4.0
|
| 13 |
---
|
| 14 |
|
| 15 |
# BertForSequenceClassification model (Classical Chinese)
|
| 16 |
+
[](https://colab.research.google.com/drive/1jVu2LrNwkLolItPALKGNjeT6iCfzF8Ic?usp=sharing/)
|
| 17 |
|
| 18 |
+
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
| 19 |
+
* Labels: 0 = non-letter, 1 = letter
|
|
|
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
|
|
|
| 37 |
|
| 38 |
Here is how to use this model to get the features of a given text in PyTorch:
|
| 39 |
|
| 40 |
+
1. Import model and packages
|
| 41 |
```python
|
| 42 |
from transformers import BertTokenizer
|
| 43 |
from transformers import BertForSequenceClassification
|
| 44 |
import torch
|
| 45 |
from numpy import exp
|
| 46 |
+
import numpy as np
|
| 47 |
|
| 48 |
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
|
| 49 |
model = BertForSequenceClassification.from_pretrained('cbdb/ClassicalChineseLetterClassification',
|