Update README.md
Browse files
README.md
CHANGED
|
@@ -48,7 +48,7 @@ model = AutoModel.from_pretrained("nlpaueb/sec-bert-base")
|
|
| 48 |
|
| 49 |
## Pre-process Text
|
| 50 |
|
| 51 |
-
To use SEC-BERT-SHAPE, you have to pre-process texts replacing every numerical token with the corresponding shape pseudo-token, from a list of 214 predefined shape pseudo-tokens. If the numerical token does not correspond to any shape pseudo
|
| 52 |
Below there is an example of how you can pre-process a simple sentence. This approach is quite simple; feel free to modify it as you see fit.
|
| 53 |
|
| 54 |
```python
|
|
@@ -84,7 +84,7 @@ print(tokenized_sentence)
|
|
| 84 |
"""
|
| 85 |
```
|
| 86 |
|
| 87 |
-
##
|
| 88 |
|
| 89 |
| Sample | Masked Token |
|
| 90 |
| --------------------------------------------------- | ------------ |
|
|
@@ -224,6 +224,23 @@ The model has been officially released with the following article:<br>
|
|
| 224 |
Lefteris Loukas, Manos Fergadiotis, Ilias Chalkidis, Eirini Spyropoulou, Prodromos Malakasiotis, Ion Androutsopoulos and George Paliouras.<br>
|
| 225 |
In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022) (Long Papers), Dublin, Republic of Ireland, May 22 - 27, 2022.
|
| 226 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 227 |
## About Us
|
| 228 |
|
| 229 |
[AUEB's Natural Language Processing Group](http://nlp.cs.aueb.gr) develops algorithms, models, and systems that allow computers to process and generate natural language texts.
|
|
|
|
| 48 |
|
| 49 |
## Pre-process Text
|
| 50 |
|
| 51 |
+
To use SEC-BERT-SHAPE, you have to pre-process texts replacing every numerical token with the corresponding shape pseudo-token, from a list of 214 predefined shape pseudo-tokens. If the numerical token does not correspond to any shape pseudo-token we replace it with the [NUM] pseudo-token.
|
| 52 |
Below there is an example of how you can pre-process a simple sentence. This approach is quite simple; feel free to modify it as you see fit.
|
| 53 |
|
| 54 |
```python
|
|
|
|
| 84 |
"""
|
| 85 |
```
|
| 86 |
|
| 87 |
+
## Using SEC-BERT variants as Language Models
|
| 88 |
|
| 89 |
| Sample | Masked Token |
|
| 90 |
| --------------------------------------------------- | ------------ |
|
|
|
|
| 224 |
Lefteris Loukas, Manos Fergadiotis, Ilias Chalkidis, Eirini Spyropoulou, Prodromos Malakasiotis, Ion Androutsopoulos and George Paliouras.<br>
|
| 225 |
In the Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022) (Long Papers), Dublin, Republic of Ireland, May 22 - 27, 2022.
|
| 226 |
|
| 227 |
+
```
|
| 228 |
+
@inproceedings{loukas-etal-2022-finer,
|
| 229 |
+
title = "{FiNER: Financial Numeric Entity Recognition for XBRL Tagging}",
|
| 230 |
+
author = "Loukas, Lefteris and
|
| 231 |
+
Fergadiotis, Manos and
|
| 232 |
+
Chalkidis, Ilias and
|
| 233 |
+
Spyropoulou, Eirini and
|
| 234 |
+
Malakasiotis, Prodromos and
|
| 235 |
+
Androutsopoulos, Ion and
|
| 236 |
+
Paliouras George",
|
| 237 |
+
booktitle = "60th Annual Meeting of the Association for Computational Linguistics",
|
| 238 |
+
month = may,
|
| 239 |
+
year = "2022",
|
| 240 |
+
publisher = "Association for Computational Linguistics",
|
| 241 |
+
}
|
| 242 |
+
```
|
| 243 |
+
|
| 244 |
## About Us
|
| 245 |
|
| 246 |
[AUEB's Natural Language Processing Group](http://nlp.cs.aueb.gr) develops algorithms, models, and systems that allow computers to process and generate natural language texts.
|