nlpaueb commited on
Commit
20ecec1
·
1 Parent(s): 86078dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -47,6 +47,8 @@ tokenizer = AutoTokenizer.from_pretrained("nlpaueb/sec-bert-base")
47
  model = AutoModel.from_pretrained("nlpaueb/sec-bert-base")
48
  ```
49
 
 
 
50
  In order to use SEC-BERT-NUM, you have to pre-process texts replacing every numerical token with a corresponding shape pseudo token from a list of 214 predefined shape pseudo tokens. If the numerical token does not correspond to any shape pseudo token we replace it with the [NUM] pseudo-token.
51
  Below there is an example how you can pre-process a simple sentence. This approach is quite simple, feel free to modify it as you see fit.
52
 
 
47
  model = AutoModel.from_pretrained("nlpaueb/sec-bert-base")
48
  ```
49
 
50
+ ## Pre-process Text
51
+
52
  In order to use SEC-BERT-NUM, you have to pre-process texts replacing every numerical token with a corresponding shape pseudo token from a list of 214 predefined shape pseudo tokens. If the numerical token does not correspond to any shape pseudo token we replace it with the [NUM] pseudo-token.
53
  Below there is an example how you can pre-process a simple sentence. This approach is quite simple, feel free to modify it as you see fit.
54