nlpaueb commited on
Commit
1e5d4e8
·
1 Parent(s): df19cbe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -11,7 +11,7 @@ widget:
11
  - text: "Total net sales decreased [X]% or $[MASK] billion during [XXXX] compared to [XXXX]."
12
  - text: "Total net sales decreased [X]% or $[X.X] billion during [MASK] compared to [XXXX]."
13
  - text: "During [MASK], the Company repurchased $[XX.X] billion of its common stock and paid dividend equivalents of $[XX.X] billion."
14
- - text: "During 2019, the Company repurchased $[MASK] billion of its common stock and paid [MASK] equivalents of $[XX.X] billion."
15
  ---
16
 
17
  # SEC-BERT
@@ -48,7 +48,7 @@ model = AutoModel.from_pretrained("nlpaueb/sec-bert-base")
48
 
49
  ## Pre-process Text
50
 
51
- To use SEC-BERT-SHAPE, you have to pre-process texts replacing every numerical token with the corresponding shape pseudo-token from a list of 214 predefined shape pseudo-tokens. If the numerical token does not correspond to any shape pseudo token we replace it with the [NUM] pseudo-token.
52
  Below there is an example of how you can pre-process a simple sentence. This approach is quite simple; feel free to modify it as you see fit.
53
 
54
  ```python
 
11
  - text: "Total net sales decreased [X]% or $[MASK] billion during [XXXX] compared to [XXXX]."
12
  - text: "Total net sales decreased [X]% or $[X.X] billion during [MASK] compared to [XXXX]."
13
  - text: "During [MASK], the Company repurchased $[XX.X] billion of its common stock and paid dividend equivalents of $[XX.X] billion."
14
+ - text: "During 2019, the Company repurchased $[MASK] billion of its common stock and paid dividend equivalents of $[XX.X] billion."
15
  ---
16
 
17
  # SEC-BERT
 
48
 
49
  ## Pre-process Text
50
 
51
+ To use SEC-BERT-SHAPE, you have to pre-process texts replacing every numerical token with the corresponding shape pseudo-token, from a list of 214 predefined shape pseudo-tokens. If the numerical token does not correspond to any shape pseudo token we replace it with the [NUM] pseudo-token.
52
  Below there is an example of how you can pre-process a simple sentence. This approach is quite simple; feel free to modify it as you see fit.
53
 
54
  ```python