Update README.md
Browse files
README.md
CHANGED
|
@@ -32,7 +32,7 @@ Introduced in:
|
|
| 32 |
|
| 33 |
## Model Description
|
| 34 |
|
| 35 |
-
TechTokenBERT extends the vocabulary of BERT4Patent with one dedicated token per IPC code at group level (
|
| 36 |
|
| 37 |
```
|
| 38 |
[CLS] patent title [SEP] patent abstract [SEP] [TT_1] [TT_2] ... [TT_N] [SEP]
|
|
|
|
| 32 |
|
| 33 |
## Model Description
|
| 34 |
|
| 35 |
+
TechTokenBERT extends the vocabulary of BERT4Patent with one dedicated token per IPC code at group level (~8000 codes total). Fine-tuning uses masked-language-modelling on sequences of the form:
|
| 36 |
|
| 37 |
```
|
| 38 |
[CLS] patent title [SEP] patent abstract [SEP] [TT_1] [TT_2] ... [TT_N] [SEP]
|