Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,8 @@ datasets:
|
|
| 15 |
- custom
|
| 16 |
---
|
| 17 |
|
| 18 |
-
# PIDIT:
|
|
|
|
| 19 |
|
| 20 |
This `tf.keras` model combines two pre-trained encoders — `BERT` and `ALBERTO` — to perform multi-task classification on Italian-language texts.
|
| 21 |
It is designed to predict:
|
|
@@ -72,7 +73,7 @@ bert_tokenizer = AutoTokenizer.from_pretrained("leeeov4/PIDIT/bert_tokenizer")
|
|
| 72 |
alberto_tokenizer = AutoTokenizer.from_pretrained("leeeov4/PIDIT/alberto_tokenizer")
|
| 73 |
```
|
| 74 |
|
| 75 |
-
##
|
| 76 |
|
| 77 |
```python
|
| 78 |
def preprocess_text(text, max_length=250):
|
|
@@ -91,7 +92,7 @@ def preprocess_text(text, max_length=250):
|
|
| 91 |
```
|
| 92 |
|
| 93 |
|
| 94 |
-
##
|
| 95 |
|
| 96 |
```python
|
| 97 |
text = "Questo è un esempio di testo italiano per testare il modello."
|
|
|
|
| 15 |
- custom
|
| 16 |
---
|
| 17 |
|
| 18 |
+
# PIDIT: Political Ideology Detection in Italian Texts
|
| 19 |
+
A Multi-Task BERT + ALBERTO Model for Gender and Ideology Prediction 🇮🇹
|
| 20 |
|
| 21 |
This `tf.keras` model combines two pre-trained encoders — `BERT` and `ALBERTO` — to perform multi-task classification on Italian-language texts.
|
| 22 |
It is designed to predict:
|
|
|
|
| 73 |
alberto_tokenizer = AutoTokenizer.from_pretrained("leeeov4/PIDIT/alberto_tokenizer")
|
| 74 |
```
|
| 75 |
|
| 76 |
+
## Preprocessing Example
|
| 77 |
|
| 78 |
```python
|
| 79 |
def preprocess_text(text, max_length=250):
|
|
|
|
| 92 |
```
|
| 93 |
|
| 94 |
|
| 95 |
+
## Inference
|
| 96 |
|
| 97 |
```python
|
| 98 |
text = "Questo è un esempio di testo italiano per testare il modello."
|