| | --- |
| | license: apache-2.0 |
| | language: |
| | - it |
| | metrics: |
| | - accuracy |
| | base_model: |
| | - DeepMount00/ModernBERT-base-ita |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # Text Quality Classifier (Binary) |
| |
|
| | This model aim to classify the general quality and educational content of a given text. The available labels are 'LABEL_0' that means **bad quality** and 'LABEL_1' that means **good quality**. |
| | It can be used to efficiently filter by quality huge quantity of raw text. Useful for creating pretraining italian datasets. |
| | The model tend to classify as "good quality" wikipedia-like texts, containing educational, well structured and explained text. |
| |
|
| | ## How to get access |
| | This is a private model, but if you want to get access explain us how you're going to use this model at <a href="mailto:redix.ai@redix.com">redix.ai@redix.com</a> |
| |
|
| |
|
| | ## Eval |
| |
|
| | Durante la fase di valutazione, il modello ha ottenuto le seguenti metriche: |
| |
|
| | * **Eval Loss:** 0.3422 |
| | * **Accuracy:** 0.8607 |
| | * **F1-Score:** 0.8597 |
| |
|
| | ## How to use |
| |
|
| | ```python |
| | from transformers import pipeline |
| | |
| | MODEL = "ReDiX/text-quality-classifier-ita" |
| | pipe = pipeline("text-classification", model=MODEL, tokenizer=MODEL) |
| | |
| | example_text = "Questo è un testo di esempio in italiano per la classificazione." |
| | result = pipe(example_text) |
| | print(f"TEXT: '{example_text}'") |
| | print(f"RESULT: {result}") |
| | ``` |
| |
|
| | # Eval |
| |
|
| |  |
| |
|