|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- ProfessorLeVesseur/EnglishTense |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
- precision |
|
|
- recall |
|
|
- f1 |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
pipeline_tag: text-classification |
|
|
tags: |
|
|
- time |
|
|
- timeframe |
|
|
- tense |
|
|
- past |
|
|
- current |
|
|
- future |
|
|
- chronological |
|
|
evaluation_results: |
|
|
accuracy: 1.00 |
|
|
macro_precision: 1.00 |
|
|
macro_recall: 1.00 |
|
|
macro_f1: 1.00 |
|
|
weighted_precision: 1.00 |
|
|
weighted_recall: 1.00 |
|
|
weighted_f1: 1.00 |
|
|
class_precision: |
|
|
Future: 1.00 |
|
|
Past: 1.00 |
|
|
Present: 1.00 |
|
|
class_recall: |
|
|
Future: 1.00 |
|
|
Past: 1.00 |
|
|
Present: 1.00 |
|
|
class_f1: |
|
|
Future: 1.00 |
|
|
Past: 1.00 |
|
|
Present: 1.00 |
|
|
support: |
|
|
Future: 727 |
|
|
Past: 577 |
|
|
Present: 694 |
|
|
--- |
|
|
|
|
|
## Model Overview ⏳🔮🔄 |
|
|
|
|
|
This model is a **text classification model** trained to **predict the tense of English sentences**: **Past**, **Present**, or **Future**. It is based on the `bert-base-uncased` architecture. |
|
|
|
|
|
## Intended Use 🔍 |
|
|
|
|
|
This model can be used in applications such as: |
|
|
- Identifying if statements are discussing past needs, motivations, products, etc. ⏪ |
|
|
- Determining current events or situations in text. ⏺️ |
|
|
- Predicting future plans or intentions based on sentence structure. ⏩ |
|
|
|
|
|
### Example Sentences and Labels 📝 |
|
|
|
|
|
| Sentence | Label | |
|
|
|--------------------------------------------------------------------------|---------| |
|
|
| the fishermen had caught a variety of fish including bass and perch | Past | |
|
|
| medical professionals are researching the impact of social determinants on health | Present | |
|
|
| in the future robotic surgical systems will have been empowering surgeons to perform increasingly complex procedures | Future | |
|
|
|
|
|
## Training Details 🏋️♂️ |
|
|
|
|
|
The model was fine-tuned on the `ProfessorLeVesseur/EnglishTense` dataset, which provides a diverse set of sentences labeled with their respective tenses. The training involved optimizing the model's weights for three epochs using a learning rate of 5e-5. |
|
|
|
|
|
## Evaluation Results 📊 |
|
|
|
|
|
The model achieves a **perfect accuracy of 1.00** on the test set, with **precision**, **recall**, and **F1-scores** also at **1.00 for all classes**. These results indicate excellent performance in classifying sentence tenses. |
|
|
|
|
|
### Classification Report ✅ |
|
|
|
|
|
| Class | Precision | Recall | F1-Score | Support | |
|
|
|-------------|-----------|--------|----------|---------| |
|
|
| Future | 1.00 | 1.00 | 1.00 | 727 | |
|
|
| Past | 1.00 | 1.00 | 1.00 | 577 | |
|
|
| Present | 1.00 | 1.00 | 1.00 | 694 | |
|
|
| **Accuracy**| | | 1.00 | 1998 | |
|
|
| **Macro Avg** | 1.00 | 1.00 | 1.00 | 1998 | |
|
|
| **Weighted Avg** | 1.00 | 1.00 | 1.00 | 1998 | |
|
|
|
|
|
### Limitations ⚠️ |
|
|
|
|
|
While the model performs well on the provided dataset, it may not generalize to all types of English text, particularly those with ambiguous or complex sentence structures. |
|
|
|
|
|
## How to Use 🚀 |
|
|
|
|
|
This model can be used for text classification tasks, either for individual text inputs or for batch processing via a DataFrame. Below are examples of both use cases. |
|
|
|
|
|
### Classifying Input Text |
|
|
|
|
|
To classify a single piece of text and retrieve the predicted label along with the confidence score, you can use the following code: |
|
|
|
|
|
```python |
|
|
from transformers import pipeline # Import the pipeline function from the transformers library |
|
|
|
|
|
# Initialize a text classification pipeline using the specified model |
|
|
classifier = pipeline( |
|
|
"text-classification", # Specify the task type as text classification |
|
|
model="ProfessorLeVesseur/bert-base-cased-timeframe-classifier" # Specify the model to use from the Hugging Face Model Hub |
|
|
) |
|
|
result = classifier("MTSS.ai is the future of education, call it education².") # Classify the input text and store the result |
|
|
print(result) # Output the result |
|
|
``` |
|
|
|
|
|
### Classifying Text in a DataFrame |
|
|
|
|
|
For batch processing, you can classify multiple text entries stored in a DataFrame. This example demonstrates how to read a CSV file and add a new column with the predicted labels: |
|
|
|
|
|
```python |
|
|
# Import libraries |
|
|
from transformers import pipeline # Import the pipeline function from the transformers library |
|
|
import pandas as pd # Import pandas for data manipulation |
|
|
|
|
|
# Read the CSV file |
|
|
file_path = 'filename.csv' # Define the path to the CSV file |
|
|
df = pd.read_csv(file_path) # Read the CSV file into a DataFrame |
|
|
|
|
|
# Initialize the text classification pipeline |
|
|
classifier = pipeline( |
|
|
"text-classification", # Specify the task type as text classification |
|
|
model="ProfessorLeVesseur/bert-base-cased-timeframe-classifier" # Specify the model to use from the Hugging Face Model Hub |
|
|
) |
|
|
|
|
|
# Apply the classifier to each row in the "Text" column and store results in a new column "label" |
|
|
df['label'] = df['Text'].apply(lambda text: classifier(text)[0]['label']) # Classify each text and store the label |
|
|
|
|
|
# Display the DataFrame with the new "label" column |
|
|
df.head(5) # Display the first 5 rows of the DataFrame |
|
|
``` |