--- '[object Object]': null license: apache-2.0 datasets: - maryzhang/hw1-24679-image-dataset language: - en metrics: - accuracy --- # Model Card for {{ model_id | default("Model ID", true) }} This is finetuned version of DistilBERT that is used for sentiment analysis on NFL news titles. ## Model Details ### Model Description This model uses the DistilBERT model to classify NFL news article titles as positive or negative. - **Developed by:** Devin DeCosmo - **Model type:** Binary Sentiment Analysis - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model:** DistilBERT ## Uses This is used for sentiment analysis of NFL articles, but could possibly be used for other article titles. ### Direct Use The direct use is to classify NFL articles as positive or negative. ### Out-of-Scope Use If the dataset was expanded, this could be used for sentiment analysis on other types of articles or find other features like bias towards a team or player. ## Bias, Risks, and Limitations This is trained off a small dataset of 100 titles, this small dataset could be liable to overfitting and is not robust. ### Recommendations The small dataset size means this model is not highly generalizable. ## How to Get Started with the Model Use the code below to get started with the model. ## Training Details ### Training Data James-kramer/football_news This is the training dataset used. It consists of 100 original titles used for validation along with 1000 synthetic pieces of data from training. ### Training Procedure This model was trained with DistilBERT using binary classification, a training split of 80%, and 5 epochs. I initially used more but this converged extremely quickly. ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data James-kramer/football_news The testing data was the 'original' split, the 100 original titles in this set. #### Factors This dataset is evaluating whether the food is positive, "1", or negative, "0". #### Metrics The testing metric used was accuracy to ensure the highest accuracy of the model possible. I also considered testing time. This small langauge model ran extremely quickly with 102 steps per second. ### Results After training with the initial dataset, this model reached an accuracy of 100% in validation. This is likely due to the simplicity of the task, binary classification, along with distilBERT being made for tasks such as this. #### Summary This model reached a high accuracy with our current model, but this perfomance can not be confirmed to continue as the dataset was very small. Additional testing with more samples would be highly beneficial.