| # Danish Sentiment Analysis | |
| ## Information | |
| - Dataset : [DDSC/angry-tweets](https://huggingface.co/datasets/DDSC/angry-tweets) | |
| - Base model : [Danish bert botxo](https://huggingface.co/Maltehb/danish-bert-botxo) | |
| ## Approach | |
| - Preprocessing | |
| - Links and Usernames are replaced with @USER and [LINK], removing those keyholders | |
| - Removing hashtags as they generally donot contribute to sentiment | |
| - Removing emoji as models used in this notebook donot take emojis into consideration (replacing with their meaning could also be tested) | |
| - lowercase | |
| - Stopwords removal, danish stopwords from NLTK | |
| - Training with HF trainer | |
| - Training with pytorch loop | |
| - Uploading model to Huggingface hub | |
| - FastAPI endpoint | |
| - Packaged the api service as a docker container | |
| - |