gundruke
/

da_bert_sentiment_analysis

Text Classification

text-embeddings-inference

Model card Files Files and versions

da_bert_sentiment_analysis / README.md

gundruke's picture

Create README.md

3edf9c5 about 2 years ago

|

history blame contribute delete

765 Bytes

Danish Sentiment Analysis

Information

Dataset : DDSC/angry-tweets
Base model : Danish bert botxo

Approach

Preprocessing
- Links and Usernames are replaced with @USER and [LINK], removing those keyholders
- Removing hashtags as they generally donot contribute to sentiment
- Removing emoji as models used in this notebook donot take emojis into consideration (replacing with their meaning could also be tested)
- lowercase
- Stopwords removal, danish stopwords from NLTK
Training with HF trainer
Training with pytorch loop
Uploading model to Huggingface hub
FastAPI endpoint
Packaged the api service as a docker container