gundruke
/

da_bert_sentiment_analysis

Text Classification

text-embeddings-inference

Model card Files Files and versions

gundruke commited on Dec 12, 2023

Commit

3edf9c5

·

1 Parent(s): fd5fa68

Create README.md

Files changed (1) hide show

README.md +19 -0

README.md ADDED Viewed

	@@ -0,0 +1,19 @@

+# Danish Sentiment Analysis
+## Information
+- Dataset : [DDSC/angry-tweets](https://huggingface.co/datasets/DDSC/angry-tweets)
+- Base model : [Danish bert botxo](https://huggingface.co/Maltehb/danish-bert-botxo)
+## Approach
+- Preprocessing
+  - Links and Usernames are replaced with @USER and [LINK], removing those keyholders
+  - Removing hashtags as they generally donot contribute to sentiment
+  - Removing emoji as models used in this notebook donot take emojis into consideration (replacing with their meaning could also be tested)
+  - lowercase
+  - Stopwords removal, danish stopwords from NLTK
+- Training with HF trainer
+- Training with pytorch loop
+- Uploading model to Huggingface hub
+- FastAPI endpoint
+- Packaged the api service as a docker container
+-