gundruke
/

da_bert_sentiment_analysis

Text Classification

text-embeddings-inference

Model card Files Files and versions

da_bert_sentiment_analysis / README.md

gundruke's picture

Create README.md

3edf9c5 about 2 years ago

|

history blame contribute delete

765 Bytes

	# Danish Sentiment Analysis
	## Information
	- Dataset : [DDSC/angry-tweets](https://huggingface.co/datasets/DDSC/angry-tweets)
	- Base model : [Danish bert botxo](https://huggingface.co/Maltehb/danish-bert-botxo)

	## Approach
	- Preprocessing
	- Links and Usernames are replaced with @USER and [LINK], removing those keyholders
	- Removing hashtags as they generally donot contribute to sentiment
	- Removing emoji as models used in this notebook donot take emojis into consideration (replacing with their meaning could also be tested)
	- lowercase
	- Stopwords removal, danish stopwords from NLTK

	- Training with HF trainer
	- Training with pytorch loop
	- Uploading model to Huggingface hub
	- FastAPI endpoint
	- Packaged the api service as a docker container
	-