foongminwong commited on
Commit
7e9a5d0
·
1 Parent(s): 818bcad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ## Coding Challenge - Deep Learning for NLP (Foong)
2
 
3
  ### Description:
4
- This repository contains notebook using scikit-learn SVM to classify real & fake news.
5
 
6
  Dataset: https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset
7
  Libraries used: Scikit-learn, NLTK, pandas, numpy, csv
@@ -9,7 +9,7 @@ Libraries used: Scikit-learn, NLTK, pandas, numpy, csv
9
  ### Write-up:
10
  The accuracy of the model is 0.995.
11
 
12
- There are a couple misclassified news articles and to improve the model's performance on these news articles, here're some suggestions:
13
- - Remove stop words: The news article title and text contain a lot of most commonly used words which should be removed as features. Tehrefore, more data cleaning should be eprformed prior to model building.
14
- - Try using neural network by setting batch size, apply dropout & finetuning it
15
- - Run cross validation
 
1
  ## Coding Challenge - Deep Learning for NLP (Foong)
2
 
3
  ### Description:
4
+ This repository contains a Jupyter notebook using scikit-learn SVM to classify real & fake news.
5
 
6
  Dataset: https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset
7
  Libraries used: Scikit-learn, NLTK, pandas, numpy, csv
 
9
  ### Write-up:
10
  The accuracy of the model is 0.995.
11
 
12
+ There are a couple of misclassified news articles and to improve the model's performance on these news articles, here're some suggestions:
13
+ - Remove stop words: The news article title and text contain a lot of commonly used words which should be removed as features. Therefore, more data cleaning should be performed prior to model building.
14
+ - Try using the neural network by setting batch size, apply dropout & finetuning it
15
+ - Run cross-validation