Commit ·
7e9a5d0
1
Parent(s): 818bcad
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
## Coding Challenge - Deep Learning for NLP (Foong)
|
| 2 |
|
| 3 |
### Description:
|
| 4 |
-
This repository contains notebook using scikit-learn SVM to classify real & fake news.
|
| 5 |
|
| 6 |
Dataset: https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset
|
| 7 |
Libraries used: Scikit-learn, NLTK, pandas, numpy, csv
|
|
@@ -9,7 +9,7 @@ Libraries used: Scikit-learn, NLTK, pandas, numpy, csv
|
|
| 9 |
### Write-up:
|
| 10 |
The accuracy of the model is 0.995.
|
| 11 |
|
| 12 |
-
There are a couple misclassified news articles and to improve the model's performance on these news articles, here're some suggestions:
|
| 13 |
-
- Remove stop words: The news article title and text contain a lot of
|
| 14 |
-
- Try using neural network by setting batch size, apply dropout & finetuning it
|
| 15 |
-
- Run cross
|
|
|
|
| 1 |
## Coding Challenge - Deep Learning for NLP (Foong)
|
| 2 |
|
| 3 |
### Description:
|
| 4 |
+
This repository contains a Jupyter notebook using scikit-learn SVM to classify real & fake news.
|
| 5 |
|
| 6 |
Dataset: https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset
|
| 7 |
Libraries used: Scikit-learn, NLTK, pandas, numpy, csv
|
|
|
|
| 9 |
### Write-up:
|
| 10 |
The accuracy of the model is 0.995.
|
| 11 |
|
| 12 |
+
There are a couple of misclassified news articles and to improve the model's performance on these news articles, here're some suggestions:
|
| 13 |
+
- Remove stop words: The news article title and text contain a lot of commonly used words which should be removed as features. Therefore, more data cleaning should be performed prior to model building.
|
| 14 |
+
- Try using the neural network by setting batch size, apply dropout & finetuning it
|
| 15 |
+
- Run cross-validation
|