Cluster the top 50 words and the reverse of each of those top 50 words, e.g., half of the occurrences of "canon'' will be transformed into "nonac", in order to test clustering accuracy.
- Python & Jupyter Notebook
- NLTK, scikit-learn, and NumPy
- Clustering Accuracy of 65%-70%
- Mark 10/13
*For more information see the coursework2-report.pdf section Task 1 (on GitHub)
Using the prelabelled corpus build and train a neural network model to classify positive and negative reviews. I created a bi-directional LSTM (Long Short Term Memory) model to solve the task.
- Python & Jupyter Notebook
- NLTK, tenserflow, and NumPy
- Accuracy of 73.4% with a low standard deviation of 0.017
- Mark 12/12
*For more information see the coursework2-report.pdf section Task 2 (on GitHub)