The aim of this fork is to improve original starter project code for students taking Intro to Machine Learning on Udacity with python 3.8, conda managing and jupyter notebooks.
- Lesson 2: Naive Bayes
- Lesson 3: SVM
- Lesson 4: Decision Trees
- Lesson 5: Choose Your own Algorithm
- Lesson 6: Datasets and Questions
- Lesson 7: Regressions
- Lesson 8: Outliers
- Lesson 9: Clustering
- Lesson 10: Feature Scaling
- Lesson 11: Text Learning
- Lesson 12: Feature Selection
- Lesson 13: PCA
- Lesson 14: Validation
- Lesson 15: Evaluation Metrics
- Lesson 17: Final Project
In this repo newer version of scikit-learn is used. Thus, to get the results expected by the course grader
you need to use SVC with gamma='auto', since the default value of gamma changed, see sklearn.svm.SVC docs:
Changed in version 0.22: The default value of gamma changed from 'auto' to 'scale'.
For example:
clf = SVC(kernel='linear', gamma='auto')To get the correct (acceptable by grader) results set sort_keys='../utils/python2_lesson06_keys.pkl' for
feature_format function:
...
data = feature_format(dictionary, features_list, remove_any_zeroes=True, sort_keys='../utils/python2_lesson06_keys.pkl')
...[...] This will open up a file in the tools folder with the Python 2 key order.
See this for detailed explanation.
$ git clone https://github.com/trsvchn/ud120-projects-py3-jupyter.git
$ cd ud120-projects-v2$ conda env create -f environment.yml$ conda activate ud120$ python ./utils/starter.py