import datasets, perform exploratory data analysis, scaling & different models such as linear or logistic regression, decision trees, random forests, K means, support vectors etc.
Import Modules
install module in system :  
  "pip3 install module-name" 
Process Data 
 process_data.py contains the following functions : 
  get_file_names_in_dir(dir_name) : print name of files to process in directory  
  dataset_import(file_name, dataset_type) : import dataset & print description  such as data size, rows, columns, unique and null values  
  dataset_EDA(data, pairplot_columns) : pairplot, heatmap  
  dataset_scrubbing(data, scrub_type, data_columns, fill_operation) : clean data by removing or filling missing values, deal with categorical variables using one hot encoding, remove entire columns  
  pre_model_algorithm(df, algorithm, target_column) : scale data using principle component analysis or k means clustering 
  def split_validation(dataset, features, target_column, test_split) : split train data into train & test including the target column with desired split ratio 
Run Model 
 run_model.py contains the following models : 
  linear_regression(X_train, X_test, y_train, y_test, show_columns, target_column) : continuous predictions 
  logistic_regression(X_train, X_test, y_train, y_test, show_columns, target_column) : discrete predictions 
  decision_tree_classifier(X_train, X_test, y_train, y_test, show_columns, target_column) : both continuous & discrete predictions 
  random_forest_classifier(X_train, X_test, y_train, y_test, show_columns, target_column, num_estimators) : both continuous & discrete predictions 
  gradient_boosting(X_train, X_test, y_train, y_test, show_columns, target_column, gb_type) : regressor for continuous & classifier for discrete 
  k_neighbors_classifier(X_train, X_test, y_train, y_test, show_columns, target_column, k, scaled_features) : continuous, discrete, ordinal, categorical data predictions 
  support_vector_classifier(X_train, X_test, y_train, y_test, show_columns, target_column) : continuous data predictions   
References