Invest the structural information in job-level anomaly

* [x] Compare the MLP with GNN (GCN, GAT, GraphSAGE) w/ and w/o position matrix
* [x] Add early stopping to avoid overfitting
* [x] HPS on the base models to check the best performance that can be achieved for each model
* [x] Remove the noisy data, and train again using HPS to check how much we can gain using clean data only (both train and test)