cd work
git clone https://github.com/tcoulvert/SPAtop/tree/topnet_devpython -m venv
pip install -e ./SPAtopIf for some reason the code does not run properly, it could be because packages may have changed and broken things. If that is the case, you can install your environment as follows:
python -m venv
pip install -r requirements.txtCopy the Delphes ROOT TTree datasets from:
- LPC EOS:
/eos/uscms/store/user/tsievert/ttbar_hadronic/ttbar_hadronic_*.root, or - non-LPC EOS:
root://cmseos.fnal.gov//store/user/tsievert/ttbar_hadronic/ttbar_hadronic_*.root
to the data/delphes/v1/ttbar_hadronic directory
Convert to training and testing HDF5 files.
python -m src.data.delphes.convert_to_h5 data/delphes/v1/ttbar_hadronic/sample_*.root --out-file data/delphes/v1/ttbar_hadronic_training.h5
python -m src.data.delphes.convert_to_h5 data/delphes/v1/ttbar_hadronic/sample_*.root --out-file data/delphes/v1/ttbar_hadronic_testing.h5!!! WARNING !!! : From this step on, this repo hasn't been updated, so don't expect things to work. When the repo is updated, this README will change to reflect that.
Override options file with --gpus 0 if no GPUs are available.
python -m spanet.train -of options_files/delphes/hhh_v2.json [--gpus 0]Training via kubernetes on the cms-ml namespace requires the following:
kubectlconfigured to target thecms-mlnamespace- PersistentVolumeClaim named
spatopvolcontaining training data (already created) - Docker image
gitlab-registry.nrp-nautilus.io/jmduarte/hhh:latest(to be updated)
Data should be placed under the PVC at:
spatopvol/data/delphes/v1/tt_training.h5
SPAtop training requires the following files, each located within the PVC:
| File | Description | Location |
|---|---|---|
tt_hadronic.yml |
Physics process settings | /spatopvol/event_files/tt_hadronic.yml |
spatop_v1.json |
SPANet model parameters | /spatopvol/options_files/spatop_v1.json |
Additionally, the Kubernetes job manifest is required:
spatop-job-train.yml: Defines the Job spec for launching the SPAtop training container. Place this file in your local working directory where you runkubectlcommands.
To start the SPAtop training job, apply the Kubernetes manifest:
kubectl apply -f spatop-job-train.yml -n cms-mlThis command creates a Kubernetes Job that spawns one or more pods to perform the training.
-
List Jobs and Pods
kubectl get jobs -n cms-ml kubectl get pods -l job-name=spatop-job-train -n cms-ml -
Describe a Pod
kubectl describe pod <pod-name> -n cms-ml
-
Stream Logs
kubectl logs -f <pod-name> -n cms-ml
Repeat these commands to track pod status, resource usage, and training progress.
Once training completes successfully,remove the job and its pods:
kubectl delete job spatop-job-train -n cms-mlAssuming the output log directory is spanet_output/version_0.
Add --gpu if a GPU is available.
python -m spanet.test spanet_output/version_0 -tf data/delphes/v2/hhh_testing.h5 [--gpu]python -m src.models.test_baseline --test-file data/delphes/v2/hhh_testing.h5The CMS dataset was updated to run with the v26 setup (nAK4 >= 4 and HLT selection). The update includes the possibility to apply the b-jet energy correction. By keeping events with at a least 4 jets, the boosted training can be performed on a maximum number of events and topologies.
List of samples (currently setup validated using 2018):
/eos/user/m/mstamenk/CxAOD31run/hhh-6b/cms-samples-spanet/v26/GluGluToHHHTo6B_SM_spanet_v26_2016APV.root
/eos/user/m/mstamenk/CxAOD31run/hhh-6b/cms-samples-spanet/v26/GluGluToHHHTo6B_SM_spanet_v26_2016.root
/eos/user/m/mstamenk/CxAOD31run/hhh-6b/cms-samples-spanet/v26/GluGluToHHHTo6B_SM_spanet_v26_2017.root
/eos/user/m/mstamenk/CxAOD31run/hhh-6b/cms-samples-spanet/v26/GluGluToHHHTo6B_SM_spanet_v26_2018.root
To run the framework, first convert the samples (this will allow to use both jets pt or ptcorr, steerable from the configuration file:
mkdir data/cms/v26/
python -m src.data.cms.convert_to_h5 /eos/user/m/mstamenk/CxAOD31run/hhh-6b/cms-samples-spanet/v26/GluGluToHHHTo6B_SM_spanet_v26_2018.root --out-file data/cms/v26/hhh_training.h5
python -m src.data.cms.convert_to_h5 /eos/user/m/mstamenk/CxAOD31run/hhh-6b/cms-samples-spanet/v26/GluGluToHHHTo6B_SM_spanet_v26_2018.root --out-file data/cms/v26/hhh_testing.h5
Then training can be done via:
python -m spanet.train -of options_files/cms/hhh_v26.json --gpus 1
Two config files exist for the event options:
event_files/cms/hhh.yaml # regular jet pT
event_files/cms/hhh_bregcorr.yaml # jet pT with b-jet energy correction scale factors applied
Note: to run the training with the b-jet energy correction applied, the log_normalize of the input variable was removed. Keeping it caused a 'Assignement collision'.