PyTorch and MATLAB implementation of scalable nonsymmetric determinantal point process (NDPP) learning and MAP inference. For details, see our Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes paper.
Install required Python packages:
$ pip install --user --requirement requirements.txtTrain and evaluate a scalable NDPP model using the Amazon apparel baby registry dataset:
$ cd src
$ python main.py --dataset_name basket_ids --input_file data/1_100_100_100_apparel_regs.csv --num_sym_embedding_dims 30 --num_nonsym_embedding_dims 30 --alpha 0 --batch_size 800Train and evaluate a scalable NDPP using the UK retail dataset:
$ python main.py --dataset_name uk --num_sym_embedding_dims 100 --num_nonsym_embedding_dims 100 --alpha 0.001 --max_basket_size 100To train and evaluate a scalable NDPP using the Instacart dataset, first download the Instacart dataset
from https://www.instacart.com/datasets/grocery-shopping-2017 and unpack it in the
data/instacart_2017_05_01 directory. Next, unzip the data/instacart_2017_05_01/meta_catalog.csv.zip file. Then run:
$ python main.py --dataset_name instacart --num_sym_embedding_dims 100 --num_nonsym_embedding_dims 100 --alpha 0.01 --max_basket_size 100 --batch_size 800To train and evaluate a scalable NDPP using the Million Song dataset, first download the dataset
from http://millionsongdataset.com/sites/default/files/challenge/train_triplets.txt.zip and unpack train_triples.txt file in the data directory. Then run:
$ python main.py --dataset_name millionsong --num_sym_embedding_dims 150 --num_nonsym_embedding_dims 150 --alpha 0.01 --max_basket_size 150 --batch_size 400For a full list of command-line options:
$ python main.py --help