Semantic-Product-Search-with-Graph-Transformers

This is our implementation for the candidate short paper: "Semantic Product Search with Graph Transformers"

Introduction

In this work, we propose a graph-based semantic product retrieval framework for known queries that combines sentence- transformer embeddings with graph neural networks (GNNs) to refine product representations for ranking. See Figure:

This repository provides a easy to use and scalable framework for exploring different Graph Neural Network (GNN) architectures for this problem.

Datasets

We test our models on two datasets, the Amazon ESCI dataset and the WANDS Wayfair dataset. We reduce the size of the datasets by only using a subset of the data and only using english products and queries.

Requirements

For installing all baseline packages, you can create a virtual enviroment with python. For using the entire repository, please first download all requirements:

pip install -r requirements.txt

For using the same datasets, clone them and move the specified files into the specified folders:

git clone https://github.com/amazon-science/esci-data.git
git clone https://github.com/wayfair/WANDS.git

And place the files as specified below:

data/esci-data/shopping_queries_dataset_examples.parquet
data/esci-data/shopping_queries_dataset_products.parquet
data/wands-data/product.csv
data/wands-data/query.csv
data/wands-data/label.csv

For using the models gtpyg-gtconv.py it is also required to clone gt-pyg.

Usage

For running one model, use the Experiment.py script:

python Experiment.py model dataset size test_subset --edges gc_random --batch_size 32 --add_edges 16 --loss_fct cosine_mse

With the following arguments:

model: specifies the model to use from /scripts
dataset: decides which dataset to use, either esci or wands
size: specifies the size of judgments, either in {10000, 50000, 100000}
test_subset: which data subset to use: between 0 to 9
--edges gc_random: specifies which edge rule to use
--batch_size 32: specifies the batch size
--add_edges 16: amount of minimal edges per node (if enough nodes with shared attribute and value)
--loss_fct cosine_mse: specify the used loss function for learning

To run multiple experiments automatically and average their results, use ExperimentBatchTester.py

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
graph_creation		graph_creation
loss_fncs		loss_fncs
scripts		scripts
.gitignore		.gitignore
Benchmarks.ipynb		Benchmarks.ipynb
Experiment.py		Experiment.py
ExperimentBatchTester.py		ExperimentBatchTester.py
ExperimentStarter.py		ExperimentStarter.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Semantic-Product-Search-with-Graph-Transformers

Introduction

Datasets

Requirements

Usage

About

Uh oh!

Releases

Packages

Languages

License

ds-jrg/semantic-product-search

Folders and files

Latest commit

History

Repository files navigation

Semantic-Product-Search-with-Graph-Transformers

Introduction

Datasets

Requirements

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages