Implementation suggestions: How to use the "debias set" for different roles?

hello! 👋  In the implementation suggestions, I'm curious about this bit:

> Pymetrics models are built for specific roles within specific companies. To achieve this customization, we collect data from top-performing incumbents in the target role. We then compare incumbents to a baseline sample of the over 1 million candidates who have applied to jobs through pymetrics. We also establish a special data set, which we call the debias set, which is sampled from a pool of 150,000 individuals who have voluntarily provided basic demographic information such as sex, ethnicity or age. From there, a wide variety of algorithms might be tested to create an initial machine learning model from the training data. The process itself is model agnostic. Multiple algorithms are fit in this process, and we are continuously testing new methods that might improve performance. The goal of the algorithm is to find the features that will most accurately and reliably separate the incumbent set from the baseline set. 

In particular, the selection method for the baseline sample and the "debias set" seems really important!  Is this secret or can you share the methodology you're using?  I'm trying to understand how it fits in with the dataset that I would provide.

Thanks for sharing this work in the open and the explanations in the examples too, it's super interesting! 👍 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation suggestions: How to use the "debias set" for different roles? #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implementation suggestions: How to use the "debias set" for different roles? #30

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions