Skip to content

Implementation suggestions: How to use the "debias set" for different roles? #30

@kevinrobinson

Description

@kevinrobinson

hello! 👋 In the implementation suggestions, I'm curious about this bit:

Pymetrics models are built for specific roles within specific companies. To achieve this customization, we collect data from top-performing incumbents in the target role. We then compare incumbents to a baseline sample of the over 1 million candidates who have applied to jobs through pymetrics. We also establish a special data set, which we call the debias set, which is sampled from a pool of 150,000 individuals who have voluntarily provided basic demographic information such as sex, ethnicity or age. From there, a wide variety of algorithms might be tested to create an initial machine learning model from the training data. The process itself is model agnostic. Multiple algorithms are fit in this process, and we are continuously testing new methods that might improve performance. The goal of the algorithm is to find the features that will most accurately and reliably separate the incumbent set from the baseline set.

In particular, the selection method for the baseline sample and the "debias set" seems really important! Is this secret or can you share the methodology you're using? I'm trying to understand how it fits in with the dataset that I would provide.

Thanks for sharing this work in the open and the explanations in the examples too, it's super interesting! 👍

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions