-
Notifications
You must be signed in to change notification settings - Fork 209
Add relevance metrics including pruned tokens to MS Marco ranking track #525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add relevance metrics including pruned tokens to MS Marco ranking track #525
Conversation
d33ff08 to
1574135
Compare
…e the recall against the original weighted terms query
…NDCG based on a smaller version of the queries used to test performance.
105b19a to
ac79e54
Compare
| "track_total_hits": false | ||
| }, | ||
| { | ||
| "name": "pruned-text-expansion-search-maxwand-disabled", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added operations here
| "track_total_hits": false | ||
| }, | ||
| { | ||
| "name": "pruned-weighted-terms-recall-10-10", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added operations here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was auto-formatted for correct json indentation. I have left comments starting at the two places where I made changes.
demjened
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a few minor comments
Mikep86
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Left a few minor comments.
| def generate_pruned_query(field, query_expansion, boost=1.0): | ||
| return { | ||
| "query": { | ||
| "weighted_tokens": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried running the modified track and noticed weighted_tokens query is not available in 8.11.x. Is this planned for 8.13.0? We need to apply the right versioning as per https://esrally.readthedocs.io/en/stable/track.html#custom-track-repositories. The most recent non-master branch in the track repo is 8.11. We need to bump up to 8.13 before merging this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, this is for 8.13.0.
gbanasiak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left small remark regarding _tools/requirements.txt file.
| pytrec_eval | ||
| numpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the presence of this file be mentioned in README? Also, please align version pinning with dependencies section.
Mikep86
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Adds recall metrics for query pruning using the weighted tokens query and precomputed ELSER tokens.