-
Notifications
You must be signed in to change notification settings - Fork 307
[NewOp] Add generate_challenging_qa_mapper based on MindGYM principles #703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| from data_juicer.utils.lazy_loader import LazyLoader | ||
| from data_juicer.utils.model_utils import get_model, prepare_model | ||
|
|
||
| torch = LazyLoader('torch', 'torch') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can import from model_utils: torch, vllm
| OP_NAME = 'generate_challenging_qa_mapper' | ||
|
|
||
|
|
||
| def retry_on_error(func, max_retries=5, delay=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use existing third party retry library or put retry_on_error in a util
|
Please merge the latest main branch and run pre-commit locally. |
805b8f6 to
7f2c203
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new generate_challenging_qa_mapper operator, a significant feature for generating reasoning-focused QA pairs using a multi-turn conversation with a language model. The implementation is well-structured. However, I've identified a few issues that need attention. There's a critical error handling gap that could lead to a crash if the model's output is malformed. Additionally, the GPU configuration is hardcoded, which could cause runtime errors or suboptimal performance in different environments. I've also provided suggestions to improve code clarity and enhance the new test case with assertions.
| qa = self.extract_json(qa[0].outputs[0].text) | ||
| qa["thinking"] = multihop[0].outputs[0].text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If extract_json returns None because the model output is malformed, the subsequent line qa["thinking"] = ... will raise a TypeError, causing the process to crash. It's crucial to check if qa is None and handle this case gracefully, for instance, by raising a ValueError to trigger the retry mechanism with a more informative error message.
| qa = self.extract_json(qa[0].outputs[0].text) | |
| qa["thinking"] = multihop[0].outputs[0].text | |
| qa = self.extract_json(qa[0].outputs[0].text) | |
| if qa is None: | |
| raise ValueError("Failed to extract valid JSON from model output.") | |
| qa["thinking"] = multihop[0].outputs[0].text |
| """ | ||
| super().__init__(*args, **kwargs) | ||
| self.hf_model = hf_model | ||
| self.model_key = prepare_model(model_type="huggingface", pretrained_model_name_or_path=hf_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| result = op.process(deepcopy(sample)) | ||
| print(f'Output results: {result}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test runs the operator but doesn't have any assertions to verify the output. A test should validate the behavior of the code. Please add assertions to check that the returned result dictionary contains the expected keys from the generated QA pair. This will make the test more meaningful and robust.
| result = op.process(deepcopy(sample)) | |
| print(f'Output results: {result}') | |
| result = op.process(deepcopy(sample)) | |
| self.assertIn('background_document', result) | |
| self.assertIn('reasoning_category', result) | |
| self.assertIn('sub_questions', result) | |
| self.assertIn('relationship_category', result) | |
| self.assertIn('multihop_question', result) | |
| self.assertIn('multihop_answer', result) | |
| self.assertIn('thinking', result) | |
| print(f'Output results: {result}') |
8a37b7a to
b7e2654
Compare
|
I ran some tests locally and have a few points of feedback. ERRORs Found During Testing:
Other suggestions for Improvement:
|
Introduces a novel QA generation module based on self-challenging mechanisms, designed to autonomously synthesize high-quality reasoning-focused question-answer pairs, inspired by the MindGYM paper.