Skip to content

Conversation

@fverac
Copy link
Contributor

@fverac fverac commented Oct 17, 2025

No description provided.

@MarcCote MarcCote force-pushed the fverac/swebench_self_test_agent branch from 2b609a7 to 63ce0f1 Compare October 22, 2025 00:39
@MarcCote
Copy link
Collaborator

MarcCote commented Oct 22, 2025

Things to do:

  • fix failing tests
  • add missing tests for the debug_mode
  • decide what we do with pdb's entrypoint when debug_mode is False

@MarcCote MarcCote marked this pull request as ready for review October 22, 2025 03:41
@fverac
Copy link
Contributor Author

fverac commented Oct 23, 2025

I feel like debug_mode will be easily confused with the other --debug flag. Should we call the mode something else? And/or add docstring

pre_apply_test_patch
oracle_test_available
debug_oracle_tests

@MarcCote MarcCote requested review from MarcCote and matheper October 24, 2025 19:56
fverac and others added 11 commits October 25, 2025 03:20
Signed-off-by: fverac <fabiovera@microsoft.com>
Signed-off-by: fverac <fabiovera@microsoft.com>
Signed-off-by: fverac <fabiovera@microsoft.com>
Signed-off-by: fverac <fabiovera@microsoft.com>
Signed-off-by: fverac <fabiovera@microsoft.com>
Signed-off-by: fverac <fabiovera@microsoft.com>
Signed-off-by: fverac <fabiovera@microsoft.com>
@MarcCote MarcCote force-pushed the fverac/swebench_self_test_agent branch from 3efb55b to 9989196 Compare October 25, 2025 11:29
MarcCote and others added 2 commits October 27, 2025 08:37
Signed-off-by: fverac <fabiovera@microsoft.com>
@MarcCote
Copy link
Collaborator

TODO:

  • Add integration tests that runs the solution agent on SWE-Bench, SWE-Smith, and R2E-Gym
  • Add a unit that that checks that env.debug_entrypoint has -m pdb in it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants