Implement a workflow for AI-assisted Selenium test case creation. #21041
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Automated Test Cases - Bad!
AI generated test cases are noisy and generally just bad - this is not that. The context that would be needed to pull in our whole client codebase, all the test functions, etc.. would be huge. Repeated re-running of tests as it guessed and checked would be painfully slow.
A semi-automated approach I looked into was recording a script and having an AI convert it into Selenium commands - it wasn't very promising at all though.
The selectors chosen don't really seem great and obviously most test cases can be bootstrapped and setup with a huge mountain of existing test helpers we've already had - reducing all of that to just sequences of Selectors would result in a ton of duplication, unreadable code, and less robustness (our helpers have a lot of good retry logic, adaptive waiting, rich debug messages, etc...).
AI Assistance in Building Test Cases - Good!
The semi-automatic approach that I think is more promising is to have the AI agent setup a rich environment for manually testing the UI and then provide a mechanism for turning that exploration directly into a test case. I've implemented that hear using claude slash commands.
This PR adds a Claude slash command
/setup-selenium-test-notebook <feature description OR GitHub PR>. It can take a description of the feature to test or just be given a PR.It will setup a Jupyter notebook with cells filled out for setting up the Selenium environment and talking with Galaxy. It tells the user about the config file they need to setup if it isn't present to talk with a standing development server of Galaxy and tells the user how to run Jupyter. All this part is based on my prior work in #11177.
The agent will pull down the PR description and try to come up with an idea for how to test it. The manual testing instructions we already provide are great for this. It will also "research" the code base and find related tests and will provide potentially relevant code from existing tests as Markdown comments right in the notebook - so you have a good idea of what helpers and components are already implemented that might help with the task of testing the PR.
Some screenshots from the Notebook they setup for me for testing #20886.
I think it generated some existing code for preconditions that worked pretty good out of the box - the stuff unlike the existing stuff in the test framework came in commented form. It didn't pretend to know things it didn't - it was kind of refreshing.
The agent seems smart enough to reason about when a managed history annotation is needed and how to deal with user login, etc...
Developing in Jupyter is nice because it can sustain a persistent connection to the browser automation application. You don't have to re-run the whole test - you can work a line or two at a time with cells and preserve progress and just re-run what is needed as components are annotated, etc...
I think the screenshots are a cool part of the framework we have - and these will appear right inside the notebook.
After the notebook test case is ready go, claude seems pretty good at converting it directly to a test case. This can be done with '/extract-selenium-test '
I generated the test case for the test in #21040 and it worked on the first try for me (I did move it into an existing file because I thought that is where it belonged - so that was a manual step - but no big deal).
How to test the changes?
(Select all options that apply)
License