Skip to content

Conversation

@janhoy
Copy link
Contributor

@janhoy janhoy commented Oct 25, 2025

https://issues.apache.org/jira/browse/SOLR-17508

WIP Feel free to try and fix issues :)

Toy project to try to replicate the BATS tests for Windows, using the Pester test framework for Powershell (https://pester.dev). With aid of Claude Code, this branch was written in one go in literally 3 minutes.
It includes a github PR workflow test running windows, so we'll have a way to run it. Here's the prompt used:

You'll now get a greenfield task. In Solr we use BATS test framework for automated testing of start/stop 
solr and CLI tools. We do not have similar support for testing on Windows commands. You will integrate 
Pester (test/test_version.bats) tests.

The approach we'll take is to "port" each BATS test to windows, as the CLI commands are supposed to 
be identical, except the obvious differences between bash and Powershell.

As a start, I want you to port the solr/packaging/test/test_version.bats test.
- First, create a folder 'powershell-tests' under solr/packaging.
- Then, integrate pester in our unit test framework by peeking at how it's done in solr:packaging:integrationTests task
- Create a new task for this, it should only run when on windows.
- Next, port the bats test 'test_version.bats' over.

I don't have windows, only Mac, so give me a hint on how I can test this. I can spin up a Windows VM 
and check out this branch there and test, but is there a smarter way like Wine or something that runs on 
mac? 

@malliaridis malliaridis self-requested a review October 27, 2025 08:57
@janhoy janhoy marked this pull request as draft October 27, 2025 23:04
@epugh
Copy link
Contributor

epugh commented Oct 28, 2025

One thought is, and I think you maybe are thinking this way, is to make sure that we only port over tests that we think have a windows specific challenges. There are bats tests like the Tika extraction stuff or the OpenNLP that probably (!) shouldn't have any windows specific nuances that would suggest a pester equivalent...

@janhoy janhoy marked this pull request as ready for review October 28, 2025 13:02
Copy link
Contributor

@malliaridis malliaridis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried to run

.\gradlew.bat --no-daemon -p solr/packaging pesterTests

I got multiple exceptions because pester was pre-installed with an older version (3.4.0). Pester getting started said to run:

Install-Module Pester -Force

as admin, and it required me to provide the parameter -SkipPublisherCheck as it refused to upgrade it.

After confirming the pester installation with Import-Module Pester -PassThru it did not throw errors anymore.

A hint of setting up and running tests may be useful, and we may also need to run pester installation on the CI before running the tests?

I also noticed that running the gradle task pesterTests it sometimes does not execute the tests. I believe it is due to some gradle caching, because when I update a line in Gradle, it always runs through the first time, but any follow-up execution skips the pester tests. Maybe this is a problem in CI too?


Write-Host "Exit Code: $LASTEXITCODE"
if ($outputStr.Length -gt 0) {
Write-Host "Output (first 500 chars): $($outputStr.Substring(0, [Math]::Min(500, $outputStr.Length)))"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This output (and a similar one in Zk.Tests.ps1) adds some verbosity tot he console during execution which I would prefer to enable manually if needed. Perhaps we could use the pester verbosity here and do something like:

if ($PesterPreference.Show -contains 'Detailed') { # gradle script sets verbosity to Normal
  # ...
}

This would also require an update in the gradle task to allow changing the verbosity via -Dpester.verbosity. Something like

def pesterVerbosity = (project.findProperty("pester.verbosity")
  ?: System.getProperty("pester.verbosity", "Normal"))

and in the pester config

// ...
"\$config.Output.Verbosity = '${pesterVerbosity}'; " +
// ...

Comment on lines +97 to +102
try {
$response = Invoke-WebRequest -Uri "http://localhost:$SOLR_PORT/solr/" -UseBasicParsing -ErrorAction SilentlyContinue
if ($response.StatusCode -eq 200) {
$solrRunning = $true
}
} catch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block always fails, leading to the 9 tests being skipped. Is Solr ever started during the test?

}

It "listing out files" {
if (-not $script:SolrRunning) { Set-ItResult -Skipped -Because "Solr is not running"; return }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this line be configured in a BeforeEach block and reduce duplication maybe?

@malliaridis
Copy link
Contributor

Oh, and I forgot to leave a comment content-wise. It is a good start for introducing tests. I believe in the future we will add more windows-specific cases that will matter more. But for now, I believe it is fine with the included tests. Well done! 👍

@janhoy
Copy link
Contributor Author

janhoy commented Oct 28, 2025

One thought is, and I think you maybe are thinking this way, is to make sure that we only port over tests that we think have a windows specific challenges. There are bats tests like the Tika extraction stuff or the OpenNLP that probably (!) shouldn't have any windows specific nuances that would suggest a pester equivalent...

Yea, that could be an option. I did the small isolated "help", "version" and "zk" tests, and they are all happy so far, hopefully proving that the Windows scripts quack the same way as linux one (if the tests actually verifies enough of the output). We probably need some of the run-example ones.

Feel free to suggest a list of prioritized tools that gives best value, where we know there is/have been issues.

But I see no reason we could not add many tests on the windoes side as well.

@malliaridis I won't have time to persue this the coming days, so I hereby give permission to any committer who wants to hijack the PR and just commit directly to the PR branch. It's a POC and if you want to polish it to the point of a first merge, that would be great.

@malliaridis
Copy link
Contributor

malliaridis commented Oct 30, 2025

I have added a solr startup for the zk tests and fixed a a false-positive issue in two of the tests.

I spent about 6 hours to make these changes work, so my experience was not ideal.

The powershell tests are also not optimal and would require some further optimizations in the long run. Any std.err log, including warnings, are recognized as errors by the powershell, leading to some outputs to include a report of a NativeCommandError which matched the error condition and failed the test.

Because I am also short in time, I thought I share my discoveries for the next person working on this. In my opinion though we could merge it as is. I got passing tests on Windows. If pester is correctly installed in CI/CD I believe it would work and would be ready to be merged.

Copy link
Contributor Author

@janhoy janhoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @malliaridis for chipping away on this. My overall comment is that we should not need 100 LOC to describe the same start-solr process as BATS does in 2 lines :)

} -ArgumentList $SolrCmd, $startArgs

# Wait a bit for Solr to start
Start-Sleep -Seconds 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? Our start script waits until solr is ready before bin\solr.cmd start exits?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If bin\solr.cmd start would properly exit after completion, it would be a one-liner I believe. But see other comment for details.

}
} catch {
$solrAlreadyRunning = $false
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit disaapointing that while the BATS setup_file() function is a tw-liner, Pester needs 100 lines for the same? Why? Guess we could pull some of the verbose check-if-solr-running and start-solr out in Util scripts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this out makes sense, yeah. This can be done in one go together with the other topics I believe.


# Start Solr with embedded ZooKeeper using Start-Process to run asynchronously
$startArgs = @("start", "-p", "$SOLR_PORT")
Write-Host "Running: $SolrCmd $($startArgs -join ' ')"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bin/solr script looks for SOLR_PORT env.variable and uses it by default if -p is not set. Is not the same true for the bin\solr.cmd script? If so, why? Also, before the test start, gradle finds an available port so we know that nothing is running on SOLR_PORT. Why do we then need to check if solr is running? Are we afraid that one of the other Pester tests have had an unclean exit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unclean exit or a service running on a chosen port are possible but extremely rare cases the way the gradle task is written, so we could indeed reduce the additional -p input. But sometimes being explicit is preferred, at least that's what I've been told a couple of times in this project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the assumptions for the BATS tests is that SOLR_PORT, SOLR2_PORT, ZK_PORT etc are set, and most tests just rely on picking up these (which is a feature in itself), but there are also other tests that explicitly tests the -p flag and the -z flag in varioud ways.

I'd say simplify. Keep it close to the BATS approach.

$elapsed += $waitInterval

try {
$response = Invoke-WebRequest -Uri "http://localhost:$SOLR_PORT/solr/" -UseBasicParsing -TimeoutSec 5 -ErrorAction Stop
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why would this be neccessary on the Windows side when it is not on the linux side? If bin\solr.cmd start -c does not return until solr is ready and listening, we can shave off this?

https://github.com/apache/solr/blob/main/solr/bin/solr.cmd#L1140-L1148

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem is that a simple one-liner for starting solr with

$startOutput = & $SolrCmd "start" "-p" $SOLR_PORT 2>&1

does not work in powershell, as the call does not exit (not sure why). So I had to start Solr as an asynchronous job, because Invoke Process would also interfere with the process hierarchy and lead to issues in StatusTool. So when starting Solr as an asynchronous managed process, it will need some time to start up. So we have to wait somehow for it to be ready.

We can get rid of the checking loop, but while testing it I had some executions where Solr was not ready yet and pester got stuck after the first failing test (without actually failing). So if there is a better / safer way, I would love to use that instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for explanation. Good candiate to hide behind a common utility function start_solr_and_wait(<port>)

# Stop Solr only if we started it
if ($script:SolrStartedByTests) {
Write-Host "Stopping Solr..."
$stopArgs = @("stop", "-p", $script:SOLR_PORT)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux test uses --all to stop all solr instances. And I assume that if a bin\solr.cmd stop --all command fails, it will fail the test, so probably no need to baby-sit the stop process here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can simplify it to

AfterAll {
  & $SolrCmd "stop" "--all"
}

but the tests won't fail if solr was not running and no Solr instance was stopped with the command. Is that fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If solr is not running when it should, I suppose other parts of the test would have failed already, so yea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants