Skip to content

Conversation

@cjen1-msft
Copy link
Contributor

Looking at https://github.com/microsoft/CCF/actions/runs/18711838663/job/53362454045 it claims that the ledger directory already existed.
Afaict this can only happen if either:

  • The previous delete of the workspace directory was non-atomic
  • There existed a node which was writing to it and was not stopped successfully
  • We have a bug in our startup behaviour

Of these the most likely is that some node was not properly killed at the end of a test, and one possible way would be if a network was started and not stopped.

In e2e_operations.py we have a couple of networks that get started within a test, and potentially are not closed.
(the one which is glaring is the run_recovery_unsealing_corrupt)

If we instead encapsulate all of these networks via the with keywork in python, then when they go out of scope they are always stopped (and there is logging of stacktraces etc).

This PR does so for all networks in e2e_operations.py
There are some other similar ones in - lts compatibility, but these seem less likely to be obviously correct.

I've also taken the opportunity to ensure that at the very least for these networks that they are not alising over the same directory.

Copilot AI review requested due to automatic review settings October 22, 2025 13:00
@cjen1-msft cjen1-msft requested a review from a team as a code owner October 22, 2025 13:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses potential resource cleanup issues in end-to-end tests by ensuring all test networks are properly managed within Python context managers (with statements). This prevents scenarios where networks might not be stopped properly, leading to processes continuing to write to ledger directories and causing test failures.

Key changes:

  • Converted all standalone infra.network.Network() instantiations to use with infra.network.network() context managers
  • Added unique label suffixes to prevent directory aliasing between networks in the same test
  • Fixed deep copy usage to ensure proper argument isolation between networks

@cjen1-msft cjen1-msft self-assigned this Oct 22, 2025
@eddyashton eddyashton enabled auto-merge (squash) October 28, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants