Skip to content

Conversation

@tstamler
Copy link
Contributor

What?

Add a heartbeat mechanism to NIXL ETCD metadata to invalidate stale metadata if an agent dies.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 10, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi tstamler! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

@tstamler tstamler marked this pull request as ready for review November 12, 2025 22:24
@tstamler tstamler requested a review from a team as a code owner November 12, 2025 22:24
@aranadive
Copy link
Contributor

/ok to test c2065fd

@aranadive
Copy link
Contributor

/build

aranadive
aranadive previously approved these changes Nov 14, 2025
Copy link
Contributor

@aranadive aranadive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@aranadive
Copy link
Contributor

/ok to test 4e8f87d

@aranadive
Copy link
Contributor

/build

@2dm
Copy link
Contributor

2dm commented Nov 19, 2025

@tstamler I added the fixes we talked about.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 19, 2025

/ok to test 4d6cdcd

@tstamler, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@tstamler
Copy link
Contributor Author

/ok to test 5220637

@tstamler
Copy link
Contributor Author

/build

@tstamler
Copy link
Contributor Author

/build

@tstamler
Copy link
Contributor Author

/ok to test 525a807

NIXL_DEBUG << "Using etcd namespace for agents: " << namespace_prefix;

etcd::Response response = etcd->leasegrant((heartbeat.count()) * 2);
lease_id = response.value().lease();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this valid if response is incorrect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants