Skip to content

Conversation

@jerrychenhf
Copy link

@jerrychenhf jerrychenhf commented Dec 4, 2025

This is to handle a lot of shutdown cases when any processes are killed or there are exceptions.
User can use "kill pid" to kill any process and the moon cake store shutdown gracefully. WARNING: if you use kill -9 to kill a process, the process is killed forcefully and there is no way to handle. You need to restart all the services for this case. It is suggest for you to gracefully shutdown a service by "kill" the pid of the api_server. This is enough to shutdown all distributed processes started as part of the api server.
When engine process shut down, it will cleanup the following:

  1. Mooncake store: Unregister the global memory
  2. The Parallel environments: PyTorch distributed groups

@jerrychenhf jerrychenhf merged commit e7bfbca into HabanaAI:deepseek_r1 Dec 12, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant