[WIP] Copy tinker to skyrl-train to eliminate cross-repo dependencies#1012
Draft
tyler-griggs wants to merge 1 commit intotyler/fix-tinker-placement-groupfrom
Draft
[WIP] Copy tinker to skyrl-train to eliminate cross-repo dependencies#1012tyler-griggs wants to merge 1 commit intotyler/fix-tinker-placement-groupfrom
tyler-griggs wants to merge 1 commit intotyler/fix-tinker-placement-groupfrom
Conversation
Copied tinker API, engine, and skyrl_train backend from skyrl-tx into
skyrl-train to test if eliminating the cross-repository dependency resolves
Ray worker environment issues.
Changes:
- Copied tinker/{api,engine,db_models,types,config}.py to skyrl_train/tinker/
- Copied backends/{backend,skyrl_train,utils}.py to skyrl_train/tinker/backends/
- Updated all imports: tx.tinker.* -> skyrl_train.tinker.*
- Updated imports: tx.utils.log -> loguru
- Inlined tx.utils.storage.download_file() to remove dependency
- Made ExternalInferenceClient import optional
- Updated engine subprocess to use sys.executable instead of uv run
- Added tinker optional dependencies to pyproject.toml
- Set RAY_RUNTIME_ENV_LOCAL_DEV_MODE=0 to disable auto-packaging
Known issue: Ray still auto-packages working directory causing workers to
reinstall all dependencies. Investigating why gsm8k example doesn't hit this.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Experimental/WIP: Copied entire tinker implementation from skyrl-tx into skyrl-train to test if eliminating cross-repository dependencies resolves Ray worker environment issues.
This PR is stacked on #1010 (placement group fixes).
What was copied
api.py- FastAPI serverengine.py- Background enginedb_models.py- SQLModel database modelstypes.py- Type definitionsconfig.py- Engine configurationbackends/backend.py- Abstract backend interfacebackends/skyrl_train.py- SkyRL-Train backend (with placement group fixes from Fix placement group creation in SkyRL-Train backend #1010)backends/utils.py- UtilitiesImport updates
from tx.tinker.*→from skyrl_train.tinker.*from tx.utils.log→from logurutx.utils.storage.download_file()ExternalInferenceClientimport optionalDependencies added
Added
tinkeroptional dependency group topyproject.toml:Current Issue
Ray workers are still downloading all packages (vllm, torch, etc.) even though we're already in the correct environment. Investigating why the working gsm8k example doesn't hit this issue.
Logs show Ray is auto-packaging the working directory:
Next Steps
🤖 Co-Authored-By: Claude Sonnet 4.5