Skip to content

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

Move pyproject.toml to repo root.

Why are the changes needed?

pyproject.toml is now the official way to specify configurations. A lot of tools are supporting it. The way they search for it is to go up in the directory tree until they find it. They will consider that as the root of the project. Having it in dev/ is not useful for users to run tools individually. We should try to avoid custom scripts as much as possible.

After this change, users can do ruff check in either repo root, or python/ and get the correct result with our configurations.

This would also be very useful when we introduce pytest, which also utilizes pyproject.toml to config.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

dev/lint-python passed, and CI should pass.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

JIRA Issue Information

=== Improvement SPARK-54949 ===
Summary: Move pyproject.toml to root dir
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the BUILD label Jan 7, 2026
Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember mypy test failed when I did this .. but ok if it passes now

@Yicong-Huang
Copy link
Contributor

IMO it feels weird to have pyproject.toml in root. pyspark is just one component of the entire repo. we can always add dev/ into the search path right?

@gaogaotiantian
Copy link
Contributor Author

IMO it feels weird to have pyproject.toml in root. pyspark is just one component of the entire repo. we can always add dev/ into the search path right?

No that's not how all tools work.

If all of our python code is in python/, we can put it in python, but that's not the case. We have python files (that we check) in other directories like dev/. It's quite common to put this kind of files in repo root - just because tools will find it as long as they are executed in repo. If we put it in python/, ruff check won't be able to pick it up when we run it at root.

I think the overall direction is to make users use tools as natural as possible. For example, they can just do pytest and pick up all the options we pre-set. This is not possible when pyproject.toml lives in dev.

@gaogaotiantian
Copy link
Contributor Author

@zhengruifeng or @cloud-fan do you have any objections to this?

@zhengruifeng
Copy link
Contributor

we also have pom.xml in root, while not all modules are based on scala.
so I think it is fine to add python-specific file in root if it needs to take effect in two directories: /python and /dev

@zhengruifeng
Copy link
Contributor

merged to master

Yicong-Huang pushed a commit to Yicong-Huang/spark that referenced this pull request Jan 9, 2026
### What changes were proposed in this pull request?

Move `pyproject.toml` to repo root.

### Why are the changes needed?

`pyproject.toml` is now the official way to specify configurations. A lot of tools are supporting it. The way they search for it is to go up in the directory tree until they find it. They will consider that as the root of the project. Having it in `dev/` is not useful for users to run tools individually. We should try to avoid custom scripts as much as possible.

After this change, users can do `ruff check` in either repo root, or `python/` and get the correct result with our configurations.

This would also be very useful when we introduce `pytest`, which also utilizes `pyproject.toml` to config.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

`dev/lint-python` passed, and CI should pass.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#53716 from gaogaotiantian/move-pyproject.

Authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants