Skip to content

Conversation

@natmaka
Copy link

@natmaka natmaka commented Aug 28, 2025

My goal is to speed things up when the database is "huge" (total size way above the host's RAM size).

Approach number 1: offering to the user a way to let pg_sample use the "SYSTEM" option instead of "BERNOULLI". This patch does it. It seems OK but I didn't test thoroughly.

Approach number 2: obtaining the amount of tuples in a table using meta-information collected and stored during an ANALYZE pass, instead of the usual SELECT COUNT() way which often implies reading the whole table. The patch offers provisions to do so, but most of the work has to be done. Let me know if it seems interesting to you.

There are also various modifications made in "janitor" mode.

@mla
Copy link
Owner

mla commented Aug 28, 2025

Awesome, than you @natmaka ! Will review this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants