Huge tables (WIP) #67

natmaka · 2025-08-28T14:24:01Z

My goal is to speed things up when the database is "huge" (total size way above the host's RAM size).

Approach number 1: offering to the user a way to let pg_sample use the "SYSTEM" option instead of "BERNOULLI". This patch does it. It seems OK but I didn't test thoroughly.

Approach number 2: obtaining the amount of tuples in a table using meta-information collected and stored during an ANALYZE pass, instead of the usual SELECT COUNT() way which often implies reading the whole table. The patch offers provisions to do so, but most of the work has to be done. Let me know if it seems interesting to you.

There are also various modifications made in "janitor" mode.

mla · 2025-08-28T17:31:26Z

Awesome, than you @natmaka ! Will review this weekend.

natmaka added 6 commits August 26, 2025 17:58

sampling method, WIP

699f840

sampling method: parameter uppercased , default value enforced

998c416

minor

0e1b965

APPROX_table_tuples_count

365d29f

APPROX_table_tuples_count comments

09a3b92

APPROX_table_tuples_count usage, WIP

25da7ea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Huge tables (WIP) #67

Huge tables (WIP) #67

Uh oh!

natmaka commented Aug 28, 2025

Uh oh!

mla commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Huge tables (WIP) #67

Are you sure you want to change the base?

Huge tables (WIP) #67

Uh oh!

Conversation

natmaka commented Aug 28, 2025

Uh oh!

mla commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants