Skip to content

Conversation

@aowenson-imm
Copy link

@aowenson-imm aowenson-imm commented Dec 19, 2025

Implements #662.

/* skip folder if size < minsize
* In some filesystems e.g. Ceph, folder size = 
* recursive sum of all contents. */

Massive performance improvement for me (Ceph filesystem, skipping folders < 1 GB), been using occasionally for several weeks now. If your filesystem does not do recursive folder sizing e.g. ext4, then no effect. Not sure what other filesystems implement this other than Ceph.

@vassilit vassilit self-requested a review December 24, 2025 15:36
@aowenson-imm
Copy link
Author

aowenson-imm commented Jan 7, 2026

Test fail:

        *_, footer = run_rmlint('--size 1024')
>       assert footer['duplicates'] == 4
E       assert 0 == 4

Cause is my folder-skip logic on "non-recursive-size" filesystems e.g. ext4 is skipping all folders (because size < 1KB).

Two possible solutions:

  1. auto-detect if filesystem implements recursive-size for folders (does folder size exceed its metadata size)
  2. new CLI argument to request applying size-skipping to folders

Which do you prefer?

@vassilit
Copy link
Collaborator

vassilit commented Jan 7, 2026

rmlint already do detect filesystems for some options (support of ioctls as FICLONERANGE, etc).

However, I think that the st_size value of CephFS with the rbytes feature enabled is non-standard, so I would prefer a new CLI long option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants