Skip to content

Speed up some functions#360

Merged
ludwiglierhammer merged 15 commits intoglamod:mainfrom
ludwiglierhammer:more_speed
Jan 29, 2026
Merged

Speed up some functions#360
ludwiglierhammer merged 15 commits intoglamod:mainfrom
ludwiglierhammer:more_speed

Conversation

@ludwiglierhammer
Copy link
Collaborator

@ludwiglierhammer ludwiglierhammer commented Jan 26, 2026

  • cdm_mapper
  • common.select

@ludwiglierhammer
Copy link
Collaborator Author

ludwiglierhammer commented Jan 26, 2026

We used test data of 2.1 GB.
Reading (mdf_reader), selecting (common.select), mapping (cdm_mapper) and writing on disk were applied to the data:

Stage chunksize duration max memory usage
Before None ~120min > 30GB
Before 200000 ~40min > 30GB
After None ~50min > 30GB
After 200000 ~35min > 30GB

This PR speeds up the code in general.
Unfortunately, maximum memory usage is not affected.

Memory rises (with chunking):

  • at the end of mapping
  • during selection (times 2 because of "data" and "mask")

@github-actions
Copy link

Warning
This Pull Request is coming from a fork and must be manually tagged approved
in order to perform additional testing.

@ludwiglierhammer
Copy link
Collaborator Author

@JanWillruth: For my perspective this PR is ready to merge. We can focus on the performance in another PR after we merged #348.

@github-actions github-actions bot added the docs label Jan 29, 2026
@ludwiglierhammer ludwiglierhammer merged commit 2139796 into glamod:main Jan 29, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

1 participant

Comments