Skip to content

Load .jpk-qi-data metadata using jpk-reader-rs package#45

Open
bicarlsen wants to merge 2 commits intoAFM-analysis:masterfrom
bicarlsen:jpk_reader_rs
Open

Load .jpk-qi-data metadata using jpk-reader-rs package#45
bicarlsen wants to merge 2 commits intoAFM-analysis:masterfrom
bicarlsen:jpk_reader_rs

Conversation

@bicarlsen
Copy link
Contributor

Speeds up loading .jpk-qi-data files by using the jpk-reader-rs package to load metadata.

Addresses #41, reducing load times of standard files by ~10x. For the test file it reduced the load time from more than 120 minutes to about 15.

I kept as much of the original JPKReader as possible for this first pass. There is still quite some performance to be gained though, as the same test file loads in 20 seconds from the Rust crate.

While probably not yet ready to be merged, I'm hoping can use it so we can address any issues that arise and continue to imporve upon performance.

@bicarlsen
Copy link
Contributor Author

This also incorporates #44.

remove backslash from hour and minute while parsing time.

refactor

optimize get_metadata
@bicarlsen
Copy link
Contributor Author

bicarlsen commented Feb 4, 2026

This new version optimizes get_metadata. Load times are down to about 130 seconds (60x faster than original) on my test data.

Once this is more thouroughly tested, it should be good to go.

There may also be some optimizations that can be applied elsewhere:

  1. File list is sorted, so you can use bisect to find items in it.
  2. Shared properties is repeatedly sorted and walked. Instead this can be done once at initialization.
  3. hierarchy was rewalking the file list. It now just happens at initialization.

@bicarlsen bicarlsen marked this pull request as ready for review February 5, 2026 16:53
@PinkShnack
Copy link
Contributor

Thanks @bicarlsen for the effort, looks great.

  • Can you please add unit test(s) for loading such a file and checking that the metadata, data is as expected?
  • I understand the files are quite large, could you make a smaller file and add it as a test dataset in tests/data. Zipping it might also help to make it smaller. If the file is still quite big, we can discuss options, such as using git-lfs.

@@ -1,19 +1,18 @@
import numpy as np
import jpk_reader_rs as jpk_rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jpk_rs isn't used here.

If you lint the package using flake8 (pip install flake8) (flake8 afmformats tests)

You'll see some small style issues

"igor2>=0.5.0", # Asylum Research .ibw file format
"jprops", # JPK file format
"jprops", # JPK file format
"jpk-reader-rs", # JPK QI data (`.jpk-qi-data`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't install as is. IT requires Rust and Cargo to install, apparently. I get the issue locally and also the CI/CD has the same issue (see https://github.com/AFM-analysis/afmformats/actions/runs/21680235950/job/62512249369?pr=45)

self._hierarchy = "single"
elif self.files_contains("index/"):
self._hierarchy = "indexed"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, if you run flake8, you'll see some lint issues e.g. W293 blank line contains whitespace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants