-
Notifications
You must be signed in to change notification settings - Fork 91
Description
I would like to propose a small refactor to use the standard library caching helpers in places where the same work is done many times on the same objects.
selectolax often exposes:
- properties that compute values based on the underlying DOM node
- methods that walk the tree or compute derived data from the same arguments on the same object
These values usually do not change for the lifetime of a given node or parser instance, but they may be accessed repeatedly. Using the built in caching tools can avoid repeated work and reduce overhead.
Proposal
-
Use
@functools.cached_propertyfor suitable propertiesFor properties that:
- are computed from the current internal state once
- do not change for the lifetime of the object
- are reasonably expensive (they do more than a trivial constant time field access)
wrap them with
@functools.cached_propertyso that the value is computed once per instance and then reused.This is especially useful for properties that walk the DOM, wrap C structures, or build Python level objects from C data.
-
Use
functools.cachefor suitable methodsFor methods that:
- depend only on
selfand their explicit arguments - have no side effects
- are likely to be called multiple times with the same arguments on the same object
decorate them with
@functools.cache, or refactor the core logic into a helper that is wrapped with@functools.cache.This gives simple memoization without needing a custom caching layer.
- depend only on
Notes and constraints
- Caching should be added only where the underlying node or parser is effectively immutable or where cached results cannot become wrong if the tree changes.
- If there are places where the tree can be mutated after creation, we should either skip caching for those properties or document clearly that cached results are not updated after mutation.
This refactor would keep the public API unchanged, while improving performance and reducing repeated work in high level Python code that wraps the C layer.