-
Notifications
You must be signed in to change notification settings - Fork 15
Added a query function to get minimum distance and the indices of clo… #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
michaelkirk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to this crate, so don't consider my review as blocking. Probably @kylebarron or someone else more familiar should make the final call.
| let node_size = self.node_size(); | ||
|
|
||
| // Use TinyVec to avoid heap allocations | ||
| let mut stack: TinyVec<[usize; 33]> = TinyVec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all pushes/pops happen in 3's maybe it makes more sense to have:
let mut stack: TinVec<[(usize, usize, usize); 11]> = TinyVec::new();
stack.push((0, indices.len() - 1, 0));
|
|
||
| // recursively search for items within radius in the kd-sorted arrays | ||
| while !stack.is_empty() { | ||
| let axis = stack.pop().unwrap_or(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unwrap_or "should never happen", right?
If you adopt the 3-tuple approach above, I think this could become:
while let Some((axis, right, left)) = stack.pop() {
...
}
And you could get rid of the unwrap_or
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking around I bit, it looks like this was copy/pasted from another method. Actually, a quick scan shows this traversal logic is being used in at least 2 other places. Is it reasonable to abstract the traversal logic from the query logic at this point?
| /// - qy: y value of query point | ||
| /// | ||
| /// Returns distance squared and indices of found items | ||
| fn query(&self, qx: N, qy: N) -> (N, Vec<u32>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have several superficial suggestions, but they are admittedly all subjective matters of taste.
1. naming: query is vague - nearest would be more specific without being overly verbose.
2. Returning distance squared: I see how this could be helpful for performance in some cases, but I'd guess it's surprising. (The current implementation actually seems to already return distance not distance squared so maybe you agree and this was just an out of date doc).
3. Tuple order: I'd expect most people to be more interested in the Vec<u32> than the distance, so I'd flip the order of the return tuple.
So, all together, I'm suggesting something like this:
/// Returns indices of found items and their distance from the query point.
fn nearest(&self, qx: N, qy: N) -> (Vec<u32>, N) {
(nearest_distance_squared.0, nearest_distance_squared.1.sqrt())
}
/// Optional: if you think it's worth exposing the squared distance - give it a more obvious name:
///
/// Returns indices of found items and their distance squared from the query point.
fn nearest_distance_squared(&self, qx: N, qy: N) -> (Vec<u32>, N) {
...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the naming. I have been using KDTrees in Scipy and I have mimicked the naming there
| /// - qy: y value of query point | ||
| /// | ||
| /// Returns distance squared and indices of found items | ||
| fn query(&self, qx: N, qy: N) -> (N, Vec<u32>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this u32? Aren't these array indices? I think if you use usize some of the unwrap's below can go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my bad. I see that indices are either u16 or u32, so converting to u32 should never fail. For future cleanup (not in this PR), maybe Indices should have some kind of infallible method like get_u32 so we don't have so many unwraps in the calling code.
| stack.push(1 - axis); | ||
| } | ||
| } | ||
| let d = min_val.sqrt().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it safe to unwrap here? When is this None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be None if min_val is < 0.0. That should not be the case since it is squared distance.
|
I'd like to find time to review this, but I'm pretty swamped right now and just got selected to serve on a 3-week jury trial, so my time will likely be very limited for the rest of January. |
|
@kylebarron No problem. I have some other work I need to progress and will come back to this when you have more time.. |
…sest points
CHANGES.mdorCHANGELOG.md