Merged
Conversation
…be sorted which fixed a test..
…ous attempts at finding the problem
…es to tests and debug statements to try get flash minsort working
…g during replacement selection still not working for 4 byte data
…ssed with the flahs minsort logic, and I tried putting the result filte to be after last write, but there's some logic breaking in there
…ile interface and adaptive sort working properly on the arduino
…ng sd file interface methods
rlawrenc
approved these changes
Feb 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR addresses the issues with sorting and order by, including fixes to the implementation and usage of the desktop/sd file interfaces, heap management in selection sort, region min finding problems in flash minsort, IO problems and logic issues in flash minsort sublist, removing Arduino specific limitations to have it work with adaptive sort, debugging tool additions/changes, makefile and pio build changes, and other miscellaneous fixes.
What was changed
File Interfaces
The sort wrapper was calling the desktop file interface directly for setting up files, which of course led to distribution file issues since those interfaces for desktop and sd files aren't to be included in the distribution. I added the interface methods to the embed db file interface struct so it can use those, and avoided the inclusion at all. I also added methods to create random file names, which has a conditional check for windows vs unix systems, and did the same for the sd file interface. These methods can now get a file name, pass it to the file setup to create the temp external sorting file, and can now use these in adaptive sort, and then it will remove and delete them afterwards. The sd file interface was also lacking a number of methods that were being used in the adaptive sort including
writeRelandreadRelfor performing these actions at the file pointer position, as well astell,error,remove, andseek.Adaptive Replacement Selection and Heap
The heap was being written to memory outside the given buffer space allocated, and was not being drained properly, which caused the selection sort sublists to have garbage data written instead of what is left inside the heap. If a new run is needed, there is new code to start a new run and drain the heap, as well as the main issue is that the record left check would pass without checking if loop had written SOME records but not all of them, which was causing it not to flush the records that were read with the heap values, causing those records to be written with garbage data to a sublist. The check to make sure that i is also greater or equal to records read so that if that is true it will flush the rest of the data, otherwise it doesn't and heap stays full of data we want written. I've also turned of the optimistic flag at the moment, since it seems to always run the pessimistic version anyway at the moment, but this can/should be revisited later.
Flash Minsort
The minsort was overwriting the min value in a region if two blocks were within a region by placing the first value there in a new block even if the current minimum of the region was lower. So if block 0 had a minimum of 2 and block 1 had minimum of 10 but they were in same region, it would be overwritten. In the SEA 100K data test, there are a ton of repeated values, and so this overwriting process caused about half the values to disappear. The check now doesn't default overwrite but make sure it is actually the minimum of the region first, then proceeds if that is true. This has so far had successful results. The other issue was the file pointer being set to 0 at the end of the method, when since this is a merge algorithm, we need to start reading from the merged output halfway through the file.
Flash Minsort Sublist
This minsort sublist had a strange issue with block reading failures. I updated that method to actually return an int for success or failure and check that to see what to do next. If the reading fails because of end-of-file or some other data corruption issue, it now will mark that region as finished and move on, rather than just continuing with a failed read. There's also some code in there to restart the check recursively for the min if this corruption occurs at the start of the read. This usually shouldn't be called, but I think it might be necessary if some corruption occurs. I will try to trigger these cases at some point, but the current test code that triggers min sublists does not do that. I also think that since this is only called if there's very few sublists, that the recursive call shouldn't be an issue if it's triggered, but again, that needs to be tested properly which I have not done unfortunately.
Makefile, Debugging
I added some files for print debugging called debug_log that uses a low level call for printing since when debugging with breakpoints, the printf calls often didn't show up. I put this in the lib folder, not to be included in the dist, and with some macros to see if any of the given debug macros are called before including the file and if it can't be found just making the debug_log be void with a given output so it should still run. In the makefile and .pio build file I removed the print errors debug symbol so it's not defined by default.
Final comments
There were a lot of changes and trials and errors, and a bunch more debugging code added in case I or someone down the line needs to revisit these methods. These are far from perfect, and while they work with the tests and data I've tried to this point, that does NOT mean they are always correct or fool proof, and there may be something I missed, something I added erroneously, or a way to improve the performance that I just missed.