Author: Li Yiming
- Sort select: O(n log n)
- Linear select: 1.5n ~ 5.43n
- Lazy select: 2n + o(n)
See my report (in Chinese) for more technical information.
Run test cases and see the logs under log folder (you have to manually create it if not exist):
python test/test.pyThe log file with WARNING is a simple version of log, and DEBUG is a more verbose version of log.
For example, the lazy-WARNING.log file has only the summary of the algorithm result and time used:
min([ 4 7 2 10 3 3], 0) = 2
Time used: 0.0seconds
min([ 4 7 2 10 3 3], 1) = 3
Time used: 0.0seconds
min([ 4 7 2 10 3 3], 2) = 3
Time used: 0.0seconds
min([ 4 7 2 10 3 3], 3) = 4
Time used: 0.0seconds
min([ 4 7 2 10 3 3], 4) = 7
Time used: 0.0seconds
min([ 4 7 2 10 3 3], 5) = 10
Time used: 0.0secondsBut the lazy-DEBUG.log file contains all the steps taken during the algorithm, it will be useful for you to understand the algorithm quickly.
You can run benchmark with command, the output will be written to files under log folder:
python test/benchmark.pyZipf distribution seems more difficult to select, so I only tested the case when the length of A is 10,000.
| Distribution | Algorithm | Length of A | Time used |
|---|---|---|---|
| Uniform | SORT-SELECT (numpy) |
1,000,000 | 0.24885845184326172seconds |
| Uniform | LINEAR-SELECT | 1,000,000 | 3.949763059616089seconds |
| Uniform | LAZY-SELECT | 1,000,000 | 4.454483985900879seconds |
| Normalize | SORT-SELECT (numpy) |
1,000,000 | 0.12593460083007812seconds |
| Normalize | LINEAR-SELECT | 1,000,000 | 0.5047116279602051seconds |
| Normalize | LAZY-SELECT | 1,000,000 | 2.378643274307251seconds |
| Zipf with ζ = 1.01 | SORT-SELECT (numpy) |
10,000 | 0.0010001659393310547seconds |
| Zipf with ζ = 1.01 | LINEAR-SELECT | 10,000 | 0.9494802951812744seconds |
| Zipf with ζ = 1.01 | LAZY-SELECT | 10,000 | 0.03298377990722656seconds |