Skip to content

Commit 361c7da

Browse files
committed
Improving sorting notes.
1 parent 9421b0c commit 361c7da

File tree

1 file changed

+39
-19
lines changed

1 file changed

+39
-19
lines changed

source/lectures/misc/sorting.md

Lines changed: 39 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -29,17 +29,19 @@ among other properties.
2929

3030
## Algorithm Comparison
3131

32-
The algorithms we are about to study compare [as follows](https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms):
33-
34-
Name | Best | Average | Worst | Memory | Stable | In-place |
35-
--------- | --- | --- | --- | --- | --- | --- |
36-
Insertion sort | n | $n^2$ | $n^2$ | 1 | Yes | Yes |
37-
Heapsort | $n \log n$ | $n \log n$ | $n \log n$ | 1 | No | Yes |
38-
Bubble sort | n | $n^2$ | $n^2$ | 1 | Yes | Yes |
39-
Shellsort | $n (\log n)^2$ | $n (\log n)^2$ | $n (\log n)^2$ | 1 | No | Yes |
40-
Quicksort | $n \log n$ | $n \log n$ | $n^2$ | $\log n$ | No | Yes |
41-
Selection sort | $n^2$ | $n^2$ | $n^2$ | 1 | No | Yes |
42-
Merge sort | $n \log n$ | $n \log n$ | $n \log n$ | n | Yes | No |
32+
The algorithms we are about to study compare [as follows](https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms), omitting the $O(.)$, $\times$, and letting $n$ be the size of the list we have to sort:
33+
34+
Name | Best | Average | Worst | Memory | Stable |
35+
--------- | --- | --- | --- | --- | --- |
36+
Insertion sort | $n$ | $n^2$ | $n^2$ | $c$ | Yes |
37+
Heapsort | $n \log(n)$ | $n \log (n)$ | $n \log(n)$ | $c$ | No |
38+
Bubble sort | $n$ | $n^2$ | $n^2$ | $c$ | Yes |
39+
Shellsort | $n (\log(n))^2$ | $n (\log(n))^2$ | $n (\log(n))^2$ | $c$ | No |
40+
Quicksort | $n \log(n)$ | $n \log(n)$ | $n^2$ | $\log(n)$ | No |
41+
Selection sort | $n^2$ | $n^2$ | $n^2$ | $c$ | No |
42+
Merge sort | $n \log(n)$ | $n \log(n)$ | $n \log(n)$ | $n$ | Yes |
43+
44+
All the algorithms above are in-place except for merge sort.
4345

4446
## Helper Methods
4547

@@ -66,17 +68,21 @@ At every step, it position a `slot` on the bar and look *back*, moving the value
6668

6769
### Complexity
6870

69-
[As explained on wikipedia](https://en.wikipedia.org/wiki/Insertion_sort#Best,_worst,_and_average_cases), the simplest worst case input is an array sorted in reverse order.
70-
With an array sorted in reverse order, every iteration of the inner loop will scan and shift the entire sorted subsection of the array (i.e., from `bar` to the beginning) before inserting the next element. This gives a quadratic running time (i.e., $O(n^2)$).
71+
[As explained on wikipedia](https://en.wikipedia.org/wiki/Insertion_sort#Best,_worst,_and_average_cases), the simplest **worst** case input is an array sorted in reverse order.
72+
With an array sorted in reverse order, every iteration of the inner loop will scan and shift the entire sorted subsection of the array (i.e., from `bar` to the beginning) before inserting the next element. This gives a quadratic running time (i.e., $O(n^2)$), since `bar` is linear in `n`, and we iterate twice over it.
73+
74+
On the flip side, if the array is already sorted, then the algorithm is linear, since the inner loop will always execute just one time, giving an overall **best** performance of $O(n)$.
75+
76+
But **on average**, the algorithm remains in $O(n^2)$ since it will need to go through the list twice.
7177

7278
## Heapsort Algorithm
7379

7480
### Implementation
7581

76-
We first define some helper methods:
82+
We first define a helper method:
7783

7884
```
79-
!include`snippetStart="// Helper methods for Heapsort", snippetEnd="// Done with helper methods for Heapsort"` code/projects/Sorting/Sorting/Sorting.cs
85+
!include`snippetStart="// Helper method for Heapsort", snippetEnd="// Done with helper method for Heapsort"` code/projects/Sorting/Sorting/Sorting.cs
8086
```
8187

8288
and then leverage the heap structure to sort:
@@ -101,6 +107,7 @@ This algorithm works in two steps:
101107
- The second step also calls `PercDown` $n$ times, so it is overall $O(n \times \log(n))$ as well.
102108

103109
Hence, the complexity of heapsort is $O(n \times \log(n))$ by [the sum rule](./docs/programming_and_computer_usage/complexity#simplifications).
110+
Note that the **average**, **worst** and **best** complexity are all the same!
104111

105112
## Bubble Algorithm
106113

@@ -116,7 +123,8 @@ The nested loop accomplishes the following: "from the beginning of the list to w
116123

117124
### Complexity
118125

119-
Since both loops depends on the size of the list, $n$, the algorithm is overall $O(n^2)$: we need to perform $n$ times $n$ operations.
126+
Since both loops depends on the size of the list, $n$, the algorithm is **on average** $O(n^2)$: we need to perform $n$ times $n$ operations.
127+
An optimization (not presented here) that stops the inner loop when elements were not swapped allows to bring the **best** case performance of bubble sort to linear ($O(n)$).
120128

121129
## ShellSort Algorithm
122130

@@ -156,15 +164,27 @@ Consider a list of size 30, we have (assuming `current.CompareTo(listP[slot - ga
156164
… | … | … |
157165
1 |
158166

159-
The important point is to understand that we generate the sequences
167+
The important point is to understand that we generate the sequences of pairs (`slot`, `slot-gap`) as follows:
168+
160169
- *Gap of 11*: (11, 0), (12, 1), (13, 2), … (22, 11), (11, 0), (23, 12), (12, 1), (30, 19), (19, 8),
161170
- *Gap of 5*: (5, 0), (11, 6), (6, 1), …
162171

163-
which are sequences of values we are comparing.
172+
which are sequences of values we are comparing. For the gap of 11, it means we do the following:
173+
174+
- First, we compare the values at indices 11 and 0, and swap them if needed,
175+
- Then, we compare the values at indices 12 and 1, and swap them if needed,
176+
-
177+
- Then, we compare the values at indices 30 and 19, and swap them if needed,
178+
- If we did swap the values previously, then we compare the values at indices 19 and 8, and swap them if needed.
179+
164180
After we are done going through "the $i$ gap", we know that all values $i$ indices apart are sorted.
165181
Reducing the value of $i$ to $1$ makes it so that the whole array is sorted.
166182

167-
168183
### Complexity
169184

185+
The complexity of shell sort depends with the "gap sequence" that is used. We use `listP.Count / 3 + 1`, `(listP.Count / 3 + 1) / 2`, `(listP.Count / 3 + 1) / 4`, …, `1`.
186+
This sequence follows Shell's original algorithm, and it is of complexity $O(n^2)$ in the **worst case**: indeed, we may need to explore $O(n)$ gaps, each requiring $O(n)$ swaps.
187+
If the **best case**, if the array is already mostly sorted, then we still need to explore $O(n)$ gaps, but each gap takes only $O(\log(n))$ swaps, giving a $O(n \times \log(n))$ complexity.
188+
On **average**, the complexity depends a lot on the sequence, but can be around $O(n^{1.5})$, which is still better than quadratic!
170189

190+
Playing with the gap sequence further can give a **best**, **worst** and **average** performance of $O(n \times (\log(n))^2)$!

0 commit comments

Comments
 (0)