Sorting algorithms

General concepts

In this part, arrays of $n$ elements are visually represented as histograms. The height of each bar represents the value of the associated element in the array.

Sorting algorithm

A sorting algorithm takes an unsorted array $A=[a_0, ..., a_{n-1}]$ as input and returns a sorted array $A'=[a_0', ..., a_{n-1}']$ as output. $A'$ is a permutation of $A$ such that:

\boxed{a_0'\leqslant a_1'\leqslant...\leqslant a_{n-1}'}

Stability

A sorting algorithm is said to be stable if the order of tied elements is guaranteed to remain the same after sorting the array.

Remark

Examples of stable sorting algorithms include merge sort and insertion sort.

The next sections will go through each sorting algorithm into more detail

Basic sort

Bubble sort

Bubble sort is a stable sorting algorithm that has a time complexity of $\mathcal{O}(n^2)$ and a space complexity of $\mathcal{O}(1)$ .

Intuition Compare consecutive elements and swap them if they are not in the correct order.

Algorithm

Compute step: Starting from the beginning of the array, compare the element $a_l$ at position $i$ with the element $a_r$ at position $i+1$ .
- Case $a_l\leqslant a_r$ : They are already in the correct order. There is nothing to do.
- Case $a_l>a_r$ : They are not in the correct order. Swap them.
  Repeat this process until reaching the end of the array.
Repeat step: Repeat the compute step until no swap can be done.

We note that:

At the end of the $k^{\textrm{th}}$ pass, the last $k$ elements of the array are guaranteed to be in their final positions.
The algorithm finishes in at most $n-1$ passes. In particular:
- If the input array is already sorted, then the algorithm finishes in 1 pass.
- If the input array is reverse sorted, then the algorithm finishes in $n-1$ passes.

Insertion sort

Insertion sort is a stable sorting algorithm that has a time complexity of $\mathcal{O}(n^2)$ and a space complexity of $\mathcal{O}(1)$ .

Intuition Incrementally build a sorted subarray by successively inserting the next unsorted element at its correct position.

Algorithm

Initialization: The first element of the unsorted array can be interpreted as a subarray of one element that is already sorted.
Compute step: Starting from position $i=1$ , we want to insert element $a_{i}$ into the current sorted subarray of size $i$ . In order to do that, we compare $a_i$ with its preceding element $a_p$ :
- As long as $a_p > a_i$ , we iteratively swap $a_p$ with $a_i$ .
- This process ends with either $a_i$ verifying $a_p\leqslant a_i$ or $a_i$ being at position 0 of the array.
  At the end of this step, the sorted subarray is of size $i+1$ .
Repeat step: Repeat the compute step until the sorted subarray reaches a size of $n$ .

Selection sort

Selection sort is a stable sorting algorithm that has a time complexity of $\mathcal{O}(n^2)$ and a space complexity of $\mathcal{O}(1)$ .

Intuition Incrementally build a sorted subarray by successively inserting the minimum value among the remaining elements.

Algorithm

Initialization:
- Find the minimum value of the unsorted array.
- Swap it with the element at the beginning of the array. The first element can now be interpreted as a sorted subarray with one element.
Compute step: Starting from $i=1$ , we want to insert a new element into the sorted subarray of size $i$ .
- Find the minimum value $a_{\textrm{min}}$ among the remaining elements at positions $i, ..., n-1$ of the array. By construction, $a_{\textrm{min}}$ is greater or equal than all elements of the current sorted subarray.
- Swap $a_{\textrm{min}}$ with the element at position $i$ of the array.
  At the end of this step, the sorted subarray is of size $i+1$ .
Repeat step: Repeat the compute step until the sorted subarray reaches a size of $n$ .

Remark

Insertion and selection sort are very similar in that they build the sorted array from scratch and add elements one at a time.

Cycle sort

Cycle sort is an unstable sorting algorithm that has a time complexity of $\mathcal{O}(n^2)$ and a space complexity of $\mathcal{O}(1)$ .

Intuition Determine the index of the final position of each element in the sorted array.

Algorithm

Compute step: Starting from $i=0$ , we want to find the final position of element $a_i$ along with those of the other elements impacted by this move.
To do so, we count the number of elements that are smaller than $a_i$ :
$\boxed{N_i=\#\left\{k, a_k < a_i\right\}}$
The final position of $a_i$ is at index $N_i$ , since we know that there are $N_i$ smaller elements than $a_i$ in the final sorted array.
- Case $N_i=i$ : This is a self-cycle, meaning that the element is already at its correct position. There is nothing to do.
- Case $N_i\neq i$ : This is the start of a cycle of moves. Place $a_i$ at position $N_i$ (or to the right of any duplicates, if applicable) and keep the replaced value in a temporary variable.
  Keep moving elements using this logic until getting back to position $i$ .
Repeat step: Repeat the compute step until reaching the end of the array. We can see cycles being formed from the way elements are moved.

Remark

This algorithm sorts the array with a minimum amount of rewrites since it only moves elements to their final positions.

Efficient sort

Merge sort

Merge sort is a stable sorting algorithm that has a time complexity of $\mathcal{O}(n\log(n))$ and a space complexity of $\mathcal{O}(n)$ .

Intuition Build a sorted array by merging two already-sorted arrays.

Algorithm

Divide step: Divide the array into as many subarrays as there are elements. Each resulting subarray has only one element and can be considered as sorted.
Conquer step: For each pair of sorted subarrays, build a sorted array by merging them using the two-pointer technique.
Repeat the process until all subarrays are merged into one final sorted array.

Remark

This algorithm is usually implemented recursively.

Heap sort

Heap sort is an unstable sorting algorithm that has a time complexity of $\mathcal{O}(n\log(n))$ and a space complexity of $\mathcal{O}(1)$ .

Intuition Incrementally build a sorted subarray by successively retrieving the maximum of remaining elements using a max-heap.

Algorithm

Initialization: Build a max-heap from the unsorted array in $\mathcal{O}(n)$ time. This is done by recursively swapping each parent with their child of highest value.
We can use the same array to represent the max-heap.
Compute step: Starting from $i=0$ , incrementally build a sorted subarray of size $i+1$ that is placed at the end of the array.
- Pop an element from the max-heap and place it at position $n-i-1$ of the array.
  By construction, the popped element is greater or equal than the $n-i-1$ elements of the heap and smaller or equal than the $i$ previously popped elements.
- Heapify the resulting max-heap of size $n-i-1$ in $\mathcal{O}(\log(n))$ time.
Repeat step: Repeat the compute step until the subarray reaches a size of $n$ .

Quick sort

Quick sort is an unstable sorting algorithm that has a time complexity of $\mathcal{O}(n^2)$ and a space complexity of $\mathcal{O}(n)$ .

Intuition Recursively choose an element of the array to be the pivot, and put:

Smaller elements to the left of the pivot.
Bigger elements to the right of the pivot.

Algorithm

Compute step:
- Pivot isolation step: Choose an element of the array to be the pivot $p$ and swap it with the last element of the array.
- Partition step: Starting from indices $l=0$ and $r=0$ , we make sure that elements $a_r$ that are smaller or equal than the pivot $p$ are sent to the left of the array:
  - Case $a_r \leqslant p$ : Swap element $a_r$ at index $r$ with element $a_l$ at index $l$ . Increment $l$ by 1.
  - Case $a_r > p$ : There is nothing to do.
    This process is successively repeated for all indices $r$ until the second-to-last position of the array.
- Pivot inclusion step: The final position of the left pointer $l$ represents the position in the array where its left elements are all array elements that are smaller than or equal to the pivot. We swap the pivot with the element from position $l$ .
  At the end of this step, the pivot is at its correct and final position.
Recursion step: The compute step is run recursively on the resulting subarrays on the left and right of the pivot. They are then merged back together to form the final sorted array.

We note that the runtime of the algorithm is sensitive to the choice of the pivot and the patterns found in the input array. In general, a time complexity of $\mathcal{O}(n\log(n))$ and a space complexity of $\mathcal{O}(\log(n))$ can be obtained when the pivot is a good approximation of the median of the array.

The two most common methods to choose the pivot are:

Special sort

Counting sort

Counting sort is a stable sorting algorithm that is efficient for integer arrays with values within a small range $[\![0, k]\!]$ . It has a time complexity of $\mathcal{O}(n+k)$ and a space complexity of $\mathcal{O}(n+k)$ .

Intuition Determine the final position of each element by counting the number of times its associated value appears in the array.

Algorithm

Number of occurrences: This step takes $\mathcal{O}(n+k)$ time and $\mathcal{O}(k)$ space.
- Initialization: Initialize an array $C$ of length $k+1$ .
- Count step: Scan array $A$ and increment the counter $c_v$ of each encountered value $v\in\{0, ..., k\}$ by 1.
  Each count $c_v$ represents the number of times the value $v\in\{0, ..., k\}$ appears in array $A$ .
- Cumulative step: Compute the cumulative sum of array $C=[c_0, .., c_k]$ and move each resulting element to the right. This operation is done in-place and takes $\mathcal{O}(k)$ time and $\mathcal{O}(1)$ space.
  For each value $v$ present in $A$ , the associated element $c_v$ in the resulting array $C$ indicates its starting index $i$ in the final sorted array $A'$ .
Construction of the sorted array: This step takes $\mathcal{O}(n)$ time and $\mathcal{O}(n)$ space.
- Initialization: Initialize an array $A'$ of length $n$ that will contain the final sorted array.
- Main step: For each value $v$ of the unsorted array $A$ :
  - Write $v$ to index $c_v$ of $A'$ .
  - Increment $c_v$ by 1 so that any later duplicates can be handled.

At the end of this process, we obtain the final sorted array $A'$ .

Radix sort

Radix sort is a stable sorting algorithm that is well suited for integer arrays where elements are written with a limited number of digits $d$ , each digit being in the range $[\![0, k]\!]$ . This algorithm has a time complexity of $\mathcal{O}(d(n+k))$ and a space complexity of $\mathcal{O}(n+k)$ .

Intuition Successively sort elements based on their digits using counting sort.

Algorithm

Compute step: Perform counting sort based on the rightmost digit. This step takes $\mathcal{O}(n+k)$ time and $\mathcal{O}(n+k)$ space.
Repeat step: Repeat the *compute step on the remaining digits until reaching the leftmost digit.
At the end of this process, the array is sorted.

The trick of this algorithm lies in the stability property of counting sort: the relative ordering based on a given (weak) digit helps in breaking ties of later (stronger) digits.

Want more content like this?

Subscribe here to be notified of new Super Study Guide releases!

Sorting algorithms

General concepts​