# Linked lists

*By Afshine Amidi and Shervine Amidi*

## Singly linked lists

### Definition

A singly linked list is a data structure composed of nodes, where each node carries the information of:

- a value $v$
- a $\texttt{next}$ field, that points to the next node

The first node of a singly linked list is called the head. The overall data structure is represented as follows:

### Operations

The main operations that can be performed on a singly linked list are explained below:

Type | Time | Description | Illustration |
---|---|---|---|

Access | O(n) | Starting from the head, the linked list is traversed in a sequential way to access the desired node. If it is the last element, the whole linked list is traversed. | |

Search | O(n) | Starting from the head, the linked list is traversed in a sequential way to search for the desired node. If the node is not found, the whole linked list is traversed. | |

Insertion | O(1) | 1. Assign the next pointer of the previous-to-be element to the newly-inserted node. 2. Assign the next pointer of the new element to the following element. | |

Deletion | O(1) | Assign the next pointer of the previous node to the following node of the soon-to-be-removed node. |

### Floyd's algorithm

Floyd's algorithm, also known as the tortoise and hare algorithm, is able to detect the starting point of a cycle in a linked list.

It finds the start of the cycle using the two-pointer method in 3 steps, with a time complexity of $\mathcal{O}(n)$ and a space complexity of $\mathcal{O}(1)$:

*Meeting step:*We define two pointers:A slow pointer (tortoise) $p_T$ that goes 1 step at a time.

A fast pointer (hare) $p_H$ that goes 2 steps at a time.

The tortoise and the hare both start from the head of the linked list and respectively travel distances $d_T$ and $d_H$ until they meet. At the meeting point, these quantities verify $d_H=2d_T$.

We note $\Delta_1$ the difference between the distances traveled by the hare and the tortoise:

$\Delta_1\triangleq d_H-d_T=d_T$Given that the two animals necessarily meet somewhere in the cycle, we can also write:

This means that we can rewrite $\Delta_1$ as:

$\Delta_1\triangleq d_H-d_T=(n_2-n_1)D$Hence, we have:

$\boxed{d_T=(n_2-n_1)D}$

*Restart step:*Now, we keep the hare at the meeting point and place the tortoise back to the head of the linked list.Since the tortoise has moved back by a distance $d_T$, the new distance $\Delta_2$ between them is such that:

$\Delta_2\triangleq\Delta_1+d_T=2d_T=2(n_2-n_1)D$Hence, we have:

$\boxed{\Delta_2=kD}\quad\textrm{with }k\in\mathbb{N}^*$*Detection step:*We make each animal move 1 step at a time, thus keeping the distance $\Delta_2$ between them constant.Since $\Delta_2$ is a multiple of the length $D$ of the cycle, the new meeting point between the two animals will precisely be the start of the cycle.

This algorithm can be used to detect cycles, length of portions of linked lists, among other use cases.

### Duplicate number

Given an array $A=[a_{0}, ..., a_{n}]$ of size $n+1$ where each element $a_i$ is in the range $[\![1, n]\!]$, suppose there exists exactly two elements $a_{i_1}, a_{i_2}$ such that $a_{i_1} = a_{i_2} \triangleq a_{\textrm{dup}}$. All other elements are assumed to be distinct. The goal of the problem is to find the duplicate value $a_{\textrm{dup}}$.

A naive approach would sort the array and scan the result to find the two consecutive elements that are equal. This would take $\mathcal{O}(n\log(n))$ time.

However, a more efficient approach takes $\mathcal{O}(n)$ time and leverages Floyd's algorithm.

**Trick** Suppose we represent the array by $n+1$ distinct nodes that compose one or more linked lists. Each element $a_i$ of the array $A$ is seen as a node $i$ that points to node $a_i$.

We make the following observations:

For $i\in[\![0, n]\!]$, $a_i$ covers all values between $1$ and $n$. As a result:

The $n+1$ nodes point to exactly $n$ of its nodes, which guarantees that each resulting linked list has a cycle.

No node points to node 0, which means that the cycle in the linked list starting from node 0 is not circular. Linked lists that do not contain node 0 (if any) are necessarily circular.

Both the nodes $i_1$ and $i_2$ point to node $a_{\textrm{dup}}$. As a result:

The desired duplicate number is at the start of a cycle.

The cycle containing the duplicate number is necessarily non-circular.

From the two points above, we deduce that the linked list starting from node 0 contains $a_{\textrm{dup}}$ at the start of its cycle.

**Algorithm** The duplicate number is found in $\mathcal{O}(n)$ time and $\mathcal{O}(1)$ space:

*Initialization:*Think of array $A$ in terms of linked list(s) using the trick above. Note that no extra-space is needed. For a given element $a_i$, the next element can be accessed via $a_{a_i}$.*Compute step:*Apply Floyd's algorithm from node $0$, which takes $\mathcal{O}(n)$ time and $\mathcal{O}(1)$ space.*Final step:*The start of the cycle is the duplicate number $a_\textrm{dup}$.

## Doubly linked lists

### Definition

A doubly linked list is a data structure composed of nodes, where each node carries the information of:

- a value $v$
- a $\texttt{next}$ pointer that points to the next node
- a $\texttt{prev}$ pointer that points to the previous node

The first and last nodes of a doubly linked list are called the head and tail respectively. The overall data structure is represented as follows:

### Operations

The main operations that can be performed on a doubly linked list are explained below:

Type | Time | Description | Illustration |
---|---|---|---|

Access | O(n) | Starting from the head, the linked list is traversed in a sequential way to access the desired node. If it is the last element, the whole linked list is traversed. | |

Search | O(n) | Starting from the head, the linked list is traversed in a sequential way to search for the desired node. If the node is not found, the whole linked list is traversed. | |

Insertion | O(1) | 1. Assign the next pointer of: a. The previous-to-be element to the newly-inserted node. b. The new element to the following element. 2. Assign the prev pointer of: a. The new element to the previous-to-be element. b. The following element to the new element. | |

Deletion | O(1) | 1. Assign the next pointer of the previous node to the following node. 2. Assign the prev pointer of the following node to the previous node. |

### LRU cache

A *Least Recently Used* (LRU) cache of capacity $C$ is a bounded-size structure that keeps track of the $C$ most recently used items. An item is said to be "used" if it is either accessed or inserted.

It is a classic example of an efficient combination of doubly linked lists with hash tables, where:

The doubly linked list provides an ordering of elements with respect to their last use, with the most recently used ones being closer to the head.

The hash table maps values to nodes and enables access times to existing nodes in $\mathcal{O}(1)$.

**Access** The retrieval operation is done in $\mathcal{O}(1)$ time as follows:

*Identification step:*Identify the node using the hash table.*Update step:*Convey the fact that the node is now the most recently used element.Connect the neighbors of the element between them.

Bring the element to the front of the doubly linked list.

**Insertion** This operation is achieved in $\mathcal{O}(1)$ time and depends on the situation:

*Inserted element already in cache*. When a cached element is inserted, that element is taken to the front of the linked list in the same fashion as when the element is accessed.*Inserted element not in cache*. When a new element $e$ is inserted, we have the following cases:*If the cache is below capacity:*Insert the new element at the front of the doubly linked list and add it to the hash table.*If the cache is at capacity:*Remove the least recently used element to welcome the new node.*Removal step:*Remove from the hash table the entry corresponding to the element that was the least used and update the tail of the doubly linked list.*Insertion step:*Insert the new element at the front of the doubly linked list and add it to the hash table.

**Subscribe here** to be notified of new Super Study Guide releases!