Linked lists
By Afshine Amidi and Shervine Amidi
Singly linked lists
Definition
A singly linked list is a data structure composed of nodes, where each node carries the information of:
- a value
- a field, that points to the next node
The first node of a singly linked list is called the head. The overall data structure is represented as follows:
Operations
The main operations that can be performed on a singly linked list are explained below:
Type | Time | Description | Illustration |
---|---|---|---|
Access | O(n) | Starting from the head, the linked list is traversed in a sequential way to access the desired node. If it is the last element, the whole linked list is traversed. | ![]() |
Search | O(n) | Starting from the head, the linked list is traversed in a sequential way to search for the desired node. If the node is not found, the whole linked list is traversed. | ![]() |
Insertion | O(1) | 1. Assign the next pointer of the previous-to-be element to the newly-inserted node. 2. Assign the next pointer of the new element to the following element. | ![]() |
Deletion | O(1) | Assign the next pointer of the previous node to the following node of the soon-to-be-removed node. | ![]() |
Floyd's algorithm
Floyd's algorithm, also known as the tortoise and hare algorithm, is able to detect the starting point of a cycle in a linked list.
It finds the start of the cycle using the two-pointer method in 3 steps, with a time complexity of and a space complexity of :
Meeting step: We define two pointers:
A slow pointer (tortoise) that goes 1 step at a time.
A fast pointer (hare) that goes 2 steps at a time.
The tortoise and the hare both start from the head of the linked list and respectively travel distances and until they meet. At the meeting point, these quantities verify .
We note the difference between the distances traveled by the hare and the tortoise:
Given that the two animals necessarily meet somewhere in the cycle, we can also write:
This means that we can rewrite as:
Hence, we have:
Restart step: Now, we keep the hare at the meeting point and place the tortoise back to the head of the linked list.
Since the tortoise has moved back by a distance , the new distance between them is such that:
Hence, we have:
Detection step: We make each animal move 1 step at a time, thus keeping the distance between them constant.
Since is a multiple of the length of the cycle, the new meeting point between the two animals will precisely be the start of the cycle.
This algorithm can be used to detect cycles, length of portions of linked lists, among other use cases.
Duplicate number
Given an array of size where each element is in the range , suppose there exists exactly two elements such that . All other elements are assumed to be distinct. The goal of the problem is to find the duplicate value .
A naive approach would sort the array and scan the result to find the two consecutive elements that are equal. This would take time.
However, a more efficient approach takes time and leverages Floyd's algorithm.
Trick Suppose we represent the array by distinct nodes that compose one or more linked lists. Each element of the array is seen as a node that points to node .
We make the following observations:
For , covers all values between and . As a result:
The nodes point to exactly of its nodes, which guarantees that each resulting linked list has a cycle.
No node points to node 0, which means that the cycle in the linked list starting from node 0 is not circular. Linked lists that do not contain node 0 (if any) are necessarily circular.
Both the nodes and point to node . As a result:
The desired duplicate number is at the start of a cycle.
The cycle containing the duplicate number is necessarily non-circular.
From the two points above, we deduce that the linked list starting from node 0 contains at the start of its cycle.
Algorithm The duplicate number is found in time and space:
Initialization: Think of array in terms of linked list(s) using the trick above. Note that no extra-space is needed. For a given element , the next element can be accessed via .
Compute step: Apply Floyd's algorithm from node , which takes time and space.
Final step: The start of the cycle is the duplicate number .
Doubly linked lists
Definition
A doubly linked list is a data structure composed of nodes, where each node carries the information of:
- a value
- a pointer that points to the next node
- a pointer that points to the previous node
The first and last nodes of a doubly linked list are called the head and tail respectively. The overall data structure is represented as follows:
Operations
The main operations that can be performed on a doubly linked list are explained below:
Type | Time | Description | Illustration |
---|---|---|---|
Access | O(n) | Starting from the head, the linked list is traversed in a sequential way to access the desired node. If it is the last element, the whole linked list is traversed. | ![]() |
Search | O(n) | Starting from the head, the linked list is traversed in a sequential way to search for the desired node. If the node is not found, the whole linked list is traversed. | ![]() |
Insertion | O(1) | 1. Assign the next pointer of: a. The previous-to-be element to the newly-inserted node. b. The new element to the following element. 2. Assign the prev pointer of: a. The new element to the previous-to-be element. b. The following element to the new element. | ![]() |
Deletion | O(1) | 1. Assign the next pointer of the previous node to the following node. 2. Assign the prev pointer of the following node to the previous node. | ![]() |
LRU cache
A Least Recently Used (LRU) cache of capacity is a bounded-size structure that keeps track of the most recently used items. An item is said to be "used" if it is either accessed or inserted.
It is a classic example of an efficient combination of doubly linked lists with hash tables, where:
The doubly linked list provides an ordering of elements with respect to their last use, with the most recently used ones being closer to the head.
The hash table maps values to nodes and enables access times to existing nodes in .
Access The retrieval operation is done in time as follows:
Identification step: Identify the node using the hash table.
Update step: Convey the fact that the node is now the most recently used element.
Connect the neighbors of the element between them.
Bring the element to the front of the doubly linked list.
Insertion This operation is achieved in time and depends on the situation:
Inserted element already in cache. When a cached element is inserted, that element is taken to the front of the linked list in the same fashion as when the element is accessed.
Inserted element not in cache. When a new element is inserted, we have the following cases:
If the cache is below capacity: Insert the new element at the front of the doubly linked list and add it to the hash table.
If the cache is at capacity: Remove the least recently used element to welcome the new node.
Removal step: Remove from the hash table the entry corresponding to the element that was the least used and update the tail of the doubly linked list.
Insertion step: Insert the new element at the front of the doubly linked list and add it to the hash table.
Subscribe here to be notified of new Super Study Guide releases!