Skip to main content

Linked lists

By Afshine Amidi and Shervine Amidi

Singly linked lists

Definition

A singly linked list is a data structure composed of nodes, where each node carries the information of:

  • a value vv
  • a next\texttt{next} field, that points to the next node

The first node of a singly linked list is called the head. The overall data structure is represented as follows:

Operations

The main operations that can be performed on a singly linked list are explained below:

TypeTimeDescriptionIllustration
AccessO(n)Starting from the head, the linked list is traversed in a sequential way to access the desired node. If it is the last element, the whole linked list is traversed.
SearchO(n)Starting from the head, the linked list is traversed in a sequential way to search for the desired node. If the node is not found, the whole linked list is traversed.
InsertionO(1)1. Assign the next pointer of the previous-to-be element to the newly-inserted node.
2. Assign the next pointer of the new element to the following element.
DeletionO(1)Assign the next pointer of the previous node to the following node of the soon-to-be-removed node.

Floyd's algorithm

Floyd's algorithm, also known as the tortoise and hare algorithm, is able to detect the starting point of a cycle in a linked list.

It finds the start of the cycle using the two-pointer method in 3 steps, with a time complexity of O(n)\mathcal{O}(n) and a space complexity of O(1)\mathcal{O}(1):

  • Meeting step: We define two pointers:

    • A slow pointer (tortoise) pTp_T that goes 1 step at a time.

    • A fast pointer (hare) pHp_H that goes 2 steps at a time.

      The tortoise and the hare both start from the head of the linked list and respectively travel distances dTd_T and dHd_H until they meet. At the meeting point, these quantities verify dH=2dTd_H=2d_T.

      We note Δ1\Delta_1 the difference between the distances traveled by the hare and the tortoise:

      Δ1dHdT=dT\Delta_1\triangleq d_H-d_T=d_T

      Given that the two animals necessarily meet somewhere in the cycle, we can also write:

      This means that we can rewrite Δ1\Delta_1 as:

      Δ1dHdT=(n2n1)D\Delta_1\triangleq d_H-d_T=(n_2-n_1)D

      Hence, we have:

      dT=(n2n1)D\boxed{d_T=(n_2-n_1)D}
  • Restart step: Now, we keep the hare at the meeting point and place the tortoise back to the head of the linked list.

    Since the tortoise has moved back by a distance dTd_T, the new distance Δ2\Delta_2 between them is such that:

    Δ2Δ1+dT=2dT=2(n2n1)D\Delta_2\triangleq\Delta_1+d_T=2d_T=2(n_2-n_1)D

    Hence, we have:

    Δ2=kDwith kN\boxed{\Delta_2=kD}\quad\textrm{with }k\in\mathbb{N}^*
  • Detection step: We make each animal move 1 step at a time, thus keeping the distance Δ2\Delta_2 between them constant.

    Since Δ2\Delta_2 is a multiple of the length DD of the cycle, the new meeting point between the two animals will precisely be the start of the cycle.

Remark

This algorithm can be used to detect cycles, length of portions of linked lists, among other use cases.

Duplicate number

Given an array A=[a0,...,an]A=[a_{0}, ..., a_{n}] of size n+1n+1 where each element aia_i is in the range [ ⁣[1,n] ⁣][\![1, n]\!], suppose there exists exactly two elements ai1,ai2a_{i_1}, a_{i_2} such that ai1=ai2adupa_{i_1} = a_{i_2} \triangleq a_{\textrm{dup}}. All other elements are assumed to be distinct. The goal of the problem is to find the duplicate value adupa_{\textrm{dup}}.

A naive approach would sort the array and scan the result to find the two consecutive elements that are equal. This would take O(nlog(n))\mathcal{O}(n\log(n)) time.

However, a more efficient approach takes O(n)\mathcal{O}(n) time and leverages Floyd's algorithm.

Trick Suppose we represent the array by n+1n+1 distinct nodes that compose one or more linked lists. Each element aia_i of the array AA is seen as a node ii that points to node aia_i.

We make the following observations:

  • For i[ ⁣[0,n] ⁣]i\in[\![0, n]\!], aia_i covers all values between 11 and nn. As a result:

    • The n+1n+1 nodes point to exactly nn of its nodes, which guarantees that each resulting linked list has a cycle.

    • No node points to node 0, which means that the cycle in the linked list starting from node 0 is not circular. Linked lists that do not contain node 0 (if any) are necessarily circular.

  • Both the nodes i1i_1 and i2i_2 point to node adupa_{\textrm{dup}}. As a result:

    • The desired duplicate number is at the start of a cycle.

    • The cycle containing the duplicate number is necessarily non-circular.

From the two points above, we deduce that the linked list starting from node 0 contains adupa_{\textrm{dup}} at the start of its cycle.

Algorithm The duplicate number is found in O(n)\mathcal{O}(n) time and O(1)\mathcal{O}(1) space:

  • Initialization: Think of array AA in terms of linked list(s) using the trick above. Note that no extra-space is needed. For a given element aia_i, the next element can be accessed via aaia_{a_i}.

  • Compute step: Apply Floyd's algorithm from node 00, which takes O(n)\mathcal{O}(n) time and O(1)\mathcal{O}(1) space.

  • Final step: The start of the cycle is the duplicate number adupa_\textrm{dup}.

Doubly linked lists

Definition

A doubly linked list is a data structure composed of nodes, where each node carries the information of:

  • a value vv
  • a next\texttt{next} pointer that points to the next node
  • a prev\texttt{prev} pointer that points to the previous node

The first and last nodes of a doubly linked list are called the head and tail respectively. The overall data structure is represented as follows:

Operations

The main operations that can be performed on a doubly linked list are explained below:

TypeTimeDescriptionIllustration
AccessO(n)Starting from the head, the linked list is traversed in a sequential way to access the desired node. If it is the last element, the whole linked list is traversed.
SearchO(n)Starting from the head, the linked list is traversed in a sequential way to search for the desired node. If the node is not found, the whole linked list is traversed.
InsertionO(1)1. Assign the next pointer of:
a. The previous-to-be element to the newly-inserted node.
b. The new element to the following element.
2. Assign the prev pointer of:
a. The new element to the previous-to-be element.
b. The following element to the new element.
DeletionO(1)1. Assign the next pointer of the previous node to the following node.
2. Assign the prev pointer of the following node to the previous node.

LRU cache

A Least Recently Used (LRU) cache of capacity CC is a bounded-size structure that keeps track of the CC most recently used items. An item is said to be "used" if it is either accessed or inserted.

It is a classic example of an efficient combination of doubly linked lists with hash tables, where:

  • The doubly linked list provides an ordering of elements with respect to their last use, with the most recently used ones being closer to the head.

  • The hash table maps values to nodes and enables access times to existing nodes in O(1)\mathcal{O}(1).

Access The retrieval operation is done in O(1)\mathcal{O}(1) time as follows:

  • Identification step: Identify the node using the hash table.

  • Update step: Convey the fact that the node is now the most recently used element.

    • Connect the neighbors of the element between them.

    • Bring the element to the front of the doubly linked list.

Insertion This operation is achieved in O(1)\mathcal{O}(1) time and depends on the situation:

  • Inserted element already in cache. When a cached element is inserted, that element is taken to the front of the linked list in the same fashion as when the element is accessed.

  • Inserted element not in cache. When a new element ee is inserted, we have the following cases:

    • If the cache is below capacity: Insert the new element at the front of the doubly linked list and add it to the hash table.

    • If the cache is at capacity: Remove the least recently used element to welcome the new node.

      • Removal step: Remove from the hash table the entry corresponding to the element that was the least used and update the tail of the doubly linked list.

      • Insertion step: Insert the new element at the front of the doubly linked list and add it to the hash table.


Want more content like this?

Subscribe here to be notified of new Super Study Guide releases!