Arrays and strings
By Afshine Amidi and Shervine Amidi
Arrays
Definition
An array is a data structure of fixed size that stores elements in a sequential way. Each element can be accessed by its index .
Most programming languages start their indices from 0. The resulting arrays are said to be 0-indexed.
Operations
The main operations that can be performed on an array are explained below:
Type | Time | Description | Illustration |
---|---|---|---|
Access | O(1) | Using index i, we can directly access the ith element of the array with A[i]. | ![]() |
Search | O(n) | We need to search the array by checking each element one by one until finding the desired value. | ![]() |
Insertion | O(n) | 1. Elements at indices i and up are moved to the right. 2. The new element is inserted at index i. Note that if there is no space for the new element to be added in the existing array, we need to create a bigger array and copy existing elements over there. | ![]() |
Deletion | O(n) | 1. Elements at indices i+1 and up are moved to the left. 2. The former last element of the array is either ignored or removed. | ![]() |
Kadane's algorithm
Kadane's algorithm computes the maximum subarray sum of a given array .
Trick Scan the array and keep track of the following quantities:
: the maximum subarray sum of up to index that contains .
: the maximum subarray sum of up to index .
Algorithm The solution is found in time and space:
Initialization: Set .
Update step: For each index , compute by distinguishing two cases:
Case : Appending to the currently tracked subarray sum is worse than starting over. We set .
Case : It is worth continuing on the current subarray and appending to it. We set .
Then, the maximum subarray sum seen so far is updated if a new maximum is detected:
In the end, we output the maximum subarray sum of the array:
In order to guarantee a space complexity of , values of and are updated in place at each step .
Merge intervals
The merge intervals problem is a classic problem that aims at producing an array of non-overlapping intervals based on an array of intervals that are potentially overlapping.
We note:
the input array of potentially-overlapping intervals, with:
the output array of non-overlapping intervals, with:
The output array is obtained in time and space:
Initialization:
The input array is sorted with respect to the lower bound of each interval in time. Without loss of generality, we renumber intervals so that .
The output array is initialized with .
Update step: Starting from , we would like to incorporate into the output array. By noting the last added interval of , we have the following cases:
Case : does not add any new information since it is entirely overlapping with the last added interval. We discard .
Case : is partially overlapping with the last added interval, so we update the current upper bound to .
Case : There is no overlap between and the last added interval, so we add to the final array at index .
We repeat the update step for all intervals in a sequential way to obtain the output array of non-overlapping intervals .
Strings
Definition
A string is a data structure that can be seen as an array of characters .
The following table summarizes a few categories of strings that have special properties:
Type | Description | Example |
---|---|---|
Palindrome | Reads the same backward and forward. | kayak, madam, radar |
Anagram | Forms another string using a rearrangement of its letters. | listen, silent |
Strobogrammatic | Remains the same when rotated by 180 degrees. | 69, 818, 1001 |
Longest substring
Given a string , the goal of this problem is to find the length of the longest substring that does not contain repeated characters.
A naive approach would check the constraint on every possible substring using a nested \texttt{for}-loop in time. However, there is a more efficient approach that uses the sliding window trick.
Trick Use a left pointer and a right pointer to delimit a window which has the constraint to not have repeated characters. We have the following rules:
Constraint is not violated: The current solution may have not reached its full potential yet. The pointer advances until the constraint is violated.
Constraint is violated: The current solution is not valid anymore, so we need to trim the solution down. The pointer advances until the constraint is no longer violated.
Algorithm The solution is found in time and space:
Initialization:
Initialize an empty hash set that aims at storing the characters that are part of the current substring.
Set a global counter to 0, which keeps track of the maximum substring length encountered so far.
Set the left pointer and right pointer to 0.
Compute step: is the new character proposed to be added to the current substring.
If is already in the hash set, then adding it again would make us have twice in the substring.
Increment and remove from the hash set until is not there anymore. The substring gets trimmed down.
Now, the updated substring is ready to have without violating the constraint. Add to the hash set.
Check whether the length of the resulting substring is greater than the counter . If it is, store the new maximum in .
Increment by 1.
Repeat this process until reaches the end of the string.
- Final step: is the length of the biggest substring without repeated characters.
The sliding window trick is found in many string problems.
Subscribe here to be notified of new Super Study Guide releases!