Data Structures and Algorithms

Intro

Each section has code examples (see "Code Examples') implementing the corresponding algorithm, all of which (plus many more not covered in this blog post - if interested) can be found at my GitHub repo: https://github.com/MattCrook/algorithms-practice. The code examples consist of simply going through and explaining the algorithm and its implementation, as well as "challenges" (denoted by "Challenge" in the link title) which are coding problems you'd commonly see in interviews, which implement the specific algorithm.

Sections:

Stacks

The stack data structure is precisely what it sounds like: a stack of data. In certain types of scenarios, it can be favorable to store data in a stack rather than in an array. A stack uses LIFO (last-in, first- out) ordering. That is, as in a stack of dinner plates. The most recent item added to the stack is the first item to be removed.

Unlike an array, a stack doesn't offer a consent-constant time access to the ith item. However, it does allow constant-time adds and removes as it doesn't require shifting elements around. One case where stacks are often useful is in certain recursive algorithms. Sometimes you need to push temporary data onto a stack as you recurse, but then remove them as you backtrack (for example, because the recursive check failed). A stack offers an intuitive way to do this. A stack can also be used to implement a recursive algorithm iteratively.

Code Examples


Queues

A Queue implements FIFO (first-in, first-out) ordering. As in a line or queue at a ticket stand, items are removed from the data structure in the same order that they are added. A queue can also be implemented with a Linked List. In fact, they are essentially the same thing as long as items are added and removed from opposite sides.

One thing to note with a queue is, it is especially easy to mess up the updating of the first and last nodes. One place are often used in breadth-first search, or an implementing a cache. In breadth-first search for example, we use a queue to store a list of nodes that we need to process. Each time we process a node, we add its adjacent nodes to the back of the queue. This allows us to process nodes in the order in which they are viewed.

Code Examples


Graph Search and Graph Traversal

Adjacency List

An adjacency list is the most common common way to represent a graph. Every vertex (or node) stores a list of adjacent vertices. In an undirected graph, an edge like (a, b) would be stored twice: once in a's adjacent vertices, and once in b's adjacent vertices.

Adjacency Matrix

An adjacency matrix is N x N boolean matrix (where N is the number of nodes), where a true value at matrix[i][j] indicates an edge from node i to node j. You can also use an integer matrix with 0's and 1's. In an undirected graph, an adjacency matrix will be symmetric. In a directed graph, it will not necessarily be. The graph algorithms that are used on adjacency lists can be performed with adjacency matrices, but they may be somewhat less efficient. In the adjacency list representation, you can easily iterate through the neighbors of a node. In the adjacency matrix representation, you will need to iterate through all the nodes to identify a node's neighbors.

Algorithms

Two common ways to search a graph are:

  • Depth-First Search (DFS)
  • Breath-First Search (BFS)
  • In Depth-First Search (DFS), we start at the root (or another arbitrarily selected node) and explore each branch completely before moving onto the next branch. That is, we go deep first (hence the name depth- first search) before we go wide. In DFS we visit a node `a`, and then iterate through each of `a's` neighbors. When visiting a node `b` that is a neighbor of `a`, we visit all of `b's` neighbors before going onto `a's` other neighbors. That is, `a` exhaustively searches `b's` branch before any of its other neighbors. The difference in this algorithm for a graph, is that we must check if the node has been visited. If we do not, we risk getting stuck in an infinite loop.

    In Breath-First Search, we start at the root (or another arbitrarily selected note) and explore each neighbor before going onto any of their children. That is, we go wide (hence the name breath- first) before we go deep. BFS is a bit less intuitive, and it is easy to struggle with the implementation unless you are already familiar with it. The main tripping point is the false assumption that BFS is recursive. It is not, it uses a queue. In BFS, node a visits each of a's neighbors before visiting any of their neighbors. You can think of this as searching level by level out from a. An iterative solution involving a queue usually works best.

    Breath-First Search (BFS) and Depth-First Search (DFS) tend to be used in different scenarios.
    • DFS is often preferred if we want to visit every node in the graph. Both will work fine but depth first search is a bit simpler.
    • However, if we want to find the shortest path or just any path between two notes BFS is generally better.

    Code Examples
    Extra Examples


    When we think of searching algorithms, we generally think of a binary search. In a binary search, we look for an element X in a sorted array by first comparing X to the midpoint of the array. If X is less than the midpoint, then we search the left half of the array. If X is greater than the midpoint, then we searched the right half of the array. We then repeat this process treating the left and right has a sub arrays. Again, we compare X to the midpoint of this sub array, and then search either it's left or right side. We repeat this process until either we find X, or the sub array has a size of 0.

    There are two main ways to perform this type of search. One using a while loop (similar to a queue algorithm), and another using recursion. There are code examples of both provided below. Important to note, though this algorithm may seem somewhat simple on the surface, getting the details correct (especially the plus and minuses (see code examples) when deciding on which direction of a sub array to search next) is more tricky than one might think. (In my opinion).

    Code Examples
    Extra Examples

    Linked Lists

    A Linked List is a data structure that represents a sequence of nodes. In a singly Linked List, each Node points to the next Node in the Linked List. A doubly Linked List gives each Node pointers to both the next Node and the previous Node.

    Unlike an array, a linked list does not provide constant time access to a particular index within the list. This means that if you'd like to find the Kth element in the list, you will need to iterate through K elements. The benefit of a Linked List is that you can add and remove items from the beginning of the list and constant time. For specific applications, this can be useful.

    A circular Linked List connects the last Node back to the first Node.

    Code Examples


    HashMap

    A HashMap or Hash Table is a data structure that maps keys to values for highly efficient lookups. There are a number of ways of implementing this.

    A simple implementation is we can use an array of linked lists, and a hash code function. To insert a key which might be a string or essentially any other data type and value, we first compute the key's hash code. Then, map the hash code to an index in the array. At this index, there is a linked list of keys and values. Store the key and value in this index. (We must use a linked list because of collisions.) Then, to retrieve the value pair by its key, we repeat the process: compute the hash code from the key, then compute the index from the hash code, then search through the linked list for the value with this key.

    Basically, it’s an array of linked list. Ideally, each item in the array is a linked list of just 1 node. So you can get O(1) complexity to perform get and put by calculating the index of the array by hashing the key of the hashmap, and get the single item in the linked list. However, because hash function doesn’t have to create a unique hash code for different values, when more than 1 key have the same hash code, they are added to the linked list. So, to get to the correct key, you’ll need to traverse the linked list and compare the key one by one, thus resulting in a complexity of O(n) for those keys with the same hash code.

    Alternatively, we can implement a look up system with a balanced binary search tree. This gives us O(log n) lookup time. The advantage of this is potentially using less space since, we no longer allocate a large array. We can also iterate through the keys in order; which can also be useful sometimes.

    Code Examples
    Extra Examples


    Two Sum and Three Sum (Two Pointer/ Three Pointer)

    One of the many popular algorithms is the Two Sum Algorithm. Given an array of numbers and a stand alone number, return all combinations of numbers in the array that add up to the stand alone number. Although any approach that works is technically a solve, but Big O determines which is the best answer for your application.

    The two sum problem is a common interview question, and it is a variation of the subset sum problem. The challenge is to find all the pairs of two integers in an unsorted array that sum up to a given Sum.

    The Three Sum problem involves finding unique triplets of numbers in an array that sum up to a given target. Solve it efficiently by building on top of the Two Sum problem and applying sorting and hashing approaches. This problem is a continuation of "Two Sum" in which the premise is very similar. As with that problem, for Three Sum there are naive, combination-driven approaches as well as more-efficient approaches.

    Two Sum Code Examples
    Three Sum Code Examples


    Fibonacci Sequence

    The Fibonacci sequence is the series of numbers where each number is the sum of the two preceding numbers. It starts with 0 and is followed by 1.

    The numbers in this sequence, known as the Fibonacci numbers, are denoted by F(n).

    Code Examples


    Circular Buffer

    A a circular buffer (or circular queue, cyclic buffer or ring buffer) is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end. This structure lends itself easily to buffering data streams. The useful property of a circular buffer is that it does not need to have its elements shuffled around when one is consumed. (If a non-circular buffer were used then it would be necessary to shift all elements when one is consumed.)

    In other words, the circular buffer is well-suited as a FIFO (first in, first out) buffer while a standard, non-circular buffer is well suited as a LIFO (last in, first out) buffer. Circular buffering makes a good implementation strategy for a queue that has fixed maximum size. Should a maximum size be adopted for a queue, then a circular buffer is a completely ideal implementation; all queue operations are constant time. However, expanding a circular buffer requires shifting memory, which is comparatively costly. For arbitrarily expanding queues, a linked list approach may be preferred instead.

    Code Examples
    Extra Examples


    Pascal's Triangle

    In mathematics, Pascal's triangle is an infinite triangular array of the binomial coefficients which play a crucial role in probability theory, combinatorics, and algebra. The rows of Pascal's triangle are conventionally enumerated starting with row n=0 at the top (the 0th row). The entries in each row are numbered from the left beginning with k=0 and are usually staggered relative to the numbers in the adjacent rows.

    The triangle may be constructed in the following manner: In row 0 (the topmost row), there is a unique nonzero entry 1. Each entry of each subsequent row is constructed by adding the number above and to the left with the number above and to the right, treating blank entries as 0. For example, the initial number of row 1 (or any other row) is 1 (the sum of 0 and 1), whereas the numbers 1 and 3 in row 3 are added to produce the number 4 in row 4.

    Code Examples


    Valid Palindrome

    A phrase is a palindrome if, after converting all uppercase letters into lowercase letters and removing all non-alphanumeric characters, it reads the same forward and backward. Alphanumeric characters include letters and numbers.

    Code Examples


    String Manipulation

    Code Examples


    Sorting Algorithms

    Code Examples


    Simple Searching

    Code Examples


    Array/ List Manipulation Algorithms

    Code Examples


    Various Counting Algorithms

    Code Examples


    Finding Sums of Integers

    Code Examples


    Fizzbuzz

    The FizzBuzz problem is a common coding task (often used in interviews or as a programming exercise). The task is to print the numbers from 1 to 100, but for multiples of 3, print "Fizz" instead of the number, and for multiples of 5, print "Buzz". For numbers that are multiples of both 3 and 5, print "FizzBuzz".

    Code Examples


    Written: November 2, 2024