Dev Todo
- Add visualizations to data structures
- Add loader to main content
Where I left off
- Advanced Search Algorithms
Quick Word
This cheat sheet is designed to help you prepare for your algorithms and data structures exams efficiently. It covers key topics from your course curriculum, broken down into concise sections with visualizations and code examples.
You can find practice problems in the The Blind 75 section below.
You can go through the sections sequentially, but you don’t have to. Jump to the topics you need the most help with or revisit specific sections to brush up on key concepts.
Happy studying, and good luck with your exams!
Introduction
Algorithms and data structures form the foundation of computer science and are crucial for success in your CS courses. They are not just theoretical concepts - they help you understand how computers process information efficiently and are a key focus area in most software engineering technical interviews as well.
Most algorithms and data structures exams will test your understanding of:
-
Data Structure Implementation & Usage: You’ll need to know when and how to use different data structures effectively. For example, understanding when to use a hash table for constant-time lookups or a binary search tree for maintaining sorted data.
-
Algorithm Design & Analysis: You’ll be asked to solve problems by designing efficient algorithms, analyzing their time/space complexity, and explaining your reasoning step-by-step.
-
Problem-Solving Skills: Beyond just knowing the concepts, exams test how you approach problems methodically, break them down into smaller parts, and develop optimal solutions.
-
Written Communication: While getting the right answer is important, you’ll also need to clearly explain your thought process and justify your solutions in written form.
The following sections will cover all these aspects in detail, with practical examples and common exam patterns. Remember - the goal isn’t just to memorize solutions, but to understand the underlying concepts and principles that will help you solve any problem you encounter on your exam.
Let’s dive into the fundamental building blocks first.
Complexity Analysis
Complexity analysis is a fundamental concept in computer science that helps us evaluate and compare algorithms based on their resource usage (primarily time and space). It provides a framework for measuring how an algorithm’s performance scales with input size, allowing us to make informed decisions about which algorithm is best suited for a particular problem. Understanding complexity analysis is crucial for writing efficient code and is a key topic in technical interviews.
Big O Notation
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. In computer science, it’s used to classify algorithms according to how their runtime or space requirements grow as the input size grows.
Key Concepts
- Upper Bound: Big O represents the worst-case scenario - the maximum time/space an algorithm might need.
- Growth Rate: It focuses on the growth rate of the requirements, not the exact count of operations.
- Asymptotic Behavior: We care about the trend as input size approaches infinity, not small inputs.
Rules for Big O Analysis
- Drop Constants: O(2n) becomes O(n)
- Drop Lower Order Terms: O(n² + n) becomes O(n²)
- Consider Worst Case: Unless specified otherwise
- Different Terms for Different Inputs: O(a * b) for nested loops with different inputs
Examples
O(n)
def find_sum(arr):
total = 0 # O(1)
for num in arr: # Loop runs n times
total += num # O(1) operation
return total # O(1)
# Total: O(n) - linear time complexity
O(n²)
def bubble_sort(arr):
n = len(arr)
for i in range(n): # Outer loop: n iterations
for j in range(n-1): # Inner loop: n iterations
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr
# Total: O(n²) - quadratic time complexity
O(log n)
def binary_search(arr, target):
left = 0
right = len(arr) - 1
while left <= right:
# Each iteration cuts the search space in half:
# n -> n/2 -> n/4 -> n/8 -> ... -> 1
# This takes log₂(n) steps to reach 1
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1 # Search right half
else:
right = mid - 1 # Search left half
return -1
# Total: O(log n) - logarithmic time complexity
# Because we divide the problem size by 2 each iteration
Time Complexity
Time complexity measures how the runtime of an algorithm grows with respect to the input size. It helps us analyze and compare algorithms independently of hardware/implementation details.
Common Complexities
- O(1) - Constant Time: Runtime stays the same regardless of input size
- Array access/modification
- Hash table insertion/lookup
- Stack push/pop
- O(log n) - Logarithmic Time: Runtime grows logarithmically with input size
- Binary search
- Balanced binary tree operations
- Divide and conquer algorithms
- O(n) - Linear Time: Runtime grows linearly with input size
- Array/list traversal
- Linear search
- String iteration
- O(n log n) - Linearithmic Time: Combination of linear and logarithmic growth
- Efficient sorting (Merge sort, Quick sort)
- Heap operations
- O(n²) - Quadratic Time: Runtime grows with square of input size
- Nested iterations
- Bubble sort, Selection sort
- Matrix operations
- O(2ⁿ) - Exponential Time: Runtime doubles with each additional input
- Recursive fibonacci
- Power set generation
- Traveling salesman (brute force)
Speed Comparison
Space Complexity
Space complexity measures how much additional memory an algorithm needs relative to input size. Like time complexity, it helps evaluate algorithm efficiency but focuses on memory usage.
Common Complexities
- O(1) - Constant Space: Fixed amount of extra space regardless of input
- Simple variables and operations
- Fixed-size data structures
- In-place algorithms
- O(n) - Linear Space: Extra space grows linearly with input size
- Creating a new array/list of size n
- Recursive call stack with depth n
- Hash tables with n elements
- O(n²) - Quadratic Space: Extra space grows with square of input size
- 2D arrays/matrices of size n×n
- Adjacency matrices for graphs
- Some dynamic programming solutions
Size Comparison
Reference Table
Data Structure | Search | Space | Notes |
---|---|---|---|
Array (Static) | O(n) | O(n) | Fast access, fixed size |
Dynamic Array | O(n) | O(n) | Resizable array |
Singly Linked List | O(n) | O(n) | Sequential search |
Doubly Linked List | O(n) | O(n) | Bidirectional traversal |
Hash Table | O(1)* | O(n) | *Average case with good hash |
Binary Search Tree | O(h) | O(n) | h = height of tree |
AVL Tree | O(log n) | O(n) | Self-balancing |
Red-Black Tree | O(log n) | O(n) | Self-balancing |
Binary Heap | O(n) | O(n) | Priority queue |
Stack | O(n) | O(n) | LIFO structure |
Queue | O(n) | O(n) | FIFO structure |
Trie | O(m) | O(n*m) | m = key length |
Graph (Adj List) | O(V+E) | O(V+E) | V = vertices, E = edges |
Graph (Adj Matrix) | O(1) | O(V²) | Higher space complexity |
Notes:
- All space complexities assume n elements stored
- Some operations may have different best/worst case complexities
- Implementation details can affect actual performance
- Real-world performance depends on data size and patterns
Fundamental Data Structures
Data structures are the building blocks of computer science and software engineering. They provide different ways to organize and store data efficiently, each with its own strengths, trade-offs and use cases.
Let’s explore each of these in detail below.
Arrays
An array is a fundamental data structure that stores a fixed-size sequence of elements of the same data type in contiguous memory locations. Each element in an array is identified by an index or subscript, which corresponds to its position in the sequence. Arrays allow for efficient random access to elements, meaning any element can be retrieved or modified directly using its index in constant time, O(1).
Arrays are one of the most basic and widely used data structures because they offer:
- Simplicity: Easy to understand and implement.
- Efficiency: Fast access and manipulation of data.
- Predictability: Fixed size and memory layout provide consistent performance.
In programming languages like C and Java, arrays have a fixed length defined at the time of creation. In contrast, languages like Python and JavaScript offer dynamic arrays (often implemented as lists or vectors) that can grow or shrink in size.
Key Characteristics
- Homogeneous Elements: Arrays typically store elements of the same data type, ensuring consistent behavior and memory usage.
- Zero-Based Indexing: Most programming languages use zero-based indexing, where the first element is accessed with index 0.
- Contiguous Memory Allocation: Elements are stored in sequential memory addresses, which optimizes access speed but may require significant contiguous memory space.
Types of Arrays
- One-Dimensional Arrays: A simple list of elements.
# One-dimensional array of integers
numbers = [1, 2, 3, 4, 5]
# Access element at index 2
print(numbers[2]) # Output: 3
# Insert element at index 1
numbers.insert(1, 10)
print(numbers) # Output: [1, 10, 2, 3, 4, 5]
# Modify element at index 2
numbers[2] = 20
print(numbers) # Output: [1, 10, 20, 3, 4, 5]
# Delete element at index 1
del numbers[1]
print(numbers) # Output: [1, 20, 3, 4, 5]
- Multi-Dimensional Arrays: Arrays with more than one dimension.
# Two-dimensional array (matrix)
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
# Access element at row 1, column 2
print(matrix[1][2]) # Output: 6
# Insert new row
matrix.insert(1, [10, 11, 12])
print("After insert:", matrix) # Output: [[1, 2, 3], [10, 11, 12], [4, 5, 6], [7, 8, 9]]
# Modify element at row 0, column 1
matrix[0][1] = 20
print("After modify:", matrix) # Output: [[1, 20, 3], [10, 11, 12], [4, 5, 6], [7, 8, 9]]
# Delete row
del matrix[1]
print("After delete:", matrix) # Output: [[1, 20, 3], [4, 5, 6], [7, 8, 9]]
- Dynamic Arrays: Arrays that can grow or shrink in size.
# Using Python's list (dynamic array)
numbers = []
# Insert elements
numbers.append(1) # Add to end
numbers.append(2)
numbers.insert(1, 3) # Insert at index 1
print("After inserts:", numbers) # Output: [1, 3, 2]
# Access element
print("Access index 1:", numbers[1]) # Output: 3
# Modify element
numbers[1] = 4
print("After modify:", numbers) # Output: [1, 4, 2]
# Delete elements
numbers.pop() # Remove last element
numbers.pop(0) # Remove first element
print("After deletes:", numbers) # Output: [4]
Common Use Cases
- Implementation of Other Data Structures:
- Stacks and queues
- Hash tables (for chaining)
- Adjacency lists for graphs
- Memory Management:
- Memory allocators
- Garbage collection algorithms
- System Programming:
- Process scheduling in operating systems
- File systems
- Applications:
- Undo functionality in software
- Music playlists
- Browser history navigation
- Image viewers (previous/next functionality)
Common Exam Topics
- Insertion and deletion of elements
- Accessing elements by index
- Traversing the array
- Finding the maximum or minimum element
- Reversing the array
Linked Lists
A linked list is a fundamental data structure that represents a sequence of elements, called nodes, where each node contains two main components:
- Data: The value or information stored in the node.
- Reference (Pointer): A link to the next node in the sequence (and possibly the previous node in the case of doubly linked lists).
Unlike arrays, linked lists do not store elements in contiguous memory locations. Instead, each node is allocated separately in memory and connected via pointers, forming a chain-like structure. This allows for efficient insertion and deletion of elements without the need to shift other elements, as is required in arrays.
Key Characteristics
- Dynamic Size: Linked lists can easily grow and shrink at runtime by allocating and deallocating nodes as needed.
- Non-Contiguous Memory Allocation: Nodes are scattered in memory, connected via pointers, which can lead to non-sequential memory access patterns.
- Sequential Access: Elements must be accessed in order from the head of the list; random access is not directly supported.
- Flexibility: Efficient insertion and deletion operations, especially at the beginning and middle of the list.
Types of Linked Lists
- Singly Linked Lists: Each node points to the next node in the sequence.
# Node class definition
class Node:
def __init__(self, data):
self.data = data
self.next = None
# Create and manipulate a singly linked list
head = Node(1)
second = Node(2)
third = Node(3)
head.next = second
second.next = third
# Traverse and print list
current = head
print("Initial list:", end=" ")
while current:
print(current.data, end=" ") # Output: 1 2 3
current = current.next
print()
# Insert node at beginning
new_head = Node(0)
new_head.next = head
head = new_head
# Insert node in middle
new_node = Node(2.5)
current = head
while current.data != 2:
current = current.next
new_node.next = current.next
current.next = new_node
# Delete node (remove node with value 2)
current = head
while current.next and current.next.data != 2:
current = current.next
if current.next:
current.next = current.next.next
# Print final list
current = head
print("Final list:", end=" ")
while current:
print(current.data, end=" ") # Output: 0 1 2.5 3
current = current.next
- Doubly Linked Lists: Each node points to both the next and previous node in the sequence.
# Node class definition
class Node:
def __init__(self, data):
self.data = data
self.next = None
self.prev = None
# Create and manipulate a doubly linked list
head = Node(1)
second = Node(2)
third = Node(3)
# Link nodes both ways
head.next = second
second.prev = head
second.next = third
third.prev = second
# Traverse and print list forward
current = head
print("Forward traversal:", end=" ")
while current:
print(current.data, end=" ") # Output: 1 2 3
current = current.next
print()
# Insert node at beginning
new_head = Node(0)
new_head.next = head
head.prev = new_head
head = new_head
# Insert node in middle
new_node = Node(2.5)
current = head
while current.data != 2:
current = current.next
new_node.next = current.next
new_node.prev = current
current.next.prev = new_node
current.next = new_node
# Delete node (remove node with value 2)
current = head
while current and current.data != 2:
current = current.next
if current:
current.prev.next = current.next
if current.next:
current.next.prev = current.prev
- Circular Linked Lists: The last node points back to the first node, creating a circular structure.
# Node class definition
class Node:
def __init__(self, data):
self.data = data
self.next = None
# Create a circular linked list
head = Node(1)
second = Node(2)
third = Node(3)
head.next = second
second.next = third
third.next = head # Make it circular
# Traverse and print list
current = head
print("Initial list:", end=" ")
count = 0
while current and count < 5: # Limit iterations to avoid infinite loop
print(current.data, end=" ") # Output: 1 2 3 1 2
current = current.next
count += 1
print()
# Insert node at beginning
new_head = Node(0)
new_head.next = head
third.next = new_head # Update last node to point to new head
head = new_head
# Delete node (remove node with value 2)
current = head
while current.next.data != 2:
current = current.next
current.next = current.next.next
Common Use Cases
-
Data Structure Implementation:
- Implementation of function call stacks
- Undo/redo operations in applications
- Expression evaluation and syntax parsing
- Browser history (back/forward navigation)
-
Algorithm Applications:
- Depth-first search traversal
- Backtracking algorithms
- Parentheses matching
- Tower of Hanoi solution
-
Memory Management:
- Program stack for managing function calls
- Memory allocation tracking
- Resource cleanup management
-
System Design:
- Task scheduling in operating systems
- Compiler syntax checking
- Expression conversion (infix to postfix)
- State management in applications
Common Exam Topics
- Insertion and deletion of elements
- Accessing elements by index
- Traversing the array
- Finding the maximum or minimum element
- Reversing the array
Stacks
A stack is a fundamental linear data structure that follows the Last-In, First-Out (LIFO) principle. In a stack, the last element added (pushed) onto the stack is the first one to be removed (popped) from it. This behavior is analogous to a stack of books or plates, where you add and remove items from the top.
Stacks are essential in various computing processes due to their simplicity and efficiency. They provide a way to store and manage data in a particular order, ensuring that the most recently added data is accessed first.
Key Characteristics
- LIFO Behavior: The most recently added item is the first to be removed.
- One-End Access: Elements are added and removed from the same end, referred to as the top of the stack.
- Dynamic Size: The size of the stack can grow or shrink dynamically as elements are added or removed.
Terminology
- Push: Add an element to the top of the stack.
- Pop: Remove and return the element from the top of the stack.
- Peek (or Top): Retrieve the element at the top without removing it.
- IsEmpty: Check if the stack is empty.
- Size: Get the number of elements in the stack.
Implementation
# Stack implementation using a list
class Stack:
def __init__(self):
self.items = []
def push(self, item):
self.items.append(item)
def pop(self):
if not self.is_empty():
return self.items.pop()
return None
def peek(self):
if not self.is_empty():
return self.items[-1]
return None
def is_empty(self):
return len(self.items) == 0
def size(self):
return len(self.items)
# Example usage
stack = Stack()
# Push elements
stack.push(1)
stack.push(2)
stack.push(3)
print("After pushes:", stack.items) # Output: [1, 2, 3]
# Peek at top element
print("Top element:", stack.peek()) # Output: 3
# Pop elements
print("Popped:", stack.pop()) # Output: 3
print("Popped:", stack.pop()) # Output: 2
print("Final stack:", stack.items) # Output: [1]
print("Size:", stack.size()) # Output: 1
Common Use Cases
-
Program Flow Control:
- Function call management
- Expression evaluation
- Syntax parsing
- Recursion implementation
-
Memory Management:
- Memory allocation tracking
- Nested operations handling
- Undo/redo functionality
- Program state management
-
Real-world Applications:
- Browser back/forward navigation
- Text editor undo/redo
- Calculator operations
- Parentheses/bracket matching
-
Programming Scenarios:
- Depth-first search in graphs
- Backtracking algorithms
- Expression conversion
- Tree traversal (pre/post/in-order)
Common Exam Topics
- Stack operations (push, pop, peek)
- Stack implementation using arrays and linked lists
- Balanced parentheses checking
- Expression evaluation (infix, postfix, prefix)
- Stack-based backtracking
- Memory management using stacks
- Function call stack and recursion
- Stack overflow and underflow handling
- Space and time complexity analysis
- Stack applications in algorithms
Queues
A queue is a fundamental linear data structure that follows the First-In, First-Out (FIFO) principle. In a queue, the element added earliest is the one that is removed first, analogous to a real-world queue, like a line of people waiting at a checkout counter. Elements are added at one end, called the rear (or tail), and removed from the other end, called the front (or head).
Key Characteristics
- FIFO Behavior: The first element enqueued is the first one to be dequeued.
- Two-End Access: Elements are added at the rear and removed from the front.
- Dynamic Size: The size can grow or shrink dynamically as elements are added or removed.
Terminology
- Enqueue: Add an element to the rear of the queue.
- Dequeue: Remove and return the element from the front of the queue.
- Peek (or Front): Retrieve the element at the front without removing it.
- IsEmpty: Check if the queue is empty.
- Size: Get the number of elements in the queue.
Types of Queues
- Simple Queue (Linear Queue)
- Definition: The basic form of a queue where elements are enqueued at the rear and dequeued from the front.
- Limitation: In array-based implementations, space may be wasted after several dequeue operations due to fixed size.
# Create a queue
from collections import deque
queue = deque()
# Enqueue elements
queue.append(1) # queue: [1]
queue.append(2) # queue: [1, 2]
queue.append(3) # queue: [1, 2, 3]
# Peek at front element
print("Front element:", queue[0]) # Output: 1
# Dequeue elements
print("Dequeued:", queue.popleft()) # Output: 1
print("Dequeued:", queue.popleft()) # Output: 2
print("Final Queue:", list(queue)) # Output: [3]
print("Size:", len(queue)) # Output: 1
- Circular Queue
- Definition: A queue implemented in a way that the rear and front indices wrap around to the beginning of the array when they reach the end, forming a circle.
- Advantage: Efficiently utilizes space in array-based queues by reusing vacant positions.
# Create a circular queue with size 5
class CircularQueue:
def __init__(self, size):
self.size = size
self.queue = [None] * size
self.front = self.rear = -1
def enqueue(self, data):
if (self.rear + 1) % self.size == self.front:
print("Queue is full")
return
elif self.front == -1: # Queue is empty
self.front = 0
self.rear = 0
else:
self.rear = (self.rear + 1) % self.size
self.queue[self.rear] = data
def dequeue(self):
if self.front == -1: # Queue is empty
print("Queue is empty")
return
data = self.queue[self.front]
if self.front == self.rear: # Last element
self.front = self.rear = -1
else:
self.front = (self.front + 1) % self.size
return data
# Example usage
cq = CircularQueue(5)
cq.enqueue(1) # queue: [1]
cq.enqueue(2) # queue: [1, 2]
cq.enqueue(3) # queue: [1, 2, 3]
print("Dequeued:", cq.dequeue()) # Output: 1
cq.enqueue(4) # queue: [2, 3, 4]
print("Current queue:", [x for x in cq.queue if x is not None])
- Double-Ended Queue (Deque)
- Definition: A generalized queue where elements can be added or removed from both the front and the rear.
- Operations:
- enqueueFront: Add an element to the front
- enqueueRear: Add an element to the rear
- dequeueFront: Remove an element from the front
- dequeueRear: Remove an element from the rear
- Use Case: Useful in algorithms that require access to both ends, such as sliding window problems.
# Create a deque
from collections import deque
dq = deque()
# Add elements at both ends
dq.appendleft(1) # dq: [1]
dq.append(2) # dq: [1, 2]
dq.appendleft(3) # dq: [3, 1, 2]
dq.append(4) # dq: [3, 1, 2, 4]
print("Current deque:", list(dq)) # Output: [3, 1, 2, 4]
# Remove elements from both ends
print("Remove front:", dq.popleft()) # Output: 3
print("Remove rear:", dq.pop()) # Output: 4
print("Final deque:", list(dq)) # Output: [1, 2]
print("Size:", len(dq)) # Output: 2
- Priority Queue
- Definition: Each element has a priority, and elements are dequeued based on priority rather than just order of insertion.
- Implementation: Often implemented using heaps (binary heap, Fibonacci heap).
- Use Case: Scheduling tasks based on priority levels, like CPU task scheduling.
# Create a priority queue
from queue import PriorityQueue
pq = PriorityQueue()
# Enqueue elements with priorities
pq.put((2, "Medium priority")) # (priority, item)
pq.put((1, "High priority"))
pq.put((3, "Low priority"))
# Dequeue elements (automatically by priority)
print(pq.get()[1]) # Output: High priority
print(pq.get()[1]) # Output: Medium priority
print(pq.get()[1]) # Output: Low priority
print("Queue empty?", pq.empty()) # Output: True
- Concurrent Queue
- Definition: A thread-safe queue designed for use in multi-threaded environments.
- Implementation: Provides synchronized access to prevent race conditions.
- Use Case: Inter-thread communication, task scheduling in concurrent applications.
from queue import Queue
import threading
# Create a thread-safe queue
q = Queue()
# Producer thread function
def producer():
for i in range(3):
q.put(i)
print(f"Produced: {i}")
# Consumer thread function
def consumer():
while not q.empty():
item = q.get()
print(f"Consumed: {item}")
# Create and start threads
prod = threading.Thread(target=producer)
cons = threading.Thread(target=consumer)
prod.start()
cons.start()
# Wait for threads to complete
prod.join()
cons.join()
- Blocking Queue
- Definition: A queue that blocks or waits when attempting to dequeue from an empty queue or enqueue into a full queue.
- Use Case: Producer-consumer problems where producers and consumers operate at different rates.
from queue import Queue
import threading
# Create a blocking queue
queue = Queue(maxsize=3) # Queue with max size 3
# Producer function
def producer():
for i in range(3):
queue.put(i) # Will block if queue is full
print(f"Produced: {i}")
# Consumer function
def consumer():
while True:
try:
item = queue.get(timeout=1) # Will block if queue is empty
print(f"Consumed: {item}")
queue.task_done()
except:
break
# Create and start threads
producer_thread = threading.Thread(target=producer)
consumer_thread = threading.Thread(target=consumer)
producer_thread.start()
consumer_thread.start()
producer_thread.join()
consumer_thread.join()
- Double-Ended Priority Queue (DEPQ)
- Definition: Combines features of a deque and a priority queue, allowing insertion and deletion of elements at both ends based on priority.
- Use Case: Complex scheduling and resource allocation algorithms.
# Create a double-ended priority queue
from heapq import heappush, heappop
class DEPQ:
def __init__(self):
self.min_heap = [] # For getting minimum
self.max_heap = [] # For getting maximum
def insert(self, val):
# Add to min heap normally
heappush(self.min_heap, val)
# Add to max heap with negated value
heappush(self.max_heap, -val)
def get_min(self):
if self.min_heap:
return self.min_heap[0]
return None
def get_max(self):
if self.max_heap:
return -self.max_heap[0]
return None
def delete_min(self):
if self.min_heap:
return heappop(self.min_heap)
return None
def delete_max(self):
if self.max_heap:
return -heappop(self.max_heap)
return None
# Example usage
depq = DEPQ()
depq.insert(3)
depq.insert(1)
depq.insert(5)
print("Min:", depq.get_min()) # Output: 1
print("Max:", depq.get_max()) # Output: 5
print("Delete min:", depq.delete_min()) # Output: 1
print("Delete max:", depq.delete_max()) # Output: 5
Common Use Cases
-
Data Processing:
- Streaming analytics requiring both min and max values
- Statistical computations (finding median, quartiles)
- Real-time data monitoring systems
- Range queries in data streams
-
System Design:
- Task scheduling with priority ranges
- Resource allocation systems
- Load balancing applications
- Network packet prioritization
-
Optimization Problems:
- Game AI decision making
- Auction systems
- Stock market trading systems
- Multi-objective optimization
-
Real-world Applications:
- Emergency response systems
- Customer service priority management
- Temperature monitoring systems
- Financial market analysis tools
Common Exam Topics
- Basic queue operations (enqueue, dequeue, peek)
- Queue implementation using arrays and linked lists
- Circular queue implementation
- Priority queue operations and implementation
- Queue applications in scheduling and buffering
Graphs
A graph is a non-linear data structure consisting of vertices (nodes) and edges that connect these vertices. Graphs are used to represent relationships between objects and are fundamental in modeling networks, social connections, routing problems, and many other real-world scenarios.
Key components:
- Vertex (Node): A fundamental unit that stores data
- Edge: Connection between two vertices
- Weight: Value associated with an edge (in weighted graphs)
- Degree: Number of edges connected to a vertex
Types of Graphs
- Undirected Graph
- Edges have no direction
- If vertex A is connected to B, B is also connected to A
- Example: Social network friendships
class UndirectedGraph:
def __init__(self):
self.graph = {}
def add_vertex(self, vertex):
if vertex not in self.graph:
self.graph[vertex] = []
def add_edge(self, v1, v2):
if v1 not in self.graph:
self.add_vertex(v1)
if v2 not in self.graph:
self.add_vertex(v2)
self.graph[v1].append(v2)
self.graph[v2].append(v1) # Add both directions
def print_graph(self):
for vertex in self.graph:
print(f"{vertex}: {self.graph[vertex]}")
# Example usage
g = UndirectedGraph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 3)
g.print_graph()
# Output:
# 0: [1, 2]
# 1: [0, 2]
# 2: [0, 1, 3]
# 3: [2]
- Directed Graph (Digraph)
- Edges have direction
- If vertex A points to B, B might not point to A
- Example: Web page links, social media following
class DirectedGraph:
def __init__(self):
self.graph = {}
def add_vertex(self, vertex):
if vertex not in self.graph:
self.graph[vertex] = []
def add_edge(self, v1, v2):
if v1 not in self.graph:
self.add_vertex(v1)
if v2 not in self.graph:
self.add_vertex(v2)
self.graph[v1].append(v2) # Only add one direction
def has_path(self, start, end, visited=None):
if visited is None:
visited = set()
if start == end:
return True
visited.add(start)
for neighbor in self.graph[start]:
if neighbor not in visited:
if self.has_path(neighbor, end, visited):
return True
return False
# Example usage
g = DirectedGraph()
g.add_edge(0, 1)
g.add_edge(1, 2)
g.add_edge(2, 3)
g.add_edge(3, 1) # Creates a cycle
print("Path from 0 to 3:", g.has_path(0, 3)) # True
print("Path from 3 to 0:", g.has_path(3, 0)) # False
- Weighted Graph
- Edges have associated weights/costs
- Used in shortest path algorithms
- Example: Road networks with distances
class WeightedGraph:
def __init__(self):
self.graph = {}
def add_vertex(self, vertex):
if vertex not in self.graph:
self.graph[vertex] = {}
def add_edge(self, v1, v2, weight):
if v1 not in self.graph:
self.add_vertex(v1)
if v2 not in self.graph:
self.add_vertex(v2)
self.graph[v1][v2] = weight
def dijkstra(self, start):
distances = {vertex: float('infinity') for vertex in self.graph}
distances[start] = 0
unvisited = set(self.graph.keys())
while unvisited:
current = min(unvisited, key=lambda x: distances[x])
unvisited.remove(current)
for neighbor, weight in self.graph[current].items():
distance = distances[current] + weight
if distance < distances[neighbor]:
distances[neighbor] = distance
return distances
# Example usage
g = WeightedGraph()
g.add_edge('A', 'B', 4)
g.add_edge('A', 'C', 2)
g.add_edge('B', 'C', 1)
g.add_edge('B', 'D', 5)
g.add_edge('C', 'D', 8)
print("Shortest paths from A:", g.dijkstra('A'))
# Output: {'A': 0, 'B': 4, 'C': 2, 'D': 9}
Graph Traversal
-
Depth-First Search (DFS)
- Explores as far as possible along each branch
- Uses stack (recursive or explicit)
- Applications: Topological sorting, cycle detection
-
Breadth-First Search (BFS)
- Explores all vertices at current depth
- Uses queue
- Applications: Shortest path (unweighted), level-order traversal
class Graph:
def __init__(self):
self.graph = {}
def add_edge(self, v1, v2):
if v1 not in self.graph:
self.graph[v1] = []
if v2 not in self.graph:
self.graph[v2] = []
self.graph[v1].append(v2)
def dfs(self, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=" ")
for neighbor in self.graph[start]:
if neighbor not in visited:
self.dfs(neighbor, visited)
def bfs(self, start):
visited = set()
queue = [start]
visited.add(start)
while queue:
vertex = queue.pop(0)
print(vertex, end=" ")
for neighbor in self.graph[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
# Example usage
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(2, 3)
g.add_edge(3, 3)
print("DFS starting from vertex 2:")
g.dfs(2) # Output: 2 0 1 3
print("
BFS starting from vertex 2:")
g.bfs(2) # Output: 2 0 3 1
Relationship with Trees
- Graphs are a generalization of trees - a tree is a special type of graph that:
- Is acyclic (has no cycles)
- Has exactly one path between any two vertices
- Has n-1 edges for n vertices
- Has a clear parent-child hierarchy
- Trees can be thought of as directed graphs with restrictions on connections
- Many tree algorithms are adaptations of graph algorithms (e.g., DFS/BFS)
Common Use Cases
-
Social Networks:
- Modeling user connections and relationships
- Friend recommendations
- Social influence analysis
- Community detection
-
Transportation Networks:
- Road/flight path mapping
- Route optimization
- Traffic flow analysis
- Public transit planning
-
Computer Networks:
- Network topology representation
- Routing algorithms
- Network flow optimization
- Resource allocation
-
Web Applications:
- Web page linking structure
- Search engine algorithms
- Content recommendation systems
- Knowledge graphs
-
Biology and Chemistry:
- Molecular structures
- Protein interaction networks
- Neural networks
- Ecological food webs
-
Project Management:
- Task dependencies
- Critical path analysis
- Resource allocation
- Workflow optimization
-
Computer Science Applications:
- State machines
- Compiler design
- Database schema design
- File system organization
Common Exam Topics
- Graph representation (adjacency matrix vs. adjacency list)
- Graph traversal algorithms (BFS, DFS)
- Shortest path algorithms (Dijkstra’s, Bellman-Ford)
- Minimum spanning trees (Prim’s, Kruskal’s)
- Topological sorting
- Strongly connected components
- Graph coloring
- Network flow
- Cycle detection
- Path finding algorithms
- Graph properties (connectivity, bipartiteness)
- Time and space complexity analysis
Trees
A tree is a hierarchical data structure consisting of nodes connected by edges. Each node contains a value and references to other nodes (children). Trees are widely used to represent hierarchical relationships and for efficient searching and organization of data.
Key components of a tree:
- Root: The topmost node of the tree
- Node: Contains data and references to child nodes
- Edge: Connection between nodes
- Leaf: Node with no children
- Height: Length of the path from root to deepest leaf
- Depth: Length of the path from a node to the root
Types of Trees
- Binary Tree
- Each node has at most two children (left and right)
- No restrictions on node values or structure
class TreeNode:
def __init__(self, val=0):
self.val = val
self.left = None
self.right = None
# Create a binary tree
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
def inorder_traversal(node):
if not node:
return
inorder_traversal(node.left)
print(node.val, end=" ")
inorder_traversal(node.right)
print("Inorder traversal:", end=" ")
inorder_traversal(root) # Output: 4 2 5 1 3
- Binary Search Tree (BST)
- Binary tree with ordering property
- Left subtree contains nodes with values less than parent
- Right subtree contains nodes with values greater than parent
class BSTNode:
def __init__(self, val=0):
self.val = val
self.left = None
self.right = None
class BST:
def __init__(self):
self.root = None
def insert(self, val):
if not self.root:
self.root = BSTNode(val)
return
def _insert(node, val):
if val < node.val:
if not node.left:
node.left = BSTNode(val)
else:
_insert(node.left, val)
else:
if not node.right:
node.right = BSTNode(val)
else:
_insert(node.right, val)
_insert(self.root, val)
def search(self, val):
def _search(node, val):
if not node:
return False
if node.val == val:
return True
if val < node.val:
return _search(node.left, val)
return _search(node.right, val)
return _search(self.root, val)
# Example usage
bst = BST()
values = [5, 3, 7, 2, 4, 6, 8]
for val in values:
bst.insert(val)
print("Search 4:", bst.search(4)) # Output: True
print("Search 9:", bst.search(9)) # Output: False
-
AVL Tree
- Self-balancing BST
- Height difference between left and right subtrees ≤ 1
- Maintains O(log n) height through rotations
-
Red-Black Tree
- Self-balancing BST with color property
- Every node is either red or black
- Root and leaves (NIL) are black
- Red nodes have black children
- All paths from root to leaves have same number of black nodes
-
B-Tree
- Generalization of BST
- Nodes can have more than two children
- Commonly used in databases and file systems
- Maintains sorted data for efficient disk access
-
Heap
- Complete binary tree
- Parent nodes are either greater or smaller than children
- Commonly used for priority queues
class MinHeap:
def __init__(self):
self.heap = []
def parent(self, i):
return (i - 1) // 2
def left_child(self, i):
return 2 * i + 1
def right_child(self, i):
return 2 * i + 2
def insert(self, key):
self.heap.append(key)
self._sift_up(len(self.heap) - 1)
def extract_min(self):
if not self.heap:
return None
min_val = self.heap[0]
self.heap[0] = self.heap[-1]
self.heap.pop()
if self.heap:
self._sift_down(0)
return min_val
def _sift_up(self, i):
while i > 0 and self.heap[self.parent(i)] > self.heap[i]:
self.heap[i], self.heap[self.parent(i)] = self.heap[self.parent(i)], self.heap[i]
i = self.parent(i)
def _sift_down(self, i):
min_index = i
size = len(self.heap)
l = self.left_child(i)
if l < size and self.heap[l] < self.heap[min_index]:
min_index = l
r = self.right_child(i)
if r < size and self.heap[r] < self.heap[min_index]:
min_index = r
if i != min_index:
self.heap[i], self.heap[min_index] = self.heap[min_index], self.heap[i]
self._sift_down(min_index)
# Example usage
heap = MinHeap()
for num in [4, 8, 2, 6, 1]:
heap.insert(num)
print(heap.extract_min()) # Output: 1
- Trie (Prefix Tree)
- Tree for storing strings
- Each node represents a character
- Paths from root to nodes form strings
- Efficient for prefix-based operations
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end
def starts_with(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
# Example usage
trie = Trie()
words = ["apple", "app", "apricot"]
for word in words:
trie.insert(word)
print("Search 'apple':", trie.search("apple")) # True
print("Search 'app':", trie.search("app")) # True
print("Search 'apt':", trie.search("apt")) # False
print("Prefix 'apr':", trie.starts_with("apr")) # True
Tree Traversal
Since trees are graphs, we can use the same traversal methods, depth-first search (DFS) and breadth-first search (BFS).
Their implementations are slightly different, but the concepts are the same.
class TreeNode:
def __init__(self, val=0):
self.val = val
self.left = None
self.right = None
def dfs_preorder(node):
if not node:
return
print(node.val, end=" ") # Root
dfs_preorder(node.left) # Left
dfs_preorder(node.right) # Right
def dfs_inorder(node):
if not node:
return
dfs_inorder(node.left) # Left
print(node.val, end=" ") # Root
dfs_inorder(node.right) # Right
def dfs_postorder(node):
if not node:
return
dfs_postorder(node.left) # Left
dfs_postorder(node.right) # Right
print(node.val, end=" ") # Root
def bfs_levelorder(root):
if not root:
return
queue = [root]
while queue:
node = queue.pop(0)
print(node.val, end=" ")
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
# Create a sample tree
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
print("Preorder:", end=" ")
dfs_preorder(root) # Output: 1 2 4 5 3
print("
Inorder:", end=" ")
dfs_inorder(root) # Output: 4 2 5 1 3
print("
Postorder:", end=" ")
dfs_postorder(root) # Output: 4 5 2 3 1
print("
Level-order:", end=" ")
bfs_levelorder(root) # Output: 1 2 3 4 5
Common Use Cases
-
Caching and Memoization:
- Web browser caches
- Database query caches
- Function result caching
- Session storage
-
Database Indexing:
- Primary and secondary indices
- In-memory databases
- Quick record lookups
-
Symbol Tables:
- Compiler symbol tables
- Language interpreters
- Variable name storage
-
Practical Applications:
- Password storage (with cryptographic hashing)
- Spell checkers
- File checksums
- Duplicate detection
- URL shorteners
Common Exam Topics
- Tree traversal algorithms (preorder, inorder, postorder, level-order)
- Tree properties (height, depth, balance)
- Binary search tree operations (insertion, deletion, search)
- Tree balancing techniques
- Tree rotations
- Tree traversal complexity analysis
- Special tree types (AVL, Red-Black, B-Trees)
- Tree applications and use cases
- Tree serialization and deserialization
- Lowest common ancestor problems
- Path finding in trees
- Tree construction from traversals
- Tree validation (is BST, is balanced)
- Tree comparison and equality checking
Hash Tables
A hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets/slots, from which the desired value can be found.
Key components:
- Hash Function: Converts keys into array indices
- Bucket: Array slot that stores key-value pairs
- Collision Resolution: Method to handle when different keys hash to same index
- Load Factor: Ratio of filled slots to total slots
Hash Functions
-
Division Method
- h(k) = k mod m
- Simple but can lead to clustering
-
Multiplication Method
- h(k) = ⌊m(kA mod 1)⌋
- A is a constant between 0 and 1
-
Universal Hashing
- Family of hash functions
- Randomly choose function at runtime
Collision Resolution
- Chaining
- Each bucket contains a linked list of entries
- Simple but requires extra memory
class HashNode:
def __init__(self, key, value):
self.key = key
self.value = value
self.next = None
class HashTable:
def __init__(self, size=10):
self.size = size
self.table = [None] * size
def _hash(self, key):
return hash(key) % self.size
def put(self, key, value):
index = self._hash(key)
if not self.table[index]:
self.table[index] = HashNode(key, value)
return
# Handle collision by chaining
current = self.table[index]
while current:
if current.key == key:
current.value = value # Update existing
return
if not current.next:
break
current = current.next
current.next = HashNode(key, value)
def get(self, key):
index = self._hash(key)
current = self.table[index]
while current:
if current.key == key:
return current.value
current = current.next
return None
def remove(self, key):
index = self._hash(key)
if not self.table[index]:
return
if self.table[index].key == key:
self.table[index] = self.table[index].next
return
current = self.table[index]
while current.next:
if current.next.key == key:
current.next = current.next.next
return
current = current.next
# Example usage
ht = HashTable()
ht.put("apple", 5)
ht.put("banana", 8)
ht.put("orange", 3)
print(ht.get("apple")) # Output: 5
print(ht.get("banana")) # Output: 8
ht.remove("apple")
print(ht.get("apple")) # Output: None
- Open Addressing
- Probing for next empty slot
- Types: Linear, Quadratic, Double Hashing
class HashTable:
def __init__(self, size=10):
self.size = size
self.keys = [None] * size
self.values = [None] * size
def _hash(self, key):
return hash(key) % self.size
def _probe(self, index):
# Linear probing
return (index + 1) % self.size
def put(self, key, value):
index = self._hash(key)
while self.keys[index] is not None:
if self.keys[index] == key:
self.values[index] = value # Update existing
return
index = self._probe(index)
if index == self._hash(key):
raise Exception("Hash table is full")
self.keys[index] = key
self.values[index] = value
def get(self, key):
index = self._hash(key)
original_index = index
while self.keys[index] is not None:
if self.keys[index] == key:
return self.values[index]
index = self._probe(index)
if index == original_index:
break
return None
def remove(self, key):
index = self._hash(key)
original_index = index
while self.keys[index] is not None:
if self.keys[index] == key:
self.keys[index] = None
self.values[index] = None
return
index = self._probe(index)
if index == original_index:
break
# Example usage
ht = HashTable()
ht.put("apple", 5)
ht.put("banana", 8)
ht.put("orange", 3)
print(ht.get("apple")) # Output: 5
print(ht.get("banana")) # Output: 8
ht.remove("apple")
print(ht.get("apple")) # Output: None
Common Use Cases
-
Database Operations
- Indexing
- Caching query results
- In-memory databases
- Quick lookups
-
Caching Systems
- Web browser caches
- Application-level caching
- Distributed caching (e.g., Redis)
- Memoization
-
Language Features
- Symbol tables in compilers
- Implementation of sets and maps
- String interning
- Object property lookups
-
Real-World Applications
- Password storage (with cryptographic hashing)
- File deduplication
- URL shorteners
- Session management
- Blockchain mining
Common Exam Topics
- Hash function design and properties
- Collision resolution techniques (chaining, open addressing)
- Load factor analysis and rehashing
- Hash table operations (insert, delete, search)
- Time and space complexity analysis
- Universal hashing concepts
- Perfect hashing
- Cryptographic hash functions
- Handling collisions efficiently
- Hash table implementation details
- Applications and use cases
- Common hash functions (division, multiplication, etc.)
- Load factor optimization
- Rehashing strategies
- Performance analysis under different scenarios
Algorithmic Techniques
Sorting
Sorting is the process of arranging data in a particular sequence or order, typically in ascending or descending order. It is one of the most fundamental operations in computer science and is essential for:
- Organizing and retrieving data efficiently
- Optimizing search operations
- Facilitating data analysis and processing
- Enabling efficient merging of datasets
- Making data more readable and manageable
Sorting serves as a building block for many other algorithms and is crucial for real-world applications like database management, file systems, and data analytics. Understanding sorting algorithms helps develop problem-solving skills and provides insights into algorithm design and analysis.
Bubble Sort
Bubble Sort is one of the simplest sorting algorithms that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. The algorithm gets its name because smaller elements “bubble up” to the beginning of the list with each iteration.
Key characteristics:
- Simple Implementation: Easy to understand and code
- Time Complexity: O(n²) in worst and average cases, O(n) in best case
- Space Complexity: O(1) - only requires a single additional memory space
- Stable Sort: Maintains relative order of equal elements
- In-place Algorithm: Sorts the array without requiring extra space
Advantages:
- Simple to understand and implement
- Requires no additional memory space
- Good for small datasets
- Detects if array is already sorted
Disadvantages:
- Poor time complexity O(n²)
- Not suitable for large datasets
- Requires multiple passes through the array
How it works:
- Compare adjacent elements
- Swap them if they are in wrong order
- Repeat for each pair in the array
- After each pass, the largest unsorted element “bubbles up” to its correct position
- Continue until no more swaps are needed
def bubble_sort(arr):
n = len(arr)
for i in range(n):
# Flag to optimize if array is already sorted
swapped = False
# Last i elements are already in place
for j in range(0, n-i-1):
# Swap if current element is greater than next
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
swapped = True
# If no swapping occurred, array is sorted
if not swapped:
break
return arr
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
print("Bubble Sort:", bubble_sort(arr.copy()))
Selection Sort
Selection Sort is an in-place comparison-based algorithm that repeatedly selects the minimum element from the unsorted portion of the list and swaps it with the first element. This process is repeated for the remaining unsorted portion until the entire list is sorted.
Key characteristics:
- Simple Implementation: Easy to understand and code
- Time Complexity: O(n²) in worst and average cases
- Space Complexity: O(1) - only requires a single additional memory space
- Stable Sort: Not stable by default
Advantages:
- Simple to understand and implement
- Requires no additional memory space
- Good for small datasets
Disadvantages:
- Poor time complexity O(n²)
- Not suitable for large datasets
- Requires multiple passes through the array
How it works:
- Find the minimum element in the unsorted portion
- Swap it with the first element of the unsorted portion
- Repeat for the remaining unsorted portion
- Continue until the entire list is sorted
def selection_sort(arr):
n = len(arr)
for i in range(n):
# Find minimum element in unsorted part
min_idx = i
for j in range(i+1, n):
if arr[j] < arr[min_idx]:
min_idx = j
# Swap found minimum with first element
arr[i], arr[min_idx] = arr[min_idx], arr[i]
return arr
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
print("Selection Sort:", selection_sort(arr.copy()))
Insertion Sort
Insertion Sort is a simple sorting algorithm that builds the final sorted array one item at a time. It iterates through the list, inserting each element into its correct position in the sorted portion of the array.
Key characteristics:
- Simple Implementation: Easy to understand and code
- Time Complexity: O(n²) in worst and average cases
- Space Complexity: O(1) - only requires a single additional memory space
- Stable Sort: Stable by default
Advantages:
- Simple to understand and implement
- Requires no additional memory space
- Good for small datasets
- Stable sort
Disadvantages:
- Poor time complexity O(n²)
- Not suitable for large datasets
- Requires multiple passes through the array
How it works:
- Iterate through the array starting from the second element
- Compare each element with the previous ones
- Shift elements to the right to make space for the current element
- Insert the current element in its correct position
- Continue until the entire array is sorted
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
# Move elements greater than key one position ahead
j = i-1
while j >= 0 and arr[j] > key:
arr[j+1] = arr[j]
j -= 1
arr[j+1] = key
return arr
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
print("Insertion Sort:", insertion_sort(arr.copy()))
Merge Sort
Merge Sort is a divide-and-conquer algorithm that divides the input array into two halves, recursively sorts each half, and then merges the two sorted halves. It is a stable sort that works well for large datasets.
Key characteristics:
- Time Complexity: O(n log n) in all cases
- Space Complexity: O(n) - requires additional array for merging
- Stable Sort: Maintains relative order of equal elements
- Not In-place: Requires extra space proportional to input size
- Divide and Conquer: Breaks problem into smaller subproblems
- Predictable Performance: Consistent time complexity regardless of input
- Parallelizable: Can be efficiently parallelized for better performance
Advantages:
- Consistent performance
- Stable sort
- Handles large datasets efficiently
Disadvantages:
- Requires additional space
- Not in-place sorting
How it works:
- Divide: Split the array into two equal halves
- Conquer: Recursively sort each half
- Merge: Combine the sorted halves by comparing elements
- Create temporary arrays for both halves
- Compare elements from both arrays
- Place smaller element in original array
- Continue until all elements are merged
- Repeat: Continue process until entire array is sorted
def merge_sort(arr):
if len(arr) <= 1:
return arr
# Divide array into two halves
mid = len(arr) // 2
left = arr[:mid]
right = arr[mid:]
# Recursively sort both halves
left = merge_sort(left)
right = merge_sort(right)
# Merge the sorted halves
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
# Compare elements from both arrays
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
# Add remaining elements
result.extend(left[i:])
result.extend(right[j:])
return result
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
print("Merge Sort:", merge_sort(arr))
Quick Sort
Quick Sort is a divide-and-conquer algorithm that selects a ‘pivot’ element from the array and partitions the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The sub-arrays are then sorted recursively.
Key characteristics:
- Time Complexity: O(n log n) in average case, O(n²) in worst case
- Space Complexity: O(log n) - due to recursive stack space
- Not Stable: Unstable by default
- In-place: Modifies input array without extra space
- Efficient: Handles large datasets well
- Parallelizable: Can be parallelized for better performance
Advantages:
- Efficient for large datasets
- Average case performance is O(n log n)
- In-place sorting
Disadvantages:
- Worst case performance is O(n²)
- Not stable by default
How it works:
- Choose Pivot: Select an element as pivot
- Partition: Rearrange elements in array such that elements less than pivot are on left, greater on right
- Recur: Recursively apply same process to sub-arrays
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
print("Quick Sort:", quick_sort(arr.copy()))
Searching
Searching algorithms are methods used to find a specific item (target) within a collection of data. These algorithms are fundamental to computer science and are used extensively in various applications, from simple data lookups to complex database operations.
The efficiency of a search algorithm is typically measured by its time complexity and space requirements. Different searching techniques are suited for different scenarios based on factors like:
- Data size
- Data structure used
- Whether the data is sorted
- Memory constraints
- Search frequency
Linear Search
Linear Search is the simplest searching algorithm that sequentially checks each element in a collection until a match is found or the entire collection has been searched. It works on both sorted and unsorted data.
Sequential search
Sequential search is a basic search algorithm that iterates through each element in a collection until a match is found or the entire collection has been searched. It works on both sorted and unsorted data.
Key characteristics:
- Time Complexity: O(n) in worst and average cases
- Space Complexity: O(1) - constant extra space
- No Prerequisites: Works with unsorted data
Advantages:
- Simple to understand and implement
- Works on unsorted arrays
- No extra memory needed
- Good for small datasets
Disadvantages:
- Poor time complexity O(n)
- Not suitable for large datasets
- Less efficient than other algorithms for sorted data
How it works:
- Start from the first element
- Compare each element with the target value
- If element matches, return its position
- If element not found, return -1 or appropriate indicator
def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
target = 22
result = linear_search(arr, target)
print(f"Linear Search: Element {target} found at index {result}")
Sentinel linear search
Sentinel linear search is an optimization technique that eliminates the need for an explicit check to ensure the loop does not run out of bounds. It involves adding a sentinel element at the end of the array, which simplifies the termination condition of the loop.
Key characteristics:
- Time Complexity: O(n) in worst and average cases
- Space Complexity: O(1) - constant extra space
- No Prerequisites: Works with unsorted data
Advantages:
- Simplifies loop termination
- Reduces conditional checks
- Improves readability
Disadvantages:
- Adds a sentinel element
- Not suitable for all scenarios
How it works:
- Add a sentinel element at the end of the array
- Compare each element with the target value
- If element matches, return its position
- If element not found, return -1 or appropriate indicator
def sentinel_linear_search(arr, target):
# Save last element and add sentinel
last = arr[-1]
arr.append(target)
i = 0
while arr[i] != target:
i += 1
# Restore array
arr[-1] = last
# Check if element was actually found
if i < len(arr) - 1 or arr[-1] == target:
return i
return -1
# Example usage
arr = [64, 34, 25, 12, 22, 11, 90]
target = 22
result = sentinel_linear_search(arr.copy(), target)
print(f"Sentinel Linear Search: Element {target} found at index {result}")
Binary Search
Binary Search is an efficient searching algorithm that works on sorted arrays. It repeatedly divides the search interval in half until the target value is found or the interval is empty.
Basic implementation
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
# Example usage
arr = [11, 12, 22, 25, 34, 64, 90]
target = 22
print("Binary Search: Element", target, "found at index", binary_search(arr, target))
Advantages:
- Extremely efficient O(log n) time complexity
- Highly scalable for large datasets
Disadvantages:
- Requires sorted data
Variations (lower bound, upper bound)
Lower bound and upper bound are variations of binary search that find the first and last occurrence of a target value (or the insertion points if the target doesn’t exist).
Lower Bound: Finds the first position where an element could be inserted while maintaining sorted order.
def lower_bound(arr, target):
left = 0
right = len(arr)
while left < right:
mid = left + (right - left) // 2
if arr[mid] < target:
left = mid + 1
else:
right = mid
return left
# Example usage
arr = [1, 2, 2, 2, 3, 4, 5]
target = 2
result = lower_bound(arr, target)
print(f"Lower Bound: First position for {target} is {result}")
Upper Bound: Finds the first position where an element greater than the target could be inserted while maintaining sorted order.
def upper_bound(arr, target):
left = 0
right = len(arr)
while left < right:
mid = left + (right - left) // 2
if arr[mid] <= target:
left = mid + 1
else:
right = mid
return left
# Example usage
arr = [1, 2, 2, 2, 3, 4, 5]
target = 2
result = upper_bound(arr, target)
print(f"Upper Bound: First position greater than {target} is {result}")
Binary search on answer concept
Binary search on answer is a problem-solving technique where binary search is used to find an answer that satisfies certain conditions, rather than searching for a specific value in an array. This approach is particularly useful when the answer space can be binary searched.
Key characteristics:
- Time Complexity: O(log N) where N is the range of possible answers
- Space Complexity: O(1)
- Requires: A monotonic condition that can be tested
When to use:
- Problem has a range of possible answers
- Can verify if a value is too high/low
- Answer space is monotonic (condition consistently changes from false to true)
How it works:
- Define search space (minimum and maximum possible answers)
- Binary search through this range
- For each mid point, test if it satisfies the condition
- Adjust search space based on result
def can_solve(mid, params):
# Implementation specific check
# Returns True if mid is a valid answer
pass
def binary_search_answer(low, high, params):
while low < high:
mid = low + (high - low) // 2
if can_solve(mid, params):
high = mid
else:
low = mid + 1
return low
# Example usage (finding square root)
def square_root(n):
def can_solve(mid, n):
return mid * mid >= n
return binary_search_answer(0, n, n)
n = 16
result = square_root(n)
print(f"Square root of {n} is approximately {result}")
Advanced Searching
Interpolation Search
Interpolation search is an improved variant of binary search that works best on uniformly distributed sorted arrays. Instead of always checking the middle element, it makes an educated guess of where the target value might be.
Key characteristics:
- Time complexity: O(log(log n)) average case, O(n) worst case
- Works best on uniformly distributed data
- More efficient than binary search for sorted, uniformly distributed arrays
def interpolation_search(arr, target):
low = 0
high = len(arr) - 1
while low <= high and target >= arr[low] and target <= arr[high]:
if low == high:
if arr[low] == target:
return low
return -1
# Calculate probe position
pos = low + ((high - low) * (target - arr[low])) // (arr[high] - arr[low])
if arr[pos] == target:
return pos
elif arr[pos] < target:
low = pos + 1
else:
high = pos - 1
return -1
# Example usage
arr = [1, 2, 4, 6, 8, 10, 12, 14, 16, 18]
target = 10
result = interpolation_search(arr, target)
print(f"Element {target} found at index: {result}") # Output: Element 10 found at index: 5
Jump Search
Jump search is a searching algorithm designed for sorted arrays that works by skipping a fixed number of elements and then performing a linear search.
Key characteristics:
- Time complexity: O(√n)
- Optimal jump size is √n
- Better than linear search and can be better than binary search when target is closer to the beginning
def jump_search(arr, target):
n = len(arr)
step = int(n ** 0.5) # Calculate optimal jump size
# Finding the block where element is present
prev = 0
while arr[min(step, n) - 1] < target:
prev = step
step += int(n ** 0.5)
if prev >= n:
return -1
# Linear search in the identified block
while arr[prev] < target:
prev += 1
if prev == min(step, n):
return -1
if arr[prev] == target:
return prev
return -1
# Example usage
arr = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
target = 21
result = jump_search(arr, target)
print(f"Element {target} found at index: {result}") # Output: Element 21 found at index: 8
Exponential Search
Exponential search (also called doubling search or galloping search) involves two steps: finding a range where the element exists and then performing a binary search in that range.
Key characteristics:
- Time complexity: O(log n)
- Particularly useful for unbounded/infinite arrays
- Good for searching in unbounded or infinite sorted arrays
- Works well when target element is closer to the beginning
def binary_search(arr, left, right, target):
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
def exponential_search(arr, target):
if arr[0] == target:
return 0
# Find range for binary search
i = 1
n = len(arr)
while i < n and arr[i] <= target:
i = i * 2
# Perform binary search
return binary_search(arr, i//2, min(i, n-1), target)
# Example usage
arr = [2, 3, 4, 10, 40, 80, 160, 320, 640, 1280]
target = 40
result = exponential_search(arr, target)
print(f"Element {target} found at index: {result}") # Output: Element 40 found at index: 4
Recursion
Explanation of recursion and its fundamental concepts.
Key Components
- Base Case
- Recursive Case
- Stack Space
Common Patterns
-
Linear Recursion
- Factorial
- Fibonacci CodeSnippet with implementations
-
Binary Recursion
- Binary Tree traversal
- Divide and conquer algorithms CodeSnippet with implementations
-
Tail Recursion
- Optimization techniques
- Converting to iteration CodeSnippet with implementations
Common Use Cases
- Tree traversal
- Backtracking problems
- Mathematical computations
- Divide and conquer algorithms
Common Exam Topics
- Base case identification
- Stack space analysis
- Recursion to iteration conversion
- Time/space complexity analysis
Dynamic Programming
Explanation of DP and its core principles.
Key Concepts
- Optimal Substructure
- Overlapping Subproblems
- Memoization vs Tabulation
Common Patterns
-
1D Dynamic Programming
- Fibonacci with DP
- Climbing Stairs CodeSnippet with implementations
-
2D Dynamic Programming
- Matrix Path Problems
- Longest Common Subsequence CodeSnippet with implementations
-
State Space Reduction
- Space optimization techniques
- Rolling array method CodeSnippet with implementations
Common Use Cases
- Optimization problems
- Path finding
- Resource allocation
- String matching
Common Exam Topics
- State definition
- Transition formula
- Space optimization
- Time/space complexity analysis
Divide & Conquer
Explanation of divide and conquer strategy.
Key Steps
- Divide
- Conquer
- Combine
Common Algorithms
- Merge Sort
- Quick Sort
- Binary Search CodeSnippet with implementations
Common Use Cases
- Sorting algorithms
- Matrix multiplication
- Closest pair problems
- Fast Fourier Transform
Common Exam Topics
- Problem decomposition
- Recurrence relations
- Time complexity analysis
- Algorithm design
Backtracking
Explanation of backtracking and its principles.
Key Concepts
- Choice
- Constraints
- Goal
Common Problems
- N-Queens
- Sudoku Solver
- Permutations
CodeSnippet with implementations
Common Use Cases
- Constraint satisfaction
- Path finding
- Puzzle solving
- Combinatorial problems
Common Exam Topics
- State space tree
- Pruning techniques
- Implementation patterns
- Time complexity analysis
Sliding Window
Explanation of sliding window technique and its applications.
Common Use Cases
- Subarray problems
- String matching
- Sliding window maximum/minimum
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Two Pointers
Explanation of two pointers technique and its applications.
Common Use Cases
- Subarray problems
- String manipulation
- Linked list operations
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Greedy Algorithms
Explanation of greedy algorithms and their principles.
Common Use Cases
- Scheduling problems
- Interval scheduling
- Minimum spanning tree
- Shortest path
Common Exam Topics
- Problem formulation
- Implementation details
- Time complexity analysis
Graph Traversal
Explanation of graph traversal techniques and their importance.
Common Use Cases
- Shortest path
- Topological sorting
- Cycle detection
- Bipartite graph checking
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Union-Find & Disjoint Sets
Explanation of union-find algorithm and its applications.
Common Use Cases
- Graph connectivity
- Disjoint set operations
- Kruskal’s algorithm
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Kadane’s Algorithm
Explanation of Kadane’s algorithm and its applications.
Common Use Cases
- Maximum subarray sum
- Stock buy/sell problem
- Longest increasing subsequence
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Problem Solving Patterns
Sliding Window
Explanation of sliding window technique and its applications.
Common Use Cases
- Subarray problems
- String matching
- Sliding window maximum/minimum
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Two Pointers
Explanation of two pointers technique and its applications.
Common Use Cases
- Subarray problems
- String manipulation
- Linked list operations
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Greedy Algorithms
Explanation of greedy algorithms and their principles.
Common Use Cases
- Scheduling problems
- Interval scheduling
- Minimum spanning tree
- Shortest path
Common Exam Topics
- Problem formulation
- Implementation details
- Time complexity analysis
Graph Traversal
Explanation of graph traversal techniques and their importance.
Common Use Cases
- Shortest path
- Topological sorting
- Cycle detection
- Bipartite graph checking
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Union-Find & Disjoint Sets
Explanation of union-find algorithm and its applications.
Common Use Cases
- Graph connectivity
- Disjoint set operations
- Kruskal’s algorithm
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
Kadane’s Algorithm
Explanation of Kadane’s algorithm and its applications.
Common Use Cases
- Maximum subarray sum
- Stock buy/sell problem
- Longest increasing subsequence
Common Exam Topics
- Implementation details
- Time complexity analysis
- Edge cases handling
The Blind 75
The “Blind 75” is a curated list of 75 LeetCode questions that cover essential coding patterns and concepts frequently asked in technical interviews. These questions were compiled by a Facebook tech lead and have become a popular study resource for interview preparation.
You don’t need to solve all 75 problems - focus on understanding the core patterns and concepts. A good approach is to:
- Start with easier problems in each category
- Try to solve 2-3 problems from each category
- Focus more on categories relevant to your target companies
- Keep track of your progress by checking off completed problems below
Use the checkboxes below to track which problems you’ve completed. The problems are grouped by category to help you focus on specific data structures and algorithms.