Lesson 20 of 2012 min read
Searching Algorithms
Prerequisites: Lists, loops, functions, sorting (for binary search)
1. Linear Search
Concept, Check each element one by one from the beginning., Works on both sorted and unsorted lists., Stop when the element is found or the list ends., Also called Sequential Search.
Algorithm
For each element in the list:
If element equals the key:
Return the index
Return -1 (not found)
Code
def linear_search(arr, key):
for i in range(len(arr)):
if arr[i] == key:
return i # Found - return index
return -1 # Not found
Dry Run Trace
Input: arr = [10, 23, 45, 70, 11, 15], key = 70
Step 1: Compare arr[0]=10 with 70 -> Not equal
Step 2: Compare arr[1]=23 with 70 -> Not equal
Step 3: Compare arr[2]=45 with 70 -> Not equal
Step 4: Compare arr[3]=70 with 70 -> FOUND at index 3
Result: 70 found at index 3
Total comparisons: 4
Input: arr = [10, 23, 45, 70, 11, 15], key = 50
Step 1: Compare arr[0]=10 with 50 -> Not equal
Step 2: Compare arr[1]=23 with 50 -> Not equal
Step 3: Compare arr[2]=45 with 50 -> Not equal
Step 4: Compare arr[3]=70 with 50 -> Not equal
Step 5: Compare arr[4]=11 with 50 -> Not equal
Step 6: Compare arr[5]=15 with 50 -> Not equal
Result: 50 not found
Total comparisons: 6
Complexity
- Best case: O(1), element is at the first position
- Worst case: O(n), element is at the last position or not present
- Average case: O(n), Works on unsorted data, No preprocessing required
2. Binary Search
Concept, Works only on sorted lists., Repeatedly divide the search interval in half., Compare the key with the middle element:
- If equal: found.
- If key < middle: search the left half.
- If key > middle: search the right half., Much faster than linear search for large datasets.
Algorithm (Iterative)
low = 0
high = n - 1
While low <= high:
mid = (low + high) // 2
If arr[mid] == key:
Return mid
Else if key < arr[mid]:
high = mid - 1
Else:
low = mid + 1
Return -1
Code
def binary_search(arr, key):
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == key:
return mid # Found
elif key < arr[mid]:
high = mid - 1 # Search left half
else:
low = mid + 1 # Search right half
return -1 # Not found
Dry Run Trace
Input: arr = [10, 20, 30, 40, 50, 60, 70, 80], key = 50
Step 1: low=0, high=7, mid=(0+7)//2=3
arr[3]=40, 50 > 40 -> Search right: low=4
Step 2: low=4, high=7, mid=(4+7)//2=5
arr[5]=60, 50 < 60 -> Search left: high=4
Step 3: low=4, high=4, mid=(4+4)//2=4
arr[4]=50, 50 == 50 -> FOUND at index 4
Result: 50 found at index 4
Total comparisons: 3
Input: arr = [10, 20, 30, 40, 50, 60, 70, 80], key = 35
Step 1: low=0, high=7, mid=3
arr[3]=40, 35 < 40 -> high=2
Step 2: low=0, high=2, mid=1
arr[1]=20, 35 > 20 -> low=2
Step 3: low=2, high=2, mid=2
arr[2]=30, 35 > 30 -> low=3
Step 4: low=3, high=2 -> low > high -> STOP
Result: 35 not found
Total comparisons: 3
Complexity
- Best case: O(1), element is at the middle
- Worst case: O(log n), repeatedly halving the search space
- Average case: O(log n)
- Requirement: List must be sorted
Why O(log n)?
For a list of 8 elements:
8 -> 4 -> 2 -> 1 (3 comparisons max = log2(8))
For a list of 1000 elements:
1000 -> 500 -> 250 -> 125 -> 63 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1
(about 10 comparisons max = log2(1000) ~ 10)
3. Comparison: Linear Search vs Binary Search
| Feature | Linear Search | Binary Search |
|---|---|---|
| Sorted data required | No | Yes |
| Best case | O(1) | O(1) |
| Worst case | O(n) | O(log n) |
| Average case | O(n) | O(log n) |
| Method | Sequential comparison | Divide and conquer |
| Simple to implement | Yes | Slightly complex |
| Suitable for | Small or unsorted data | Large sorted data |
| Comparisons (n=1000) | Up to 1000 | Up to 10 |
| Comparisons (n=1,000,000) | Up to 1,000,000 | Up to 20 |
4. When to Use Which?
| Scenario | Best Choice |
|---|---|
| Data is unsorted | Linear Search |
| Data is sorted and large | Binary Search |
| Data is very small (< 20 elements) | Either works fine |
| Data changes frequently (many inserts/deletes) | Linear Search (no need to re-sort) |
| Data is sorted and rarely changes | Binary Search |
| Need to find all occurrences | Linear Search (scan all) |
Practice Programs
Program 1: Linear Search in a List
def linear_search(arr, key):
for i in range(len(arr)):
if arr[i] == key:
return i
return -1
# Main program
data = [45, 23, 78, 12, 56, 34, 89, 67]
print("List:", data)
key = int(input("Enter element to search: "))
result = linear_search(data, key)
if result != -1:
print(f"{key} found at index {result} (position {result + 1})")
else:
print(f"{key} not found in the list.")
# Test Run 1:
# List: [45, 23, 78, 12, 56, 34, 89, 67]
# Enter element to search: 56
# 56 found at index 4 (position 5)
# Test Run 2:
# List: [45, 23, 78, 12, 56, 34, 89, 67]
# Enter element to search: 99
# 99 not found in the list.
Program 2: Linear Search Returning All Positions
def linear_search_all(arr, key):
positions = []
for i in range(len(arr)):
if arr[i] == key:
positions.append(i)
return positions
# Main program
data = [10, 20, 30, 20, 40, 20, 50]
print("List:", data)
key = int(input("Enter element to search: "))
positions = linear_search_all(data, key)
if positions:
print(f"{key} found at indices: {positions}")
print(f"Total occurrences: {len(positions)}")
else:
print(f"{key} not found in the list.")
# Test Run:
# List: [10, 20, 30, 20, 40, 20, 50]
# Enter element to search: 20
# 20 found at indices: [1, 3, 5]
# Total occurrences: 3
Program 3: Binary Search in a Sorted List
def binary_search(arr, key):
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == key:
return mid
elif key < arr[mid]:
high = mid - 1
else:
low = mid + 1
return -1
# Main program
data = [11, 22, 33, 44, 55, 66, 77, 88, 99]
print("Sorted list:", data)
key = int(input("Enter element to search: "))
result = binary_search(data, key)
if result != -1:
print(f"{key} found at index {result}")
else:
print(f"{key} not found in the list.")
# Test Run 1:
# Sorted list: [11, 22, 33, 44, 55, 66, 77, 88, 99]
# Enter element to search: 66
# 66 found at index 5
# Test Run 2:
# Sorted list: [11, 22, 33, 44, 55, 66, 77, 88, 99]
# Enter element to search: 50
# 50 not found in the list.
Program 4: Binary Search with Trace Showing Comparisons
def binary_search_trace(arr, key):
low = 0
high = len(arr) - 1
step = 0
print(f"\nSearching for {key} in {arr}")
print(f"{'Step':<6}{'Low':<6}{'High':<6}{'Mid':<6}{'arr[mid]':<10}{'Action'}")
print("-" * 50)
while low <= high:
mid = (low + high) // 2
step += 1
if arr[mid] == key:
print(f"{step:<6}{low:<6}{high:<6}{mid:<6}{arr[mid]:<10}FOUND!")
print(f"\n{key} found at index {mid} in {step} comparison(s).")
return mid
elif key < arr[mid]:
print(f"{step:<6}{low:<6}{high:<6}{mid:<6}{arr[mid]:<10}Search LEFT (high={mid-1})")
high = mid - 1
else:
print(f"{step:<6}{low:<6}{high:<6}{mid:<6}{arr[mid]:<10}Search RIGHT (low={mid+1})")
low = mid + 1
print(f"\n{key} not found. Total comparisons: {step}")
return -1
# Test 1: Element present
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
binary_search_trace(data, 70)
# Output:
# Searching for 70 in [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
# Step Low High Mid arr[mid] Action
# --------------------------------------------------
# 1 0 9 4 50 Search RIGHT (low=5)
# 2 5 9 7 80 Search LEFT (high=6)
# 3 5 6 5 60 Search RIGHT (low=6)
# 4 6 6 6 70 FOUND!
#
# 70 found at index 6 in 4 comparison(s).
# Test 2: Element not present
binary_search_trace(data, 45)
# Output:
# Searching for 45 in [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
# Step Low High Mid arr[mid] Action
# --------------------------------------------------
# 1 0 9 4 50 Search LEFT (high=3)
# 2 0 3 1 20 Search RIGHT (low=2)
# 3 2 3 2 30 Search RIGHT (low=3)
# 4 3 3 3 40 Search RIGHT (low=4)
#
# 45 not found. Total comparisons: 4
Program 5: Count Comparisons for Both Algorithms on Same Data
def linear_search_count(arr, key):
comparisons = 0
for i in range(len(arr)):
comparisons += 1
if arr[i] == key:
return i, comparisons
return -1, comparisons
def binary_search_count(arr, key):
low = 0
high = len(arr) - 1
comparisons = 0
while low <= high:
mid = (low + high) // 2
comparisons += 1
if arr[mid] == key:
return mid, comparisons
elif key < arr[mid]:
high = mid - 1
else:
low = mid + 1
return -1, comparisons
# Test with sorted data
data = [i * 5 for i in range(1, 21)] # [5, 10, 15, ..., 100]
print("Data:", data)
print(f"Size: {len(data)} elements\n")
# Search for various keys
test_keys = [5, 50, 100, 42]
print(f"{'Key':<8}{'Linear (idx, comps)':<25}{'Binary (idx, comps)':<25}")
print("=" * 58)
for key in test_keys:
l_idx, l_comp = linear_search_count(data, key)
b_idx, b_comp = binary_search_count(data, key)
l_str = f"idx={l_idx}, comps={l_comp}"
b_str = f"idx={b_idx}, comps={b_comp}"
print(f"{key:<8}{l_str:<25}{b_str:<25}")
# Output:
# Data: [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
# Size: 20 elements
#
# Key Linear (idx, comps) Binary (idx, comps)
# ==========================================================
# 5 idx=0, comps=1 idx=0, comps=5
# 50 idx=9, comps=10 idx=9, comps=4
# 100 idx=19, comps=20 idx=19, comps=4
# 42 idx=-1, comps=20 idx=-1, comps=5
Observation: For small data at the beginning, linear search may use fewer comparisons. For large data or elements near the end, binary search is vastly more efficient.
Program 6: Search in a List of Strings
def linear_search_string(arr, key):
"""Linear search in a list of strings (case-insensitive)."""
key = key.lower
for i in range(len(arr)):
if arr[i].lower == key:
return i
return -1
def binary_search_string(arr, key):
"""Binary search in a sorted list of strings (case-insensitive)."""
key = key.lower
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high) // 2
mid_val = arr[mid].lower
if mid_val == key:
return mid
elif key < mid_val:
high = mid - 1
else:
low = mid + 1
return -1
# Test
names = ["Aman", "Kavya", "Priya", "Rahul", "Sneha", "Vikram"]
print("Names (sorted):", names)
# Linear search
search_name = input("\nEnter name to search (linear): ")
idx = linear_search_string(names, search_name)
if idx != -1:
print(f" Found '{names[idx]}' at index {idx}")
else:
print(f" '{search_name}' not found.")
# Binary search
search_name = input("\nEnter name to search (binary): ")
idx = binary_search_string(names, search_name)
if idx != -1:
print(f" Found '{names[idx]}' at index {idx}")
else:
print(f" '{search_name}' not found.")
# Test Run:
# Names (sorted): ['Aman', 'Kavya', 'Priya', 'Rahul', 'Sneha', 'Vikram']
#
# Enter name to search (linear): priya
# Found 'Priya' at index 2
#
# Enter name to search (binary): RAHUL
# Found 'Rahul' at index 3
Common Mistakes
| Mistake | Correct Approach |
|---|---|
| Using binary search on unsorted data | Binary search requires sorted data |
mid = (low + high) / 2 (float division) |
Use // for integer division: (low + high) // 2 |
while low < high (strict) |
Use while low <= high (inclusive) to check the last element |
Forgetting to update low or high |
After comparison: low = mid + 1 or high = mid - 1 |
Setting low = mid or high = mid |
This causes infinite loops; must be mid + 1 or mid - 1 |
Returning True/False instead of index |
expects the index or position to be returned |
$1$2 Quick Tips
- Binary search requires sorted data - if the question gives unsorted data, sort first or use linear search.
- Trace questions: Show low, high, mid at each step. This is the most common question format.
- Comparison of linear vs binary search is a very frequent 2-3 mark question. Know the table.
- O(n) vs O(log n): For n=1000, linear search needs up to 1000 comparisons while binary search needs at most 10 (log2(1000) ~ 10).
- Code writing: Be able to write both algorithms from memory. Binary search is more commonly asked.
- Edge cases: What happens when the list is empty? When the key is not present? Your code should handle both.
- Remember:
//integer division in the mid calculation, using/gives a float and causes an error. - Binary search loop condition:
low <= high(notlow < high). Missing the=means you skip checking when low equals high. - Position vs Index: Some questions ask for position (1-based) instead of index (0-based). Position = index + 1.
Prefer watching over reading?
Subscribe for free.