Lesson 20 of 2012 min read

Searching Algorithms

Share:WhatsAppLinkedIn

Prerequisites: Lists, loops, functions, sorting (for binary search)


1. Linear Search

Concept, Check each element one by one from the beginning., Works on both sorted and unsorted lists., Stop when the element is found or the list ends., Also called Sequential Search.

Algorithm

For each element in the list:
 If element equals the key:
 Return the index
Return -1 (not found)

Code

def linear_search(arr, key):
 for i in range(len(arr)):
 if arr[i] == key:
 return i # Found - return index
 return -1 # Not found

Dry Run Trace

Input: arr = [10, 23, 45, 70, 11, 15], key = 70

Step 1: Compare arr[0]=10 with 70 -> Not equal
Step 2: Compare arr[1]=23 with 70 -> Not equal
Step 3: Compare arr[2]=45 with 70 -> Not equal
Step 4: Compare arr[3]=70 with 70 -> FOUND at index 3

Result: 70 found at index 3
Total comparisons: 4

Input: arr = [10, 23, 45, 70, 11, 15], key = 50

Step 1: Compare arr[0]=10 with 50 -> Not equal
Step 2: Compare arr[1]=23 with 50 -> Not equal
Step 3: Compare arr[2]=45 with 50 -> Not equal
Step 4: Compare arr[3]=70 with 50 -> Not equal
Step 5: Compare arr[4]=11 with 50 -> Not equal
Step 6: Compare arr[5]=15 with 50 -> Not equal

Result: 50 not found
Total comparisons: 6

Complexity

  • Best case: O(1), element is at the first position
  • Worst case: O(n), element is at the last position or not present
  • Average case: O(n), Works on unsorted data, No preprocessing required

2. Binary Search

Concept, Works only on sorted lists., Repeatedly divide the search interval in half., Compare the key with the middle element:

  • If equal: found.
  • If key < middle: search the left half.
  • If key > middle: search the right half., Much faster than linear search for large datasets.

Algorithm (Iterative)

low = 0
high = n - 1
While low <= high:
 mid = (low + high) // 2
 If arr[mid] == key:
 Return mid
 Else if key < arr[mid]:
 high = mid - 1
 Else:
 low = mid + 1
Return -1

Code

def binary_search(arr, key):
 low = 0
 high = len(arr) - 1

 while low <= high:
 mid = (low + high) // 2
 if arr[mid] == key:
 return mid # Found
 elif key < arr[mid]:
 high = mid - 1 # Search left half
 else:
 low = mid + 1 # Search right half

 return -1 # Not found

Dry Run Trace

Input: arr = [10, 20, 30, 40, 50, 60, 70, 80], key = 50

Step 1: low=0, high=7, mid=(0+7)//2=3
 arr[3]=40, 50 > 40 -> Search right: low=4

Step 2: low=4, high=7, mid=(4+7)//2=5
 arr[5]=60, 50 < 60 -> Search left: high=4

Step 3: low=4, high=4, mid=(4+4)//2=4
 arr[4]=50, 50 == 50 -> FOUND at index 4

Result: 50 found at index 4
Total comparisons: 3

Input: arr = [10, 20, 30, 40, 50, 60, 70, 80], key = 35

Step 1: low=0, high=7, mid=3
 arr[3]=40, 35 < 40 -> high=2

Step 2: low=0, high=2, mid=1
 arr[1]=20, 35 > 20 -> low=2

Step 3: low=2, high=2, mid=2
 arr[2]=30, 35 > 30 -> low=3

Step 4: low=3, high=2 -> low > high -> STOP

Result: 35 not found
Total comparisons: 3

Complexity

  • Best case: O(1), element is at the middle
  • Worst case: O(log n), repeatedly halving the search space
  • Average case: O(log n)
  • Requirement: List must be sorted

Why O(log n)?

For a list of 8 elements:

8 -> 4 -> 2 -> 1 (3 comparisons max = log2(8))

For a list of 1000 elements:

1000 -> 500 -> 250 -> 125 -> 63 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1
(about 10 comparisons max = log2(1000) ~ 10)

3. Comparison: Linear Search vs Binary Search

Feature Linear Search Binary Search
Sorted data required No Yes
Best case O(1) O(1)
Worst case O(n) O(log n)
Average case O(n) O(log n)
Method Sequential comparison Divide and conquer
Simple to implement Yes Slightly complex
Suitable for Small or unsorted data Large sorted data
Comparisons (n=1000) Up to 1000 Up to 10
Comparisons (n=1,000,000) Up to 1,000,000 Up to 20

4. When to Use Which?

Scenario Best Choice
Data is unsorted Linear Search
Data is sorted and large Binary Search
Data is very small (< 20 elements) Either works fine
Data changes frequently (many inserts/deletes) Linear Search (no need to re-sort)
Data is sorted and rarely changes Binary Search
Need to find all occurrences Linear Search (scan all)

Practice Programs

Program 1: Linear Search in a List

def linear_search(arr, key):
 for i in range(len(arr)):
 if arr[i] == key:
 return i
 return -1

# Main program
data = [45, 23, 78, 12, 56, 34, 89, 67]
print("List:", data)

key = int(input("Enter element to search: "))
result = linear_search(data, key)

if result != -1:
 print(f"{key} found at index {result} (position {result + 1})")
else:
 print(f"{key} not found in the list.")

# Test Run 1:
# List: [45, 23, 78, 12, 56, 34, 89, 67]
# Enter element to search: 56
# 56 found at index 4 (position 5)

# Test Run 2:
# List: [45, 23, 78, 12, 56, 34, 89, 67]
# Enter element to search: 99
# 99 not found in the list.

Program 2: Linear Search Returning All Positions

def linear_search_all(arr, key):
 positions = []
 for i in range(len(arr)):
 if arr[i] == key:
 positions.append(i)
 return positions

# Main program
data = [10, 20, 30, 20, 40, 20, 50]
print("List:", data)

key = int(input("Enter element to search: "))
positions = linear_search_all(data, key)

if positions:
 print(f"{key} found at indices: {positions}")
 print(f"Total occurrences: {len(positions)}")
else:
 print(f"{key} not found in the list.")

# Test Run:
# List: [10, 20, 30, 20, 40, 20, 50]
# Enter element to search: 20
# 20 found at indices: [1, 3, 5]
# Total occurrences: 3

Program 3: Binary Search in a Sorted List

def binary_search(arr, key):
 low = 0
 high = len(arr) - 1

 while low <= high:
 mid = (low + high) // 2
 if arr[mid] == key:
 return mid
 elif key < arr[mid]:
 high = mid - 1
 else:
 low = mid + 1

 return -1

# Main program
data = [11, 22, 33, 44, 55, 66, 77, 88, 99]
print("Sorted list:", data)

key = int(input("Enter element to search: "))
result = binary_search(data, key)

if result != -1:
 print(f"{key} found at index {result}")
else:
 print(f"{key} not found in the list.")

# Test Run 1:
# Sorted list: [11, 22, 33, 44, 55, 66, 77, 88, 99]
# Enter element to search: 66
# 66 found at index 5

# Test Run 2:
# Sorted list: [11, 22, 33, 44, 55, 66, 77, 88, 99]
# Enter element to search: 50
# 50 not found in the list.

Program 4: Binary Search with Trace Showing Comparisons

def binary_search_trace(arr, key):
 low = 0
 high = len(arr) - 1
 step = 0

 print(f"\nSearching for {key} in {arr}")
 print(f"{'Step':<6}{'Low':<6}{'High':<6}{'Mid':<6}{'arr[mid]':<10}{'Action'}")
 print("-" * 50)

 while low <= high:
 mid = (low + high) // 2
 step += 1

 if arr[mid] == key:
 print(f"{step:<6}{low:<6}{high:<6}{mid:<6}{arr[mid]:<10}FOUND!")
 print(f"\n{key} found at index {mid} in {step} comparison(s).")
 return mid
 elif key < arr[mid]:
 print(f"{step:<6}{low:<6}{high:<6}{mid:<6}{arr[mid]:<10}Search LEFT (high={mid-1})")
 high = mid - 1
 else:
 print(f"{step:<6}{low:<6}{high:<6}{mid:<6}{arr[mid]:<10}Search RIGHT (low={mid+1})")
 low = mid + 1

 print(f"\n{key} not found. Total comparisons: {step}")
 return -1

# Test 1: Element present
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
binary_search_trace(data, 70)

# Output:
# Searching for 70 in [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
# Step Low High Mid arr[mid] Action
# --------------------------------------------------
# 1 0 9 4 50 Search RIGHT (low=5)
# 2 5 9 7 80 Search LEFT (high=6)
# 3 5 6 5 60 Search RIGHT (low=6)
# 4 6 6 6 70 FOUND!
#
# 70 found at index 6 in 4 comparison(s).

# Test 2: Element not present
binary_search_trace(data, 45)

# Output:
# Searching for 45 in [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
# Step Low High Mid arr[mid] Action
# --------------------------------------------------
# 1 0 9 4 50 Search LEFT (high=3)
# 2 0 3 1 20 Search RIGHT (low=2)
# 3 2 3 2 30 Search RIGHT (low=3)
# 4 3 3 3 40 Search RIGHT (low=4)
#
# 45 not found. Total comparisons: 4

Program 5: Count Comparisons for Both Algorithms on Same Data

def linear_search_count(arr, key):
 comparisons = 0
 for i in range(len(arr)):
 comparisons += 1
 if arr[i] == key:
 return i, comparisons
 return -1, comparisons

def binary_search_count(arr, key):
 low = 0
 high = len(arr) - 1
 comparisons = 0

 while low <= high:
 mid = (low + high) // 2
 comparisons += 1
 if arr[mid] == key:
 return mid, comparisons
 elif key < arr[mid]:
 high = mid - 1
 else:
 low = mid + 1

 return -1, comparisons

# Test with sorted data
data = [i * 5 for i in range(1, 21)] # [5, 10, 15, ..., 100]
print("Data:", data)
print(f"Size: {len(data)} elements\n")

# Search for various keys
test_keys = [5, 50, 100, 42]

print(f"{'Key':<8}{'Linear (idx, comps)':<25}{'Binary (idx, comps)':<25}")
print("=" * 58)

for key in test_keys:
 l_idx, l_comp = linear_search_count(data, key)
 b_idx, b_comp = binary_search_count(data, key)
 l_str = f"idx={l_idx}, comps={l_comp}"
 b_str = f"idx={b_idx}, comps={b_comp}"
 print(f"{key:<8}{l_str:<25}{b_str:<25}")

# Output:
# Data: [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
# Size: 20 elements
#
# Key Linear (idx, comps) Binary (idx, comps)
# ==========================================================
# 5 idx=0, comps=1 idx=0, comps=5
# 50 idx=9, comps=10 idx=9, comps=4
# 100 idx=19, comps=20 idx=19, comps=4
# 42 idx=-1, comps=20 idx=-1, comps=5

Observation: For small data at the beginning, linear search may use fewer comparisons. For large data or elements near the end, binary search is vastly more efficient.

Program 6: Search in a List of Strings

def linear_search_string(arr, key):
 """Linear search in a list of strings (case-insensitive)."""
 key = key.lower
 for i in range(len(arr)):
 if arr[i].lower == key:
 return i
 return -1

def binary_search_string(arr, key):
 """Binary search in a sorted list of strings (case-insensitive)."""
 key = key.lower
 low = 0
 high = len(arr) - 1

 while low <= high:
 mid = (low + high) // 2
 mid_val = arr[mid].lower

 if mid_val == key:
 return mid
 elif key < mid_val:
 high = mid - 1
 else:
 low = mid + 1

 return -1

# Test
names = ["Aman", "Kavya", "Priya", "Rahul", "Sneha", "Vikram"]
print("Names (sorted):", names)

# Linear search
search_name = input("\nEnter name to search (linear): ")
idx = linear_search_string(names, search_name)
if idx != -1:
 print(f" Found '{names[idx]}' at index {idx}")
else:
 print(f" '{search_name}' not found.")

# Binary search
search_name = input("\nEnter name to search (binary): ")
idx = binary_search_string(names, search_name)
if idx != -1:
 print(f" Found '{names[idx]}' at index {idx}")
else:
 print(f" '{search_name}' not found.")

# Test Run:
# Names (sorted): ['Aman', 'Kavya', 'Priya', 'Rahul', 'Sneha', 'Vikram']
#
# Enter name to search (linear): priya
# Found 'Priya' at index 2
#
# Enter name to search (binary): RAHUL
# Found 'Rahul' at index 3

Common Mistakes

Mistake Correct Approach
Using binary search on unsorted data Binary search requires sorted data
mid = (low + high) / 2 (float division) Use // for integer division: (low + high) // 2
while low < high (strict) Use while low <= high (inclusive) to check the last element
Forgetting to update low or high After comparison: low = mid + 1 or high = mid - 1
Setting low = mid or high = mid This causes infinite loops; must be mid + 1 or mid - 1
Returning True/False instead of index expects the index or position to be returned

$1$2 Quick Tips

  1. Binary search requires sorted data - if the question gives unsorted data, sort first or use linear search.
  2. Trace questions: Show low, high, mid at each step. This is the most common question format.
  3. Comparison of linear vs binary search is a very frequent 2-3 mark question. Know the table.
  4. O(n) vs O(log n): For n=1000, linear search needs up to 1000 comparisons while binary search needs at most 10 (log2(1000) ~ 10).
  5. Code writing: Be able to write both algorithms from memory. Binary search is more commonly asked.
  6. Edge cases: What happens when the list is empty? When the key is not present? Your code should handle both.
  7. Remember: // integer division in the mid calculation, using / gives a float and causes an error.
  8. Binary search loop condition: low <= high (not low < high). Missing the = means you skip checking when low equals high.
  9. Position vs Index: Some questions ask for position (1-based) instead of index (0-based). Position = index + 1.

Test Your Knowledge

Take a quick quiz on this lesson

Start Quiz →

Prefer watching over reading?

Subscribe for free.

Subscribe on YouTube