Programming & Data Structures

Why this unit matters

Programming and Data Structures is the unit where ISRO CS tests your ability to trace code rather than simply recall definitions. Past papers have featured C pointer arithmetic output prediction, malloc/free pitfalls, recursive function evaluation, BST insertion/deletion sequences, AVL rotation triggers, and heap property verification. Expect 10-12 marks, and expect that at least half of them require you to simulate execution on paper.

The unit connects directly to Algorithms (complexity of DS operations), Compiler Design (symbol table implementation, stack-based expression evaluation), and Operating Systems (process control blocks as linked structures).

Syllabus map

C programming essentials
- Data types, operators, precedence and associativity
- Pointers: declaration, dereferencing, pointer arithmetic
- Arrays vs pointers: decay rule, 2D arrays
- Strings: null-terminated, standard library functions
- Structures and unions: memory layout, padding, bit fields
- Dynamic memory: malloc, calloc, realloc, free, dangling pointers, memory leaks
- Function pointers
- Recursion: head, tail, indirect, mutual recursion
Linear data structures
- Arrays: static allocation, random access, insertion/deletion cost
- Linked lists: singly linked, doubly linked, circular linked
- Stacks: array and linked implementation, applications (balanced brackets, infix-to-postfix, function call stack)
- Queues: linear, circular, deque, priority queue
Trees
- Binary tree: properties, full, complete, perfect, skewed
- Tree traversals: inorder, preorder, postorder (recursive and iterative), level-order (BFS)
- BST: search, insert, delete (three cases)
- AVL tree: balance factor, LL, LR, RL, RR rotations
- B-tree and B+ tree: order, properties, use in databases
- Binary heap: min-heap, max-heap, heapify, build-heap, heap sort
Graphs
- Adjacency matrix vs adjacency list
- BFS and DFS traversal, time and space complexity

C programming

Pointers and pointer arithmetic

A pointer holds a memory address. Incrementing a pointer of type T* advances the address by sizeof(T) bytes.

int arr[] = {10, 20, 30, 40};
int *p = arr;
printf("%d", *(p + 2)); // prints 30

The array name decays to a pointer to its first element in most contexts. Exception: sizeof(arr) gives the total array size, not the pointer size.

ISRO trap: int *p = arr and int (*p)[4] = &arr are different. The first is a pointer to int; the second is a pointer to an array of 4 ints. Incrementing the second moves 4 * sizeof(int) bytes.

Pointer to function

int add(int a, int b) { return a + b; }
int (*fp)(int, int) = add;
printf("%d", fp(3, 4)); // prints 7

Function pointers are used in callback patterns and jump tables.

Structures and memory layout

The compiler may insert padding bytes between struct members to satisfy alignment requirements. The size of a struct is not necessarily the sum of its member sizes.

struct S {
    char c;    // 1 byte
    // 3 bytes padding
    int x;     // 4 bytes
};
// sizeof(S) = 8, not 5

A union uses the same memory for all members. sizeof(union) = size of the largest member (plus any alignment padding).

Recursion

A recursive function must have a base case. Tail recursion: the recursive call is the last operation. Compilers can optimise tail recursion into a loop (tail call optimisation). Head recursion: the recursive call comes before other work in the function body.

Mutual recursion: function A calls B, which calls A. Both need forward declarations in C.

int factorial(int n) {
    if (n == 0) return 1;
    return n * factorial(n - 1); // head recursion
}

Stack depth for this call is O(n). Tail-recursive version:

int fact_tail(int n, int acc) {
    if (n == 0) return acc;
    return fact_tail(n - 1, n * acc);
}

Linear data structures

Linked list operations

Inserting at the head of a singly linked list: O(1). Inserting at the tail without a tail pointer: O(n). Deleting a node requires the predecessor; in a singly linked list you must traverse to find it (O(n)), unless you have the address of the previous node.

A doubly linked list stores both next and prev pointers. Deletion is O(1) given the node (no need to search for the predecessor). Cost: 2x pointer storage.

ISRO trap on circular linked lists: the last node points back to the first node (not NULL). Traversal must check for the head node, not NULL, as the stopping condition.

Stack applications

Infix to postfix conversion uses a stack of operators. The precedence rules determine when to pop:

Higher-precedence operators on the stack are popped before a lower-precedence incoming operator is pushed.
Left-associative operators at the same precedence are also popped before pushing.

Checking balanced brackets: push opening brackets; on a closing bracket, pop and check match.

Queue variations

A circular queue uses an array of size N with front and rear indices modulo N. The queue is full when (rear + 1) % N == front (one slot wasted to distinguish full from empty), or you maintain a separate count.

A priority queue supports insert and extract-min (or extract-max). Implemented with a binary heap: both operations are O(log n).

Trees

Binary tree properties

For a full binary tree with n internal nodes: exactly n+1 leaf nodes. For a complete binary tree with n nodes: height = floor(log2 n). For a perfect binary tree of height h: 2^(h+1) - 1 nodes.

Tree traversals

Inorder (left, root, right): for a BST, this gives sorted order. Preorder (root, left, right): useful for copying a tree. Postorder (left, right, root): useful for deletion (process children before parent). Level-order: BFS using a queue.

Given preorder and inorder (or postorder and inorder), you can reconstruct the unique binary tree. Given only preorder and postorder, reconstruction is not unique for non-full trees.

BST operations

Search: compare key with root; recurse left if smaller, right if larger. O(h) where h is height. Insert: same traversal as search; insert at the NULL position where search fails. Delete: three cases:

Node is a leaf: simply remove it.
Node has one child: replace node with its child.
Node has two children: replace node's key with its inorder successor (smallest in right subtree), then delete the inorder successor.

AVL tree

An AVL tree maintains the invariant: |height(left subtree) - height(right subtree)| <= 1 for every node. The balance factor (BF) = height(left) - height(right).

When BF becomes +2 or -2 after an insertion, rebalance with rotations:

LL imbalance (BF = +2 at node, BF >= 0 at left child): single right rotation.
RR imbalance (BF = -2 at node, BF <= 0 at right child): single left rotation.
LR imbalance (BF = +2 at node, BF < 0 at left child): double rotation, left on child then right on node.
RL imbalance (BF = -2 at node, BF > 0 at right child): double rotation, right on child then left on node.

Binary heap

A max-heap has the property: parent >= children for all nodes. Implemented as an array where for node at index i (1-based): left child = 2i, right child = 2i+1, parent = floor(i/2).

Heapify (sift-down): O(log n). Build-heap from arbitrary array: apply heapify from n/2 down to 1. Total time: O(n), not O(n log n). Extract-max: swap root with last element, reduce size, heapify root. O(log n). Insert: place at end, sift-up. O(log n).

Graphs

Representation comparison

Adjacency matrix: O(V^2) space. O(1) edge existence check. O(V) to find all neighbours. Adjacency list: O(V + E) space. O(degree) for edge check. O(degree) for neighbours.

For sparse graphs (E much less than V^2), adjacency lists are preferred. For dense graphs or when edge-existence queries are frequent, the matrix is preferable.

BFS and DFS

BFS uses a queue. Visits vertices level by level. Can compute shortest-path distance (in terms of number of edges) from the source in an unweighted graph.

DFS uses a stack (or recursion). Produces discovery and finish times, which help in topological sort and identifying strongly connected components.

Both are O(V + E) on adjacency list representation.

Worked examples

Example 1: Pointer output

int a[3][3] = {{1,2,3},{4,5,6},{7,8,9}};
int *p = a[0];
printf("%d %d", *(p+4), *(*a+4));

a[0] is a pointer to int (first element of first row). p+4 points to a[1][1] = 5. *a = a[0] (same pointer). *a+4 also points to a[1][1] = 5. Output: 5 5.

Example 2: Recursive function trace

int f(int n) {
    if (n <= 1) return n;
    return f(n-1) + f(n-2);
}

This is the Fibonacci function. f(5) makes 2*f(4) branches, exponential time O(2^n).

Trace f(4): f(4) = f(3) + f(2) = (f(2)+f(1)) + (f(1)+f(0)) = ((f(1)+f(0))+1) + (1+0) = ((1+0)+1)+(1+0) = 2+1 = 3.

Example 3: AVL insertion

Insert keys in order: 10, 20, 30.

Insert 10: root = 10. Insert 20: 10's right child = 20. BF(10) = -1. OK. Insert 30: 20's right child = 30. BF(20) = -1. BF(10) = -2. RR imbalance at 10.

Single left rotation on 10: 20 becomes root, 10 becomes 20's left child, 30 stays 20's right child.

Result: balanced BST with 20 at root.

Example 4: Build-heap from array

Array: [4, 10, 3, 5, 1] (1-indexed).

Start heapify from index 2 (n/2 = 2):

Heapify index 2: children are 4 (index 4) and 1 (index 5). 10 > 4, 10 > 1. No swap.
Heapify index 1: children are 10 (index 2) and 3 (index 3). 10 > 4, so swap 4 and 10. Array: [10, 4, 3, 5, 1]. Now heapify index 2: children 5 and 1. 5 > 4, swap. Array: [10, 5, 3, 4, 1]. Heapify index 4: no children. Done.

Max-heap: [10, 5, 3, 4, 1].

Example 5: Inorder from preorder + inorder

Preorder: [3, 9, 20, 15, 7]. Inorder: [9, 3, 15, 20, 7].

Root = 3 (first of preorder). In inorder, 3 is at index 1. Left subtree: inorder [9], preorder [9]. Right subtree: inorder [15, 20, 7], preorder [20, 15, 7].

Right subtree root = 20 (first of [20, 15, 7]). In [15, 20, 7], 20 is at index 1. Left of 20: [15], right of 20: [7].

Tree: 3 with left 9 and right 20; 20 with left 15 and right 7.

Quick-revision summary

Pointer arithmetic: adding k to a T* pointer moves k * sizeof(T) bytes.
Array name decays to pointer in expressions, but not in sizeof.
sizeof(struct) includes padding; sizeof(union) = size of largest member.
BST delete with two children: replace with inorder successor, then delete successor.
AVL imbalance: four cases (LL, RR, LR, RL). LR and RL need double rotations.
Build-heap is O(n); repeated insert is O(n log n). Distinction matters for exam questions.
BFS: queue, level-order, shortest path in unweighted graphs.
DFS: stack/recursion, used for cycle detection, topological sort, SCCs.
Adjacency list: O(V+E) space, preferred for sparse graphs.
Tail recursion can be optimised into iteration by the compiler.

How to study this unit

Write C programs for each data structure from scratch: linked list insert/delete, stack push/pop, BST insert/delete/search. Running code in your head is faster when you have written it by hand.
Practice tracing C code with pointers on paper. Draw boxes for variables and arrows for pointers. ISRO gives code-output MCQs that require exact simulation.
Draw AVL rotations step by step for insertion sequences of 5-7 keys until you can identify the rotation type in under 10 seconds.
For heaps, always maintain the array representation alongside the tree diagram. Verify heap property after each operation.
Know the time complexities as a table: search/insert/delete for array, linked list, BST (average and worst), AVL, heap, hash table. ISRO loves comparison questions.
Revise traversal reconstruction (given two traversals, build the tree) as it is a classic 2-mark question.