Back to Basics: Algorithms and Data Structures in C
Every few years I go back to the fundamentals. Not because I’ve forgotten how merge sort works, but because implementing things from scratch in C forces a level of precision that higher-level languages let you skip. There’s no sorted() to lean on. No garbage collector. Just pointers, memory, and the compiler telling you exactly where you went wrong.
This round I worked through problems from two classics: CLRS (Introduction to Algorithms) and Cracking the Coding Interview. All in C, no libraries beyond stdlib.h and string.h.
Sorting: Three Algorithms, One File
Started with the CLRS fundamentals: insertion sort, selection sort, and merge sort. All three implemented in a single file with a shared printArray() helper.
Insertion sort is the one everyone can write from memory, but the details matter. Walking backwards through the sorted portion with a while loop instead of a for avoids the off-by-one bugs that come from managing two index variables:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
void insertion_sort(int * arr, int len)
{
for (int i = 1; i <= SIZE-1; i++)
{
int j = i;
while (j > 0 && arr[j] < arr[j-1])
{
int tmp = arr[j-1];
arr[j-1] = arr[j];
arr[j] = tmp;
j--;
}
}
}
Selection sort is the conceptual inverse – find the minimum, swap it to the front, repeat. The only subtlety is skipping the swap when min_index == i (the element is already in place).
Merge sort is where C makes you earn it. You’re malloc-ing temporary arrays for each merge, copying data in, merging back, and free-ing the temps. The merge step uses explicit bounds checking for when one half empties before the other – no fancy sentinel values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
while (k <= r)
{
if (j >= rightSize)
arr[k] = left[i++];
else if (i >= leftSize)
arr[k] = right[j++];
else if (left[i] < right[j])
arr[k] = left[i++];
else
arr[k] = right[j++];
k++;
}
free(left);
free(right);
Also implemented recursive binary search. The twist: it returns the value that matches or the next higher value, which is slightly more useful than a simple “found/not found” interface.
Cracking the Coding Interview: Chapter 1 Strings
Worked through seven string problems from CTCI Chapter 1. A few highlights:
Unique Characters (1.1) – Three Approaches
This one is great for interviews because there are three natural solutions with different tradeoffs:
- O(n^2) brute force – nested loops, no extra space
- O(n) with a lookup array – 128-element boolean array for ASCII
- O(n) with a bit vector – pack 26 flags into a single
unsigned int
The bit vector version is the most satisfying:
1
2
3
4
5
6
7
8
9
10
11
12
bool all_unique_in_string_bit_vector(char * str)
{
unsigned int check = 0;
for (int i = 0; str[i] != '\0'; i++)
{
int bitValue = str[i] - 'a';
bool isDupe = (check & (1 << bitValue)) > 0;
if (isDupe) return false;
check |= (1 << bitValue);
}
return true;
}
26 characters, 26 bits. No array, no hash map, just arithmetic.
One Edit Away (1.5)
Given two strings, check if they’re zero or one edits apart (insert, remove, or replace a single character). The key insight is that insert and remove are the same operation from different perspectives – if string A needs one insert to become string B, then string B needs one remove to become string A.
The implementation splits into two cases: same-length strings (check for at most one character difference) and strings differing by one character (walk both with independent indices, allowing one skip). Clean, O(n), no extra space.
In-Place Matrix Rotation (1.7)
Rotate an NxN matrix 90 degrees clockwise, in place. The trick is processing concentric “rings” from outside in, and for each ring, rotating four elements at a time:
1
2
3
4
5
6
7
8
9
10
11
for (int j = 0; j < 4; j++) // 360 / 90 = 4 swaps per element
{
int nextRow = currCol;
int nextCol = matrixSize - currRow - 1;
int nextVal = matrix[nextRow][nextCol];
tmp = nextVal;
matrix[nextRow][nextCol] = currVal;
currVal = tmp;
currRow = nextRow;
currCol = nextCol;
}
The formula nextRow = currCol, nextCol = size - currRow - 1 handles the 90-degree coordinate mapping. Four swaps per position, processing size/2 rings. O(n^2) time, O(1) space.
String Compression (1.6)
Run-length encoding: aabccccc becomes a2b1c5. The interesting constraint is returning the original string if compression doesn’t actually shrink it. In C this means managing a malloc‘d buffer and deciding at the end whether to return it or the original.
Why C?
I write Go and Python at work. C is a choice, not a requirement. But there’s something about C that keeps the fundamentals honest:
- No hidden allocations. Every byte of memory is your responsibility. Merge sort’s
malloc/freedance makes the O(n) space cost visceral. - No string class. String manipulation means pointer arithmetic and null terminators. You can’t accidentally O(n^2) a string builder if you’re thinking about every byte.
- No generics, no templates. Your sort function works on
int*. If you want it to work on something else, you writeqsortwith void pointers and a comparator. It’s ugly and educational.
The full source is on GitHub.