Post

Back to Basics: Algorithms and Data Structures in C

Back to Basics: Algorithms and Data Structures in C

Every few years I go back to the fundamentals. Not because I’ve forgotten how merge sort works, but because implementing things from scratch in C forces a level of precision that higher-level languages let you skip. There’s no sorted() to lean on. No garbage collector. Just pointers, memory, and the compiler telling you exactly where you went wrong.

This round I worked through problems from two classics: CLRS (Introduction to Algorithms) and Cracking the Coding Interview. All in C, no libraries beyond stdlib.h and string.h.

Algorithms and data structures in C

Sorting: Three Algorithms, One File

Started with the CLRS fundamentals: insertion sort, selection sort, and merge sort. All three implemented in a single file with a shared printArray() helper.

Insertion sort is the one everyone can write from memory, but the details matter. Walking backwards through the sorted portion with a while loop instead of a for avoids the off-by-one bugs that come from managing two index variables:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void insertion_sort(int * arr, int len)
{
  for (int i = 1; i <= SIZE-1; i++)
  {
    int j = i;
    while (j > 0 && arr[j] < arr[j-1])
    {
      int tmp = arr[j-1];
      arr[j-1] = arr[j];
      arr[j] = tmp;
      j--;
    }
  }
}

Selection sort is the conceptual inverse – find the minimum, swap it to the front, repeat. The only subtlety is skipping the swap when min_index == i (the element is already in place).

Merge sort is where C makes you earn it. You’re malloc-ing temporary arrays for each merge, copying data in, merging back, and free-ing the temps. The merge step uses explicit bounds checking for when one half empties before the other – no fancy sentinel values:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
while (k <= r)
{
    if (j >= rightSize)
        arr[k] = left[i++];
    else if (i >= leftSize)
        arr[k] = right[j++];
    else if (left[i] < right[j])
        arr[k] = left[i++];
    else
        arr[k] = right[j++];
    k++;
}
free(left);
free(right);

Also implemented recursive binary search. The twist: it returns the value that matches or the next higher value, which is slightly more useful than a simple “found/not found” interface.

Cracking the Coding Interview: Chapter 1 Strings

Worked through seven string problems from CTCI Chapter 1. A few highlights:

Unique Characters (1.1) – Three Approaches

This one is great for interviews because there are three natural solutions with different tradeoffs:

  1. O(n^2) brute force – nested loops, no extra space
  2. O(n) with a lookup array – 128-element boolean array for ASCII
  3. O(n) with a bit vector – pack 26 flags into a single unsigned int

The bit vector version is the most satisfying:

1
2
3
4
5
6
7
8
9
10
11
12
bool all_unique_in_string_bit_vector(char * str)
{
    unsigned int check = 0;
    for (int i = 0; str[i] != '\0'; i++)
    {
        int bitValue = str[i] - 'a';
        bool isDupe = (check & (1 << bitValue)) > 0;
        if (isDupe) return false;
        check |= (1 << bitValue);
    }
    return true;
}

26 characters, 26 bits. No array, no hash map, just arithmetic.

One Edit Away (1.5)

Given two strings, check if they’re zero or one edits apart (insert, remove, or replace a single character). The key insight is that insert and remove are the same operation from different perspectives – if string A needs one insert to become string B, then string B needs one remove to become string A.

The implementation splits into two cases: same-length strings (check for at most one character difference) and strings differing by one character (walk both with independent indices, allowing one skip). Clean, O(n), no extra space.

In-Place Matrix Rotation (1.7)

Rotate an NxN matrix 90 degrees clockwise, in place. The trick is processing concentric “rings” from outside in, and for each ring, rotating four elements at a time:

1
2
3
4
5
6
7
8
9
10
11
for (int j = 0; j < 4; j++)  // 360 / 90 = 4 swaps per element
{
    int nextRow = currCol;
    int nextCol = matrixSize - currRow - 1;
    int nextVal = matrix[nextRow][nextCol];
    tmp = nextVal;
    matrix[nextRow][nextCol] = currVal;
    currVal = tmp;
    currRow = nextRow;
    currCol = nextCol;
}

The formula nextRow = currCol, nextCol = size - currRow - 1 handles the 90-degree coordinate mapping. Four swaps per position, processing size/2 rings. O(n^2) time, O(1) space.

String Compression (1.6)

Run-length encoding: aabccccc becomes a2b1c5. The interesting constraint is returning the original string if compression doesn’t actually shrink it. In C this means managing a malloc‘d buffer and deciding at the end whether to return it or the original.

Why C?

I write Go and Python at work. C is a choice, not a requirement. But there’s something about C that keeps the fundamentals honest:

  • No hidden allocations. Every byte of memory is your responsibility. Merge sort’s malloc/free dance makes the O(n) space cost visceral.
  • No string class. String manipulation means pointer arithmetic and null terminators. You can’t accidentally O(n^2) a string builder if you’re thinking about every byte.
  • No generics, no templates. Your sort function works on int*. If you want it to work on something else, you write qsort with void pointers and a comparator. It’s ugly and educational.

The full source is on GitHub.

This post is licensed under CC BY 4.0 by the author.