How to Implement a Linked List from Scratch

What is linked list implementation?

Think of a linked list not as a list, but as a train. Each car (node) holds your data and has a single coupling (a next pointer) that points to the next car. There's no central depot; the entire train is found by starting at the first car and following the couplings one by one. This is the core intuition for a singly linked list: a chain of independent nodes connected only by these forward pointers.

This pointer-based chain is fundamentally different from an array's contiguous block of memory. An array is like a long row of houses—you know exactly where house #5 is by simple arithmetic. In a linked list train, you have to walk from the engine (the head) through each car to find the fifth one. That walk is traversal, and it's the price you pay for the list's flexible, dynamic size.

Interactive Lab: Array vs. Linked List Access

Watch how the computer finds Index 3 in both structures. Notice how the Linked List must visit every previous node to get there.

Array (Contiguous Memory) Ready
0
1
2
3
Linked List (Scattered Memory) Ready
0
1
2
3
Click "Find Index 3" to compare access methods.

A common beginner assumption is that linked lists are universally "faster" than arrays because inserting or deleting feels like a simple pointer change. This is often wrong for real-world performance. The overhead of those extra pointers (memory) and the cost of chasing scattered memory addresses (time) usually makes arrays faster for most access patterns.

Arrays benefit from cache locality: when you read element 0, the CPU likely preloads elements 1, 2, 3... into its fast cache because they're right next door. A linked list's nodes can be scattered all over memory, causing constant cache misses and slowing down even simple traversal. So, while linked lists excel at frequent, arbitrary insertions/deletions, they are typically slower for sequential access than a well-used array.

Data structures for beginners: linked list fundamentals

Think of a singly linked list node as the simplest possible "smart container". It holds data and knows about exactly one other container. This simple pattern—a self-referential structure with a next pointer—is a fundamental building block.

Once you understand how to chain these nodes together, you can build more complex structures by simply changing the rules of connection:

Stack (LIFO)

Just a linked list where you only add/remove at the head.

Queue (FIFO)

A linked list with a tail pointer, adding at the tail and removing from the head.

Trees & Graphs

Nodes with multiple pointers (like left, right). The core idea is the same.

In computer science, we separate the what (the Abstract Data Type, or ADT) from the how (the data structure). The ADT "list" promises operations like insert and delete. A linked list is one way to fulfill that promise—using dynamic nodes and pointers. Another way is an array (contiguous memory). Understanding the linked list implementation gives you a mental model for how dynamic, pointer-based structures work under the hood.

Common Pitfall: Assuming Constant-Time Indexing

A critical insight from our train analogy: to reach the fifth car, you must start at the engine and walk forward, counting each coupling. There is no way to jump directly to "position 5."

This means accessing an element by its index (like list[5]) is an O(n) operation. The time it takes grows linearly with how far you want to go.

Simulation: The Cost of Traversal

Watch how the computer "walks" through the list to find a specific index. Unlike an array, it cannot teleport.

0
Head
1
Node
2
Node
3
Node
4
Tail
Current
Target Index: 3
Ready to traverse...

The Code Behind the Walk

// Start at the first car
Node* current = head;

// Walk 'index' steps forward
for (int i = 0; i < index; i++) {
    // Check if we fell off the end
    if (current == NULL) {
        return NULL; 
    }
    // Follow the coupling to next car
    current = current->next;   
}
// Found the target car
return current;

The for loop is the walk. If index is 1000, you must perform up to 1000 pointer dereferences (current->next). This is fundamentally different from an array, where array[1000] calculates the memory address instantly (base address + element size * 1000), a constant-time O(1) operation.

Why this matters: If your algorithm frequently needs random access (e.g., "give me the 500th element"), a linked list will be dramatically slower than an array. The linked list's strength is efficient insertion/deletion once you're already at the correct position, and efficient sequential traversal. Never assume it provides fast random access—that's the array's job.

Linked list in C: setting up the environment

Welcome to the "under the hood" experience. If you've programmed in Python or Java, you're used to references feeling like magic—just assign node.next = other_node and the computer handles the rest.

In C, the magic curtain is pulled back. You are now the architect of memory. You explicitly ask for space, you explicitly hold the addresses, and you explicitly free it. This isn't just "harder"—it's empowering. By forcing you to handle the * (dereference) and & (address-of) operators, C ensures you truly understand that a pointer is simply an integer storing a memory location.

The C Skeleton: Your First Linked List

Before we write complex logic, we need our building block. A C linked list node is a struct that contains data and a pointer to itself.

#include <stdio.h>
#include <stdlib.h> // CRITICAL: For memory allocation

// 1. The Node Definition
typedef struct Node {
    int data;               // The payload
    struct Node* next;      // The bridge to the next node
} Node;

int main() {
    // Start with an empty list (NULL)
    Node* head = NULL;

    // 2. Allocate memory for the first node
    // malloc asks the OS for enough bytes to hold a Node
    head = (Node*)malloc(sizeof(Node));

    // 3. Initialize
    head->data = 10;        // Set the value
    head->next = NULL;     // Mark end of list

    // ... use the list ...

    // 4. Clean up (prevent memory leaks)
    free(head);
    return 0;
}
Why typedef?

It saves typing. Instead of writing struct Node* everywhere, you can just write Node*.

Why head = NULL?

A list without a head pointer is lost. NULL tells us the list is currently empty.

The Silent Killer: Forgetting stdlib.h

This is the #1 reason beginners get Segmentation Faults even when their logic seems perfect. If you forget #include <stdlib.h>, the compiler doesn't know what malloc returns.

Memory Truncation Simulator

Return Value 0x7FFF...
Status: Safe
The pointer is full size (64-bit). It points to the correct memory.
Compiler Logic:
void* = malloc(size); // Correct type

When the compiler assumes int (32-bit) instead of void* (64-bit on modern systems), it chops off the upper half of your memory address. You end up with a pointer that points to the wrong place—often inside the operating system's protected memory. When you try to write data there, the OS kills your program instantly.

The Rule: Always include <stdlib.h> at the top. It tells the compiler: "I am going to ask for memory, and I expect a full pointer back."

Implement linked list from scratch: core structure

We've talked about the "train" analogy, but now we need to look at the blueprint. In C, a node isn't a magical object; it's a rigid block of memory defined by a struct.

The magic of the linked list lies in the next pointer. It doesn't hold the next node inside it—that would be infinite recursion! Instead, it holds the address (a number) of where the next node lives.

Anatomy of a Node: Memory Layout

When you call malloc(sizeof(Node)), the computer reserves a single, contiguous block of memory. Here is exactly what that block looks like on a 64-bit system:

struct Node sizeof = 12 bytes
data (int)
42
4 bytes
next (pointer)
0x7F...
8 bytes

Key Insight: The data and next are neighbors in memory. They are glued together.

However, the next pointer points to a completely different memory address, likely far away in the heap. That is the "coupling" that connects the train cars.

Notice something important: there is no struct LinkedList. The "list" is an illusion created by the head pointer.

The Silent Killer: Uninitialized Pointers

In C, declaring a variable does not give it a value. It gives it garbage.

If you write Node* head;, the head variable contains a random number from your computer's memory. If you try to use it immediately, your program will crash.

Lab: The Danger of Uninitialized Memory

Let's simulate what happens when you forget to initialize head to NULL.

main.c
1 Node* head;
2 // head contains GARBAGE
3 head->data = 10; // CRASH HERE?
Memory Inspector (Value of 'head'):
0x00000000 (NULL)

Choose Scenario

The rule is simple: Always initialize your pointers. Even if you plan to assign them a valid memory address immediately after, setting them to NULL first is a safety habit that saves you from hours of debugging "Segmentation Faults."

Once head is safely NULL, you are ready to start allocating your first node and linking them together.

Core linked list operations: Rewiring the Train

We've established that a linked list is a chain of couplings. Now, let's perform surgery on the train. In C, operations like insertion and deletion are purely about rewiring.

You don't "shift" elements like in an array. You simply change the next pointer of the previous node to point to a different address.

The Deletion Trap: Free vs. Rewire

The most common bug in linked list deletion is order of operations.
Wrong: Free the memory first. (The pointer becomes a "dangling pointer" pointing to garbage).
Right: Rewire the previous node's coupling first, then free the memory.

Simulation: Deleting Node B

A
Head
B
To Delete
C
Tail
Ready to delete Node B
// The Logic
// 1. Locate B
// 2. ...

Why the order matters: If you call free(B) before changing A's pointer, A is left pointing to a "dead" memory address. If you try to follow A's pointer later, your program will crash (Segmentation Fault).

Here is the correct pattern for deletion. Notice how we handle the rewiring (lines 13-17) before we call free() (line 19).

delete_node.c
// Function to delete node at specific index
void delete_node_at_index(Node** head_ref, int index) {
    Node* current = *head_ref;
    Node* prev = NULL;

    // 1. Find the node to delete (and its predecessor)
    for (int i = 0; current != NULL && i < index; i++) {
        prev = current;
        current = current->next;
    }
    if (current == NULL) return; // Index out of bounds

    // 2. REWIRE: The Critical Step
    if (prev == NULL) {
        // Special case: Deleting the Head
        *head_ref = current->next; // Head now points to second node
    } else {
        // Normal case: Skip over 'current'
        prev->next = current->next; 
    }

    // 3. FREE: Now it's safe to scrap the memory
    free(current); 
}

The Golden Rule of Deletion

Never free a node while it is still being pointed to. Always update the previous node's next pointer to bypass the target node before releasing the target node's memory.

Insertion in linked list: step-by-step guide

We've mastered the anatomy of the train. Now, let's perform the most common surgery: Insertion.

Think of the train cars again. Inserting a new car depends entirely on where you attach it.

1

Head Insertion

You are adding a new engine. You don't need to walk anywhere.

  • 1. New Node points to old Head.
  • 2. Head pointer points to New Node.
  • Complexity: O(1) (Instant)
2

Tail Insertion

You are adding a caboose. You must walk the whole train.

  • 1. Walk until current->next == NULL.
  • 2. Set last car's next to New Node.
  • Complexity: O(n) (Slow)
3

Middle Insertion

You are adding a car between two others.

  • 1. Find the predecessor node.
  • 2. New Node points to predecessor's next.
  • 3. Predecessor points to New Node.
  • Complexity: O(n) (Walk + Link)

Head insertion is the fastest operation you can perform—technically O(1). It doesn't matter if your list has 1 car or 1,000,000 cars. You just attach the new one at the front.

However, in C, this speed comes with a dangerous trap. Beginners often write code that looks correct but fails silently. This is the "Pass-by-Value" Trap.

The C Trap: Updating the Head Pointer

When you call a function in C, arguments are passed by value. This means the function gets a copy of the variable.

If you pass the head pointer to a function, the function receives a copy of the address. If you change that copy inside the function, the original head in main remains unchanged. You've created a new node, but you've lost the link to it.

Simulation: The Scope Problem

Main Scope (Caller) Function Scope (Callee)
Variable: head
NULL
Points to Heap
Parameter: head (Copy)
NULL
Heap (New Node)
Node(10)
// The Function Logic
// Click a button to run simulation

Observation: In the "Wrong" scenario, the function thinks it worked (its local copy points to the node), but main still thinks the list is empty. The new node is lost forever (memory leak).

To fix this, we need to give the function the address of the head variable itself. This is called a double pointer (Node**).

insert_head.c
// Correct Implementation
void insert_at_head(Node** head_ref, int data) {
    // 1. Allocate new node
    Node* new_node = malloc(sizeof(Node));
    new_node->data = data;

    // 2. Point new node to the OLD head
    // *head_ref gives us the value of head (the address)
    new_node->next = *head_ref;

    // 3. Update the ORIGINAL head pointer
    // *head_ref = new_node changes the variable in main()
    *head_ref = new_node;
}

This pattern—Node** head_ref—is the standard way to modify the head of a linked list in C. It ensures that whether the list is empty or full, your new node becomes the new engine, and your program remains safe.

Traversal and searching in linked list

Traversal is simply walking the train. You start with your hand on the engine (head) and follow each coupling (next) until you reach the end (NULL).

There is no alternative—this is the only way to visit every node in sequence. Every other operation (insertion, deletion, searching) must begin with some form of traversal to reach the relevant position.

Simulation: The Linear Walk

Watch how the current pointer moves from node to node. It cannot jump; it must visit every link in the chain.

10
Node 0
25
Node 1
42
Node 2
8
Node 3
NULL
End
current
Steps: 0
Ready to traverse...

The Code Behind the Walk

// Start at the engine
Node* current = head;

// Keep walking until you hit the end
while (current != NULL) {
    printf("%d ", current->data); // Process data
    current = current->next;            // Move to next coupling
}

The while loop is the walk. current is your moving hand. You never skip cars; you visit each one exactly once. This makes traversal inherently O(n)—time grows linearly with the number of nodes.

This linear cost is by design, not a bug. It's the trade-off for dynamic size and cheap insertion/deletion. Traversal is acceptable when:

  • You need to process every element (e.g., printing, summing).
  • Your list is relatively small.
  • Insertions/deletions are frequent, but full scans are rare.
  • You're building something like a stack or queue where you only ever touch the head or tail.

Common Misconception: Searching is Faster

The Myth: "Since linked lists are dynamic and pointers are 'fast,' searching (by value) should be quicker than in an array."

The Reality: Searching by value (e.g., "find the first node with data == 42") requires traversal in both linked lists and arrays. You must inspect each element until you find a match. In the worst case (value not present), you visit every node—O(n) for both.

Simulation: Search for Value 42

10
5
99
42
NULL
Checking...
Ready to search...

Why this is false: Searching by value requires traversal in both linked lists and arrays. You must inspect each element until you find a match. In the worst case (value not present), you visit every node/element—O(n) for both.

The Hidden Cost: Cache Locality

Even though the algorithmic complexity is O(n) for both, arrays are often faster in practice for sequential access.

Array (Contiguous Memory): The CPU can prefetch adjacent elements into its fast cache. If you read element 0, elements 1, 2, 3 are likely already loaded nearby.

Linked List (Scattered Memory): Nodes are scattered. Each current->next may jump to a completely unrelated memory address, causing cache misses that slow down even simple traversal.

Bottom line: Don't expect linked lists to beat arrays at searching or random access. Their strength is flexible reshaping of the chain after you've walked to the right spot. If your workload involves many "find and insert/delete" operations on large lists, consider whether the traversal overhead negates the insertion benefit—sometimes a balanced tree or a dynamic array is better.

Testing and debugging linked list code

You've written insert_at_head, delete_at_index, and traverse. You run main(), it prints something, and it exits without crashing. Is it correct? Probably not.

Linked list bugs are silent assassins. A single misplaced pointer can leave your list internally inconsistent—nodes orphaned, cycles created, memory leaked—yet the program may still run and print something. The only way to gain confidence is to systematically verify the list's structure after every operation, not just its output.

Simulation: The Test Harness

A test harness is a small function that builds a known state, performs an operation, and then walks the list to verify the sequence. We use assert() to stop immediately if the structure breaks.

Verifying: 10 → 20 → 30 → NULL

10
Node 0
20
Node 1
30
Node 2
NULL
End
Check
Ready to verify list structure...

The Test Code

// Test 1: Insert three nodes at head
insert_at_head(&head, 30);
insert_at_head(&head, 20);
insert_at_head(&head, 10); 

// Expected: 10 -> 20 -> 30 -> NULL
int expected[] = {10, 20, 30};
Node* current = head;

for (int i = 0; i < 3; i++) {
    // 1. Check if node exists
    assert(current != NULL);        
    // 2. Check if data matches
    assert(current->data == expected[i]); 
    // 3. Move to next
    current = current->next;
}
// 4. Check termination
assert(current == NULL);

Why this works: assert() halts the program immediately if a condition fails. We verify both the data sequence and the termination (current == NULL). This catches incorrect links and accidental cycles.

Common Misconception: "If the program runs and prints something, it's fine. I'll only debug when it segfaults."

The "Silent" Memory Errors

Most linked list errors are memory logic errors, not immediate crashes:

  • Dangling pointers: You free() a node, but another node still points to it. The program may continue for thousands of instructions before crashing.
  • Memory leaks: You allocate nodes but lose the head pointer. The program exits "cleanly" but leaks memory.
  • Cycles: A next pointer points to a previous node. Traversal loops forever, eventually causing a stack overflow.

Simulation: The Dangling Pointer Trap

This is the classic Use-After-Free bug. You delete a node, but forget to update the previous node's pointer. The previous node is now pointing to a "dead" memory address.

Memory Inspector

A 10
B 20
C 30
Select an action to see the memory consequences...
// The Correct Pattern
void delete_node(Node** head_ref, int key) {
    Node* temp = *head_ref;
    Node* prev = NULL;

    // 1. Locate the node
    while (temp != NULL && temp->data != key) {
        prev = temp;
        temp = temp->next;
    }
    if (temp == NULL) return;

    // 2. REWIRE (Critical Step)
    // If prev exists, point it to temp's next
    if (prev != NULL) prev->next = temp->next;
    else *head_ref = temp->next;

    // 3. FREE (Safe now)
    free(temp); 
}

Tools of the Trade:
AddressSanitizer (ASan): Compile with gcc -fsanitize=address -g. It detects invalid reads/writes and use-after-free errors instantly.
Valgrind: Run valgrind --leak-check=full ./program. It tracks every allocation and reports exactly where memory was lost.

Bottom line: A program that "seems to work" is often a program with a ticking memory bomb. Your job is to prove it's safe, not assume it. Write assert tests for structure, and run your code through ASan/Valgrind for memory safety.

Frequently Asked Questions (FAQ)

You've built the train, learned to add cars, and even performed surgery on the couplings. Now, let's address the most common questions that trip up students.

1. What is the difference between singly and doubly linked lists?

A Singly Linked List is a one-way street. You can only look forward. A Doubly Linked List gives every car a rear-view mirror, allowing you to look backward as well.

Data 10
prev

Click to add the prev pointer

Singly Linked List:
  • Uses struct Node* next;
  • Less memory (8 bytes less per node).
  • Cannot traverse backwards.
Doubly Linked List:
  • Uses struct Node* prev; AND next;
  • More memory usage.
  • Insertion/Deletion is easier (no need to find predecessor).

2. Why does my linked list crash when I delete the head?

This is the "Pass-by-Value" Trap. In C, when you pass a pointer to a function, you are passing a copy of the address. If you change that copy inside the function, the original head in your main() function doesn't know about it.

Wrong: Pass by Value

void delete_head(Node* head) {
    free(head);
    // head = head->next; 
    // This only changes the LOCAL copy!
    // The original 'head' in main() is still pointing to free'd memory.
}

Result: Segmentation Fault because the original pointer is now a "dangling pointer".

Right: Pass by Reference

void delete_head(Node** head_ref) {
    // Use ** (pointer to pointer)
    Node* temp = *head_ref;
    *head_ref = (*head_ref)->next; // Update the ORIGINAL head
    free(temp);
}

Result: Safe. The *head_ref dereferences the address of the head pointer, allowing us to modify the original variable.

3. How does memory allocation affect performance?

This is the Cache Locality problem. Arrays live in a contiguous block of memory (neighbors). Linked List nodes live wherever the memory manager (malloc) has space (strangers).

Array (Contiguous)

0
1
2
3

CPU loads them all at once into Cache.
Fast Traversal.

Linked List (Scattered)

10
20
30

CPU must fetch each one individually from RAM.
Slower Traversal (Cache Misses).

4. How do I handle circular linked lists?

A circular linked list is a train where the last car connects back to the engine. There is no NULL at the end.

A B C D E F
HEAD

Key Rules:

  • Termination: Stop when current == head (not NULL).
  • Traversal: Use a do-while loop to ensure you process the head node at least once.
  • Deletion: Be careful not to leave the list in a broken loop. If you delete the only node, set head = NULL.
while (current->next != head) {
  current = current->next;
}

5. What are common pitfalls for beginners?

Uninitialized Head

Always set Node* head = NULL; initially. Garbage values cause crashes.

Dangling Pointers

Never access node->next after calling free(node).

Infinite Loops

Accidentally creating a cycle (e.g., tail->next = head in a linear list) causes your loop to never end.

Recursion Depth

Avoid recursive traversal for large lists. It will crash the stack. Use a while loop.

6. Can I implement a linked list in languages other than C?

Yes! The logic is identical, but the memory management changes.

Python / Java / JavaScript

You use References (similar to pointers, but safer). You don't call malloc or free. The Garbage Collector handles memory automatically.

C++

Similar to C, but use new and delete. Modern C++ encourages Smart Pointers (std::unique_ptr) to prevent leaks automatically.

Post a Comment

Previous Post Next Post