How to Implement LFU Cache from Scratch for Technical Interviews

Design LFU Cache: Intuition & Trade-offs

Welcome back! Today we are tackling the LFU (Least Frequently Used) Cache. This is a classic interview problem that feels tricky at first, but once you visualize the structure, it becomes elegant.

Your goal is to build a cache where get and put operations are always O(1). The eviction rule is specific: remove the item that has been used the least often. If two items have the same frequency, remove the one that was used least recently (LRU).

The "Shelves" Analogy

Unlike a simple LRU cache (which just needs a list and a map), LFU must handle two moving parts simultaneously: frequency counts and recency.

The cleanest way to achieve O(1) is to maintain a hierarchy of frequency buckets. Think of your cache as a library with specific shelves:

  • Shelf 1: Contains items accessed exactly 1 time.
  • Shelf 2: Contains items accessed exactly 2 times.
  • Shelf 3: Contains items accessed exactly 3 times.

Crucially, on each shelf, the items are arranged by recency (like an LRU list). The most recently used item is at the front, and the least recently used is at the back.

The Golden Rule:

When you access an item, you pick it up from its current shelf and move it to the next higher shelf, placing it at the very front.

Visualizing the Shelves

Shelf 1
Item A (Freq: 1)
Shelf 2
Shelf 3

*Notice how "Item A" moves from Shelf 1 to Shelf 2 when accessed, maintaining its position at the front (most recent).

The Misconception: "One Hash Map is Enough"

A very common first thought is: "I'll just store key → (value, frequency) in a single hash map. When I need to evict, I'll loop through the map to find the smallest frequency."

Problem: That loop is O(n). If your cache has 1 million items, finding the one to remove takes a long time. That breaks our O(1) requirement.

The Solution: Two Coordinated Maps

To achieve strict O(1) performance, we need to stop searching for the minimum frequency and start tracking it. We do this with two data structures working in harmony:

1. Key → Node Map

This is your standard dictionary. It lets you find any item instantly.

"user_123" → Node{val: 5, freq: 2, pointers...}

2. Freq → List Map

This maps a frequency number to a Doubly Linked List of items.

1 → [A, C]
2 → [B]
3 → [D]

Why does this work in O(1)?

  1. Finding the Minimum Frequency: We maintain a simple variable min_freq. When we insert a new item, min_freq resets to 1. When we evict the last item from the min_freq list, we increment min_freq. No scanning required!
  2. Moving Nodes: Because we have direct pointers to where a node sits in a linked list (from Map #1), we can "splice" it out of Shelf 1 and "splice" it into Shelf 2 instantly.
  3. Eviction: We simply look at freq_list_map[min_freq] and remove the item at the tail (the least recently used one).

This two-map structure is the minimal setup that preserves all the relationships you need for constant-time operations. It trades a bit of memory (to store the pointers) for massive gains in speed.

The LFU Logic: Frequency First, Recency Second

Now that we have our "Shelves" visualized, let's dig into the specific logic that governs them. Understanding the decision hierarchy is the key to acing the interview question.

The "Score": A Two-Step Decision

When the cache is full and needs to make space, it doesn't just pick a random item. It performs a strict two-step check to determine the "score" of every item. Think of it like a sorting algorithm where every item has a tuple: (frequency, last_access_time).

The Eviction Algorithm

  1. Step 1: Find the Minimum Frequency.
    Look at all items and find the lowest frequency number (e.g., everyone on "Shelf 1"). Ignore everything else for a moment.
  2. Step 2: Tie-Break with Recency.
    If multiple items share that lowest frequency, look at their timestamps. Evict the one that was accessed longest ago.

The Misconception: "Is LFU just LRU with a counter?"

This is the most dangerous trap in this problem. Many students think LFU is simply LRU where we add a `count` variable. It is not.

LRU only cares about time.
LFU cares about popularity first, and time second.

LRU vs. LFU: A Showdown

Current Cache State
  • A
    Item A
    Freq: 100 | Last: 10 min ago
    Popular but Old
  • B
    Item B
    Freq: 1 | Last: Just now
    New & Rare
LRU Strategy:
Evicts Item A
Why? Item A was accessed longer ago (10 min vs Just now).
LFU Strategy:
Evicts Item B
Why? Item B has lower frequency (1 vs 100). Recency doesn't matter yet!

As you can see above, the two strategies make opposite decisions.

If you implement LFU by simply storing a frequency count and then scanning for the minimum count on eviction (without maintaining frequency groups with their own recency orders), you will fail to handle these ties correctly, or worse, you will achieve O(n) performance instead of O(1).

The correct LFU design maintains separate recency orders for each frequency level. When you access an item, you don't just increment its count—you move it from its current frequency list to the front of the next higher frequency list.

O(1) Cache Operations: The Real Cost of Efficiency

In an interview, O(1) isn't just a theoretical checkbox—it's a direct test of your ability to engineer a system that won't buckle under real traffic. Caches are the backbone of high-throughput services; if a single get or put operation degrades to O(n) due to a hidden scan, latency spikes will cascade into system-wide slowdowns.

Interviewers ask for O(1) to see if you can manage the inherent complexity of LFU (two moving dimensions: frequency and recency) without resorting to brute force. Achieving this with LFU is harder than with LRU, precisely because you must coordinate multiple data structures seamlessly. Nailing it demonstrates you understand how to use memory (extra pointers, maps) to buy time—a fundamental trade-off in systems design.

The Search for the Minimum Frequency

Current Cache State
Item A Freq: 5
Item B Freq: 2
Item C Freq: 8
Item D Freq: 1
Algorithm Execution
Ready to search...

*Click the buttons to see how the two approaches differ. The Brute Force method checks every single item. The LFU method looks at a single variable.

The Misconception: "One Map is Enough"

The most common flawed approach is: "I'll just keep a single hash map key → (value, count) and, when evicting, scan all entries to find the smallest count."

Problem: That scan is O(n). Even if you try to track min_freq separately, you'll face O(n) updates when that list becomes empty. Why? To find the new minimum frequency, you'd need to scan all possible frequency counts—again, O(n).

The core issue is that a single map gives you no efficient way to group by frequency or find the LRU item within the minimum group.

The Solution: Two Coordinated Maps

The only path to true O(1) is the two-map structure we discussed. This allows every operation to be a constant-time pointer manipulation:

1. Eviction

node = freq_list_map[min_freq].tail.prev
Direct access to the least recent item in the minimum frequency bucket. No searching required.

2. Access (Get/Put)

remove(node)add_to_front(node.freq + 1)
Splice the node out of its current list and push it to the front of the next list.

Consider the critical step in a get(key) operation. Notice how there are no loops here:

def get(self, key: int) -> int:
    # 1. Check if key exists (O(1))
    if key not in self.key_node_map:
        return -1

    node = self.key_node_map[key]
    
    # 2. Update Frequency (O(1))
    # We splice 'node' out of its current list and move it to freq + 1
    self._update_freq(node)
    
    return node.val

Every step is a hash lookup or pointer update. That's the O(1) guarantee interviewers expect—and the misconception that a single map suffices is exactly what they'll probe to see if you truly understand the layered design.

Technical Interview Data Structures: The LFU Trio

Welcome back! Now that we understand the logic of LFU, let's look at the mechanics. In an interview, you won't just say "I'll use a map." You need to specify exactly which data structures work together to achieve that O(1) performance.

To build a working LFU Cache, you need exactly three components working in harmony. I call this the "LFU Trio".

The LFU Trio: How They Connect

1. Key-Node Map
"User_123"
2. The Node
Node Object
val: 50
freq: 2
[prev] <-- [me] --> [next]
3. Freq-List Map
Key: 2
Value: [Linked List]
A
B
C

The Components Explained

1. Key-Node Map

Map<String, Node>
This is your standard dictionary. It maps the Key directly to the memory address of the Node. It allows you to find any item instantly without scanning.

2. The Node

The container for your data. It holds the Value and the Frequency. Crucially, it has prev and next pointers to link itself into a Doubly Linked List.

3. Freq-List Map

Map<Int, DoublyLinkedList>
This is the heart of the design. The key is the frequency (1, 2, 3...). The value is a list of all nodes with that frequency, ordered by recency.

The Misconception: "One Map is Enough"

Many students try to simplify this by using just one map: key → (value, frequency). They think, "I can just loop through the map to find the minimum frequency when I need to evict."

Problem: That loop is O(n). If you have 100,000 items, looping through all of them just to find one to delete is too slow. It breaks the O(1) requirement.

The "Single Map" Trap vs. The "Two Map" Solution

Cache Memory (100 Items)
Algorithm Execution
Ready to evict...

*Notice how the Single Map approach has to check every single item (yellow) to find the one to evict. The Two Map approach goes straight to the bucket.

The Freq-List Map solves both problems instantly:

  1. Finding the Victim: You don't search. You just look at freq_list_map[min_freq]. The tail of that list is always the item to evict.
  2. Updating Min Freq: If a list becomes empty, you know the new minimum must be the next integer up (or the frequency of the node you just added). No scanning required.

Without these frequency-ordered lists, you cannot efficiently group by frequency and maintain recency within the group. That grouping is non-negotiable for O(1) LFU.

LFU Cache Implementation: The Stepwise Assembly

Now that we understand the theory, let's build the machine. Implementing an LFU cache is like assembling a complex engine: you need the right parts (Data Structures) and the correct sequence of operations (Workflow).

The core insight for implementation is that every access is a migration event. When you touch an item, you aren't just "reading" it; you are promoting it to a higher status.

The Implementation Workflow

1. Key-Node Map (The Address Book)
"Item A"
Node { val: 10, freq: 1 }
2. Freq-List Map (The Shelves)
Freq 1
Item A
Freq 2
Waiting for promotion...
Ready to simulate...

The Misconception: Skipping the Frequency Update

The most subtle bug in LFU implementation is forgetting to increment the frequency counter. It's tempting to think: "I found the node, I'll just move it to the front of the list to mark it as recent."

The Trap: If you move the node but don't update node.freq += 1, the node stays in the wrong bucket. Your cache effectively becomes an LRU cache with a broken frequency counter.

The "Stagnant Frequency" Bug

❌ Broken Logic (No Freq Update)
Freq 1
Item A
Freq 2
Empty

Result: Item A stays in Freq 1. The system thinks it's "rare," even though you just accessed it.

✅ Fixed Logic (Update Freq)
Freq 1
Empty
Freq 2
Item A

Result: Item A moves to Freq 2. The system correctly recognizes it as "popular."

*Click "Run Comparison" to see how the broken approach fails to promote the item.

The Critical Code Pattern

To avoid the trap above, your get and put methods must follow this strict sequence. Note the bolded line—that is the heartbeat of the LFU algorithm.

def update_freq(self, node):
    # 1. Identify current state
    old_freq = node.freq
    
    # 2. REMOVE from old list
    self.freq_list_map[old_freq].remove_node(node)
    
    # 3. UPDATE frequency (The Critical Step!)
    node.freq += 1
    
    # 4. ADD to new list
    if node.freq not in self.freq_list_map:
        self.freq_list_map[node.freq] = FreqList()
    self.freq_list_map[node.freq].add_to_front(node)
    
    # 5. Maintain min_freq if needed
    if not self.freq_list_map[old_freq]:
        if old_freq == self.min_freq:
            self.min_freq += 1

This pattern ensures that the "Address Book" (Key-Node Map) and the "Shelves" (Freq-List Map) remain perfectly synchronized. If you skip step 3, the synchronization breaks, and your cache logic collapses.

Cache Eviction Policy Interview: Intuition & Edge Cases

Now, let's talk about how to sell this solution in an interview. You've built the structure, but the interviewer needs to know you understand the decision-making process behind the eviction.

When asked "How do you decide what to evict?", do not just say "The least frequent." That is incomplete. You must articulate the Two-Level Decision Rule:

The Golden Rule of LFU Eviction:

  1. Primary Filter: Find the lowest frequency group (the bucket with the smallest number).
  2. Tie-Breaker: If multiple items share that frequency, evict the one that is Least Recently Used (the one at the tail of the list).

The Tie-Breaker Visualization

This is the most common interview trap. Imagine a scenario where you have two items, Item A and Item B, and they both have a frequency of 2. The cache is full and needs space. Who goes?

If you only look at frequency, it's a tie. You must look at recency. The item accessed longer ago is the victim.

Scenario: The Tie-Breaker

Frequency Bucket: 2
A
Item A
Freq: 2 | Last Access: 10 mins ago
Tail (LRU)
B
Item B
Freq: 2 | Last Access: Just now
Head (MRU)

Waiting for eviction trigger...

*This proves why simply storing a frequency count is insufficient. You need a linked list for each frequency to track recency order.

The Misconception: "Eviction is Automatic"

A frequent mistake is assuming that once you have the two-map structure, eviction "just works." In reality, interviewers will probe how you handle these subtle edge cases:

1. The "Aging" Problem

LFU uses cumulative counts. An item accessed heavily in the past stays in a high-frequency bucket forever, even if never used again.

Interview Tip: Acknowledge this trade-off. Mention that variants like LFU with Aging periodically decay counts to fix this.

2. The min_freq Trap

When you evict the last item from min_freq, you must find the next smallest frequency.

Simply incrementing min_freq by 1 fails if there are gaps (e.g., Freq 2 is empty, but Freq 3 has items). You must check for the next non-empty list.

The Critical Code Pattern

To handle the min_freq update correctly, your eviction logic must be precise. When the list at min_freq becomes empty, you don't just guess the next frequency; you verify it.

def _evict(self):
    # 1. Identify the victim
    victim = self.freq_list_map[self.min_freq].remove_tail()
    
    # 2. Check if the bucket is now empty
    if not self.freq_list_map[self.min_freq]:
        # 3. CRITICAL: Find next valid min_freq
        # (Assuming no gaps usually, but safe to check)
        while self.min_freq in self.freq_list_map and not self.freq_list_map[self.min_freq]:
            self.min_freq += 1
        # If map is empty, reset min_freq to 1 (new insertion case)
        if self.min_freq not in self.freq_list_map:
            self.min_freq = 1
            
    return victim.key

Explicitly listing these edge cases—and showing how your code addresses them—proves you've thought beyond the happy path. It shows you understand that the eviction policy is not a single rule, but a set of invariants your data structure must uphold after every operation.

Core Data Structures: Hash Map and Doubly Linked List

Now that we understand the logic of LFU, let's look at the mechanics. In an interview, you won't just say "I'll use a map." You need to specify exactly which data structures work together to achieve that O(1) performance.

Intuition: How Hash Map Provides O(1) Key Lookup

Think of the key_node_map as your master contact list. When someone asks for a key, you don't search through a line of people—you look it up directly by name.

That's exactly what a hash map (or dictionary) gives you: given any key, you find its associated node in constant time. The node is the container holding the value, the current frequency count, and—critically—pointers that tie it to its position in a frequency list. Without this instant lookup, every get or put would start with a slow scan, breaking the O(1) promise.

Hash Map vs. Linear Search

Data Structure: Array vs. Map
A
B
C
D Target
E
Algorithm Execution
Ready to search for "D"...

*Click the buttons to see the difference. The Array must check every item. The Hash Map jumps straight to "D".

The Misconception: Ignoring the Need for a Doubly Linked List

Here's the subtle trap: you might think, "I'll store the nodes in the hash map and keep a separate list of all keys sorted by frequency." But sorting is O(n log n) and updating it on every access is worse.

The real insight is that within each frequency group, you need an LRU list, and you need to be able to yank any node out of that list instantly—not by searching, but by following pointers stored in the node itself.

That's why the node must have prev and next pointers, and why each frequency bucket is a doubly linked list. The list orders nodes by recency (newest at head, oldest at tail).

The "Splicing" Challenge

Head
A
B
C
Tail
Removing Node B
Select an approach to see how we remove Node B.

*Notice how the Singly Linked list has to traverse from the start to find Node A (the predecessor). The Doubly Linked list just looks at Node B's "Prev" pointer.

If you tried to use an array or a singly linked list, removal would require scanning to find the node's position—O(n) again. The doubly linked list, with pointers stored in the node, lets you remove and insert anywhere in constant time. The hash map gives you the node; the node's pointers let you reposition it within its frequency hierarchy. You need both: one for fast lookup, the other for fast reordering. Forget either, and your operations degrade to linear time.

Step-by-Step Implementation Guide

Now that we have the data structures ready, let's build the engine. In an interview, you won't just be asked to draw the structure—you'll need to write the code that makes it work.

The core insight for implementation is that every access is a migration event. Whether you are calling get or put, once you find the node, the logic is identical: you must update its frequency and physically move it to a higher-frequency bucket.

The Shared Workflow: Get vs. Put

1. Key-Node Map (The Address Book)
"Key A"
Node { val: 10, freq: 1 }
2. Freq-List Map (The Shelves)
Freq 1
Key A
Freq 2
Waiting for promotion...
Ready to simulate access...

The Misconception: The "Read-Only" Trap

The most common and critical bug in LFU implementation is treating get as a read-only operation. You might think: "I just need to return the value; the frequency shouldn't change."

The Trap: In LFU, every access—including get—counts as a use. If you don't run the full frequency update after a get, the cache's state becomes corrupted.

What happens if you skip the update in get?

The Consequence:

  • The node's freq counter stays the same.
  • The node remains in its old frequency bucket.
  • Over time, items never "graduate" to higher buckets. The min_freq bucket becomes a dumping ground.
  • Result: You've effectively built an LRU cache, not an LFU cache.

Remember: The get operation is not just a lookup—it's an access that must affect eviction priority. The moment you find the node in key_node_map, you must immediately call your frequency update helper.

The Critical Code Pattern

To avoid the trap above, your get and put methods must follow this strict sequence. Note the bolded line in the _update_node function—that is the heartbeat of the LFU algorithm.

def get(self, key):
    if key not in self.key_node_map:
        return -1
    node = self.key_node_map[key]
    # This is non-negotiable!
    self._update_node(node)  
    return node.value

def _update_node(self, node):
    old_freq = node.freq
    
    # 1. REMOVE from old list
    self.freq_list_map[old_freq].remove_node(node)
    
    # 2. UPDATE frequency (The Critical Step!)
    node.freq += 1
    new_freq = node.freq
    
    # 3. ADD to new list
    if new_freq not in self.freq_list_map:
        self.freq_list_map[new_freq] = FreqList()
    self.freq_list_map[new_freq].add_to_front(node)
    
    # 4. Maintain min_freq if needed
    if not self.freq_list_map[old_freq]:
        if old_freq == self.min_freq:
            self.min_freq = new_freq

This pattern ensures that the "Address Book" (Key-Node Map) and the "Shelves" (Freq-List Map) remain perfectly synchronized. If you skip step 2, the synchronization breaks, and your cache logic collapses.

Advanced Topics: Production Polish & Edge Cases

Congratulations! You have built the core LFU engine. It works, it's O(1), and it handles the main logic. But in a real-world system (or a senior-level interview), we often ask: "What about memory leaks? What about weird edge cases?"

Let's look at two advanced refinements that make your solution production-ready: cleaning up empty memory buckets and handling the capacity = 0 edge case.

Refinement 1: Shrinking Frequency Lists (Memory Cleanup)

In our basic design, when a node moves from Frequency 1 to Frequency 2, we remove it from the list at Freq 1. If that list becomes empty, we usually just leave the empty list sitting in our freq_list_map.

Over time, if you have 1,000 items that all climb to Frequency 100, you might end up with a map containing empty entries for frequencies 1 through 99. This wastes memory. The fix is to delete the empty list immediately after it becomes empty.

The "Empty Bucket" Cleanup

Freq List Map (Dictionary)
1
Item A
2
Empty
3
Empty
Memory State
Map contains keys: [1, 2, 3]

*Step 1: Promote Item A. Bucket 1 becomes empty. Step 2: Prune. Bucket 1 is removed from memory.

def _update_node(self, node):
    old_freq = node.freq
    
    # 1. Remove from old list
    self.freq_list_map[old_freq].remove_node(node)
    
    # 2. Check if list is empty
    if not self.freq_list_map[old_freq]:
        # CRITICAL: If it was the minimum, update min_freq first
        if old_freq == self.min_freq:
            self.min_freq += 1
        # THEN delete the key from the map
        del self.freq_list_map[old_freq]

Refinement 2: The "Zero Capacity" Edge Case

What if the interviewer asks: "What happens if I initialize the cache with capacity = 0?"

Logic dictates that a cache of size 0 cannot hold anything. Therefore:

  • get(key) should always return -1.
  • put(key, val) should do absolutely nothing.

This is a classic "Guard Clause" scenario. You handle it at the very top of your functions before any logic runs.

The Zero Capacity Guard

Current Capacity
0
STORE DISABLED
Execution Log
Ready. Capacity is 0.

*Notice how the code returns immediately without allocating memory when capacity is 0.

def put(self, key, value):
    # GUARD CLAUSE: Handle edge case first
    if self.capacity == 0:
        return
    
    # ... rest of logic ...

Interview Strategy: Don't Over-Optimize

While these refinements are great for production, remember the Interview Priority:

The Golden Rule of Interviewing:

Correctness & Clarity > Micro-Optimizations.

Your interviewer wants to see that you understand the Two-Map Structure and can handle the O(1) Frequency Update.

If you spend 20 minutes trying to optimize memory cleanup before your core logic works, you will fail.

Always implement the core solution first. Once you have a working, bug-free O(1) cache, then you can say: "One final optimization I'd make in production is pruning empty frequency buckets to save memory..." That shows seniority without risking the core solution.

Testing and Validation: Proving Your O(1) Promise

Congratulations! You've designed the structure and written the code. But in a real interview (and in production), saying "It works" isn't enough. You must prove "It works efficiently, regardless of size."

This is where Testing comes in. Your test suite is your evidence. It needs to validate two things:

1. Functional Correctness

Does the cache evict the right item? Does the frequency counter update correctly after a get?

2. Performance Guarantees

Does the operation time stay constant as the cache grows? Are there hidden loops?

The Test Suite Simulator

test_lfu_cache.py
# Ready to execute tests...
Test Scenarios
  • 1
    Eviction Order
    Verify LFU policy & Tie-Breaker
    PENDING
  • 2
    Frequency Update
    Verify node migration after GET
    PENDING
  • 3
    O(1) Scalability
    10,000 ops on size 1000 cache
    PENDING

The Misconception: "Manual Testing is Enough"

It is very common to walk through an example by hand: "I put A, I put B, I get A, so B is evicted. See? It works."

The Problem: Manual testing is fragile. It misses edge cases (like the "Zero Capacity" guard) and gives zero evidence about performance. You can't "see" O(1) by hand.

A robust test suite must be automated. It should explicitly check the invariants that make your solution O(1).

def test_lfu_eviction_order():
    # Setup: Cache size 2
    cache = LFUCache(2)
    
    # 1. Insert items (freq=1)
    cache.put(1, "a") # Key 1
    cache.put(2, "b") # Key 2
    
    # 2. Access Key 1 (freq becomes 2)
    cache.get(1) 
    
    # 3. Insert Key 3 (Cache full, min_freq is 1)
    # Should evict Key 2 (freq 1, older than Key 1)
    cache.put(3, "c")
    
    assert cache.get(2) == -1  # Evicted?
    assert cache.get(1) == "a" # Still there?
    assert cache.get(3) == "c" # New item?

The Stress Test: Validating O(1)

How do you prove it's O(1)? You run a Stress Test. You create a cache with 1,000 items and perform 10,000 random operations.

If your implementation is correct, the time taken per operation should remain constant. If you have a hidden loop (like scanning for the minimum frequency), the time will spike as the cache grows.

Post a Comment

Previous Post Next Post