At 3 AM on a Tuesday in July 2019, our production memory metrics started climbing and didn’t stop. No new deployments. No traffic spike. Just a slow, steady ascent toward OOM that our alerting hadn’t been tuned to catch early enough.
What followed was two days of debugging that ultimately had nothing to do with a memory leak in the traditional sense — no unreachable objects, no goroutine accumulation, no cgo boundary mismanagement. The culprit was a documented, intentional behavior in Go’s runtime that I had simply never thought through in the context of long-running services.
This post walks through that incident: what Go’s map allocator actually does, why it matters for services that process data in bulk cycles, how this interacts with GC pressure in production, and what Rust’s HashMap design reveals about a different set of trade-offs.
The Service and the Symptom
The service was a batch processor: pull a dataset from upstream, load it into memory, transform and emit, clear state, repeat. The cycle ran every few minutes. The implementation was straightforward Go — a top-level map that got populated and cleared on each iteration.
func processCycle(source DataSource) error {
records := make(map[int][128]byte)
// Phase 1: Load
for i, record := range source.All() {
records[i] = record
}
// Phase 2: Transform and emit
if err := emitAll(records); err != nil {
return err
}
// Phase 3: "Clear" and prepare for next cycle
for k := range records {
delete(records, k)
}
runtime.GC()
return nil
}
After a few hours in production, RSS memory stabilized not at baseline but at ~290 MB — the high-water mark from the first populated cycle. Subsequent cycles showed no additional growth, but memory never returned toward zero.
Our initial hypothesis was a goroutine leak. Then a closure capturing the map reference. Both were wrong. runtime.ReadMemStats pointed at heap in use, not leaked goroutines, and pprof heap profiles showed the memory was in the map’s bucket array — fully reachable, not leaked, just not released.
What Go’s Map Allocator Actually Does
Go’s map is implemented as a hash table with a bucket array. Each bucket holds 8 key-value pairs. When you insert enough keys to exceed the load factor (6.5 entries per bucket), the runtime allocates a new, larger bucket array and migrates entries incrementally.
The critical property: this bucket array never shrinks. Deleting a key marks the slot as empty but leaves the bucket array at its peak-allocation size. A map that once held 1,000,000 entries will retain the bucket memory for 1,000,000 entries even after every key has been deleted.
This is verifiable in isolation:
func main() {
printHeap("initial") // ~0 MB
m := make(map[int][128]byte)
for i := 0; i < 1_000_000; i++ {
m[i] = [128]byte{}
}
printHeap("after fill") // ~305 MB
for k := range m {
delete(m, k)
}
runtime.GC()
// Reference m to prevent it from being collected entirely
_ = len(m)
printHeap("after delete + GC") // ~287 MB — buckets retained
}
func printHeap(label string) {
var s runtime.MemStats
runtime.ReadMemStats(&s)
fmt.Printf("[%s] HeapInuse = %d MB\n", label, s.HeapInuse/1024/1024)
}
Output (Go 1.24):
[initial] HeapInuse = 0 MB
[after fill] HeapInuse = 305 MB
[after delete + GC] HeapInuse = 287 MB
The GC reclaims a small amount (partially freed spans at the allocator level), but the bucket array itself stays. This is not a bug. It is explicitly documented behavior, motivated by a real trade-off: if you re-populate the map, you avoid the cost of re-growing the bucket array. The map has optimized for the case where you intend to reuse it.
The Production Impact: Why This Compounds
In an isolated benchmark, 290 MB sitting idle is manageable. In a production service, several factors compound the problem:
1. Pod/container sizing. Kubernetes memory limits are set based on observed usage. A service that spikes to 300 MB on first cycle and never reclaims will trigger OOM kills if requests arrive during a spike, or will force over-provisioned limits that cost money at scale.
2. GC pressure. The GC’s target heap size is influenced by GOGC (default 100, meaning heap can double before a GC is triggered). If your baseline is 290 MB, the GC may allow the live heap to grow to 580 MB before collecting. In a service processing large transient datasets, this shifts the GC trigger substantially.
3. Multiple maps. A service with several maps following this pattern — caches, in-flight request state, batch buffers — can retain 3–5x more heap than its actual working set.
4. Long tail outliers. If one cycle processes 5x the normal dataset (burst traffic, catch-up after lag), the map grows to accommodate it and stays at that size permanently. Every subsequent smaller cycle sits at the burst high-water mark.
The Fixes, Ranked by Invasiveness
Option 1: Reassign the map (lowest friction)
Drop the reference to the old map and allocate a fresh one. The GC will collect the old bucket array when no references remain.
// Instead of ranging and deleting:
records = make(map[int][128]byte) // old map eligible for GC
runtime.GC() // optional; GC will collect on next trigger
The key distinction from delete() in a loop: you’re not zeroing entries in an existing structure — you’re abandoning the structure entirely. The old bucket array becomes unreachable and will be collected.
This adds one allocation per cycle (allocating the new map header and initial buckets), but for a service running cycles every few minutes, this is not a meaningful overhead.
Option 2: Use pointer values to reduce bucket size
Go moves values larger than 128 bytes to the heap and stores a pointer in the bucket. If you’re storing large structs directly as values, switching to pointers reduces how much memory the bucket array itself consumes — and, critically, allows the GC to collect the pointed-to values independently from the bucket array.
// Before: bucket array holds 128 bytes per slot inline
records := make(map[int][128]byte)
// After: bucket array holds 8-byte pointers; values collected independently
records := make(map[int]*[128]byte)
With pointer values, after delete(), the values themselves become unreachable and are collected. Only the bucket array (8 bytes per slot) is retained. For 1 million entries, this means the retained bucket memory drops from ~128 MB to ~8 MB — a 16x reduction in the persistent overhead.
This does not fully solve the problem (the bucket array still doesn’t shrink), but it significantly reduces the cost of the non-shrinking behavior.
Option 3: Pre-size when you know the cardinality
If your cycle size is predictable, pre-allocate the map to avoid bucket growth during population. This doesn’t help with the shrinking problem, but it eliminates repeated reallocations during fill.
records := make(map[int][128]byte, expectedSize)
Pre-sizing is primarily a throughput optimization (fewer incremental rehashes), not a memory fix. But for a large, predictable dataset, it reduces GC pressure during the fill phase.
How Rust’s HashMap Approaches This
Rust’s std::collections::HashMap uses a different design — Robin Hood hashing with SIMD-accelerated probing (via hashbrown). More relevant here is that it exposes explicit memory control that Go’s map does not.
use std::collections::HashMap;
fn process_cycle(source: &[([u8; 128]); N]) -> Result<(), Error> {
let mut records: HashMap<i32, [u8; 128]> = HashMap::with_capacity(1_000_000);
// Fill
for (i, record) in source.iter().enumerate() {
records.insert(i as i32, *record);
}
// Emit
emit_all(&records)?;
// Clear and release memory
records.clear();
records.shrink_to_fit(); // Returns bucket memory to allocator
Ok(())
}
clear() removes all entries. shrink_to_fit() deallocates the bucket array down to the minimum required for the current entry count (zero), returning the memory to the allocator.
The tradeoff: if you call shrink_to_fit() and then re-populate in the next cycle, Rust will reallocate the bucket array from scratch. You pay the allocation cost on each cycle. Go’s approach avoids this by retaining buckets. Both are intentional design decisions — they optimize for different access patterns.
For patterns where you can predict you’ll re-populate to a similar size, Rust offers shrink_to(min_capacity):
// Keep capacity for 1M entries; don't release everything
records.shrink_to(1_000_000);
This lets you tune the trade-off explicitly: retain some capacity to avoid reallocation cost, but shed the tail of excess capacity from spike cycles.
Value Size and Bucket Layout: A Hidden Multiplier
There’s a related issue that compounds the memory retention problem: the relationship between value size and bucket layout in Go.
Go’s runtime has a threshold (~128 bytes) above which it stores values by pointer in the bucket rather than inline. This is an optimization for large values — it makes bucket operations cheaper (moving pointers instead of copying large structs). But it’s automatic and invisible.
// Go decides this transparently based on value size
smallMap := make(map[int][64]byte) // values inline in buckets
largeMap := make(map[int][256]byte) // Go stores pointer in bucket, value on heap
For the memory retention problem, this matters because when Go stores values by pointer, the GC can collect the values independently from the bucket array. After a delete(), pointed-to values become unreachable and are collected. Only the pointer slots in the bucket array remain — 8 bytes per slot instead of N bytes per slot.
Rust makes this explicit:
// Inline: entire value stored in bucket table
let mut inline_map: HashMap<i32, [u8; 128]> = HashMap::new();
// Box: pointer stored in bucket; value on separate heap allocation
let mut boxed_map: HashMap<i32, Box<[u8; 128]>> = HashMap::new();
With boxed values in Rust (or pointer values in Go), clear()/delete() makes the value allocations immediately collectible. The bucket array is still retained (in Go) or shrinkable (in Rust), but the bulk of the memory — the values themselves — can be reclaimed.
Distributed Systems Implications
This class of problem — resource retained beyond the scope of a logical operation — has analogs across distributed systems design.
Connection pools that don’t shrink. A database connection pool that grew to 200 connections during a traffic spike and has no idle-connection eviction policy will retain those connections indefinitely. Go’s map retention is the same behavior at the memory allocator level.
In-memory caches with TTL-only eviction. A cache that evicts entries by TTL but never compacts its internal structure will retain the bucket/shard allocation from peak load. Under bursty traffic patterns, this produces the same high-water mark behavior.
Retry buffers in messaging systems. A service buffering messages for retry in an in-memory map — keyed by message ID — will not reclaim the bucket memory after successful retries clear the map. If a DLQ backlog is processed and then drains, the allocated buffer stays.
Observability note. The failure mode is particularly difficult to diagnose because standard heap profilers show the memory as live and reachable. It’s not leaked. pprof’s heap profile will correctly attribute it to the map’s bucket allocation. The issue is that operators observing RSS or container memory metrics see high usage without understanding why the GC isn’t collecting it.
Production Failure Case: The Catch-Up Spike
Here is the specific failure sequence from our 2019 incident, reconstructed with more precise framing:
Normal steady state: Cycles process ~500K records. Map grows to ~150 MB per cycle, then gets deleted and re-populated. Because we were re-using the same map (range-delete pattern), the bucket array stayed at ~150 MB across cycles. Acceptable.
Upstream lag event: A 40-minute upstream outage caused records to queue. When the outage resolved, a single catch-up cycle processed ~3.2M records — 6x the normal batch size. The map grew its bucket array to accommodate. Peak RSS: ~960 MB.
Post-recovery: Subsequent cycles returned to 500K records. But the map’s bucket array was now sized for 3.2M entries. Every cycle ran at ~960 MB RSS. Our pod limit was 1.2 GB. We had 240 MB of headroom where we’d previously had ~600 MB.
The kill shot: Two weeks later, a separate bug caused a temporary increase in value size. The service hit the 1.2 GB limit and was OOM-killed. The root cause in the post-mortem was identified as the OOM — but the contributing factor was the map that had been running at 960 MB for two weeks because of a single catch-up cycle nobody noticed.
Fix applied: Changed the cycle implementation to reassign the map on each iteration. RSS returned to the steady-state range immediately. The map-reassignment cost was unmeasurable at the cycle frequency.
Implementation Pattern: Cycle-Safe Map Usage in Go
type BatchProcessor struct {
// Don't hold the map as long-lived state if you follow this pattern
}
func (p *BatchProcessor) RunCycle(source DataSource) error {
// Allocate fresh map each cycle with expected capacity
// Cost: one allocation per cycle (negligible at minute-level frequency)
records := make(map[int]*Record, source.EstimatedSize())
defer func() {
// Explicit nil to release reference immediately
// (helps if RunCycle is called in a long-lived goroutine)
records = nil
}()
if err := p.load(records, source); err != nil {
return fmt.Errorf("load: %w", err)
}
if err := p.emit(records); err != nil {
return fmt.Errorf("emit: %w", err)
}
return nil
}
Key decisions here:
- Per-cycle allocation. The map is not shared state. Each cycle gets a new map. The old one becomes GC-eligible at cycle end.
- Pointer values. Using
*Recordinstead ofRecordinline reduces bucket size and makes values independently collectible during the cycle if any partial cleanup is needed. - Pre-sized.
source.EstimatedSize()pre-allocates capacity to avoid rehashing during fill. If the estimate is wrong, the map grows — but you’ve minimized the average number of reallocations. - Explicit nil in defer. In a service that processes cycles in a long-running goroutine, explicitly nil-ing the map ensures the reference is released at the top of the stack, not held until the goroutine unblocks on the next source read.
Implementation Pattern: Cycle-Safe HashMap in Rust
pub struct BatchProcessor {
// Reuse the allocation across cycles to avoid repeated allocator round-trips
buffer: HashMap<i32, Record>,
}
impl BatchProcessor {
pub fn new(expected_size: usize) -> Self {
Self {
buffer: HashMap::with_capacity(expected_size),
}
}
pub fn run_cycle(&mut self, source: &DataSource) -> Result<(), Error> {
// Clear entries but retain bucket allocation (amortized reuse)
self.buffer.clear();
// If the previous cycle was an outlier, shed the excess capacity
if self.buffer.capacity() > source.estimated_size() * 2 {
self.buffer.shrink_to(source.estimated_size());
}
self.load(&source)?;
self.emit()?;
Ok(())
}
}
This pattern explicitly manages the trade-off: retain capacity when the cycle size is stable (amortize allocation cost), but shed capacity when a spike cycle inflated the buffer to 2x or more of the expected size. The threshold is configurable based on your latency tolerance for the reallocation.
What the Language Design Reveals
Go’s map behavior is not an accident or an oversight. The Go runtime team has discussed map shrinking multiple times. The decision not to implement it reflects a concrete philosophy: the runtime should minimize latency variance, and shrinking a map requires copying all live entries to a new, smaller bucket array — a pause proportional to map size.
This is a reasonable trade-off in a language designed for services where developer productivity and operational simplicity are primary constraints. It costs memory to save latency. Most Go services never encounter this because their maps are either small, long-lived caches (where high-water retention is expected), or short-lived per-request state (where the map itself is GC’d when the request completes).
It becomes a problem specifically when maps are large, transient, and long-lived: populated in bulk, cleared, and held as a container for the next cycle. This is exactly the batch processor pattern.
Rust’s approach reflects the opposite philosophy: give the developer precise control and let them make the right call for their workload. shrink_to_fit() and shrink_to(min) exist not because they’re always the right choice, but because sometimes they are — and you should be able to make that decision without rewriting the data structure.
Neither is strictly better. Go’s behavior is safe and predictable for most services. Rust’s behavior requires you to know what you’re doing, but when you do, it’s more precise.
Diagnostics Checklist
When you observe unexplained heap retention in a Go service:
- Capture a heap profile at steady state.
go tool pprof http://localhost:6060/debug/pprof/heap— look for large allocations attributed toruntime.makemapor map bucket internals. - Check
runtime.MemStatsfields. CompareHeapInusevsHeapAlloc. LargeHeapInusewith lowHeapAllocsuggests retained-but-empty structures. - Identify maps with high-water behavior. Any map that is populated from external data (batch loads, cache warm-ups, request buffers) is a candidate.
- Trace cycle high-water marks. Add instrumentation to log map
len()and an estimate of capacity at cycle boundaries. Spikes that don’t recover indicate the retention pattern. - Test with map reassignment. Replace
for k := range m { delete(m, k) }withm = make(...)and observeHeapInuseacross cycles.
Summary
Go maps don’t shrink because the runtime trades memory for allocation predictability. This is the right trade-off for most use cases. It is the wrong trade-off for long-running services that process large, transient datasets in repeating cycles.
The fix is simple — either reassign the map or use pointer values to reduce bucket footprint — but you have to know the problem exists to reach for the fix.
Rust’s HashMap exposes shrink_to_fit() and shrink_to(n) because it assumes you want to make this decision yourself. Whether that additional control is worth the additional responsibility depends on your team and your workload.
In production, the difference between these two designs shows up not in benchmarks but in the post-mortem after your service gets OOM-killed two weeks after a traffic spike nobody noticed — because the map from that spike never let go.