Perf Benchmark: what is the memory footprint of a PremiumEntry instance?¶
Most processes using PremiumEntry are perf-intensive, and it's useful to understand how much memory this object takes at runtime.
Background: how we measure¶
We use tracemalloc, a Python standard-library tool that tracks memory allocations on the heap. The approach:
- Force a garbage collection so freed objects don't skew results.
- Take a snapshot of the heap.
- Run the code under measurement.
- Take a second snapshot, then compute the difference.
This gives two numbers:
- delta — how much heap is retained after the code finishes (the objects you hold in memory when it's done)
- peak — the highest watermark reached at any point during execution (includes temporary allocations that were already freed by the end)
For plain object construction, delta ≈ peak. For database reads, peak is much higher than delta because SQLAlchemy holds raw rows and ORM objects in memory simultaneously while mapping them to domain objects, then discards everything except the final result.
How to reproduce¶
direnv exec backend uv run pytest \
backend/components/be/internal/premium_computation/tests/test_repository_memory.py \
-v -s
cat memory_usage.log
The two tests run back to back and append to the same memory_usage.log file in the repository root.
Results, April 2026, 1,000 entries × 6 components¶
Domain objects only (no database)¶
Plain PremiumEntry objects constructed directly in Python, no I/O involved.
| Per entry | Total (10,000 entries) | |
|---|---|---|
| Delta (retained) | ~2,300 bytes | ~22 MB |
| Peak | ~2,300 bytes | ~22 MB |
Delta and peak are the same because nothing is allocated and freed during plain object construction.
E2E via get_premiums_for_payroll (full DB path)¶
Data written to Postgres, read back through SQLAlchemy, mapped to domain objects. The session is discarded after the call.
| Per entry | Total (1,000 entries) | |
|---|---|---|
| Delta (retained) | ~6,600 bytes | ~6.6 MB |
| Peak | ~19,500 bytes | ~18.6 MB |
The delta is ~3x the domain-only cost: the extra ~4,300 bytes per entry are permanent — UUIDs, enum instances, and other objects allocated during ORM mapping that stay alive as long as you hold the list.
The peak is ~8x the domain-only cost: during the call, SQLAlchemy holds raw cursor rows, ORM model instances, and the final domain objects all in memory at the same time. Once the session closes, only the domain objects remain.
Practical implication: loading 10,000 entries will retain ~66 MB but will transiently require ~195 MB at peak.