Some Problems Only Appear After 24 Hours

Memory leaks, connection leaks, and gradual performance degradation only become visible under extended load. Our soak testing runs your system at sustained load for 8 to 72 hours to surface problems that standard load tests miss entirely.

Duration: 3–5 days Team: 1 Senior Load Testing Engineer

You might be experiencing...

Memory usage climbs gradually under production load and you restart services weekly to manage it
Database connections accumulate over time and eventually exhaust the pool — but only after days of uptime
Performance is fine after a fresh deploy but degrades by 40% after 3 days of production traffic
Your system passed a 30-minute load test but you need confidence it will hold up under continuous traffic

Soak testing — also called endurance testing — exposes failure modes that standard load tests miss because they run too briefly. Memory leaks that accumulate at 50MB per hour are invisible in a 30-minute test but manifest as an out-of-memory crash after 20 hours of production traffic. Connection pool leaks that grow by 2 connections per minute exhaust a 500-connection pool in 4 hours. Thread pool saturation from unreleased goroutines or threads builds over days. These are operational problems masquerading as infrastructure problems.

The time-series analysis of soak test metrics is what distinguishes a meaningful soak test from simply running load for a long time. We look for upward-trending metrics that do not stabilise: heap size that grows linearly with runtime, file descriptor counts that increase without ceiling, database connection counts that ratchet upward with each traffic spike. Every identified trend comes with a rate-of-change measurement and a projected time to failure.

Root cause identification for memory leaks requires correlating the metric trend with code paths: which allocation call sites are producing objects that are not being garbage collected? Heap profilers — JVM heap dumps, Go pprof memory profiles, Python memory_profiler — taken at intervals during the soak test reveal the answer. We pair memory profiling with soak load to surface the specific code responsible for the leak.

Engagement Phases

Day 1

Soak Test Design & Instrumentation

We design the soak test scenario at your target sustained load level (typically 70–80% of peak capacity). We configure comprehensive memory and resource monitoring: JVM heap, goroutine counts, file descriptors, database connection counts, Redis connection counts, and application-level metrics. We set up time-series dashboards for all key metrics.

Days 2–4 (8–72 hours)

Extended Load Execution

We run sustained load at the target level for the agreed duration. We monitor memory and resource metrics at 5-minute intervals, looking for upward trends that indicate leaks. We analyse application logs for increasing error rates, timeout patterns, and garbage collection pressure. We record the precise timing of any metric that begins trending upward.

Day 5

Analysis & Remediation

We produce a soak test analysis report identifying all metrics that trended upward, their rate of change, and the projected time to failure. For identified leaks, we provide root cause analysis with recommended code or configuration fixes. We pair with your engineering team to implement and validate the highest-priority fixes.

Deliverables

Soak test monitoring dashboard with time-series metrics
Memory and resource leak analysis report
Time-to-failure projections for each identified trend
Root cause analysis for each identified leak
Remediation recommendations with effort estimates

Before & After

MetricBeforeAfter
Memory leaks identifiedUnknown3 found at 10 MB/hr
Connection leaksUnknown2 found, fixed
Time to failureUnknown72 hrs without restart

Tools We Use

k6 / Locust extended config Prometheus / Grafana Heap profilers

Frequently Asked Questions

How do you run a 72-hour test without it consuming your full attention?

We configure automated monitoring with alert thresholds that notify us if any metric exceeds expected bounds during the test run. We check in at regular intervals (morning and evening) and review the time-series data. The load generation runs autonomously — k6 and Locust are designed for extended runs. We are available for immediate response if an alert fires.

What is the difference between a memory leak and normal memory growth?

Normal memory growth stabilises: the application allocates memory for caches and in-flight requests, reaches a steady state, and garbage collects appropriately. A memory leak grows continuously without stabilising — the trend line has a positive slope that does not flatten. We use rate-of-change analysis to distinguish between the two, and we look for correlation with request count to identify whether memory growth is proportional to traffic.

We restart our application nightly — does that mean soak testing doesn't apply?

Nightly restarts are a common workaround for memory or connection leaks — which means the leak is present and you are managing it operationally rather than fixing it. Soak testing identifies the specific leak so it can be fixed, eliminating the need for scheduled restarts. Planned restarts also create brief downtime windows and complicate deployment scheduling.

Know Your Scaling Ceiling

Book a free 30-minute capacity scope call with our load testing engineers. We review your architecture, traffic expectations, and upcoming scaling events — and scope the load test that will give you the data you need.

Talk to an Expert