Postgres, Linux, and the “we didn’t change anything” performance drop

There’s an interesting (and slightly worrying) shift happening right now in the Postgres + Linux world.

A recent Linux kernel change (7.x line) is showing significant performance regressions for PostgreSQL workloads — in some cases halving throughput or turning 50ms queries into 500ms ones overnight. (hostingartisan)

And here’s the catch:
nothing in Postgres changed.

What’s actually going on?

At a high level, this sits at the intersection of three things:

Linux scheduler / preemption changes
PostgreSQL’s process + locking model
Modern high-core-count CPUs (especially ARM / Graviton)

There are reports that increased preemption and scheduling behaviour are impacting spinlocks and concurrency-heavy workloads. (Reddit)

That’s basically Postgres in a nutshell.

Postgres leans heavily on shared memory + lightweight locks. If the kernel starts interrupting those threads more aggressively, you get contention… and performance tanks.

Why this matters more on ARM (Graviton)

If you’re running on AWS, this hits a very specific sweet spot:

You’re likely on Graviton (ARM) for cost/perf
You’re likely on high core counts (r7g, m7g etc.)
You’re likely running I/O heavy workloads on EBS

That combination amplifies the issue.

More cores = more concurrent workers = more contention when scheduling changes
And ARM environments (aarch64) are already where some I/O behaviour differences show up first in Postgres testing. (PostgreSQL)

But wait… wasn’t Postgres supposed to get faster?

Yes – and this is where it gets messy.

Postgres 18 is introducing asynchronous I/O, including support for modern Linux interfaces like io_uring, designed to improve performance significantly in cloud environments. (Neon)

In theory:

Better parallel reads
Less syscall overhead
Much better performance on network-attached storage (like EBS)

In practice:

Kernel behaviour is shifting underneath it
Some features (like io_uring) are still maturing
And the “best” path is now highly kernel-dependent

So we’re in that awkward phase where:

the database is evolving with the kernel…
but the kernel just changed the rules mid-game

What does this mean for AWS RDS users?

This is the important bit.

If you’re on Amazon RDS or Aurora PostgreSQL running on Graviton:

You don’t control the kernel version directly
But AWS does upgrade underlying infrastructure over time
And performance changes can appear with no config change on your side

That’s the dangerous part.

You could:

Upgrade minor version
Move instance class
Or AWS rolls out a host update

…and suddenly:

same queries, same plans, very different performance

What should you actually do?

A few practical things (not theoretical):

Baseline now
- Capture query latency, CPU, and wait events
- Especially LWLock and I/O waits
Watch for silent regressions
- Not all performance drops come from your code or schema anymore
Test instance upgrades properly
- Especially when moving between Graviton generations
Be cautious with “latest everything”
- New kernel + new Postgres + new instance type = stacked unknowns
Keep an eye on async I/O adoption
- It will be the long-term win
- But right now, it’s still settling

The bigger takeaway

We’ve spent years treating the database as the thing to tune.

Increasingly, that’s not true.

The performance profile of Postgres today is just as dependent on:

Kernel scheduling
I/O subsystems
CPU architecture

As it is on indexes and query plans.

And this is a perfect example of that shift.

If you’re running Postgres on AWS (especially Graviton), it’s worth paying attention to this one.

Because this is exactly the kind of issue that shows up as:

“nothing changed… but everything got slower”

—and those are always the hardest to debug.