Fixing a 700ms PostgreSQL Query: From Seq Scan to 38ms

By Taha Tekin • Engineering • PostgreSQL

When you are the sole technical owner of a growing mobile product, every performance bottleneck is your responsibility.

In my recent project, Common Ground (a real-time social planning app built with React Native and Django), user engagement heavily relied on the in-app messaging feature. Initially, loading a chat history felt instantaneous. But as the Message table grew, the chat screen started taking noticeably longer to load.

When I checked my distributed tracing logs via Sentry, the bottleneck was glaringly obvious: the database query fetching the chat history was taking around 700ms at the p95 percentile.

For a real-time app backed by WebSockets, a 700ms delay just to fetch historical messages creates unacceptable friction. Here is how I diagnosed the issue using PostgreSQL’s EXPLAIN ANALYZE and reduced the latency to ~38ms by leveraging query-aligned compound B-tree indexes.

The Problem: The Dreaded Sequential Scan & In-Memory Sorting

In a typical chat application, fetching the message history between two users involves filtering by their IDs and sorting by the timestamp to get the latest N messages.

Using Django’s ORM, the query looked something like this:

messages = Message.objects.filter(
    (Q(sender=user1) & Q(receiver=user2)) | 
    (Q(sender=user2) & Q(receiver=user1))
).order_by('-time_stamp')[:100]

At first glance, this is a standard, clean ORM query. But under the hood, PostgreSQL was working inefficiently. I fired up my database console and ran an EXPLAIN ANALYZE on the generated SQL.

The output revealed a Sequential Scan (Seq Scan).

Because there were no specific indexes covering this exact OR condition pattern combined with an ORDER BY, PostgreSQL had to scan through the entire Message table, checking every single message to see if it belonged to the conversation between user1 and user2.

To make matters worse, after finding the relevant messages, the database had to load them into memory and perform a Top-N Heapsort to order them by time_stamp before applying the LIMIT 100.

As the table grew, this operation became prohibitively slow and resource-intensive. At 700ms, it was already degrading the user experience.

The Solution: Designing Compound B-Tree Indexes

The naive approach to fixing database speed is usually "just add an index to the columns." If I had added separate, single-column indexes on sender, receiver, and time_stamp, PostgreSQL might have utilized them, but it would still likely need to compute the intersection and then sort the results in memory.

Instead, I needed indexes that exactly matched my production query pattern.

Since a B-tree index sorts data hierarchically from left to right, column order in a compound index is critical. The rule of thumb for indexing is: Equality checks first, Range/Sort checks last.

Because my query looks for messages where (sender=A AND receiver=B) OR (sender=B AND receiver=A), I designed two separate compound indexes in my Django model's Meta class to cover both sides of the OR clause:

class Message(models.Model):
    # ... fields (sender, receiver, message_body, time_stamp) ...

    class Meta:
        indexes =[
            models.Index(fields=['sender', 'receiver', 'time_stamp']),
            models.Index(fields=['receiver', 'sender', 'time_stamp']),
        ]

Why did this work so effectively?

Fast Lookup via Index Scan: When the query runs, PostgreSQL traverses the B-tree to quickly locate the exact (sender, receiver) and (receiver, sender) pairs. Instead of a full table scan, it performs highly efficient index scans and merges the results (often via bitmap index operations such as BitmapOr).
Eliminating Explicit Sorts: Because time_stamp is the third column in the index, the index entries for a specific (sender, receiver) pair are already ordered by time_stamp within the B-tree structure.
Leveraging Index Ordering & LIMIT: To get the latest 100 messages (ORDER BY -time_stamp), PostgreSQL doesn't need to load the data into memory to sort it. It can leverage this pre-sorted index structure and may perform Index Scan Backward operations, effectively eliminating the explicit sorting cost and allowing PostgreSQL to efficiently stop scanning once it retrieves the first LIMIT 100 rows.

By eliminating the in-memory Top-N Heapsort and replacing the Seq Scan with targeted index lookups, the CPU overhead and I/O operations were drastically reduced.

The Result: From ~700ms to 38ms

I ran EXPLAIN ANALYZE again after applying the Django migrations.

The Sequential Scan was completely gone. In its place were Index Scan Backward operations utilizing the new compound indexes.

The query execution time plummeted from ~700ms to ~38ms in our observed workload.

This wasn't just a 94.6% reduction in latency; it was a massive reduction in database CPU load. For the end-user, the chat screen went back to feeling instant. For the infrastructure, the cost of scaling the messaging feature dropped significantly.

Next Steps & Trade-offs

While this solution solved the immediate bottleneck, performance engineering is an iterative process. If the application scales to millions of users, further optimizations could include:

Covering Indexes: Adding INCLUDE (message_body) to the index could potentially allow index-only scans, preventing the database from ever touching the heap (table data). However, since message bodies can be variable and large, the increased index size (and write penalty) wasn't worth the trade-off at this stage.
Data Model > Index Tuning: At a higher scale, even this dual-index pattern may degrade due to the overhead of OR expansion and bitmap merges. At that point, normalizing the query shape becomes more important than index tuning. Refactoring the schema to include a unique conversation_id (e.g., a hashed combination of user1 and user2) would eliminate the OR clause entirely, allowing a single optimal index (conversation_id, time_stamp DESC) to handle the query flawlessly.

Takeaways

When building scalable systems, especially as a solo engineer, you have to be intentional about optimization:

Don't guess, measure: EXPLAIN ANALYZE is your best friend. Distributed tracing (like Sentry) tells you where the bottleneck is, but EXPLAIN tells you why.
ORMs hide complexity: Django ORM makes writing queries incredibly easy, but it doesn't absolve you from understanding how relational databases actually execute those queries.
Index for your specific query patterns: A compound index is significantly more effective than single-column indexes if you align the column order with your WHERE and ORDER BY clauses.
Performance engineering isn't just about writing complex code; it's about deeply understanding the constraints of your system and designing targeted solutions to overcome them.