postgres SQL optimization
The optimizer:
query -> parser -> rewriter -> planner -> executor -> result
Parser
- syntax validation
- data types
- nodes and tree construction
Rewriter
- CTE processing
- subquery transformation
- join reordering
- predicate pushdown
Tip
Even if the rewriter rewrite your query to be more optimized, if you can write it right from the start, it will prevent the rewriter to rewrite your query, hence gaining a bit on performance.
Planner
- processing input
- generating plans
- estimating costs
- selecting plan
- creating plan tree
statistics
- based on column values
- updated with analyze (or after commit for some)
- stored in
pg_stat_all_tables
andpg_stats
costs
Unit costs:
seq_page_cost
: how much does it cost to do a sequence scan, mostly has the unit of 1random_page_cost
== indexescpu_tuple_cost
: access to a row at cpu level (by default set to 0.01)cpu_index_tuple_cost
: access to an index (by default set to 0.005)
seq scan example
pages * seq_page_cost + rows * cpu_tuple_cost
“rows” in cost formulas = estimated number of rows based on pg_stats
.
index scan example
index_pages * random_page_cost + rows * (cpu_index_tuple_cost + cpu_tuple_cost)
Executor
- executes
- postgres not as strong as Oracle
- manages resources allocation
- how much CPU/memory it needs, parallelize, …
- can perform some limited adaptations
- e.g. for big table, it might prefer matching using pages instead of rows
Missconception 1
The optimizer always selects the best plan!
The optimizer is based on the data it has, so it might not always select the best plans. Some common scenarios:
- statistics not up-to-date
- modelization issue
- postgres cannot know if some columns have some link, e.g. two columns
city
andcountry
- postgres cannot know if some columns have some link, e.g. two columns
Explain: your best friend
# Create execution plan
EXPLAIN
SELECT * FROM orders WHERE customer_id = 123;
# Create execution plan and execute the query
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;
# verbose: add information on columns and their datatype
# buffer: show the hits on the shared blocks (so in-memory) and what has been fetched on the disk
# wal: can add an overhead
# memory: memory usage of the SQL query
EXPLAN (ANALYZE, VERBOSE, BUFFER, WAL, MEMORY)
SELECT * FROM orders WHERE customer_id = 123;
Explaining explain output:
# `Nested Loop`: Type of the node
# `cost`:
# - 1st number (4.65): the estimated cost to fetch the 1st row
# - 2nd number (26.59): the estimated cost to fetch all rows
# `rows`: the number of rows fetched
# `width`: average size in bytes fetched of each row
Nested Loop (cost=4.65..26.59 rows=1 width=325)
# Line provided by the `EXPLAIN ANALYZE`
# `time`:
# - 1st number (0.025): the time to fetch the 1st row
# - 2nd number (0.032): the time to fetch all rows
# `loops`: the number of times it performed the operation
(actual time=0.025..0.032 rows=3 loops=1)
# Line provided by `EXPLAIN VERBOSE`
Output: customer.id, customers.name, customers.email
# Line provided by `EXPLAIN BUFFER`
# Here, it did not read to the disk
Buffers: shared hit=8
# Line provided by `EXPLAIN`
# Indicate it's using an index scan, not a seq scan
-> Index Scan using customers_pkey on public.customers
(cost=0.29..8.30 rows=1 width=126)
(actual time=0.012..0.013 rows=1 loops=1)
Output: customer.id, customers.name, customers.email
# Filter performed at index level
Index Cond: (customer.id = 123)
...
# Summary of the plan
Planning Time: 0.195ms
Execution Time: 0.062ms
Some vizualisation tools:
Performance red flags
- seq scan on large tables
- huge differences between estimated and actual rows
- ⇒ that means the statistics are not up-to-date
- high-cost sort or hash operations
- can make postgres read on swap, so slower SQL query
- worse: it can make an out-of-memory error
- nested loops joins with large outer relations
- best to have
hash joins
- best to have
- filter conditions (vs index cond)
- if it’s a
filter conditions
, that means postgres first get the rows, and then use the filter condition to filter ⇒ less performant
- if it’s a
Indexes
- b-tree
- default index
- used for equality and inequality
- hash
- only for equality
- GiST
- for geometric data and range data
- GIN
- for full text and tables
- BRIN
- for ordered data
Different index
- functional indexes
- multicolumn indexes
- covering indexes
- index on column A, but add the value of column B to the indexed value
- partial indexes
- index with a
WHERE
clause
- index with a
When indexes help
- selective queries
SELECT * FROM customers WHERE id = 123;
SELECT * FROM orders WHERE order_date between '2023-01-01' AND '2023-01-31'
- join operations
- add indexes on foreign keys
SELECT *
FROM orders o
INNER JOIN order_items i ON o.order_id = i.order_id;
- sort operations
SELECT * FROM orders
WHERE customer_id = 123
ORDER BY order_date DESC;
When indexes hurt
- write performance
# An index on `order_date` will make the query slower.
# Additional indexes on status, processing_date, updated_at will likely make the situation worse.
UPDATE orders
SET
status = 'processed',
processing_date = current_date,
updated_at = now()
WHERE
order_date BETWEEN '2023-01-01' AND '2023-01-31'
AND status = 'pending';
- low selectivity
# If the planner decides to use the index, the execution will likely be slower than scanning the whole table.
# => The Stack Overflow Trap
SELECT * FROM customers WHERE status = 'active'
- indexes drawback
- maintenance (writes, vaccum, …)
- disk space (storage costs, backup, …)
Finding the right balance
- use
EXPLAIN ANALYZE
- look for attributes with a lot of reads and a few writes
- look for good selectivity
- consider multicolumn indexes and covering indexes
- regularly check index usage
- regularly look for duplicate indexes
Example of issues
Low selectivity
In the EXPLAIN ANALYZE
output:
# That means there are too many matching rows so sequential scan is actually faster if the
# number is close to the total number of rows, and the estimated/actual rows as way lower than the number of rows removed by the filter.
# Because that means the selectivity is low, so a seq scan is better (postgresql considers as low when the selectivity is between 5% and 10%).
Rows Removed by Filter: 9588
Outdated statistics
When the estimated rows are a higher order of the actual number of rows.
Swap
temp read=8554 written=8554
Sort Method: external merge Disk: 68432kB
That means it has used the swap.
Mitigation: increase the work_mem
parameter.