1️ Use Proper Indexing (Most
Important)
Indexes allow the database to avoid full table scans.
Create indexes on:
- JOIN
keys
- WHERE
clause columns
- GROUP
BY columns
- ORDER
BY columns
Example:
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_customer_country ON customer(country);
Key idea:
Indexes let the database filter and join data efficiently
without copying or storing anything.
2️.Rewrite Subqueries as JOINs or
EXISTS
Avoid IN (subquery) and correlated subqueries when possible.
❌ Slow
SELECT * FROM orders
WHERE customer_id IN (SELECT id FROM customers WHERE country='US');
✅ Faster (JOIN)
SELECT o.*
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.country='US';
✅ Also fast (EXISTS)
SELECT *
FROM orders o
WHERE EXISTS (
SELECT 1 FROM
customers c
WHERE c.id =
o.customer_id AND c.country='US'
);
3️.Choose the Right JOIN Type
Unnecessary join types can degrade performance.
Replace:
- LEFT
JOIN → INNER JOIN (if possible)
- FULL
OUTER JOIN → split logic into UNION ALL
- CROSS
JOIN → avoid unless needed
Fewer rows processed = faster queries.
4️.Push Filters Down Early
(Predicate Pushdown)
Apply filters on the smallest dataset first.
❌ Slow
SELECT ...
FROM big_table b
JOIN small_table s ON ...
WHERE s.type = 'X';
✅ Fast
Move predicate to small table before join:
SELECT ...
FROM big_table b
JOIN (SELECT * FROM small_table WHERE type='X') s ON ...
This reduces join workload without materializing the data.
5️.Avoid Functions on Indexed
Columns
This blocks index usage.
❌ Bad
WHERE DATE(created_at) = '2024-01-01'
✅ Good
WHERE created_at >= '2024-01-01'
AND created_at < '2024-01-02'
6️.Use Covering Indexes
A covering index contains all columns needed, so the
DB doesn't fetch the table.
Example query:
SELECT amount, created_at
FROM orders
WHERE customer_id = 100;
Create covering index:
CREATE INDEX idx_orders_cover ON orders(customer_id,
created_at, amount);
The DB can serve the entire query from the index only
→ faster, no temp storing.
7️⃣ **Avoid SELECT ***
Only select columns you need.
❌ Bad
SELECT *
FROM orders o
JOIN customers c ON ...
✅ Good
SELECT o.id, o.amount, c.name
FROM orders o
JOIN customers c ON ...
Less data scanned + less data transferred.
8️.Use LIMIT, WINDOWING, and
Pagination
Avoid scanning large datasets.
Example Pagination:
SELECT * FROM orders
ORDER BY id
LIMIT 50 OFFSET 0;
Avoid OFFSET for large pages:
Use keyset pagination:
SELECT *
FROM orders
WHERE id > last_seen_id
ORDER BY id
LIMIT 50;
9️⃣ Normalize Query Logic (No
Redundant Operations)
Avoid repeating the same subquery multiple times.
❌ Bad
SELECT (SELECT price FROM products WHERE id = o.product_id),
(SELECT
category FROM products WHERE id = o.product_id)
FROM orders o;
✅ Good
SELECT p.price, p.category
FROM orders o
JOIN products p ON p.id = o.product_id;
10. Use Database-specific
Optimizer Hints (When Needed)
These do not materialize data; they influence execution
plan.
Examples:
- MySQL:
STRAIGHT_JOIN
- Oracle:
USE_NL, NO_MERGE
- SQL
Server: OPTION (HASH JOIN)
- Postgres:
SET enable_seqscan=off (temporary)
Only use when the optimizer chooses a poor plan.
1️1.Partitioning (Logical, Not Materializing)
Partitioning does not materialize data; it splits tables for
faster scanning.
Use partitioning on:
- date
columns
- high-cardinality
keys
Improves:
- scanning
- filtering
- aggregation
Without storing extra copies of data.
1️2.Use Window Functions Instead of Self-Joins
Window functions compute aggregates without extra joins.
❌ Slow
SELECT o.*,
(SELECT SUM(amount)
FROM orders WHERE customer_id=o.customer_id)
FROM orders o;
✅ Fast (window)
SELECT o.*,
SUM(amount) OVER
(PARTITION BY customer_id) AS customer_total
FROM orders o;
๐ง Summary: Optimization
Without Materializing Data
|
Technique
|
Benefit
|
|
Indexes
|
Fast filtering and joining
|
|
Rewriting subqueries
|
Reduce scans + better execution plans
|
|
Join optimization
|
Process fewer rows
|
|
Predicate pushdown
|
Filter early
|
|
Covering indexes
|
Avoid table lookups
|
|
Avoid functions on indexed columns
|
Enable index usage
|
|
Keyset pagination
|
Avoid large offsets
|
|
Window functions
|
Avoid redundant joins
|
|
Partitioning
|
Faster scans on large datasets
|