Database Query Optimization

Common Heuristics

Predicate Pushdown: Push down projects (π) and selects (σ) before join
Projection Pushdown: create smaller tuples and reduce intermediate results
Reorder predicates so that the DBMS applies the most selective one first.
Merge Predicate: If the two queries have intersection
Subquery Optimization - Rewriting: de-correlating and / or flattening nested subqueries.
Join elimination.

Note that you can only project away columns that are not used in the rest of the query (i.e. if they are in a SELECT or a WHERE clause that hasn’t yet been evaluated, you can’t get rid of the column).

Cost Estimation

Selection Statistics

used to determine the number of tuples that will be selected for a given input. So we can choose which index to use.

Optimization

Bottom up - System R

Use static rules to perform initial optimization. Then use dynamic programming to determine the best join order for tables using a divide-and conquer search method.

Break query up into blocks and generate the logical operators for each block
For each logical operator, generate a set of physical operators that implement it
Then, iteratively construct a ”left-deep” tree that minimizes the estimated amount of work to execute the plan

Top down - Volcano

Start with a logical plan of what we want the query to be. Perform a branch-and-bound search to traverse the plan tree by converting logical operators into physical operators.

Keep track of global best plan during search.
Treat physical properties of data as first-class entities during planning

Notes

Database Query Processing

Fish Touching🐟🎣

Explorer