Common Heuristics
- Predicate Pushdown: Push down projects (π) and selects (σ) before join
- Projection Pushdown: create smaller tuples and reduce intermediate results
- Reorder predicates so that the DBMS applies the most selective one first.
- Merge Predicate: If the two queries have intersection
- Subquery Optimization - Rewriting: de-correlating and / or flattening nested subqueries.
- Join elimination.
- Note that you can only project away columns that are not used in the rest of the query (i.e. if they are in a SELECT or a WHERE clause that hasn’t yet been evaluated, you can’t get rid of the column).
Cost Estimation
Selection Statistics
used to determine the number of tuples that will be selected for a given input. So we can choose which index to use.
Optimization
Bottom up - System R
Use static rules to perform initial optimization. Then use dynamic programming to determine the best join order for tables using a divide-and conquer search method.
- Break query up into blocks and generate the logical operators for each block
- For each logical operator, generate a set of physical operators that implement it
- Then, iteratively construct a ”left-deep” tree that minimizes the estimated amount of work to execute the plan
Top down - Volcano
Start with a logical plan of what we want the query to be. Perform a branch-and-bound search to traverse the plan tree by converting logical operators into physical operators.
- Keep track of global best plan during search.
- Treat physical properties of data as first-class entities during planning