1. What Are Oracle Subqueries?
A subquery is a query nested inside another query. Subqueries can be:
- Single-row: returns one row → used with =, <, >
- Multi-row: returns multiple rows → used with IN, ANY, ALL
- Scalar: returns one value → used as expression in SELECT/WHERE
- Correlated: references outer query columns → executed per row
- Non-Correlated: independent → executed once
- Inline View: used in FROM clause as derived table
2. Why Follow Best Practices?
- Avoid performance issues (especially with correlated subqueries)
- Ensure correct results (avoid NULL pitfalls, ORA-01427 errors)
- Improve readability and maintainability
- Make queries optimizer-friendly for faster execution
3. Best Practice 1: Prefer Non-Correlated Subqueries
- Non-correlated subqueries are executed once → faster
- Correlated subqueries execute per row → slower for large datasets
Example:
SELECT e.employee_id
FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
WHERE department_id = e.department_id
);
SELECT employee_id
FROM (
SELECT employee_id, salary,
AVG(salary) OVER (PARTITION BY department_id) dept_avg
FROM employees
)
WHERE salary > dept_avg;
4. Best Practice 2: Use Aggregates to Ensure Single-Row Subquery
- Single-row subqueries must return exactly one row
- Use aggregate functions like AVG(), MAX(), MIN(), COUNT() to guarantee a single value
Example:
SELECT employee_id
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
5. Best Practice 3: Prefer EXISTS Instead of IN for Correlated Subqueries
- EXISTS is optimized for checking existence
- Stops processing once condition is met
- Safer when dealing with NULLs
Example:
SELECT e.employee_id
FROM employees e
WHERE EXISTS (
SELECT 1
FROM job_history j
WHERE j.employee_id = e.employee_id
);
SELECT e.employee_id
FROM employees e
WHERE e.employee_id IN (
SELECT employee_id
FROM job_history
);
6. Best Practice 4: Avoid NOT IN With NULLs
- NOT IN fails if subquery returns NULL
- Use NOT EXISTS instead
Example:
SELECT e.employee_id
FROM employees e
WHERE NOT EXISTS (
SELECT 1
FROM departments d
WHERE e.department_id = d.department_id
);
7. Best Practice 5: Use Inline Views or WITH Clause for Complex Aggregation
- Simplifies query
- Optimizer can merge inline views
- Avoids repeated computation
Inline View:
SELECT e.employee_id, d.avg_sal
FROM employees e
JOIN (
SELECT department_id, AVG(salary) AS avg_sal
FROM employees
GROUP BY department_id
) d
ON e.department_id = d.department_id
WHERE e.salary > d.avg_sal;
WITH Clause (CTE):
WITH dept_avg AS (
SELECT department_id, AVG(salary) AS avg_sal
FROM employees
GROUP BY department_id
)
SELECT e.employee_id, e.salary
FROM employees e
JOIN dept_avg d
ON e.department_id = d.department_id
WHERE e.salary > d.avg_sal;
8. Best Practice 6: Index Correlated Columns
- Correlated subqueries reference outer columns → repeated scans
- Indexing improves performance
Example:
CREATE INDEX idx_emp_dept ON employees(department_id);
9. Best Practice 7: Use Analytic Functions Instead of Correlated Subqueries
- Analytic functions calculate per-group results once per partition
- Avoid repeated inner query execution
Example:
SELECT employee_id, salary,
AVG(salary) OVER (PARTITION BY department_id) dept_avg
FROM employees
WHERE salary > dept_avg;
10. Best Practice 8: Limit Rows Early
- Use WHERE clauses in subqueries to reduce unnecessary processing
Example:
SELECT employee_id
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location_id = 1700
);
11. Best Practice 9: Avoid Deeply Nested Subqueries
- Excessive nesting → hard to read and harder for optimizer
- Flatten queries using JOINs, inline views, or CTEs
12. Best Practice 10: Check Execution Plan
- Use EXPLAIN PLAN to see subquery execution
- Check for full table scans, nested loops on large datasets, and unnesting opportunities
Example:
EXPLAIN PLAN FOR
SELECT e.employee_id
FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());
13. Summary Table of Best Practices
|
Best Practice |
Description |
|
Prefer Non-Correlated Subquery |
Executes once, faster |
|
Use Aggregates for Single-Row |
Prevent ORA-01427 |
|
Prefer EXISTS over IN |
Optimized for correlated checks |
|
Avoid NOT IN with NULLs |
Use NOT EXISTS |
|
Inline Views / CTEs |
Simplifies complex queries |
|
Index Correlated Columns |
Improves performance |
|
Use Analytic Functions |
Efficient per-group computation |
|
Limit Rows Early |
Reduces unnecessary processing |
|
Avoid Deep Nesting |
Maintain readability & optimizer efficiency |
|
Check Execution Plan |
Validate optimizer choices |
14. Short Interview Answer
“Oracle subquery best practices include using non-correlated subqueries when possible, preferring EXISTS over IN for correlated queries, using aggregates to guarantee single-row results, indexing correlated columns, limiting rows early, using inline views or CTEs for complex aggregations, and leveraging analytic functions to replace repeated correlated subqueries. Always check the execution plan to ensure optimal performance.”
No comments:
Post a Comment