The DISTINCT
keyword in
Oracle SQL is used to remove duplicate rows from the result set, ensuring that
only unique (distinct) records are returned. This is especially useful when you
want to get a list of unique values from one or more columns in a query.
Key Points About DISTINCT
in Oracle
1. Basic Usage of DISTINCT
The DISTINCT
keyword is
placed directly after SELECT
to eliminate duplicate rows from the result set
based on the columns specified in the query.
Syntax:
SELECT DISTINCT column1, column2, ...
FROM table_name;
Example:
SELECT DISTINCT department_id
FROM employees;
In this example, the query will return
only unique department_id
values, even if some departments appear multiple times in the table.
2. DISTINCT with Multiple Columns
When DISTINCT
is applied to
multiple columns, the uniqueness of the result set is determined by the
combination of all the specified columns. This means that only rows with a
unique combination of values across all the specified columns will be returned.
Example:
SELECT DISTINCT department_id, job_id
FROM employees;
In this case, the query will return only
distinct combinations of department_id
and job_id
. If the same
pair of department_id
and job_id
appears in multiple rows, only one row will be returned.
3. DISTINCT on a Single Column
You can apply DISTINCT
to a single
column to get unique values from that column.
Example:
SELECT DISTINCT job_id
FROM employees;
This query will return a list of unique job titles across all employees.
4. DISTINCT and Aggregate Functions
When combined with aggregate functions
such as COUNT()
,
SUM()
,
AVG()
,
etc., DISTINCT
affects only the arguments within the aggregate function.
- COUNT DISTINCT: Counts only unique values in a column.
Example:
SELECT COUNT(DISTINCT department_id)
FROM employees;
This query will return the number of distinct department IDs in the employees' table.
5. DISTINCT with ORDER BY
Clause
You can use DISTINCT
in combination
with the ORDER
BY
clause to return distinct values sorted in a particular order.
Example:
SELECT DISTINCT department_id
FROM employees
ORDER BY department_id;
This query will return distinct department IDs, sorted in ascending order.
6. Performance Considerations
with DISTINCT
- Using
DISTINCT
requires Oracle to scan the result set and remove duplicate rows. This can have performance implications, especially on large tables or when applied to multiple columns. - To optimize performance, ensure that the columns you
use with
DISTINCT
are indexed, if possible. This can speed up the search and reduce the overall cost of the operation.
7. DISTINCT and NULL Values
DISTINCT
treatsNULL
values as identical. If a column has multipleNULL
values,DISTINCT
will consider them as a single value, meaning only oneNULL
will appear in the result set.
Example:
SELECT DISTINCT department_id
FROM employees;
If the department_id
column
contains multiple NULL
values, only one NULL
value will appear
in the result set, even if the table contains many rows with NULL
.
8. Using DISTINCT
with JOIN
DISTINCT
can also be used in queries
involving JOIN
operations to return unique rows based on a combination of columns from
different tables.
Example:
SELECT DISTINCT e.department_id, d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id;
This query will return unique combinations
of department_id
and department_name
from the employees
and departments
tables.
9. Using DISTINCT
with GROUP
BY
DISTINCT
is often used with GROUP BY
to ensure unique groupings, but DISTINCT
is not necessary when using GROUP BY
because GROUP
BY
already groups rows into unique combinations.
Example with DISTINCT
:
SELECT DISTINCT department_id
FROM employees
GROUP BY department_id;
However, since GROUP BY
groups the
data by the specified column(s), applying DISTINCT
would be
redundant here.
10. DISTINCT
vs. GROUP BY
- Both
DISTINCT
andGROUP BY
can be used to remove duplicates, but they serve different purposes: DISTINCT
: Removes duplicate rows based on the entire result set or specific columns.GROUP BY
: Groups rows by specific columns and can be used in conjunction with aggregate functions (e.g.,COUNT()
,SUM()
,AVG()
).
Example of DISTINCT
:
SELECT DISTINCT department_id
FROM employees;
Example of GROUP BY
:
SELECT department_id, COUNT(*)
FROM employees
GROUP BY department_id;
11. Common Use Cases for DISTINCT
- Retrieving Unique Values: When you need a list of unique values from a table (e.g., unique department IDs, job titles).
- Removing Duplicates: When duplicates may arise in a result set due to joins or other operations.
- Data
Quality Checks: When analyzing data quality, you might use
DISTINCT
to check how many unique entries exist in a column. - Aggregating Data: When you want to count or summarize distinct values in your dataset.
12. Limitations of DISTINCT
- Performance:
As mentioned earlier, using
DISTINCT
can slow down query performance, especially on large datasets. - Duplication
in Complex Joins: Even when using
DISTINCT
, complex joins or operations can still result in duplicated rows if the join condition doesn't guarantee uniqueness.
Examples of DISTINCT
in Queries
1. Simple Use Case
Get distinct job IDs from the employees table.
SELECT DISTINCT job_id
FROM employees;
2. Using DISTINCT
with Multiple Columns
Get distinct department ID and job ID combinations.
SELECT DISTINCT department_id, job_id
FROM employees;
3. Using DISTINCT
with COUNT
Count the number of unique departments.
SELECT COUNT(DISTINCT department_id)
FROM employees;
4. Using DISTINCT
with ORDER BY
Get distinct department IDs sorted in ascending order.
SELECT DISTINCT department_id
FROM employees
ORDER BY department_id;
5. Combining DISTINCT
with JOIN
Get distinct department IDs and
department names from two tables (employees
and departments
).
SELECT DISTINCT e.department_id, d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id;
No comments:
Post a Comment