← Back to Dashboard

Day 4: Combining Data (UNION & UNION ALL)

Operator Behavior Performance
UNION Combines result sets and removes duplicates. Slower (requires the engine to sort and filter data).
UNION ALL Combines result sets and keeps all duplicates. Faster (simply appends data without checking for uniqueness).

1. What are Set Operators?

While JOINs are used to combine columns horizontally (side-by-side), Set Operators like UNION are used to combine rows vertically (stacking them on top of each other). This is incredibly useful when you have similar data spread across multiple tables.

2. The Golden Rules of UNION

You cannot just stack any two random tables together. To use UNION or UNION ALL, your SELECT statements must strictly follow these three rules:

  1. Same Number of Columns: Every SELECT statement must ask for the exact same number of columns.
  2. Same Data Types & Order: The columns must have compatible data types, and they must be in the exact same order in all queries.
  3. ORDER BY at the End: You cannot use an ORDER BY clause in the middle of a UNION. It can only be placed at the very end of the final SELECT statement.

3. UNION vs UNION ALL Examples

Scenario:

Imagine you have two tables: CurrentEmployees and FormerEmployees. You need a single master list of all email addresses.

Using UNION (Removes Duplicates)

If an employee quit and was rehired (meaning their email is in both tables), UNION will only show their email once.

SELECT EmailAddress, Department FROM CurrentEmployees
UNION
SELECT EmailAddress, Department FROM FormerEmployees;

Using UNION ALL (Keeps Duplicates)

This will return every single row from both tables. If an email is in both tables, it will appear twice in your results.

SELECT EmailAddress, Department FROM CurrentEmployees
UNION ALL
SELECT EmailAddress, Department FROM FormerEmployees;

Applying ORDER BY Correctly

Notice how the ORDER BY is only written once, at the very end of the entire query.

SELECT FirstName, LastName, 'Active' AS Status FROM CurrentEmployees
UNION ALL
SELECT FirstName, LastName, 'Inactive' AS Status FROM FormerEmployees
ORDER BY LastName ASC; -- Placed at the very end!