PostgreSQL SELF JOIN



PostgreSQL SELF JOIN

The PostgreSQL SELF JOIN is a regular join but the table is joined with itself. This is essential for querying hierarchical data or comparing rows within the same table.


Syntax

SELECT a.column1, b.column2
FROM table_name a
JOIN table_name b
ON a.common_field = b.common_field;

The SELF JOIN has the following components:

  • a: The alias for the first instance of the table.
  • b: The alias for the second instance of the table.
  • common_field: The field used to join the table to itself.

Example PostgreSQL SELF JOIN Queries

Let's look at some examples of PostgreSQL SELF JOIN queries:

1. Basic SELF JOIN Example

SELECT a.employee_id, a.employee_name, b.manager_name
FROM employees a
JOIN employees b
ON a.manager_id = b.employee_id;

This query retrieves the employee_id and employee_name from the first instance of the employees table and the manager_name from the second instance of the employees table where the manager_id in the first instance matches the employee_id in the second instance.

2. SELF JOIN with WHERE Clause

SELECT a.employee_id, a.employee_name, b.manager_name
FROM employees a
JOIN employees b
ON a.manager_id = b.employee_id
WHERE a.department_id = 1;

This query retrieves the employee_id and employee_name from the first instance of the employees table and the manager_name from the second instance of the employees table where the manager_id in the first instance matches the employee_id in the second instance, and the department_id in the first instance is 1.

3. SELF JOIN to Find Duplicates

SELECT a.employee_name, b.employee_name
FROM employees a
JOIN employees b
ON a.email = b.email
AND a.employee_id <> b.employee_id;

This query retrieves pairs of employee_name from two instances of the employees table where the email is the same but the employee_id is different, helping to find duplicates.


Full Example

Let's go through a complete example that includes creating a table, inserting data, and using the SELF JOIN clause to retrieve hierarchical data.

Step 1: Creating a Table

This step involves creating a new table named employees to store employee data.

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    employee_name VARCHAR(100),
    manager_id INT,
    department_id INT,
    email VARCHAR(100)
);

In this example, we create a table named employees with columns for employee_id, employee_name, manager_id, department_id, and email.

Step 2: Inserting Data into the Table

This step involves inserting some sample data into the employees table.

INSERT INTO employees (employee_name, manager_id, department_id, email)
VALUES ('John Doe', NULL, 1, 'john.doe@example.com'),
       ('Jane Smith', 1, 1, 'jane.smith@example.com'),
       ('Jim Brown', 1, 2, 'jim.brown@example.com'),
       ('Jake White', 2, 1, 'jake.white@example.com');

Here, we insert data into the employees table.

Step 3: Using the SELF JOIN Clause

This step involves using the SELF JOIN clause to retrieve hierarchical data from the employees table.

-- Basic SELF JOIN
SELECT a.employee_id, a.employee_name, b.employee_name AS manager_name
FROM employees a
JOIN employees b
ON a.manager_id = b.employee_id;

-- SELF JOIN with WHERE Clause
SELECT a.employee_id, a.employee_name, b.employee_name AS manager_name
FROM employees a
JOIN employees b
ON a.manager_id = b.employee_id
WHERE a.department_id = 1;

-- SELF JOIN to Find Duplicates
SELECT a.employee_name, b.employee_name
FROM employees a
JOIN employees b
ON a.email = b.email
AND a.employee_id <> b.employee_id;

These queries demonstrate how to use the SELF JOIN clause to retrieve hierarchical data from the employees table, including basic self-joins, self-joins with additional conditions, and self-joins to find duplicates.

Conclusion

The PostgreSQL SELF JOIN is a fundamental tool for querying hierarchical data or comparing rows within the same table. Understanding how to use the SELF JOIN and its syntax is essential for effective data retrieval and manipulation in PostgreSQL databases.