Categories:
Query Syntax

JOIN

A JOIN clause is a sub-clause of a FROM clause.

A JOIN operation combines rows from two tables (or other sources, such as views or table functions) to create a new combined row that can be used in the query.

Typically, the data in the two tables is related in some way. For example, one table might hold information about projects, and one table might hold information about the employees working on those projects.

The two tables usually contain one or more columns in common so that the rows in one table can be associated with the corresponding rows in the other table. For example, each row in the projects table might have a unique project ID number, and each row in the employees table might include the ID number of the project that the employee is currently assigned to.

The JOIN clause specifies (explicitly or implicitly) how to relate rows in one table to the corresponding rows in the other table, typically by referencing the common column(s), such as project ID.

Syntax

Use one of the following:

SELECT ...
FROM <object_ref1> [
                     {
                       INNER
                       | { LEFT | RIGHT | FULL } [ OUTER ]
                     }
                   ]
                   JOIN <object_ref2>
  [ ON <condition> ]
[ ... ]
SELECT *
FROM <object_ref1> [
                     {
                       INNER
                       | { LEFT | RIGHT | FULL } [ OUTER ]
                     }
                   ]
                   JOIN <object_ref2>
  [ USING( <column_list> ) ]
[ ... ]
SELECT ...
FROM <object_ref1> [
                     {
                       | NATURAL [ { LEFT | RIGHT | FULL } [ OUTER ] ]
                       | CROSS
                     }
                   ]
                   JOIN <object_ref2>
[ ... ]

JOIN

The semantics of joins are as follows (for brevity, this topic uses o1 and o2 for object_ref1 and object_ref2, respectively):

Join Type Semantics
o1 INNER JOIN o2 For each row of o1, a row is produced for each row of o2 that matches according to the ON condition subclause. (Note that you can also use a comma to specify an inner join. For an example, see the examples section below.) If you use INNER JOIN without the ON clause (or if you use comma without a WHERE clause), the result is the same as using CROSS JOIN: a cartesian product (every row of o1 paired with every row of o2).
o1 LEFT OUTER JOIN o2 The result of the inner join is augmented with a row for each row of o1 that has no matches in o2. The result columns referencing o2 contain null.
o1 RIGHT OUTER JOIN o2 The result of the inner join is augmented with a row for each row of o2 that has no matches in o1. The result columns referencing o1 contain null.
o1 FULL OUTER JOIN o2 Returns all joined rows, plus one row for each unmatched left side row (extended with nulls on the right), plus one row for each unmatched right side row (extended with nulls on the left).
o1 CROSS JOIN o2 For every possible combination of rows from o1 and o2 (i.e. Cartesian product), the joined table contains a row consisting of all columns in o1 followed by all columns in o2. A CROSS JOIN cannot be combined with an ON condition clause. However, you can use a WHERE clause to filter the results.
o1 NATURAL JOIN o2 A NATURAL JOIN is identical to an explicit JOIN on the common columns of the two tables, except that the common columns are included only once in the output. (A natural join assumes that columns with the same name, but in different tables, contain corresponding data.) See the Examples section below for some examples. A NATURAL JOIN can be combined with an OUTER JOIN. A NATURAL JOIN cannot be combined with an ON condition clause because the JOIN condition is already implied. However, you can use a WHERE clause to filter the results.

Default: INNER JOIN

If the word JOIN is used without specifying INNER or OUTER, then the JOIN will be an inner join.

ON condition

A boolean expression that defines the rows from the two sides of the JOIN that are considered to match, for example:

ON object_ref2.id_number = object_ref1.id_number

Conditions are discussed in more detail in the WHERE clause.

The ON clause is prohibited for CROSS JOIN.

The ON clause is unnecessary (and prohibited) for NATURAL JOIN; the join columns are implied.

For other joins, the ON clause is optional. However, omitting the ON clause results in a cartesian product (every row of object_ref1 paired with every row of object_ref2). A cartesian product can produce a very large volume of output, almost all of which consists of pairs of rows that aren’t actually related; this consumes a lot of resources and is often a user error.

USING( <column_list> )

A list of columns in common between the two tables being joined; these columns are used as the join columns. The columns must have the same name and meaning in each of the tables being joined.

For example, suppose that the SQL statement contains:

... o1 JOIN o2
    USING (key_column)

In the simple case, this would be equivalent to:

... o1 JOIN o2
    ON o2.key_column = o1.key_column

To use the USING clause properly, the projection list (the list of columns and other expressions after the SELECT keyword) should be “*”. This allows the server to return the key_column exactly once. If the key column is referenced more than once, the query results might be unexpected. For examples of standard and non-standard usage, see the examples below.

Examples

Many of the JOIN examples use two tables, t1 and t2. The tables and their data are created as shown below:

CREATE TABLE t1 (col1 INTEGER);
CREATE TABLE t2 (col1 INTEGER);
INSERT INTO t1 (col1) VALUES 
   (2),
   (3),
   (4);
INSERT INTO t2 (col1) VALUES 
   (1),
   (2),
   (2),
   (3);

Inner join:

SELECT t1.col1, t2.col1 FROM t1 INNER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
+------+------+

This shows a left outer join. Note the NULL value for the row in table t1 that doesn’t have a matching row in table t2.

SELECT t1.col1, t2.col1 FROM t1 LEFT OUTER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
|    4 | NULL |
+------+------+

This shows a right outer join. Note the NULL value for the row in table t1 that doesn’t have a matching row in table t2.

SELECT t1.col1, t2.col1 FROM t1 RIGHT OUTER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
| NULL |    1 |
+------+------+

This shows a full outer join. Note that since each table has a row that doesn’t have a matching row in the other table, the output contains two rows with NULL values:

SELECT t1.col1, t2.col1 FROM t1 FULL OUTER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
|    4 | NULL |
| NULL |    1 |
+------+------+

Here is an example of a cross join, which will produce the cartesian product. Note that the cross join does not have an ON clause.

SELECT t1.col1, t2.col1 FROM t1 CROSS JOIN t2 ORDER BY 1, 2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    1 |
|    2 |    2 |
|    2 |    2 |
|    2 |    3 |
|    3 |    1 |
|    3 |    2 |
|    3 |    2 |
|    3 |    3 |
|    4 |    1 |
|    4 |    2 |
|    4 |    2 |
|    4 |    3 |
+------+------+

A cross join can be filtered by a WHERE clause, as shown in the example below:

SELECT t1.col1, t2.col1 FROM t1 CROSS JOIN t2 WHERE t2.col1 = t1.col1 ORDER BY 1, 2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
+------+------+

This is an example of a natural join. This produces the same output as the corresponding inner join, except that we don’t get a second copy of the join column:

CREATE OR REPLACE TABLE d1 (
  id number,
  name string
  );
+--------------------------------+
| status                         |
|--------------------------------|
| Table D1 successfully created. |
+--------------------------------+
INSERT INTO d1 (id, name) VALUES
  (1,'a'),
  (2,'b'),
  (4,'c');
+-------------------------+
| number of rows inserted |
|-------------------------|
|                       3 |
+-------------------------+
CREATE OR REPLACE TABLE d2 (
  id number,
  value string
  );
+--------------------------------+
| status                         |
|--------------------------------|
| Table D2 successfully created. |
+--------------------------------+
INSERT INTO d2 (id, value) VALUES
  (1,'xx'),
  (2,'yy'),
  (5,'zz');
+-------------------------+
| number of rows inserted |
|-------------------------|
|                       3 |
+-------------------------+
SELECT * FROM d1 NATURAL INNER JOIN d2 ORDER BY id;
+----+------+-------+
| ID | NAME | VALUE |
|----+------+-------|
|  1 | a    | xx    |
|  2 | b    | yy    |
+----+------+-------+

Natural joins can be combined with outer joins, for example:

SELECT *
  FROM d1 NATURAL FULL OUTER JOIN d2
  ORDER BY ID;
+----+------+-------+
| ID | NAME | VALUE |
|----+------+-------|
|  1 | a    | xx    |
|  2 | b    | yy    |
|  4 | c    | NULL  |
|  5 | NULL | zz    |
+----+------+-------+

The comma operator is older syntax for INNER JOIN. When a comma is used instead of INNER JOIN, the filter condition is put in the WHERE clause rather than the ON clause. The following two statements are equivalent:

Newer (preferred) style:

SELECT t1.col1, t2.col1 FROM t1 JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1, 2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
+------+------+

Older style:

SELECT t1.col1, t2.col1 FROM t1, t2 WHERE t2.col1 = t1.col1 ORDER BY 1, 2;
+------+------+
| COL1 | COL1 |
|------+------|
|    2 |    2 |
|    2 |    2 |
|    3 |    3 |
+------+------+

Joins can be combined in the FROM clause. The following query shows two joins chained together in the FROM clause:

SELECT t1.*, t2.*, t3.*
  FROM t1
     LEFT OUTER JOIN t2 ON (t1.col1 = t2.col1)
     RIGHT OUTER JOIN t3 ON (t3.col1 = t2.col1)
  ORDER BY t1.col1;
+------+------+------+
| COL1 | COL1 | COL1 |
|------+------+------|
|    2 |    2 |    2 |
|    2 |    2 |    2 |
| NULL | NULL |    6 |
+------+------+------+

In such a query, the results are determined based on the joins taking place from left to right (though the optimizer may reorder the joins if a different join order will produce the same result). If the right outer join is meant to take place before the left outer join, then the query can be written as follows:

SELECT t1.*, t2.*, t3.*
  FROM t1
     LEFT OUTER JOIN
     (t2 RIGHT OUTER JOIN t3 ON (t3.col1 = t2.col1))
     ON (t1.col1 = t2.col1)
  ORDER BY t1.col1;
+------+------+------+
| COL1 | COL1 | COL1 |
|------+------+------|
|    2 |    2 |    2 |
|    2 |    2 |    2 |
|    3 | NULL | NULL |
|    4 | NULL | NULL |
+------+------+------+

The two examples below show standard and non-standard usage of the USING clause.

This first example shows correct usage. Specifically, the projection list contains “*” and nothing else. Even though the query joins two tables, and each table has one column, and the query asks for all columns, the output contains one column, not two columns.

WITH 
    l AS (
         SELECT 'a' AS userid
         ),
    r AS (
         SELECT 'b' AS userid
         )
  SELECT * 
    FROM l LEFT JOIN r USING(userid)
;
+--------+
| USERID |
|--------|
| a      |
+--------+

The following example shows non-standard usage; the projection list contains something other than “*”. Because the usage is non-standard, the output contains two columns named “userid”, and the second occurrence (which you might expect to contain a value from table ‘r’) contains a value that is not in the table (the value ‘a’ is not in the table ‘r’).

WITH 
    l AS (
         SELECT 'a' AS userid
       ),
    r AS (
         SELECT 'b' AS userid
         )
  SELECT l.userid as UI_L,
         r.userid as UI_R  -- Incorrect usage!
    FROM l LEFT JOIN r USING(userid)
;
+------+------+
| UI_L | UI_R |
|------+------|
| a    | a    |
+------+------+