SELECT v17

Name

SELECT Retrieve rows from a table or view.

Synopsis

SELECT [ optimizer_hint ] [ ALL | DISTINCT | UNIQUE ]
  * | expression [ AS output_name ] [, ...]
  FROM from_item [, ...]
  [ WHERE condition ]
  [ [ START WITH start_expression ]
      CONNECT BY { PRIOR parent_expr = child_expr |
        child_expr = PRIOR parent_expr }
    [ ORDER SIBLINGS BY expression [ ASC | DESC ] [, ...] ] ]
  [ GROUP BY { expression | ROLLUP ( expr_list ) |
      CUBE ( expr_list ) | GROUPING SETS ( expr_list ) } [, ...]
      [ LEVEL ] ]
  [ HAVING condition [, ...] ]
  [ { UNION [ ALL ] | INTERSECT | MINUS } select ]
  [ ORDER BY expression [ ASC | DESC ] [, ...] ]
  [ FOR UPDATE [WAIT n|NOWAIT|SKIP LOCKED]]

Where from_item can be one of:

table_name[@dblink ] [ alias ]
( select ) alias
from_item [ NATURAL ] join_type from_item
  [ ON join_condition | USING ( join_column [, ...] ) ]

Description

SELECT retrieves rows from one or more tables. The general processing of SELECT is as follows:

  1. All elements in the FROM list are computed. (Each element in the FROM list is a real or virtual table.) If you specify more than one element in the FROM list, they are cross joined. See FROM clause.
  2. If you specify the WHERE clause, all rows that don't satisfy the condition are eliminated from the output. See WHERE clause.
  3. If you specify the GROUP BY clause, the output is divided into groups of rows that match one or more values. The HAVING clause eliminates groups that don't satisfy the given condition. See GROUP BY clause and HAVING clause.
  4. Using the operators UNION, INTERSECT, and MINUS, you can combine the output of more than one SELECT statement to form a single result set. The UNION operator returns all rows that are in one or both of the result sets. The INTERSECT operator returns all rows that are strictly in both result sets. The MINUS operator returns the rows that are in the first result set but not in the second. In all three cases, duplicate rows are eliminated. In the case of the UNION operator, if you specify ALL, then duplicates aren't eliminated. See UNION clause, INTERSECT clause, and MINUS clause.
  5. The actual output rows are computed using the SELECT output expressions for each selected row. See SELECT list.
  6. The CONNECT BY clause is used to select data that has a hierarchical relationship. Such data has a parent-child relationship between rows. See CONNECT BY clause.
  7. If you specify the ORDER BY clause, the returned rows are sorted in the specified order. Otherwise, the rows are returned in the order the system finds fastest to produce. See ORDER BY clause.
  8. DISTINCT | UNIQUE eliminates duplicate rows from the result. ALL (the default) returns all candidate rows, including duplicates. See DISTINCT | UNIQUE clause.
  9. The FOR UPDATE clause causes the SELECT statement to lock the selected rows against concurrent updates. See FOR UPDATE clause.

You must have SELECT privilege on a table to read its values. The use of FOR UPDATE requires UPDATE privilege as well.

Parameters

optimizer_hint

Comment-embedded hints to the optimizer for selecting an execution plan. See Optimizer hints for information about optimizer hints.

FROM clause

The FROM clause specifies one or more source tables for a SELECT statement. The syntax is:

FROM source [, ...]

Where source can be one of following elements:

table_name[@dblink ]

The name (optionally schema-qualified) of an existing table or view. dblink is a database link name identifying a remote database. See the CREATE DATABASE LINK command for information on database links.

alias

A substitute name for the FROM item containing the alias. Use an alias for brevity or to eliminate ambiguity for self-joins where the same table is scanned multiple times. Providing an alias completely hides the name of the table or function. For example, given FROM foo AS f, the remainder of the SELECT must refer to this FROM item as f, not foo.

select

A sub-SELECT can appear in the FROM clause. This acts as though its output were created as a temporary table for the duration of this single SELECT command. Surround the sub-SELECT with parentheses, and provide an alias for it.

join_type

One of the following:

[ INNER ] JOIN

LEFT [ OUTER ] JOIN

RIGHT [ OUTER ] JOIN

FULL [ OUTER ] JOIN

CROSS JOIN

For the INNER and OUTER join types, you must specify a join condition, namely one of NATURAL, ON join_condition, or USING (join_column [, ...] ). For CROSS JOIN, you can't use any of these clauses.

A JOIN clause combines two FROM items. Use parentheses to determine the order of nesting. Without parentheses, JOINs nest left-to-right. In any case JOIN binds more tightly than the commas separating FROM items.

CROSS JOIN and INNER JOIN produce a simple Cartesian product, the same result that you get from listing the two items at the top level of FROM but restricted by any join condition. CROSS JOIN is equivalent to INNER JOIN ON (TRUE), that is, no rows are removed by qualification. These join types are a notational convenience, since you can accomplish the same thing using FROM and WHERE.

LEFT OUTER JOIN returns all rows in the qualified Cartesian product, that is, all combined rows that pass its join condition. It also returns one copy of each row in the left-hand table for which there was no right-hand row that passed the join condition. This left-hand row is extended to the full width of the joined table by inserting null values for the right-hand columns. Only the JOIN clause’s own condition is considered while deciding which rows have matches. Outer conditions are applied afterwards.

Conversely, RIGHT OUTER JOIN returns all the joined rows plus one row for each unmatched right-hand row extended with nulls on the left. This is a notational convenience, since you can convert it to a LEFT OUTER JOIN by switching the left and right inputs.

FULL OUTER JOIN returns all the joined rows, one row for each unmatched left-hand row extended with nulls on the right, and one row for each unmatched right-hand row extended with nulls on the left.

ON join_condition

join_condition is an expression resulting in a value of type BOOLEAN (similar to a WHERE clause) that specifies the rows in a join that are considered to match.

USING (join_column [, ...] )

A clause of the form USING (a, b, ... ) is shorthand for ON left_table.a = right_table.a AND left_table.b = right_table.b .... Also, USING implies that only one of each pair of equivalent columns is included in the join output, not both.

NATURAL

NATURAL is shorthand for a USING list that mentions all columns in the two tables that have the same names.

If you specify multiple sources, the result is the Cartesian product (cross join) of all the sources. Usually qualification conditions are added to restrict the returned rows to a small subset of the Cartesian product.

Example

This example selects all of the entries from the dept table:

SELECT * FROM dept;
Output
deptno |  dname      |  loc
-------+-------------+-----------
    10 |  ACCOUNTING |  NEW YORK
    20 |  RESEARCH   |  DALLAS
    30 |  SALES      |  CHICAGO
    40 | OPERATIONS  |  BOSTON
 (4 rows)

WHERE clause

The optional WHERE clause has the form:

WHERE condition

where condition is any expression that evaluates to a result of type BOOLEAN. Any row that doesn't satisfy this condition is eliminated from the output. A row satisfies the condition if it returns TRUE when the actual row values are substituted for any variable references.

Example

This example joins the contents of the emp and dept tables. The value of the deptno column in the emp table is equal to the value of the deptno column in the deptno table.

SELECT d.deptno, d.dname, e.empno, e.ename, e.mgr, e.hiredate
    FROM emp e, dept d
    WHERE d.deptno = e.deptno;
Output
 deptno |   dname    | empno | ename  | mgr  |      hiredate
--------+------------+-------+--------+------+--------------------
     10 | ACCOUNTING |  7934 | MILLER | 7782 | 23-JAN-82 00:00:00
     10 | ACCOUNTING |  7782 | CLARK  | 7839 | 09-JUN-81 00:00:00
     10 | ACCOUNTING |  7839 | KING   |      | 17-NOV-81 00:00:00
     20 | RESEARCH   |  7788 | SCOTT  | 7566 | 19-APR-87 00:00:00
     20 | RESEARCH   |  7566 | JONES  | 7839 | 02-APR-81 00:00:00
     20 | RESEARCH   |  7369 | SMITH  | 7902 | 17-DEC-80 00:00:00
     20 | RESEARCH   |  7876 | ADAMS  | 7788 | 23-MAY-87 00:00:00
     20 | RESEARCH   |  7902 | FORD   | 7566 | 03-DEC-81 00:00:00
     30 | SALES      |  7521 | WARD   | 7698 | 22-FEB-81 00:00:00
     30 | SALES      |  7844 | TURNER | 7698 | 08-SEP-81 00:00:00
     30 | SALES      |  7499 | ALLEN  | 7698 | 20-FEB-81 00:00:00
     30 | SALES      |  7698 | BLAKE  | 7839 | 01-MAY-81 00:00:00
     30 | SALES      |  7654 | MARTIN | 7698 | 28-SEP-81 00:00:00
     30 | SALES      |  7900 | JAMES  | 7698 | 03-DEC-81 00:00:00
(14 rows)

GROUP BY clause

The optional GROUP BY clause has the form:

GROUP BY { expression | ROLLUP ( expr_list ) |
  CUBE ( expr_list ) | GROUPING SETS ( expr_list ) } [, ...]

GROUP BY condenses all selected rows that share the same values for the grouped expressions into a single row. expression can be an input column name or the name or ordinal number of an output column (SELECT list item). Or it can be an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name is interpreted as an input-column name rather than an output column name.

ROLLUP, CUBE, and GROUPING SETS are extensions to the GROUP BY clause for supporting multidimensional analysis.

Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group. Without GROUP BY, an aggregate produces a single value computed across all the selected rows. When GROUP BY is present, it isn't valid for the SELECT list expressions to refer to ungrouped columns except in aggregate functions, since there is more than one possible value to return for an ungrouped column.

Example

This example computes the sum of the sal column in the emp table, grouping the results by department number:

SELECT deptno, SUM(sal) AS total
    FROM emp
    GROUP BY deptno;
Output
 deptno |  total
--------+----------
     10 |  8750.00
     20 | 10875.00
     30 |  9400.00
(3 rows)

HAVING clause

The optional HAVING clause has the form:

HAVING condition

Where condition is the same as specified for the WHERE clause.

HAVING eliminates group rows that don't satisfy the specified condition. HAVING is different from WHERE. WHERE filters individual rows before applying GROUP BY, while HAVING filters group rows created by GROUP BY. Each column referenced in condition must unambiguously reference a grouping column, unless the reference appears in an aggregate function.

Example

Sum the column sal of all employees, group the results by department number, and show those group totals that are less than 10000:

SELECT deptno, SUM(sal) AS total
    FROM emp
    GROUP BY deptno
    HAVING SUM(sal) < 10000;
Output
 deptno |  total
--------+---------
     10 | 8750.00
     30 | 9400.00
(2 rows)

SELECT List

The SELECT list (between the keywords SELECT and FROM) specifies expressions that form the output rows of the SELECT statement. The expressions can refer to columns computed in the FROM clause, and they usually do. Using the clause AS output_name, you can specify another name for an output column. This name is primarily used to label the column for display. You can also use it to refer to the column’s value in ORDER BY and GROUP BY clauses but not in the WHERE or HAVING clauses. In those clauses, you must write out the expression.

Instead of an expression, you can write * in the output list as a shorthand for all the columns of the selected rows.

Example

The SELECT list in this example specifies for the result set to include the empno column, the ename column, the mgr column, and the hiredate column:

SELECT empno, ename, mgr, hiredate FROM emp;
Output
 empno | ename  | mgr  |      hiredate
-------+--------+------+--------------------
  7934 | MILLER | 7782 | 23-JAN-82 00:00:00
  7782 | CLARK  | 7839 | 09-JUN-81 00:00:00
  7839 | KING   |      | 17-NOV-81 00:00:00
  7788 | SCOTT  | 7566 | 19-APR-87 00:00:00
  7566 | JONES  | 7839 | 02-APR-81 00:00:00
  7369 | SMITH  | 7902 | 17-DEC-80 00:00:00
  7876 | ADAMS  | 7788 | 23-MAY-87 00:00:00
  7902 | FORD   | 7566 | 03-DEC-81 00:00:00
  7521 | WARD   | 7698 | 22-FEB-81 00:00:00
  7844 | TURNER | 7698 | 08-SEP-81 00:00:00
  7499 | ALLEN  | 7698 | 20-FEB-81 00:00:00
  7698 | BLAKE  | 7839 | 01-MAY-81 00:00:00
  7654 | MARTIN | 7698 | 28-SEP-81 00:00:00
  7900 | JAMES  | 7698 | 03-DEC-81 00:00:00
(14 rows)

UNION clause

The UNION clause has the form:

select_statement UNION [ ALL ] select_statement

Where select_statement is any SELECT statement without an ORDER BY or FOR UPDATE clause. You can attach ORDER BY to a sub-expression if you enclose it in parentheses. Without parentheses, these clauses apply to the result of the UNION, not to its right-hand input expression.

The UNION operator computes the set union of the rows returned by the involved SELECT statements. A row is in the set union of two result sets if it appears in at least one of the result sets. The two SELECT statements that represent the direct operands of the UNION must produce the same number of columns, and corresponding columns must have compatible data types.

The result of UNION doesn't contain any duplicate rows unless you specify the ALL option. ALL prevents eliminating duplicates.

Without parentheses, multiple UNION operators in the same SELECT statement are evaluated left to right.

Currently, you can't specify FOR UPDATE either for a UNION result or for any input of a UNION.

INTERSECT clause

The INTERSECT clause has the form:

select_statement INTERSECT select_statement

Where select_statement is any SELECT statement without an ORDER BY or FOR UPDATE clause.

The INTERSECT operator computes the set intersection of the rows returned by the involved SELECT statements. A row is in the intersection of two result sets if it appears in both result sets.

The result of INTERSECT doesn't contain any duplicate rows.

Without parentheses, multiple INTERSECT operators in the same SELECT statement are evaluated left to right. INTERSECT binds more tightly than UNION. That is, A UNION B INTERSECT C is read as A UNION (B INTERSECT C).

MINUS clause

The MINUS clause has this general form:

select_statement MINUS select_statement

Where select_statement is any SELECT statement without an ORDER BY or FOR UPDATE clause.

The MINUS operator computes the set of rows that are in the result of the left SELECT statement but not in the result of the right one.

The result of MINUS doesn't contain any duplicate rows.

Without parentheses, multiple MINUS operators in the same SELECT statement are evaluated left to right. MINUS binds at the same level as UNION.

CONNECT BY clause

The CONNECT BY clause determines the parent-child relationship of rows when performing a hierarchical query. It has the general form:

CONNECT BY { PRIOR parent_expr = child_expr |
  child_expr = PRIOR parent_expr }

Where parent_expr is evaluated on a candidate parent row. If parent_expr = child_expr results in TRUE for a row returned by the FROM clause, then this row is considered a child of the parent.

You can specify the following optional clauses with the CONNECT BY clause.

START WITH start_expression

The rows returned by the FROM clause on which start_expression evaluates to TRUE become the root nodes of the hierarchy.

ORDER SIBLINGS BY expression [ ASC | DESC ] [, ...]

Sibling rows of the hierarchy are ordered by expression in the result set.

Note

EDB Postgres Advanced Server doesn't support the use of AND or other operators in the CONNECT BY clause.

ORDER BY clause

The optional ORDER BY clause has the form:

ORDER BY expression [ ASC | DESC ] [, ...]

Where expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values.

The ORDER BY clause sorts the result rows according to the specified expressions. If two rows are equal according to the left-most expression, they are compared according to the next expression, and so on. If they are equal according to all specified expressions, they are returned in an implementation-dependent order.

The ordinal number refers to the ordinal (left-to-right) position of the result column. This feature makes it possible to define an ordering on the basis of a column that doesn't have a unique name. This is never required because you can always assign a name to a result column using the AS clause.

You also can use arbitrary expressions in the ORDER BY clause, including columns that don't appear in the SELECT result list. Thus the following statement is valid:

SELECT ename FROM emp ORDER BY empno;

A limitation of this feature is that an ORDER BY clause applying to the result of a UNION, INTERSECT, or MINUS clause can specify only an output column name or number, not an expression.

If an ORDER BY expression is a simple name that matches both a result column name and an input column name, ORDER BY interprets it as the result column name. This behavior is the opposite of the choice that GROUP BY makes in the same situation. This inconsistency is compatible with the SQL standard.

Optionally, you can add the key word ASC (ascending) or DESC (descending) after any expression in the ORDER BY clause. ASC is the default.

The null value sorts higher than any other value. In other words, with ascending sort order, null values sort at the end. With descending sort order, null values sort at the beginning.

Character-string data is sorted according to the locale-specific collation order that was established when the database cluster was initialized.

Note

If you specify SELECT DISTINCT or if a SELECT statement includes the SELECT DISTINCT ...ORDER BY clause, then all the expressions in ORDER BY must be present in the select list of the SELECT DISTINCT query.

Examples

These examples sort the individual results according to the contents of the second column (dname):

SELECT * FROM dept ORDER BY dname;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     40 | OPERATIONS | BOSTON
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
(4 rows)
SELECT * FROM dept ORDER BY 2;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     40 | OPERATIONS | BOSTON
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
(4 rows)

This example uses the SELECT DISTINCT ...ORDER BY clause to fetch the job and deptno from table emp:

CREATE TABLE EMP(EMPNO NUMBER(4) NOT NULL,
ENAME VARCHAR2(10),
JOB VARCHAR2(9),
DEPTNO NUMBER(2));
INSERT INTO EMP VALUES (7369, 'SMITH', 'CLERK', 20);
INSERT 0 1
INSERT INTO EMP VALUES (7499, 'ALLEN', 'SALESMAN', 30);
INSERT 0 1
INSERT INTO EMP VALUES (7521, 'WARD', 'SALESMAN', 30);
INSERT 0 1
INSERT INTO EMP VALUES (7566, 'JONES', 'MANAGER', 20);
INSERT 0 1
SELECT DISTINCT e.job, e.deptno FROM emp e ORDER BY e.job, e.deptno;
Output
 job      | deptno
----------+--------
 CLERK    |     20
 MANAGER  |     20
 SALESMAN |     30
(3 rows)

DISTINCT | UNIQUE clause

If a SELECT statement specifies DISTINCT or UNIQUE, all duplicate rows are removed from the result set. One row is kept from each group of duplicates. The DISTINCT or UNIQUE clauses are synonymous when used with a SELECT statement. The ALL keyword specifies the opposite, which is that all rows are kept (the default).

Error messages resulting from the improper use of a SELECT statement that includes the DISTINCT or UNIQUE keywords include both the DISTINCT | UNIQUE keywords:

psql: ERROR: FOR UPDATE is not allowed with DISTINCT/UNIQUE clause

FOR UPDATE clause

The FOR UPDATE clause takes the form:

FOR UPDATE [WAIT n|NOWAIT|SKIP LOCKED]

FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents a row from being modified or deleted by other transactions until the current transaction ends. Any transaction that attempts to UPDATE, DELETE, or SELECT FOR UPDATE a selected row is blocked until the current transaction ends. If an UPDATE, DELETE, or SELECT FOR UPDATE from another transaction has already locked a selected row or rows, SELECT FOR UPDATE waits for the first transaction to complete. It then locks and returns the updated row, or it returns no row if the row was deleted.

You can't use FOR UPDATE in contexts where you can't clearly identify returned rows with individual table rows, for example, with aggregation.

Use FOR UPDATE options to specify locking preferences:

  • Include the WAIT n keywords to specify the number of seconds or fractional seconds for the SELECT statement to wait for a row locked by another session. Use a decimal form to specify fractional seconds. For example, WAIT 1.5 waits one and a half seconds. Specify up to four digits to the right of the decimal.
  • Include the NOWAIT keyword to report an error immediately if the current session can't lock a row.
  • Include SKIP LOCKED to lock rows, if possible, and skip rows that are already locked by another session.