Categories:

String & Binary Functions (Matching/Comparison)

SUBSTR , SUBSTRING

Returns the portion of the string or binary value from base_expr, starting from the character/byte specified by start_expr, with optionally limited length.

These functions are synonymous.

See also

LEFT , RIGHT

Syntax

SUBSTR( <base_expr>, <start_expr> [ , <length_expr> ] )

SUBSTRING( <base_expr>, <start_expr> [ , <length_expr> ] )

Arguments

base_expr

This must be a VARCHAR or BINARY value.

start_expr

The start position should be an expression that evaluates to an integer. It specifies the offset from which the substring starts. The offset is measured in:

  • The number of UTF-8 characters if the input is VARCHAR.

  • The number of bytes if the input is BINARY.

The start position is 1-based, not 0-based. SUBSTR('abc', 1, 1) returns ‘a’, not ‘b’.

length_expr

The length should be an expression that evaluates to an integer. It should specify:

  • The number of UTF-8 characters to return if the input is VARCHAR.

  • The number of bytes to return if the input is BINARY.

Returns

The data type of the returned value is the same as the data type of the base_expr (BINARY or VARCHAR).

Usage Notes

  • If length_expr is used, up to length_expr characters/bytes are returned, otherwise all the characters until the end of the string or binary value are returned.

  • The values in start_expr start from 1:

    • If 0 is specified, it is treated as 1.

    • If a negative value is specified, the starting position is computed as the start_expr characters/bytes from the end of the string or binary value. If the position is outside of the range of a string or binary value, an empty value is returned.

  • If any of the inputs are NULL, NULL is returned.

Collation Details

  • Collation applies to VARCHAR inputs. Collation does not apply if the input data type of the first parameter is BINARY.

  • No impact. Although collation is accepted syntactically, collations have no impact on processing. For example, languages with two-character and three-character letters (e.g. “dzs” in Hungarian, “ch” in Czech) still count those as two or three characters (not one character) for the length argument.

  • The collation of the result is the same as the collation of the input. This can be useful if the returned value is passed to another function as part of nested function calls.

Examples

SELECT SUBSTR('testing 1 2 3', 9, 5) FROM x;

-------------------------------+
 substr('testing 1 2 3', 9, 5) |
-------------------------------+
 1 2 3                         |
-------------------------------+

SELECT '123456', pos, len, SUBSTR('123456', pos, len) FROM o;

----------+--------+-----+----------------------------+
 '123456' |  pos   | len | substr('123456', pos, len) |
----------+--------+-----+----------------------------+
 123456   | -1     | 3   | 6                          |
 123456   | -3     | 3   | 456                        |
 123456   | -3     | 7   | 456                        |
 123456   | -5     | 3   | 234                        |
 123456   | -7     | 3   |                            |
 123456   | 0      | 3   | 123                        |
 123456   | 0      | 7   | 123456                     |
 123456   | 1      | 3   | 123                        |
 123456   | 3      | 3   | 345                        |
 123456   | 3      | 7   | 3456                       |
 123456   | 5      | 3   | 56                         |
 123456   | 5      | 7   | 56                         |
 123456   | 7      | 3   |                            |
 123456   | [NULL] | 3   | [NULL]                     |
 123456   | [NULL] | 7   | [NULL]                     |
----------+--------+-----+----------------------------+