Chapter 12. Functions and Operators

Table of Contents

12.1. Function and Operator Reference
12.2. Type Conversion in Expression Evaluation
12.3. Operators
12.3.1. Operator Precedence
12.3.2. Comparison Functions and Operators
12.3.3. Logical Operators
12.3.4. Assignment Operators
12.4. Control Flow Functions
12.5. String Functions
12.5.1. String Comparison Functions
12.5.2. Regular Expressions
12.6. Numeric Functions and Operators
12.6.1. Arithmetic Operators
12.6.2. Mathematical Functions
12.7. Date and Time Functions
12.8. What Calendar Is Used By MySQL?
12.9. Full-Text Search Functions
12.9.1. Natural Language Full-Text Searches
12.9.2. Boolean Full-Text Searches
12.9.3. Full-Text Searches with Query Expansion
12.9.4. Full-Text Stopwords
12.9.5. Full-Text Restrictions
12.9.6. Fine-Tuning MySQL Full-Text Search
12.9.7. Adding a Collation for Full-Text Indexing
12.10. Cast Functions and Operators
12.11. XML Functions
12.12. Bit Functions
12.13. Encryption and Compression Functions
12.14. Information Functions
12.15. Functions Used with Global Transaction IDs
12.16. Miscellaneous Functions
12.17. Functions and Modifiers for Use with GROUP BY Clauses
12.17.1. GROUP BY (Aggregate) Functions
12.17.2. GROUP BY Modifiers
12.17.3. MySQL Extensions to GROUP BY
12.18. Spatial Extensions
12.18.1. Introduction to MySQL Spatial Support
12.18.2. The OpenGIS Geometry Model
12.18.3. Supported Spatial Data Formats
12.18.4. Creating a Spatially Enabled MySQL Database
12.18.5. Spatial Analysis Functions
12.18.6. Optimizing Spatial Analysis
12.18.7. MySQL Conformance and Compatibility
12.19. Precision Math
12.19.1. Types of Numeric Values
12.19.2. DECIMAL Data Type Changes
12.19.3. Expression Handling
12.19.4. Rounding Behavior
12.19.5. Precision Math Examples

Expressions can be used at several points in SQL statements, such as in the ORDER BY or HAVING clauses of SELECT statements, in the WHERE clause of a SELECT, DELETE, or UPDATE statement, or in SET statements. Expressions can be written using literal values, column values, NULL, built-in functions, stored functions, user-defined functions, and operators. This chapter describes the functions and operators that are permitted for writing expressions in MySQL. Instructions for writing stored functions and user-defined functions are given in Section 18.2, “Using Stored Routines (Procedures and Functions)”, and Section 22.3, “Adding New Functions to MySQL”. See Section 9.2.4, “Function Name Parsing and Resolution”, for the rules describing how the server interprets references to different kinds of functions.

An expression that contains NULL always produces a NULL value unless otherwise indicated in the documentation for a particular function or operator.

Note

By default, there must be no whitespace between a function name and the parenthesis following it. This helps the MySQL parser distinguish between function calls and references to tables or columns that happen to have the same name as a function. However, spaces around function arguments are permitted.

You can tell the MySQL server to accept spaces after function names by starting it with the --sql-mode=IGNORE_SPACE option. (See Section 5.1.7, “Server SQL Modes”.) Individual client programs can request this behavior by using the CLIENT_IGNORE_SPACE option for mysql_real_connect(). In either case, all function names become reserved words.

For the sake of brevity, most examples in this chapter display the output from the mysql program in abbreviated form. Rather than showing examples in this format:

mysql> SELECT MOD(29,9);
+-----------+
| mod(29,9) |
+-----------+
|         2 |
+-----------+
1 rows in set (0.00 sec)

This format is used instead:

mysql> SELECT MOD(29,9);
        -> 2

12.1. Function and Operator Reference

Table 12.1. Functions/Operators

NameDescription
ABS()Return the absolute value
ACOS()Return the arc cosine
ADDDATE()Add time values (intervals) to a date value
ADDTIME()Add time
AES_DECRYPT()Decrypt using AES
AES_ENCRYPT()Encrypt using AES
AND, &&Logical AND
ASCII()Return numeric value of left-most character
ASIN()Return the arc sine
=Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement)
:=Assign a value
ATAN2(), ATAN()Return the arc tangent of the two arguments
ATAN()Return the arc tangent
AVG()Return the average value of the argument
BENCHMARK()Repeatedly execute an expression
BETWEEN ... AND ...Check whether a value is within a range of values
BIN()Return a string containing binary representation of a number
BINARYCast a string to a binary string
BIT_AND()Return bitwise and
BIT_COUNT()Return the number of bits that are set
BIT_LENGTH()Return length of argument in bits
BIT_OR()Return bitwise or
BIT_XOR()Return bitwise xor
&Bitwise AND
~Invert bits
|Bitwise OR
^Bitwise XOR
CASECase operator
CAST()Cast a value as a certain type
CEIL()Return the smallest integer value not less than the argument
CEILING()Return the smallest integer value not less than the argument
CHAR_LENGTH()Return number of characters in argument
CHAR()Return the character for each integer passed
CHARACTER_LENGTH()A synonym for CHAR_LENGTH()
CHARSET()Return the character set of the argument
COALESCE()Return the first non-NULL argument
COERCIBILITY()Return the collation coercibility value of the string argument
COLLATION()Return the collation of the string argument
COMPRESS()Return result as a binary string
CONCAT_WS()Return concatenate with separator
CONCAT()Return concatenated string
CONNECTION_ID()Return the connection ID (thread ID) for the connection
CONV()Convert numbers between different number bases
CONVERT_TZ()Convert from one timezone to another
CONVERT()Cast a value as a certain type
COS()Return the cosine
COT()Return the cotangent
COUNT(DISTINCT)Return the count of a number of different values
COUNT()Return a count of the number of rows returned
CRC32()Compute a cyclic redundancy check value
CURDATE()Return the current date
CURRENT_DATE(), CURRENT_DATESynonyms for CURDATE()
CURRENT_TIME(), CURRENT_TIMESynonyms for CURTIME()
CURRENT_TIMESTAMP(), CURRENT_TIMESTAMPSynonyms for NOW()
CURRENT_USER(), CURRENT_USERThe authenticated user name and host name
CURTIME()Return the current time
DATABASE()Return the default (current) database name
DATE_ADD()Add time values (intervals) to a date value
DATE_FORMAT()Format date as specified
DATE_SUB()Subtract a time value (interval) from a date
DATE()Extract the date part of a date or datetime expression
DATEDIFF()Subtract two dates
DAY()Synonym for DAYOFMONTH()
DAYNAME()Return the name of the weekday
DAYOFMONTH()Return the day of the month (0-31)
DAYOFWEEK()Return the weekday index of the argument
DAYOFYEAR()Return the day of the year (1-366)
DECODE()Decodes a string encrypted using ENCODE()
DEFAULT()Return the default value for a table column
DEGREES()Convert radians to degrees
DES_DECRYPT()Decrypt a string
DES_ENCRYPT()Encrypt a string
DIVInteger division
/Division operator
ELT()Return string at index number
ENCODE()Encode a string
ENCRYPT()Encrypt a string
<=>NULL-safe equal to operator
=Equal operator
EXP()Raise to the power of
EXPORT_SET()Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string
EXTRACT()Extract part of a date
ExtractValue()Extracts a value from an XML string using XPath notation
FIELD()Return the index (position) of the first argument in the subsequent arguments
FIND_IN_SET()Return the index position of the first argument within the second argument
FLOOR()Return the largest integer value not greater than the argument
FORMAT()Return a number formatted to specified number of decimal places
FOUND_ROWS()For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause
FROM_BASE64()Decode to a base-64 string and return result
FROM_DAYS()Convert a day number to a date
FROM_UNIXTIME()Format UNIX timestamp as a date
GET_FORMAT()Return a date format string
GET_LOCK()Get a named lock
>=Greater than or equal operator
>Greater than operator
GREATEST()Return the largest argument
GROUP_CONCAT()Return a concatenated string
GTID_SUBSET()Return true if all GTIDs in subset are also in set; otherwise false.
GTID_SUBTRACT()Return all GTIDs in set that are not in subset.
HEX()Return a hexadecimal representation of a decimal or string value
HOUR()Extract the hour
IF()If/else construct
IFNULL()Null if/else construct
IN()Check whether a value is within a set of values
INET_ATON()Return the numeric value of an IP address
INET_NTOA()Return the IP address from a numeric value
INET6_ATON()Return the numeric value of an IPv6 address
INET6_NTOA()Return the IPv6 address from a numeric value
INSERT()Insert a substring at the specified position up to the specified number of characters
INSTR()Return the index of the first occurrence of substring
INTERVAL()Return the index of the argument that is less than the first argument
IS_FREE_LOCK()Checks whether the named lock is free
IS_IPV4_COMPAT()Return true if argument is an IPv4-compatible address
IS_IPV4_MAPPED()Return true if argument is an IPv4-mapped address
IS_IPV4()Return true if argument is an IPv4 address
IS_IPV6()Return true if argument is an IPv6 address
IS NOT NULLNOT NULL value test
IS NOTTest a value against a boolean
IS NULLNULL value test
IS_USED_LOCK()Checks whether the named lock is in use. Return connection identifier if true.
ISTest a value against a boolean
ISNULL()Test whether the argument is NULL
LAST_DAYReturn the last day of the month for the argument
LAST_INSERT_ID()Value of the AUTOINCREMENT column for the last INSERT
LCASE()Synonym for LOWER()
LEAST()Return the smallest argument
<<Left shift
LEFT()Return the leftmost number of characters as specified
LENGTH()Return the length of a string in bytes
<=Less than or equal operator
<Less than operator
LIKESimple pattern matching
LN()Return the natural logarithm of the argument
LOAD_FILE()Load the named file
LOCALTIME(), LOCALTIMESynonym for NOW()
LOCALTIMESTAMP, LOCALTIMESTAMP()Synonym for NOW()
LOCATE()Return the position of the first occurrence of substring
LOG10()Return the base-10 logarithm of the argument
LOG2()Return the base-2 logarithm of the argument
LOG()Return the natural logarithm of the first argument
LOWER()Return the argument in lowercase
LPAD()Return the string argument, left-padded with the specified string
LTRIM()Remove leading spaces
MAKE_SET()Return a set of comma-separated strings that have the corresponding bit in bits set
MAKEDATE()Create a date from the year and day of year
MAKETIMEMAKETIME()
MASTER_POS_WAIT()Block until the slave has read and applied all updates up to the specified position
MATCHPerform full-text search
MAX()Return the maximum value
MD5()Calculate MD5 checksum
MICROSECOND()Return the microseconds from argument
MID()Return a substring starting from the specified position
MIN()Return the minimum value
-Minus operator
MINUTE()Return the minute from the argument
MOD()Return the remainder
% or MODModulo operator
MONTH()Return the month from the date passed
MONTHNAME()Return the name of the month
NAME_CONST()Causes the column to have the given name
NOT BETWEEN ... AND ...Check whether a value is not within a range of values
!=, <>Not equal operator
NOT IN()Check whether a value is not within a set of values
NOT LIKENegation of simple pattern matching
NOT REGEXPNegation of REGEXP
NOT, !Negates value
NOW()Return the current date and time
NULLIF()Return NULL if expr1 = expr2
OCT()Return a string containing octal representation of a number
OCTET_LENGTH()A synonym for LENGTH()
OLD_PASSWORD()Return the value of the pre-4.1 implementation of PASSWORD
||, ORLogical OR
ORD()Return character code for leftmost character of the argument
PASSWORD()Calculate and return a password string
PERIOD_ADD()Add a period to a year-month
PERIOD_DIFF()Return the number of months between periods
PI()Return the value of pi
+Addition operator
POSITION()A synonym for LOCATE()
POW()Return the argument raised to the specified power
POWER()Return the argument raised to the specified power
PROCEDURE ANALYSE()Analyze the results of a query
QUARTER()Return the quarter from a date argument
QUOTE()Escape the argument for use in an SQL statement
RADIANS()Return argument converted to radians
RAND()Return a random floating-point value
REGEXPPattern matching using regular expressions
RELEASE_LOCK()Releases the named lock
REPEAT()Repeat a string the specified number of times
REPLACE()Replace occurrences of a specified string
REVERSE()Reverse the characters in a string
>>Right shift
RIGHT()Return the specified rightmost number of characters
RLIKESynonym for REGEXP
ROUND()Round the argument
ROW_COUNT()The number of rows updated
RPAD()Append string the specified number of times
RTRIM()Remove trailing spaces
SCHEMA()A synonym for DATABASE()
SEC_TO_TIME()Converts seconds to 'HH:MM:SS' format
SECOND()Return the second (0-59)
SESSION_USER()Synonym for USER()
SHA1(), SHA()Calculate an SHA-1 160-bit checksum
SHA2()Calculate an SHA-2 checksum
SIGN()Return the sign of the argument
SIN()Return the sine of the argument
SLEEP()Sleep for a number of seconds
SOUNDEX()Return a soundex string
SOUNDS LIKECompare sounds
SPACE()Return a string of the specified number of spaces
SQRT()Return the square root of the argument
STD()Return the population standard deviation
STDDEV_POP()Return the population standard deviation
STDDEV_SAMP()Return the sample standard deviation
STDDEV()Return the population standard deviation
STR_TO_DATE()Convert a string to a date
STRCMP()Compare two strings
SUBDATE()A synonym for DATE_SUB() when invoked with three arguments
SUBSTR()Return the substring as specified
SUBSTRING_INDEX()Return a substring from a string before the specified number of occurrences of the delimiter
SUBSTRING()Return the substring as specified
SUBTIME()Subtract times
SUM()Return the sum
SYSDATE()Return the time at which the function executes
SYSTEM_USER()Synonym for USER()
TAN()Return the tangent of the argument
TIME_FORMAT()Format as time
TIME_TO_SEC()Return the argument converted to seconds
TIME()Extract the time portion of the expression passed
TIMEDIFF()Subtract time
*Multiplication operator
TIMESTAMP()With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments
TIMESTAMPADD()Add an interval to a datetime expression
TIMESTAMPDIFF()Subtract an interval from a datetime expression
TO_BASE64()Return the argument converted to a base-64 string
TO_DAYS()Return the date argument converted to days
TO_SECONDS()Return the date or datetime argument converted to seconds since Year 0
TRIM()Remove leading and trailing spaces
TRUNCATE()Truncate to specified number of decimal places
UCASE()Synonym for UPPER()
-Change the sign of the argument
UNCOMPRESS()Uncompress a string compressed
UNCOMPRESSED_LENGTH()Return the length of a string before compression
UNHEX()Return a string containing hex representation of a number
UNIX_TIMESTAMP()Return a UNIX timestamp
UpdateXML()Return replaced XML fragment
UPPER()Convert to uppercase
USER()The user name and host name provided by the client
UTC_DATE()Return the current UTC date
UTC_TIME()Return the current UTC time
UTC_TIMESTAMP()Return the current UTC date and time
UUID_SHORT()Return an integer-valued universal identifier
UUID()Return a Universal Unique Identifier (UUID)
VALIDATE_PASSWORD_STRENGTH()Determine strength of password
VALUES()Defines the values to be used during an INSERT
VAR_POP()Return the population standard variance
VAR_SAMP()Return the sample variance
VARIANCE()Return the population standard variance
VERSION()Returns a string that indicates the MySQL server version
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()Wait until the slave SQL thread has executed all the given GTIDs. Returns: the number of events that were executed (or NULL, if GTID mode is not enabled).
WEEK()Return the week number
WEEKDAY()Return the weekday index
WEEKOFYEAR()Return the calendar week of the date (0-53)
WEIGHT_STRING()Return the weight string for a string
XORLogical XOR
YEAR()Return the year
YEARWEEK()Return the year and week

12.2. Type Conversion in Expression Evaluation

When an operator is used with operands of different types, type conversion occurs to make the operands compatible. Some conversions occur implicitly. For example, MySQL automatically converts numbers to strings as necessary, and vice versa.

mysql> SELECT 1+'1';
        -> 2
mysql> SELECT CONCAT(2,' test');
        -> '2 test'

It is also possible to convert a number to a string explicitly using the CAST() function. Conversion occurs implicitly with the CONCAT() function because it expects string arguments.

mysql> SELECT 38.8, CAST(38.8 AS CHAR);
        -> 38.8, '38.8'
mysql> SELECT 38.8, CONCAT(38.8);
        -> 38.8, '38.8'

See later in this section for information about the character set of implicit number-to-string conversions.

The following rules describe how conversion occurs for comparison operations:

  • If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe <=> equality comparison operator. For NULL <=> NULL, the result is true. No conversion is needed.

  • If both arguments in a comparison operation are strings, they are compared as strings.

  • If both arguments are integers, they are compared as integers.

  • Hexadecimal values are treated as binary strings if not compared to a number.

  • If one of the arguments is a TIMESTAMP or DATETIME column and the other argument is a constant, the constant is converted to a timestamp before the comparison is performed. This is done to be more ODBC-friendly. Note that this is not done for the arguments to IN()! To be safe, always use complete datetime, date, or time strings when doing comparisons. For example, to achieve best results when using BETWEEN with date or time values, use CAST() to explicitly convert the values to the desired data type.

    A single-row subquery from a table or tables is not considered a constant. For example, if a subquery returns an integer to be compared to a DATETIME value, the comparison is done as two integers. The integer is not converted to a temporal value. To compare the operands as DATETIME values, use CAST() to explicitly convert the subquery value to DATETIME.

  • If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.

  • In all other cases, the arguments are compared as floating-point (real) numbers.

For information about conversion of values from one temporal type to another, see Section 11.3.7, “Conversion Between Date and Time Types”.

The following examples illustrate conversion of strings to numbers for comparison operations:

mysql> SELECT 1 > '6x';
        -> 0
mysql> SELECT 7 > '6x';
        -> 1
mysql> SELECT 0 > 'x6';
        -> 0
mysql> SELECT 0 = 'x6';
        -> 1

For comparisons of a string column with a number, MySQL cannot use an index on the column to look up the value quickly. If str_col is an indexed string column, the index cannot be used when performing the lookup in the following statement:

SELECT * FROM tbl_name WHERE str_col=1;

The reason for this is that there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.

Comparisons that use floating-point numbers (or values that are converted to floating-point numbers) are approximate because such numbers are inexact. This might lead to results that appear inconsistent:

mysql> SELECT '18015376320243458' = 18015376320243458;
        -> 1
mysql> SELECT '18015376320243459' = 18015376320243459;
        -> 0

Such results can occur because the values are converted to floating-point numbers, which have only 53 bits of precision and are subject to rounding:

mysql> SELECT '18015376320243459'+0.0;
        -> 1.8015376320243e+16

Furthermore, the conversion from string to floating-point and from integer to floating-point do not necessarily occur the same way. The integer may be converted to floating-point by the CPU, whereas the string is converted digit by digit in an operation that involves floating-point multiplications.

The results shown will vary on different systems, and can be affected by factors such as computer architecture or the compiler version or optimization level. One way to avoid such problems is to use CAST() so that a value will not be converted implicitly to a float-point number:

mysql> SELECT CAST('18015376320243459' AS UNSIGNED) = 18015376320243459;
        -> 1

For more information about floating-point comparisons, see Section C.5.5.8, “Problems with Floating-Point Values”.

In MySQL 5.7, the server includes dtoa, a conversion library that provides the basis for improved conversion between string or DECIMAL values and approximate-value (FLOAT/DOUBLE) numbers:

  • Consistent conversion results across platforms, which eliminates, for example, Unix versus Windows conversion differences.

  • Accurate representation of values in cases where results previously did not provide sufficient precision, such as for values close to IEEE limits.

  • Conversion of numbers to string format with the best possible precision. The precision of dtoa is always the same or better than that of the standard C library functions.

Because the conversions produced by this library differ in some cases from previous results, the potential exists for incompatibilities in applications that rely on previous results. For example, applications that depend on a specific exact result from previous conversions might need adjustment to accommodate additional precision.

The dtoa library provides conversions with the following properties. D represents a value with a DECIMAL or string representation, and F represents a floating-point number in native binary (IEEE) format.

  • F -> D conversion is done with the best possible precision, returning D as the shortest string that yields F when read back in and rounded to the nearest value in native binary format as specified by IEEE.

  • D -> F conversion is done such that F is the nearest native binary number to the input decimal string D.

These properties imply that F -> D -> F conversions are lossless unless F is -inf, +inf, or NaN. The latter values are not supported because the SQL standard defines them as invalid values for FLOAT or DOUBLE.

For D -> F -> D conversions, a sufficient condition for losslessness is that D uses 15 or fewer digits of precision, is not a denormal value, -inf, +inf, or NaN. In some cases, the conversion is lossless even if D has more than 15 digits of precision, but this is not always the case.

In MySQL 5.7, implicit conversion of a numeric or temporal value to string produces a value that has a character set and collation determined by the character_set_connection and collation_connection system variables. (These variables commonly are set with SET NAMES. For information about connection character sets, see Section 10.1.4, “Connection Character Sets and Collations”.)

This means that such a conversion results in a character (nonbinary) string (a CHAR, VARCHAR, or LONGTEXT value), except in the case that the connection character set is set to binary. In that case, the conversion result is a binary string (a BINARY, VARBINARY, or LONGBLOB value).

12.3. Operators

Table 12.2. Operators

NameDescription
AND, &&Logical AND
=Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement)
:=Assign a value
BETWEEN ... AND ...Check whether a value is within a range of values
BINARYCast a string to a binary string
&Bitwise AND
~Invert bits
|Bitwise OR
^Bitwise XOR
CASECase operator
DIVInteger division
/Division operator
<=>NULL-safe equal to operator
=Equal operator
>=Greater than or equal operator
>Greater than operator
IS NOT NULLNOT NULL value test
IS NOTTest a value against a boolean
IS NULLNULL value test
ISTest a value against a boolean
<<Left shift
<=Less than or equal operator
<Less than operator
LIKESimple pattern matching
-Minus operator
% or MODModulo operator
NOT BETWEEN ... AND ...Check whether a value is not within a range of values
!=, <>Not equal operator
NOT LIKENegation of simple pattern matching
NOT REGEXPNegation of REGEXP
NOT, !Negates value
||, ORLogical OR
+Addition operator
REGEXPPattern matching using regular expressions
>>Right shift
RLIKESynonym for REGEXP
SOUNDS LIKECompare sounds
*Multiplication operator
-Change the sign of the argument
XORLogical XOR

12.3.1. Operator Precedence

Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence.

INTERVAL
BINARY, COLLATE
!
- (unary minus), ~ (unary bit inversion)
^
*, /, DIV, %, MOD
-, +
<<, >>
&
|
= (comparison), <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN
BETWEEN, CASE, WHEN, THEN, ELSE
NOT
&&, AND
XOR
||, OR
= (assignment), :=

The precedence of = depends on whether it is used as a comparison operator (=) or as an assignment operator (=). When used as a comparison operator, it has the same precedence as <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, and IN. When used as an assignment operator, it has the same precedence as :=. Section 13.7.4, “SET Syntax”, and Section 9.4, “User-Defined Variables”, explain how MySQL determines which interpretation of = should apply.

The meaning of some operators depends on the SQL mode:

  • By default, || is a logical OR operator. With PIPES_AS_CONCAT enabled, || is string concatenation, with a precedence between ^ and the unary operators.

  • By default, ! has a higher precedence than NOT. With HIGH_NOT_PRECEDENCE enabled, ! and NOT have the same precedence.

See Section 5.1.7, “Server SQL Modes”.

The precedence of operators determines the order of evaluation of terms in an expression. To override this order and group terms explicitly, use parentheses. For example:

mysql> SELECT 1+2*3;
        -> 7
mysql> SELECT (1+2)*3;
        -> 9

12.3.2. Comparison Functions and Operators

Table 12.3. Comparison Operators

NameDescription
BETWEEN ... AND ...Check whether a value is within a range of values
COALESCE()Return the first non-NULL argument
<=>NULL-safe equal to operator
=Equal operator
>=Greater than or equal operator
>Greater than operator
GREATEST()Return the largest argument
IN()Check whether a value is within a set of values
INTERVAL()Return the index of the argument that is less than the first argument
IS NOT NULLNOT NULL value test
IS NOTTest a value against a boolean
IS NULLNULL value test
ISTest a value against a boolean
ISNULL()Test whether the argument is NULL
LEAST()Return the smallest argument
<=Less than or equal operator
<Less than operator
LIKESimple pattern matching
NOT BETWEEN ... AND ...Check whether a value is not within a range of values
!=, <>Not equal operator
NOT IN()Check whether a value is not within a set of values
NOT LIKENegation of simple pattern matching
STRCMP()Compare two strings

Comparison operations result in a value of 1 (TRUE), 0 (FALSE), or NULL. These operations work for both numbers and strings. Strings are automatically converted to numbers and numbers to strings as necessary.

The following relational comparison operators can be used to compare not only scalar operands, but row operands:

=  >  <  >=  <=  <>  !=

For examples of row comparisons, see Section 13.2.10.5, “Row Subqueries”.

Some of the functions in this section return values other than 1 (TRUE), 0 (FALSE), or NULL. For example, LEAST() and GREATEST(). However, the value they return is based on comparison operations performed according to the rules described in Section 12.2, “Type Conversion in Expression Evaluation”.

To convert a value to a specific type for comparison purposes, you can use the CAST() function. String values can be converted to a different character set using CONVERT(). See Section 12.10, “Cast Functions and Operators”.

By default, string comparisons are not case sensitive and use the current character set. The default is latin1 (cp1252 West European), which also works well for English.

  • =

    Equal:

    mysql> SELECT 1 = 0;
            -> 0
    mysql> SELECT '0' = 0;
            -> 1
    mysql> SELECT '0.0' = 0;
            -> 1
    mysql> SELECT '0.01' = 0;
            -> 0
    mysql> SELECT '.01' = 0.01;
            -> 1
    
  • <=>

    NULL-safe equal. This operator performs an equality comparison like the = operator, but returns 1 rather than NULL if both operands are NULL, and 0 rather than NULL if one operand is NULL.

    mysql> SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL;
            -> 1, 1, 0
    mysql> SELECT 1 = 1, NULL = NULL, 1 = NULL;
            -> 1, NULL, NULL
    
  • <>, !=

    Not equal:

    mysql> SELECT '.01' <> '0.01';
            -> 1
    mysql> SELECT .01 <> '0.01';
            -> 0
    mysql> SELECT 'zapp' <> 'zappp';
            -> 1
    
  • <=

    Less than or equal:

    mysql> SELECT 0.1 <= 2;
            -> 1
    
  • <

    Less than:

    mysql> SELECT 2 < 2;
            -> 0
    
  • >=

    Greater than or equal:

    mysql> SELECT 2 >= 2;
            -> 1
    
  • >

    Greater than:

    mysql> SELECT 2 > 2;
            -> 0
    
  • IS boolean_value

    Tests a value against a boolean value, where boolean_value can be TRUE, FALSE, or UNKNOWN.

    mysql> SELECT 1 IS TRUE, 0 IS FALSE, NULL IS UNKNOWN;
            -> 1, 1, 1
    
  • IS NOT boolean_value

    Tests a value against a boolean value, where boolean_value can be TRUE, FALSE, or UNKNOWN.

    mysql> SELECT 1 IS NOT UNKNOWN, 0 IS NOT UNKNOWN, NULL IS NOT UNKNOWN;
            -> 1, 1, 0
    
  • IS NULL

    Tests whether a value is NULL.

    mysql> SELECT 1 IS NULL, 0 IS NULL, NULL IS NULL;
            -> 0, 0, 1
    

    To work well with ODBC programs, MySQL supports the following extra features when using IS NULL:

  • IS NOT NULL

    Tests whether a value is not NULL.

    mysql> SELECT 1 IS NOT NULL, 0 IS NOT NULL, NULL IS NOT NULL;
            -> 1, 1, 0
    
  • expr BETWEEN min AND max

    If expr is greater than or equal to min and expr is less than or equal to max, BETWEEN returns 1, otherwise it returns 0. This is equivalent to the expression (min <= expr AND expr <= max) if all the arguments are of the same type. Otherwise type conversion takes place according to the rules described in Section 12.2, “Type Conversion in Expression Evaluation”, but applied to all the three arguments.

    mysql> SELECT 2 BETWEEN 1 AND 3, 2 BETWEEN 3 and 1;
            -> 1, 0
    mysql> SELECT 1 BETWEEN 2 AND 3;
            -> 0
    mysql> SELECT 'b' BETWEEN 'a' AND 'c';
            -> 1
    mysql> SELECT 2 BETWEEN 2 AND '3';
            -> 1
    mysql> SELECT 2 BETWEEN 2 AND 'x-3';
            -> 0
    

    For best results when using BETWEEN with date or time values, use CAST() to explicitly convert the values to the desired data type. Examples: If you compare a DATETIME to two DATE values, convert the DATE values to DATETIME values. If you use a string constant such as '2001-1-1' in a comparison to a DATE, cast the string to a DATE.

  • expr NOT BETWEEN min AND max

    This is the same as NOT (expr BETWEEN min AND max).

  • COALESCE(value,...)

    Returns the first non-NULL value in the list, or NULL if there are no non-NULL values.

    mysql> SELECT COALESCE(NULL,1);
            -> 1
    mysql> SELECT COALESCE(NULL,NULL,NULL);
            -> NULL
    
  • GREATEST(value1,value2,...)

    With two or more arguments, returns the largest (maximum-valued) argument. The arguments are compared using the same rules as for LEAST().

    mysql> SELECT GREATEST(2,0);
            -> 2
    mysql> SELECT GREATEST(34.0,3.0,5.0,767.0);
            -> 767.0
    mysql> SELECT GREATEST('B','A','C');
            -> 'C'
    

    GREATEST() returns NULL if any argument is NULL.

  • expr IN (value,...)

    Returns 1 if expr is equal to any of the values in the IN list, else returns 0. If all values are constants, they are evaluated according to the type of expr and sorted. The search for the item then is done using a binary search. This means IN is very quick if the IN value list consists entirely of constants. Otherwise, type conversion takes place according to the rules described in Section 12.2, “Type Conversion in Expression Evaluation”, but applied to all the arguments.

    mysql> SELECT 2 IN (0,3,5,7);
            -> 0
    mysql> SELECT 'wefwf' IN ('wee','wefwf','weg');
            -> 1
    

    You should never mix quoted and unquoted values in an IN list because the comparison rules for quoted values (such as strings) and unquoted values (such as numbers) differ. Mixing types may therefore lead to inconsistent results. For example, do not write an IN expression like this:

    SELECT val1 FROM tbl1 WHERE val1 IN (1,2,'a');

    Instead, write it like this:

    SELECT val1 FROM tbl1 WHERE val1 IN ('1','2','a');

    The number of values in the IN list is only limited by the max_allowed_packet value.

    To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.

    IN() syntax can also be used to write certain types of subqueries. See Section 13.2.10.3, “Subqueries with ANY, IN, or SOME.

  • expr NOT IN (value,...)

    This is the same as NOT (expr IN (value,...)).

  • ISNULL(expr)

    If expr is NULL, ISNULL() returns 1, otherwise it returns 0.

    mysql> SELECT ISNULL(1+1);
            -> 0
    mysql> SELECT ISNULL(1/0);
            -> 1
    

    ISNULL() can be used instead of = to test whether a value is NULL. (Comparing a value to NULL using = always yields false.)

    The ISNULL() function shares some special behaviors with the IS NULL comparison operator. See the description of IS NULL.

  • INTERVAL(N,N1,N2,N3,...)

    Returns 0 if N < N1, 1 if N < N2 and so on or -1 if N is NULL. All arguments are treated as integers. It is required that N1 < N2 < N3 < ... < Nn for this function to work correctly. This is because a binary search is used (very fast).

    mysql> SELECT INTERVAL(23, 1, 15, 17, 30, 44, 200);
            -> 3
    mysql> SELECT INTERVAL(10, 1, 10, 100, 1000);
            -> 2
    mysql> SELECT INTERVAL(22, 23, 30, 44, 200);
            -> 0
    
  • LEAST(value1,value2,...)

    With two or more arguments, returns the smallest (minimum-valued) argument. The arguments are compared using the following rules:

    • If any argument is NULL, the result is NULL. No comparison is needed.

    • If the return value is used in an INTEGER context or all arguments are integer-valued, they are compared as integers.

    • If the return value is used in a REAL context or all arguments are real-valued, they are compared as reals.

    • If the arguments comprise a mix of numbers and strings, they are compared as numbers.

    • If any argument is a nonbinary (character) string, the arguments are compared as nonbinary strings.

    • In all other cases, the arguments are compared as binary strings.

    mysql> SELECT LEAST(2,0);
            -> 0
    mysql> SELECT LEAST(34.0,3.0,5.0,767.0);
            -> 3.0
    mysql> SELECT LEAST('B','A','C');
            -> 'A'
    

    Note that the preceding conversion rules can produce strange results in some borderline cases:

    mysql> SELECT CAST(LEAST(3600, 9223372036854775808.0) as SIGNED);
            -> -9223372036854775808
    

    This happens because MySQL reads 9223372036854775808.0 in an integer context. The integer representation is not good enough to hold the value, so it wraps to a signed integer.

12.3.3. Logical Operators

Table 12.4. Logical Operators

NameDescription
AND, &&Logical AND
NOT, !Negates value
||, ORLogical OR
XORLogical XOR

In SQL, all logical operators evaluate to TRUE, FALSE, or NULL (UNKNOWN). In MySQL, these are implemented as 1 (TRUE), 0 (FALSE), and NULL. Most of this is common to different SQL database servers, although some servers may return any nonzero value for TRUE.

MySQL evaluates any nonzero, non-NULL value to TRUE. For example, the following statements all assess to TRUE:

mysql> SELECT 10 IS TRUE;
-> 1
mysql> SELECT -10 IS TRUE;
-> 1
mysql> SELECT 'string' IS NOT NULL;
-> 1
  • NOT, !

    Logical NOT. Evaluates to 1 if the operand is 0, to 0 if the operand is nonzero, and NOT NULL returns NULL.

    mysql> SELECT NOT 10;
            -> 0
    mysql> SELECT NOT 0;
            -> 1
    mysql> SELECT NOT NULL;
            -> NULL
    mysql> SELECT ! (1+1);
            -> 0
    mysql> SELECT ! 1+1;
            -> 1
    

    The last example produces 1 because the expression evaluates the same way as (!1)+1.

  • AND, &&

    Logical AND. Evaluates to 1 if all operands are nonzero and not NULL, to 0 if one or more operands are 0, otherwise NULL is returned.

    mysql> SELECT 1 && 1;
            -> 1
    mysql> SELECT 1 && 0;
            -> 0
    mysql> SELECT 1 && NULL;
            -> NULL
    mysql> SELECT 0 && NULL;
            -> 0
    mysql> SELECT NULL && 0;
            -> 0
    
  • OR, ||

    Logical OR. When both operands are non-NULL, the result is 1 if any operand is nonzero, and 0 otherwise. With a NULL operand, the result is 1 if the other operand is nonzero, and NULL otherwise. If both operands are NULL, the result is NULL.

    mysql> SELECT 1 || 1;
            -> 1
    mysql> SELECT 1 || 0;
            -> 1
    mysql> SELECT 0 || 0;
            -> 0
    mysql> SELECT 0 || NULL;
            -> NULL
    mysql> SELECT 1 || NULL;
            -> 1
    
  • XOR

    Logical XOR. Returns NULL if either operand is NULL. For non-NULL operands, evaluates to 1 if an odd number of operands is nonzero, otherwise 0 is returned.

    mysql> SELECT 1 XOR 1;
            -> 0
    mysql> SELECT 1 XOR 0;
            -> 1
    mysql> SELECT 1 XOR NULL;
            -> NULL
    mysql> SELECT 1 XOR 1 XOR 1;
            -> 1
    

    a XOR b is mathematically equal to (a AND (NOT b)) OR ((NOT a) and b).

12.3.4. Assignment Operators

Table 12.5. Assignment Operators

NameDescription
=Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement)
:=Assign a value

  • :=

    Assignment operator. Causes the user variable on the left hand side of the operator to take on the value to its right. The value on the right hand side may be a literal value, another variable storing a value, or any legal expression that yields a scalar value, including the result of a query (provided that this value is a scalar value). You can perform multiple assignments in the same SET statement. You can perform multiple assignments in the same statement-

    Unlike =, the := operator is never interpreted as a comparison operator. This means you can use := in any valid SQL statement (not just in SET statements) to assign a value to a variable.

    mysql> SELECT @var1, @var2;
            -> NULL, NULL
    mysql> SELECT @var1 := 1, @var2;
            -> 1, NULL
    mysql> SELECT @var1, @var2;
            -> 1, NULL
    mysql> SELECT @var1, @var2 := @var1;
            -> 1, 1
    mysql> SELECT @var1, @var2;
            -> 1, 1
    
    mysql> SELECT @var1:=COUNT(*) FROM t1;
            -> 4
    mysql> SELECT @var1;
            -> 4
    

    You can make value assignments using := in other statements besides SELECT, such as UPDATE, as shown here:

    mysql> SELECT @var1;
            -> 4
    mysql> SELECT * FROM t1;
            -> 1, 3, 5, 7
    
    mysql> UPDATE t1 SET c1 = 2 WHERE c1 = @var1:= 1;
    Query OK, 1 row affected (0.00 sec)
    Rows matched: 1  Changed: 1  Warnings: 0
    
    mysql> SELECT @var1;
            -> 1
    mysql> SELECT * FROM t1;
            -> 2, 3, 5, 7
    

    While it is also possible both to set and to read the value of the same variable in a single SQL statement using the := operator, this is not recommended. Section 9.4, “User-Defined Variables”, explains why you should avoid doing this.

  • =

    This operator is used to perform value assignments in two cases, described in the next two paragraphs.

    Within a SET statement, = is treated as an assignment operator that causes the user variable on the left hand side of the operator to take on the value to its right. (In other words, when used in a SET statement, = is treated identically to :=.) The value on the right hand side may be a literal value, another variable storing a value, or any legal expression that yields a scalar value, including the result of a query (provided that this value is a scalar value). You can perform multiple assignments in the same SET statement.

    In the SET clause of an UPDATE statement, = also acts as an assignment operator; in this case, however, it causes the column named on the left hand side of the operator to assume the value given to the right, provided any WHERE conditions that are part of the UPDATE are met. You can make multiple assignments in the same SET clause of an UPDATE statement.

    In any other context, = is treated as a comparison operator.

    mysql> SELECT @var1, @var2;
            -> NULL, NULL
    mysql> SELECT @var1 := 1, @var2;
            -> 1, NULL
    mysql> SELECT @var1, @var2;
            -> 1, NULL
    mysql> SELECT @var1, @var2 := @var1;
            -> 1, 1
    mysql> SELECT @var1, @var2;
            -> 1, 1
    

    For more information, see Section 13.7.4, “SET Syntax”, Section 13.2.11, “UPDATE Syntax”, and Section 13.2.10, “Subquery Syntax”.

12.4. Control Flow Functions

Table 12.6. Flow Control Operators

NameDescription
CASECase operator
IF()If/else construct
IFNULL()Null if/else construct
NULLIF()Return NULL if expr1 = expr2

  • CASE value WHEN [compare_value] THEN result [WHEN [compare_value] THEN result ...] [ELSE result] END

    CASE WHEN [condition] THEN result [WHEN [condition] THEN result ...] [ELSE result] END

    The first version returns the result where value=compare_value. The second version returns the result for the first condition that is true. If there was no matching result value, the result after ELSE is returned, or NULL if there is no ELSE part.

    mysql> SELECT CASE 1 WHEN 1 THEN 'one'
        ->     WHEN 2 THEN 'two' ELSE 'more' END;
            -> 'one'
    mysql> SELECT CASE WHEN 1>0 THEN 'true' ELSE 'false' END;
            -> 'true'
    mysql> SELECT CASE BINARY 'B'
        ->     WHEN 'a' THEN 1 WHEN 'b' THEN 2 END;
            -> NULL
    

    The return type of a CASE expression is the compatible aggregated type of all return values, but also depends on the context in which it is used. If used in a string context, the result is returned as a string. If used in a numeric context, the result is returned as a decimal, real, or integer value.

    Note

    The syntax of the CASE expression shown here differs slightly from that of the SQL CASE statement described in Section 13.6.5.1, “CASE Syntax”, for use inside stored programs. The CASE statement cannot have an ELSE NULL clause, and it is terminated with END CASE instead of END.

  • IF(expr1,expr2,expr3)

    If expr1 is TRUE (expr1 <> 0 and expr1 <> NULL) then IF() returns expr2; otherwise it returns expr3. IF() returns a numeric or string value, depending on the context in which it is used.

    mysql> SELECT IF(1>2,2,3);
            -> 3
    mysql> SELECT IF(1<2,'yes','no');
            -> 'yes'
    mysql> SELECT IF(STRCMP('test','test1'),'no','yes');
            -> 'no'
    

    If only one of expr2 or expr3 is explicitly NULL, the result type of the IF() function is the type of the non-NULL expression.

    The default return type of IF() (which may matter when it is stored into a temporary table) is calculated as follows.

    ExpressionReturn Value
    expr2 or expr3 returns a stringstring
    expr2 or expr3 returns a floating-point valuefloating-point
    expr2 or expr3 returns an integerinteger

    If expr2 and expr3 are both strings, the result is case sensitive if either string is case sensitive.

    Note

    There is also an IF statement, which differs from the IF() function described here. See Section 13.6.5.2, “IF Syntax”.

  • IFNULL(expr1,expr2)

    If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL() returns a numeric or string value, depending on the context in which it is used.

    mysql> SELECT IFNULL(1,0);
            -> 1
    mysql> SELECT IFNULL(NULL,10);
            -> 10
    mysql> SELECT IFNULL(1/0,10);
            -> 10
    mysql> SELECT IFNULL(1/0,'yes');
            -> 'yes'
    

    The default result value of IFNULL(expr1,expr2) is the more general of the two expressions, in the order STRING, REAL, or INTEGER. Consider the case of a table based on expressions or where MySQL must internally store a value returned by IFNULL() in a temporary table:

    mysql> CREATE TABLE tmp SELECT IFNULL(1,'test') AS test;
    mysql> DESCRIBE tmp;
    +-------+--------------+------+-----+---------+-------+
    | Field | Type         | Null | Key | Default | Extra |
    +-------+--------------+------+-----+---------+-------+
    | test  | varbinary(4) | NO   |     |         |       |
    +-------+--------------+------+-----+---------+-------+
    

    In this example, the type of the test column is VARBINARY(4).

  • NULLIF(expr1,expr2)

    Returns NULL if expr1 = expr2 is true, otherwise returns expr1. This is the same as CASE WHEN expr1 = expr2 THEN NULL ELSE expr1 END.

    mysql> SELECT NULLIF(1,1);
            -> NULL
    mysql> SELECT NULLIF(1,2);
            -> 1
    

    Note that MySQL evaluates expr1 twice if the arguments are not equal.

12.5. String Functions

Table 12.7. String Operators

NameDescription
ASCII()Return numeric value of left-most character
BIN()Return a string containing binary representation of a number
BIT_LENGTH()Return length of argument in bits
CHAR_LENGTH()Return number of characters in argument
CHAR()Return the character for each integer passed
CHARACTER_LENGTH()A synonym for CHAR_LENGTH()
CONCAT_WS()Return concatenate with separator
CONCAT()Return concatenated string
ELT()Return string at index number
EXPORT_SET()Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string
FIELD()Return the index (position) of the first argument in the subsequent arguments
FIND_IN_SET()Return the index position of the first argument within the second argument
FORMAT()Return a number formatted to specified number of decimal places
FROM_BASE64()Decode to a base-64 string and return result
HEX()Return a hexadecimal representation of a decimal or string value
INSERT()Insert a substring at the specified position up to the specified number of characters
INSTR()Return the index of the first occurrence of substring
LCASE()Synonym for LOWER()
LEFT()Return the leftmost number of characters as specified
LENGTH()Return the length of a string in bytes
LIKESimple pattern matching
LOAD_FILE()Load the named file
LOCATE()Return the position of the first occurrence of substring
LOWER()Return the argument in lowercase
LPAD()Return the string argument, left-padded with the specified string
LTRIM()Remove leading spaces
MAKE_SET()Return a set of comma-separated strings that have the corresponding bit in bits set
MATCHPerform full-text search
MID()Return a substring starting from the specified position
NOT LIKENegation of simple pattern matching
NOT REGEXPNegation of REGEXP
OCT()Return a string containing octal representation of a number
OCTET_LENGTH()A synonym for LENGTH()
ORD()Return character code for leftmost character of the argument
POSITION()A synonym for LOCATE()
QUOTE()Escape the argument for use in an SQL statement
REGEXPPattern matching using regular expressions
REPEAT()Repeat a string the specified number of times
REPLACE()Replace occurrences of a specified string
REVERSE()Reverse the characters in a string
RIGHT()Return the specified rightmost number of characters
RLIKESynonym for REGEXP
RPAD()Append string the specified number of times
RTRIM()Remove trailing spaces
SOUNDEX()Return a soundex string
SOUNDS LIKECompare sounds
SPACE()Return a string of the specified number of spaces
STRCMP()Compare two strings
SUBSTR()Return the substring as specified
SUBSTRING_INDEX()Return a substring from a string before the specified number of occurrences of the delimiter
SUBSTRING()Return the substring as specified
TO_BASE64()Return the argument converted to a base-64 string
TRIM()Remove leading and trailing spaces
UCASE()Synonym for UPPER()
UNHEX()Return a string containing hex representation of a number
UPPER()Convert to uppercase
WEIGHT_STRING()Return the weight string for a string

String-valued functions return NULL if the length of the result would be greater than the value of the max_allowed_packet system variable. See Section 8.11.2, “Tuning Server Parameters”.

For functions that operate on string positions, the first position is numbered 1.

For functions that take length arguments, noninteger arguments are rounded to the nearest integer.

  • ASCII(str)

    Returns the numeric value of the leftmost character of the string str. Returns 0 if str is the empty string. Returns NULL if str is NULL. ASCII() works for 8-bit characters.

    mysql> SELECT ASCII('2');
            -> 50
    mysql> SELECT ASCII(2);
            -> 50
    mysql> SELECT ASCII('dx');
            -> 100
    

    See also the ORD() function.

  • BIN(N)

    Returns a string representation of the binary value of N, where N is a longlong (BIGINT) number. This is equivalent to CONV(N,10,2). Returns NULL if N is NULL.

    mysql> SELECT BIN(12);
            -> '1100'
    
  • BIT_LENGTH(str)

    Returns the length of the string str in bits.

    mysql> SELECT BIT_LENGTH('text');
            -> 32
    
  • CHAR(N,... [USING charset_name])

    CHAR() interprets each argument N as an integer and returns a string consisting of the characters given by the code values of those integers. NULL values are skipped.

    mysql> SELECT CHAR(77,121,83,81,'76');
            -> 'MySQL'
    mysql> SELECT CHAR(77,77.3,'77.3');
            -> 'MMM'
    

    CHAR() arguments larger than 255 are converted into multiple result bytes. For example, CHAR(256) is equivalent to CHAR(1,0), and CHAR(256*256) is equivalent to CHAR(1,0,0):

    mysql> SELECT HEX(CHAR(1,0)), HEX(CHAR(256));
    +----------------+----------------+
    | HEX(CHAR(1,0)) | HEX(CHAR(256)) |
    +----------------+----------------+
    | 0100           | 0100           |
    +----------------+----------------+
    mysql> SELECT HEX(CHAR(1,0,0)), HEX(CHAR(256*256));
    +------------------+--------------------+
    | HEX(CHAR(1,0,0)) | HEX(CHAR(256*256)) |
    +------------------+--------------------+
    | 010000           | 010000             |
    +------------------+--------------------+
    

    By default, CHAR() returns a binary string. To produce a string in a given character set, use the optional USING clause:

    mysql> SELECT CHARSET(CHAR(0x65)), CHARSET(CHAR(0x65 USING utf8));
    +---------------------+--------------------------------+
    | CHARSET(CHAR(0x65)) | CHARSET(CHAR(0x65 USING utf8)) |
    +---------------------+--------------------------------+
    | binary              | utf8                           |
    +---------------------+--------------------------------+
    

    If USING is given and the result string is illegal for the given character set, a warning is issued. Also, if strict SQL mode is enabled, the result from CHAR() becomes NULL.

  • CHAR_LENGTH(str)

    Returns the length of the string str, measured in characters. A multi-byte character counts as a single character. This means that for a string containing five 2-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5.

  • CHARACTER_LENGTH(str)

    CHARACTER_LENGTH() is a synonym for CHAR_LENGTH().

  • CONCAT(str1,str2,...)

    Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent nonbinary string form.

    CONCAT() returns NULL if any argument is NULL.

    mysql> SELECT CONCAT('My', 'S', 'QL');
            -> 'MySQL'
    mysql> SELECT CONCAT('My', NULL, 'QL');
            -> NULL
    mysql> SELECT CONCAT(14.3);
            -> '14.3'
    

    For quoted strings, concatenation can be performed by placing the strings next to each other:

    mysql> SELECT 'My' 'S' 'QL';
            -> 'MySQL'
    
  • CONCAT_WS(separator,str1,str2,...)

    CONCAT_WS() stands for Concatenate With Separator and is a special form of CONCAT(). The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments. If the separator is NULL, the result is NULL.

    mysql> SELECT CONCAT_WS(',','First name','Second name','Last Name');
            -> 'First name,Second name,Last Name'
    mysql> SELECT CONCAT_WS(',','First name',NULL,'Last Name');
            -> 'First name,Last Name'
    

    CONCAT_WS() does not skip empty strings. However, it does skip any NULL values after the separator argument.

  • ELT(N,str1,str2,str3,...)

    ELT() returns the Nth element of the list of strings: str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD().

    mysql> SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
            -> 'ej'
    mysql> SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
            -> 'foo'
    
  • EXPORT_SET(bits,on,off[,separator[,number_of_bits]])

    Returns a string such that for every bit set in the value bits, you get an on string and for every bit not set in the value, you get an off string. Bits in bits are examined from right to left (from low-order to high-order bits). Strings are added to the result from left to right, separated by the separator string (the default being the comma character ,). The number of bits examined is given by number_of_bits, which has a default of 64 if not specified. number_of_bits is silently clipped to 64 if larger than 64. It is treated as an unsigned integer, so a value of –1 is effectively the same as 64.

    mysql> SELECT EXPORT_SET(5,'Y','N',',',4);
            -> 'Y,N,Y,N'
    mysql> SELECT EXPORT_SET(6,'1','0',',',10);
            -> '0,1,1,0,0,0,0,0,0,0'
    
  • FIELD(str,str1,str2,str3,...)

    Returns the index (position) of str in the str1, str2, str3, ... list. Returns 0 if str is not found.

    If all arguments to FIELD() are strings, all arguments are compared as strings. If all arguments are numbers, they are compared as numbers. Otherwise, the arguments are compared as double.

    If str is NULL, the return value is 0 because NULL fails equality comparison with any value. FIELD() is the complement of ELT().

    mysql> SELECT FIELD('ej', 'Hej', 'ej', 'Heja', 'hej', 'foo');
            -> 2
    mysql> SELECT FIELD('fo', 'Hej', 'ej', 'Heja', 'hej', 'foo');
            -> 0
    
  • FIND_IN_SET(str,strlist)

    Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings. A string list is a string composed of substrings separated by , characters. If the first argument is a constant string and the second is a column of type SET, the FIND_IN_SET() function is optimized to use bit arithmetic. Returns 0 if str is not in strlist or if strlist is the empty string. Returns NULL if either argument is NULL. This function does not work properly if the first argument contains a comma (,) character.

    mysql> SELECT FIND_IN_SET('b','a,b,c,d');
            -> 2
    
  • FORMAT(X,D[,locale])

    Formats the number X to a format like '#,###,###.##', rounded to D decimal places, and returns the result as a string. If D is 0, the result has no decimal point or fractional part.

    The optional third parameter enables a locale to be specified to be used for the result number's decimal point, thousands separator, and grouping between separators. Permissible locale values are the same as the legal values for the lc_time_names system variable (see Section 10.7, “MySQL Server Locale Support”). If no locale is specified, the default is 'en_US'.

    mysql> SELECT FORMAT(12332.123456, 4);
            -> '12,332.1235'
    mysql> SELECT FORMAT(12332.1,4);
            -> '12,332.1000'
    mysql> SELECT FORMAT(12332.2,0);
            -> '12,332'
    mysql> SELECT FORMAT(12332.2,2,'de_DE');
            -> '12.332,20'
    
  • FROM_BASE64(str)

    Takes a string encoded with the base-64 encoded rules used by TO_BASE64() and returns the decoded result as a binary string. The result is NULL if the argument is NULL or not a valid base-64 string. See the description of TO_BASE64() for details about the encoding and decoding rules.

    mysql> SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
            -> 'JWJj', 'abc'
    
  • HEX(str), HEX(N)

    For a string argument str, HEX() returns a hexadecimal string representation of str where each character in str is converted to two hexadecimal digits. The inverse of this operation is performed by the UNHEX() function.

    For a numeric argument N, HEX() returns a hexadecimal string representation of the value of N treated as a longlong (BIGINT) number. This is equivalent to CONV(N,10,16). The inverse of this operation is performed by CONV(HEX(N),16,10).

    mysql> SELECT 0x616263, HEX('abc'), UNHEX(HEX('abc'));
            -> 'abc', 616263, 'abc'
    mysql> SELECT HEX(255), CONV(HEX(255),16,10);
            -> 'FF', 255
    
  • INSERT(str,pos,len,newstr)

    Returns the string str, with the substring beginning at position pos and len characters long replaced by the string newstr. Returns the original string if pos is not within the length of the string. Replaces the rest of the string from position pos if len is not within the length of the rest of the string. Returns NULL if any argument is NULL.

    mysql> SELECT INSERT('Quadratic', 3, 4, 'What');
            -> 'QuWhattic'
    mysql> SELECT INSERT('Quadratic', -1, 4, 'What');
            -> 'Quadratic'
    mysql> SELECT INSERT('Quadratic', 3, 100, 'What');
            -> 'QuWhat'
    

    This function is multi-byte safe.

  • INSTR(str,substr)

    Returns the position of the first occurrence of substring substr in string str. This is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.

    mysql> SELECT INSTR('foobarbar', 'bar');
            -> 4
    mysql> SELECT INSTR('xbar', 'foobar');
            -> 0
    

    This function is multi-byte safe, and is case sensitive only if at least one argument is a binary string.

  • LCASE(str)

    LCASE() is a synonym for LOWER().

    In MySQL 5.7, LCASE() used in a view is rewritten as LOWER() when storing the view's definition. (Bug #12844279)

  • LEFT(str,len)

    Returns the leftmost len characters from the string str, or NULL if any argument is NULL.

    mysql> SELECT LEFT('foobarbar', 5);
            -> 'fooba'
    

    This function is multi-byte safe.

  • LENGTH(str)

    Returns the length of the string str, measured in bytes. A multi-byte character counts as multiple bytes. This means that for a string containing five 2-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5.

    mysql> SELECT LENGTH('text');
            -> 4
    
  • LOAD_FILE(file_name)

    Reads the file and returns the file contents as a string. To use this function, the file must be located on the server host, you must specify the full path name to the file, and you must have the FILE privilege. The file must be readable by all and its size less than max_allowed_packet bytes. If the secure_file_priv system variable is set to a nonempty directory name, the file to be loaded must be located in that directory.

    If the file does not exist or cannot be read because one of the preceding conditions is not satisfied, the function returns NULL.

    The character_set_filesystem system variable controls interpretation of file names that are given as literal strings.

    mysql> UPDATE t
                SET blob_col=LOAD_FILE('/tmp/picture')
                WHERE id=1;
    
  • LOCATE(substr,str), LOCATE(substr,str,pos)

    The first syntax returns the position of the first occurrence of substring substr in string str. The second syntax returns the position of the first occurrence of substring substr in string str, starting at position pos. Returns 0 if substr is not in str.

    mysql> SELECT LOCATE('bar', 'foobarbar');
            -> 4
    mysql> SELECT LOCATE('xbar', 'foobar');
            -> 0
    mysql> SELECT LOCATE('bar', 'foobarbar', 5);
            -> 7
    

    This function is multi-byte safe, and is case-sensitive only if at least one argument is a binary string.

  • LOWER(str)

    Returns the string str with all characters changed to lowercase according to the current character set mapping. The default is latin1 (cp1252 West European).

    mysql> SELECT LOWER('QUADRATICALLY');
            -> 'quadratically'
    

    LOWER() (and UPPER()) are ineffective when applied to binary strings (BINARY, VARBINARY, BLOB). To perform lettercase conversion, convert the string to a nonbinary string:

    mysql> SET @str = BINARY 'New York';
    mysql> SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
    +-------------+-----------------------------------+
    | LOWER(@str) | LOWER(CONVERT(@str USING latin1)) |
    +-------------+-----------------------------------+
    | New York    | new york                          |
    +-------------+-----------------------------------+
    

    For Unicode character sets, LOWER() and UPPER() work accounting to Unicode Collation Algorithm (UCA) 5.2.0 for xxx_unicode_520_ci collations and for language-specific collations that are derived from them. For other Unicode collations, LOWER() and UPPER() work accounting to Unicode Collation Algorithm (UCA) 4.0.0. See Section 10.1.14.1, “Unicode Character Sets”.

    This function is multi-byte safe.

    In previous versions of MySQL, LOWER() used within a view was rewritten as LCASE() when storing the view's definition. In MySQL 5.7, LOWER() is never rewritten in such cases, but LCASE() used within views is instead rewritten as LOWER(). (Bug #12844279)

  • LPAD(str,len,padstr)

    Returns the string str, left-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters.

    mysql> SELECT LPAD('hi',4,'??');
            -> '??hi'
    mysql> SELECT LPAD('hi',1,'??');
            -> 'h'
    
  • LTRIM(str)

    Returns the string str with leading space characters removed.

    mysql> SELECT LTRIM('  barbar');
            -> 'barbar'
    

    This function is multi-byte safe.

  • MAKE_SET(bits,str1,str2,...)

    Returns a set value (a string containing substrings separated by , characters) consisting of the strings that have the corresponding bit in bits set. str1 corresponds to bit 0, str2 to bit 1, and so on. NULL values in str1, str2, ... are not appended to the result.

    mysql> SELECT MAKE_SET(1,'a','b','c');
            -> 'a'
    mysql> SELECT MAKE_SET(1 | 4,'hello','nice','world');
            -> 'hello,world'
    mysql> SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
            -> 'hello'
    mysql> SELECT MAKE_SET(0,'a','b','c');
            -> ''
    
  • MID(str,pos,len)

    MID(str,pos,len) is a synonym for SUBSTRING(str,pos,len).

  • OCT(N)

    Returns a string representation of the octal value of N, where N is a longlong (BIGINT) number. This is equivalent to CONV(N,10,8). Returns NULL if N is NULL.

    mysql> SELECT OCT(12);
            -> '14'
    
  • OCTET_LENGTH(str)

    OCTET_LENGTH() is a synonym for LENGTH().

  • ORD(str)

    If the leftmost character of the string str is a multi-byte character, returns the code for that character, calculated from the numeric values of its constituent bytes using this formula:

      (1st byte code)
    + (2nd byte code * 256)
    + (3rd byte code * 2562) ...
    

    If the leftmost character is not a multi-byte character, ORD() returns the same value as the ASCII() function.

    mysql> SELECT ORD('2');
            -> 50
    
  • POSITION(substr IN str)

    POSITION(substr IN str) is a synonym for LOCATE(substr,str).

  • QUOTE(str)

    Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement. The string is returned enclosed by single quotation marks and with each instance of backslash (\), single quote ('), ASCII NUL, and Control+Z preceded by a backslash. If the argument is NULL, the return value is the word NULL without enclosing single quotation marks.

    mysql> SELECT QUOTE('Don\'t!');
            -> 'Don\'t!'
    mysql> SELECT QUOTE(NULL);
            -> NULL
    

    For comparison, see the quoting rules for literal strings and within the C API in Section 9.1.1, “String Literals”, and Section 21.8.7.54, “mysql_real_escape_string().

  • REPEAT(str,count)

    Returns a string consisting of the string str repeated count times. If count is less than 1, returns an empty string. Returns NULL if str or count are NULL.

    mysql> SELECT REPEAT('MySQL', 3);
            -> 'MySQLMySQLMySQL'
    
  • REPLACE(str,from_str,to_str)

    Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.

    mysql> SELECT REPLACE('www.mysql.com', 'w', 'Ww');
            -> 'WwWwWw.mysql.com'
    

    This function is multi-byte safe.

  • REVERSE(str)

    Returns the string str with the order of the characters reversed.

    mysql> SELECT REVERSE('abc');
            -> 'cba'
    

    This function is multi-byte safe.

  • RIGHT(str,len)

    Returns the rightmost len characters from the string str, or NULL if any argument is NULL.

    mysql> SELECT RIGHT('foobarbar', 4);
            -> 'rbar'
    

    This function is multi-byte safe.

  • RPAD(str,len,padstr)

    Returns the string str, right-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters.

    mysql> SELECT RPAD('hi',5,'?');
            -> 'hi???'
    mysql> SELECT RPAD('hi',1,'?');
            -> 'h'
    

    This function is multi-byte safe.

  • RTRIM(str)

    Returns the string str with trailing space characters removed.

    mysql> SELECT RTRIM('barbar   ');
            -> 'barbar'
    

    This function is multi-byte safe.

  • SOUNDEX(str)

    Returns a soundex string from str. Two strings that sound almost the same should have identical soundex strings. A standard soundex string is four characters long, but the SOUNDEX() function returns an arbitrarily long string. You can use SUBSTRING() on the result to get a standard soundex string. All nonalphabetic characters in str are ignored. All international alphabetic characters outside the A-Z range are treated as vowels.

    Important

    When using SOUNDEX(), you should be aware of the following limitations:

    • This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reliable results.

    • This function is not guaranteed to provide consistent results with strings that use multi-byte character sets, including utf-8.

      We hope to remove these limitations in a future release. See Bug #22638 for more information.

    mysql> SELECT SOUNDEX('Hello');
            -> 'H400'
    mysql> SELECT SOUNDEX('Quadratically');
            -> 'Q36324'
    
    Note

    This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.

  • expr1 SOUNDS LIKE expr2

    This is the same as SOUNDEX(expr1) = SOUNDEX(expr2).

  • SPACE(N)

    Returns a string consisting of N space characters.

    mysql> SELECT SPACE(6);
            -> '      '
    
  • SUBSTR(str,pos), SUBSTR(str FROM pos), SUBSTR(str,pos,len), SUBSTR(str FROM pos FOR len)

    SUBSTR() is a synonym for SUBSTRING().

  • SUBSTRING(str,pos), SUBSTRING(str FROM pos), SUBSTRING(str,pos,len), SUBSTRING(str FROM pos FOR len)

    The forms without a len argument return a substring from string str starting at position pos. The forms with a len argument return a substring len characters long from string str, starting at position pos. The forms that use FROM are standard SQL syntax. It is also possible to use a negative value for pos. In this case, the beginning of the substring is pos characters from the end of the string, rather than the beginning. A negative value may be used for pos in any of the forms of this function.

    For all forms of SUBSTRING(), the position of the first character in the string from which the substring is to be extracted is reckoned as 1.

    mysql> SELECT SUBSTRING('Quadratically',5);
            -> 'ratically'
    mysql> SELECT SUBSTRING('foobarbar' FROM 4);
            -> 'barbar'
    mysql> SELECT SUBSTRING('Quadratically',5,6);
            -> 'ratica'
    mysql> SELECT SUBSTRING('Sakila', -3);
            -> 'ila'
    mysql> SELECT SUBSTRING('Sakila', -5, 3);
            -> 'aki'
    mysql> SELECT SUBSTRING('Sakila' FROM -4 FOR 2);
            -> 'ki'
    

    This function is multi-byte safe.

    If len is less than 1, the result is the empty string.

  • SUBSTRING_INDEX(str,delim,count)

    Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim.

    mysql> SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
            -> 'www.mysql'
    mysql> SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
            -> 'mysql.com'
    

    This function is multi-byte safe.

  • TO_BASE64(str)

    Converts the string argument to base-64 encoded form and returns the result as a character string with the connection character set and collation. If the argument is not a string, it is converted to a string before conversion takes place. The result is NULL if the argument is NULL. Base-64 encoded strings can be decoded using the the FROM_BASE64() function.

    mysql> SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
            -> 'JWJj', 'abc'
    

    Different base-64 encoding schemes exist. These are the encoding and decoding rules used by TO_BASE64() and FROM_BASE64():

    • The encoding for alphabet value 62 is '+'.

    • The encoding for alphabet value 63 is '/'.

    • Encoded output consists of groups of 4 printable characters. Each 3 bytes of the input data are encoded using 4 characters. If the last group is incomplete, it is padded with '=' characters to a length of 4.

    • A newline is added after each 76 characters of encoded output to divide long output into multiple lines.

    • Decoding recognizes and ignores newline, carriage return, tab, and space.

  • TRIM([{BOTH | LEADING | TRAILING} [remstr] FROM] str), TRIM([remstr FROM] str)

    Returns the string str with all remstr prefixes or suffixes removed. If none of the specifiers BOTH, LEADING, or TRAILING is given, BOTH is assumed. remstr is optional and, if not specified, spaces are removed.

    mysql> SELECT TRIM('  bar   ');
            -> 'bar'
    mysql> SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx');
            -> 'barxxx'
    mysql> SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx');
            -> 'bar'
    mysql> SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz');
            -> 'barx'
    

    This function is multi-byte safe.

  • UCASE(str)

    UCASE() is a synonym for UPPER().

    In previous versions of MySQL, UPPER() used within a view was rewritten as UCASE() when storing the view's definition. In MySQL 5.7, UPPER() is never rewritten in such cases, but UCASE() used within views is instead rewritten as UPPER(). (Bug #12844279)

    In MySQL 5.7, UCASE() used in a view is rewritten as UPPER() when storing the view's definition. (Bug #12844279)

  • UNHEX(str)

    For a string argument str, UNHEX(str) performs the inverse operation of HEX(str). That is, it interprets each pair of characters in the argument as a hexadecimal number and converts it to the character represented by the number. The return value is a binary string.

    mysql> SELECT UNHEX('4D7953514C');
            -> 'MySQL'
    mysql> SELECT 0x4D7953514C;
            -> 'MySQL'
    mysql> SELECT UNHEX(HEX('string'));
            -> 'string'
    mysql> SELECT HEX(UNHEX('1267'));
            -> '1267'
    

    The characters in the argument string must be legal hexadecimal digits: '0' .. '9', 'A' .. 'F', 'a' .. 'f'. If the argument contains any nonhexadecimal digits, the result is NULL:

    mysql> SELECT UNHEX('GG');
    +-------------+
    | UNHEX('GG') |
    +-------------+
    | NULL        |
    +-------------+
    

    A NULL result can occur if the argument to UNHEX() is a BINARY column, because values are padded with 0x00 bytes when stored but those bytes are not stripped on retrieval. For example, '41' is stored into a CHAR(3) column as '41 ' and retrieved as '41' (with the trailing pad space stripped), so UNHEX() for the column value returns 'A'. By contrast '41' is stored into a BINARY(3) column as '41\0' and retrieved as '41\0' (with the trailing pad 0x00 byte not stripped). '\0' is not a legal hexadecimal digit, so UNHEX() for the column value returns NULL.

    For a numeric argument N, the inverse of HEX(N) is not performed by UNHEX(). Use CONV(HEX(N),16,10) instead. See the description of HEX().

  • UPPER(str)

    Returns the string str with all characters changed to uppercase according to the current character set mapping. The default is latin1 (cp1252 West European).

    mysql> SELECT UPPER('Hej');
            -> 'HEJ'
    

    See the description of LOWER() for information that also applies to UPPER(). This included information about how to perform lettercase conversion of binary strings (BINARY, VARBINARY, BLOB) for which these functions are ineffective, and information about case folding for Unicode character sets.

    This function is multi-byte safe.

    In previous versions of MySQL, UPPER() used within a view was rewritten as UCASE() when storing the view's definition. In MySQL 5.7, UPPER() is never rewritten in such cases, but UCASE() used within views is instead rewritten as UPPER(). (Bug #12844279)

  • WEIGHT_STRING(str [AS {CHAR|BINARY}(N)] [LEVEL levels] [flags])

    levels: N [ASC|DESC|REVERSE] [, N [ASC|DESC|REVERSE]] ...

    This function returns the weight string for the input string. The return value is a binary string that represents the sorting and comparison value of the string. It has these properties:

    WEIGHT_STRING() can be used for testing and debugging of collations, especially if you are adding a new collation. See Section 10.4, “Adding a Collation to a Character Set”.

    The input string, str, is a string expression. If the input is a nonbinary (character) string such as a CHAR, VARCHAR, or TEXT value, the return value contains the collation weights for the string. If the input is a binary (byte) string such as a BINARY, VARBINARY, or BLOB value, the return value is the same as the input (the weight for each byte in a binary string is the byte value). If the input is NULL, WEIGHT_STRING() returns NULL.

    Examples:

    mysql> SET @s = _latin1 'AB' COLLATE latin1_swedish_ci;
    mysql> SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
    +------+---------+------------------------+
    | @s   | HEX(@s) | HEX(WEIGHT_STRING(@s)) |
    +------+---------+------------------------+
    | AB   | 4142    | 4142                   |
    +------+---------+------------------------+
    
    mysql> SET @s = _latin1 'ab' COLLATE latin1_swedish_ci;
    mysql> SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
    +------+---------+------------------------+
    | @s   | HEX(@s) | HEX(WEIGHT_STRING(@s)) |
    +------+---------+------------------------+
    | ab   | 6162    | 4142                   |
    +------+---------+------------------------+
    
    mysql> SET @s = CAST('AB' AS BINARY);
    mysql> SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
    +------+---------+------------------------+
    | @s   | HEX(@s) | HEX(WEIGHT_STRING(@s)) |
    +------+---------+------------------------+
    | AB   | 4142    | 4142                   |
    +------+---------+------------------------+
    
    mysql> SET @s = CAST('ab' AS BINARY);
    mysql> SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
    +------+---------+------------------------+
    | @s   | HEX(@s) | HEX(WEIGHT_STRING(@s)) |
    +------+---------+------------------------+
    | ab   | 6162    | 6162                   |
    +------+---------+------------------------+
    

    The preceding examples use HEX() to display the WEIGHT_STRING() result. Because the result is a binary value, HEX() can be especially useful when the result contains nonprinting values, to display it in printable form:

    mysql> SET @s = CONVERT(0xC39F USING utf8) COLLATE utf8_czech_ci;
    mysql> SELECT HEX(WEIGHT_STRING(@s));
    +------------------------+
    | HEX(WEIGHT_STRING(@s)) |
    +------------------------+
    | 0FEA0FEA               |
    +------------------------+
    

    For non-NULL return values, the data type of the value is VARBINARY if its length is within the maximum length for VARBINARY, otherwise the data type is BLOB.

    The AS clause may be given to cast the input string to a nonbinary or binary string and to force it to a given length:

    • AS CHAR(N) casts the string to a nonbinary string and pads it on the right with spaces to a length of N characters. N must be at least 1. If N is less than the length of the input string, the string is truncated to N characters. No warning occurs for truncation.

    • AS BINARY(N) is similar but casts the string to a binary string, N is measured in bytes (not characters), and padding uses 0x00 bytes (not spaces).

    mysql> SELECT HEX(WEIGHT_STRING('ab' AS CHAR(4)));
    +-------------------------------------+
    | HEX(WEIGHT_STRING('ab' AS CHAR(4))) |
    +-------------------------------------+
    | 41422020                            |
    +-------------------------------------+
    
    mysql> SELECT HEX(WEIGHT_STRING('ab' AS BINARY(4)));
    +---------------------------------------+
    | HEX(WEIGHT_STRING('ab' AS BINARY(4))) |
    +---------------------------------------+
    | 61620000                              |
    +---------------------------------------+
    

    The LEVEL clause may be given to specify that the return value should contain weights for specific collation levels.

    The levels specifier following the LEVEL keyword may be given either as a list of one or more integers separated by commas, or as a range of two integers separated by a dash. Whitespace around the punctuation characters does not matter.

    Examples:

    LEVEL 1
    LEVEL 2, 3, 5
    LEVEL 1-3

    Any level less than 1 is treated as 1. Any level greater than the maximum for the input string collation is treated as maximum for the collation. The maximum varies per collation, but is never greater than 6.

    In a list of levels, levels must be given in increasing order. In a range of levels, if the second number is less than the first, it is treated as the first number (for example, 4-2 is the same as 4-4).

    If the LEVEL clause is omitted, MySQL assumes LEVEL 1 - max, where max is the maximum level for the collation.

    If LEVEL is specified using list syntax (not range syntax), any level number can be followed by these modifiers:

    • ASC: Return the weights without modification. This is the default.

    • DESC: Return bitwise-inverted weights (for example, 0x78f0 DESC = 0x870f).

    • REVERSE: Return the weights in reverse order (that is,the weights for the reversed string, with the first character last and the last first).

    Examples:

    mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1));
    +--------------------------------------+
    | HEX(WEIGHT_STRING(0x007fff LEVEL 1)) |
    +--------------------------------------+
    | 007FFF                               |
    +--------------------------------------+
    
    mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC));
    +-------------------------------------------+
    | HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC)) |
    +-------------------------------------------+
    | FF8000                                    |
    +-------------------------------------------+
    
    mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1 REVERSE));
    +----------------------------------------------+
    | HEX(WEIGHT_STRING(0x007fff LEVEL 1 REVERSE)) |
    +----------------------------------------------+
    | FF7F00                                       |
    +----------------------------------------------+
    
    mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC REVERSE));
    +---------------------------------------------------+
    | HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC REVERSE)) |
    +---------------------------------------------------+
    | 0080FF                                            |
    +---------------------------------------------------+
    

    The flags clause currently is unused.

12.5.1. String Comparison Functions

Table 12.8. String Comparison Operators

NameDescription
LIKESimple pattern matching
NOT LIKENegation of simple pattern matching
STRCMP()Compare two strings

If a string function is given a binary string as an argument, the resulting string is also a binary string. A number converted to a string is treated as a binary string. This affects only comparisons.

Normally, if any expression in a string comparison is case sensitive, the comparison is performed in case-sensitive fashion.

  • expr LIKE pat [ESCAPE 'escape_char']

    Pattern matching using SQL simple regular expression comparison. Returns 1 (TRUE) or 0 (FALSE). If either expr or pat is NULL, the result is NULL.

    The pattern need not be a literal string. For example, it can be specified as a string expression or table column.

    Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:

    mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
    +-----------------------------------------+
    | 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
    +-----------------------------------------+
    |                                       0 |
    +-----------------------------------------+
    mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
    +--------------------------------------+
    | 'ä' = 'ae' COLLATE latin1_german2_ci |
    +--------------------------------------+
    |                                    1 |
    +--------------------------------------+
    

    In particular, trailing spaces are significant, which is not true for CHAR or VARCHAR comparisons performed with the = operator:

    mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
    +------------+---------------+
    | 'a' = 'a ' | 'a' LIKE 'a ' |
    +------------+---------------+
    |          1 |             0 |
    +------------+---------------+
    1 row in set (0.00 sec)
    

    With LIKE you can use the following two wildcard characters in the pattern.

    CharacterDescription
    %Matches any number of characters, even zero characters
    _Matches exactly one character
    mysql> SELECT 'David!' LIKE 'David_';
            -> 1
    mysql> SELECT 'David!' LIKE '%D%v%';
            -> 1
    

    To test for literal instances of a wildcard character, precede it by the escape character. If you do not specify the ESCAPE character, \ is assumed.

    StringDescription
    \%Matches one % character
    \_Matches one _ character
    mysql> SELECT 'David!' LIKE 'David\_';
            -> 0
    mysql> SELECT 'David_' LIKE 'David\_';
            -> 1
    

    To specify a different escape character, use the ESCAPE clause:

    mysql> SELECT 'David_' LIKE 'David|_' ESCAPE '|';
            -> 1
    

    The escape sequence should be empty or one character long. The expression must evaluate as a constant at execution time. If the NO_BACKSLASH_ESCAPES SQL mode is enabled, the sequence cannot be empty.

    The following two statements illustrate that string comparisons are not case sensitive unless one of the operands is a binary string:

    mysql> SELECT 'abc' LIKE 'ABC';
            -> 1
    mysql> SELECT 'abc' LIKE BINARY 'ABC';
            -> 0
    

    In MySQL, LIKE is permitted on numeric expressions. (This is an extension to the standard SQL LIKE.)

    mysql> SELECT 10 LIKE '1%';
            -> 1
    
    Note

    Because MySQL uses C escape syntax in strings (for example, \n to represent a newline character), you must double any \ that you use in LIKE strings. For example, to search for \n, specify it as \\n. To search for \, specify it as \\\\; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.

    Exception: At the end of the pattern string, backslash can be specified as \\. At the end of the string, backslash stands for itself because there is nothing following to escape. Suppose that a table contains the following values:

    mysql> SELECT filename FROM t1;
    +--------------+
    | filename     |
    +--------------+
    | C:           | 
    | C:\          | 
    | C:\Programs  | 
    | C:\Programs\ | 
    +--------------+
    

    To test for values that end with backslash, you can match the values using either of the following patterns:

    mysql> SELECT filename, filename LIKE '%\\' FROM t1;
    +--------------+---------------------+
    | filename     | filename LIKE '%\\' |
    +--------------+---------------------+
    | C:           |                   0 | 
    | C:\          |                   1 | 
    | C:\Programs  |                   0 | 
    | C:\Programs\ |                   1 | 
    +--------------+---------------------+
    
    mysql> SELECT filename, filename LIKE '%\\\\' FROM t1;
    +--------------+-----------------------+
    | filename     | filename LIKE '%\\\\' |
    +--------------+-----------------------+
    | C:           |                     0 | 
    | C:\          |                     1 | 
    | C:\Programs  |                     0 | 
    | C:\Programs\ |                     1 | 
    +--------------+-----------------------+
    
  • expr NOT LIKE pat [ESCAPE 'escape_char']

    This is the same as NOT (expr LIKE pat [ESCAPE 'escape_char']).

    Note

    Aggregate queries involving NOT LIKE comparisons with columns containing NULL may yield unexpected results. For example, consider the following table and data:

    CREATE TABLE foo (bar VARCHAR(10));
    
    INSERT INTO foo VALUES (NULL), (NULL);

    The query SELECT COUNT(*) FROM foo WHERE bar LIKE '%baz%'; returns 0. You might assume that SELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%'; would return 2. However, this is not the case: The second query returns 0. This is because NULL NOT LIKE expr always returns NULL, regardless of the value of expr. The same is true for aggregate queries involving NULL and comparisons using NOT RLIKE or NOT REGEXP. In such cases, you must test explicitly for NOT NULL using OR (and not AND), as shown here:

    SELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%' OR bar IS NULL;
  • STRCMP(expr1,expr2)

    STRCMP() returns 0 if the strings are the same, -1 if the first argument is smaller than the second according to the current sort order, and 1 otherwise.

    mysql> SELECT STRCMP('text', 'text2');
            -> -1
    mysql> SELECT STRCMP('text2', 'text');
            -> 1
    mysql> SELECT STRCMP('text', 'text');
            -> 0
    

    STRCMP() performs the comparison using the collation of the arguments.

    mysql> SET @s1 = _latin1 'x' COLLATE latin1_general_ci;
    mysql> SET @s2 = _latin1 'X' COLLATE latin1_general_ci;
    mysql> SET @s3 = _latin1 'x' COLLATE latin1_general_cs;
    mysql> SET @s4 = _latin1 'X' COLLATE latin1_general_cs;
    mysql> SELECT STRCMP(@s1, @s2), STRCMP(@s3, @s4);
    +------------------+------------------+
    | STRCMP(@s1, @s2) | STRCMP(@s3, @s4) |
    +------------------+------------------+
    |                0 |                1 |
    +------------------+------------------+
    

    If the collations are incompatible, one of the arguments must be converted to be compatible with the other. See Section 10.1.7.5, “Collation of Expressions”.

    mysql> SELECT STRCMP(@s1, @s3);
    ERROR 1267 (HY000) at line 10: Illegal mix of collations (latin1_general_ci,IMPLICIT) and (latin1_general_cs,IMPLICIT) for operation 'strcmp'
    mysql> SELECT STRCMP(@s1, @s3 COLLATE latin1_general_ci);
    +--------------------------------------------+
    | STRCMP(@s1, @s3 COLLATE latin1_general_ci) |
    +--------------------------------------------+
    |                                          0 |
    +--------------------------------------------+
    

12.5.2. Regular Expressions

Table 12.9. String Regular Expression Operators

NameDescription
NOT REGEXPNegation of REGEXP
REGEXPPattern matching using regular expressions
RLIKESynonym for REGEXP

A regular expression is a powerful way of specifying a pattern for a complex search.

MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MySQL uses the extended version to support pattern-matching operations performed with the REGEXP operator in SQL statements.

This section summarizes, with examples, the special characters and constructs that can be used in MySQL for REGEXP operations. It does not contain all the details that can be found in Henry Spencer's regex(7) manual page. That manual page is included in MySQL source distributions, in the regex.7 file under the regex directory. See also Section 3.3.4.7, “Pattern Matching”.

  • expr NOT REGEXP pat, expr NOT RLIKE pat

    This is the same as NOT (expr REGEXP pat).

  • expr REGEXP pat, expr RLIKE pat

    Performs a pattern match of a string expression expr against a pattern pat. The pattern can be an extended regular expression. The syntax for regular expressions is discussed in Section 12.5.2, “Regular Expressions”. Returns 1 if expr matches pat; otherwise it returns 0. If either expr or pat is NULL, the result is NULL. RLIKE is a synonym for REGEXP, provided for mSQL compatibility.

    The pattern need not be a literal string. For example, it can be specified as a string expression or table column.

    Note

    Because MySQL uses the C escape syntax in strings (for example, \n to represent the newline character), you must double any \ that you use in your REGEXP strings.

    REGEXP is not case sensitive, except when used with binary strings.

    mysql> SELECT 'Monty!' REGEXP '.*';
            -> 1
    mysql> SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
            -> 1
    mysql> SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
            -> 1  0
    mysql> SELECT 'a' REGEXP '^[a-d]';
            -> 1
    

    REGEXP and RLIKE use the character set and collations of the arguments when deciding the type of a character and performing the comparison. If the arguments have different character sets or collations, coercibility rules apply as described in Section 10.1.7.5, “Collation of Expressions”.

    Warning

    The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.

A regular expression describes a set of strings. The simplest regular expression is one that has no special characters in it. For example, the regular expression hello matches hello and nothing else.

Nontrivial regular expressions use certain special constructs so that they can match more than one string. For example, the regular expression hello|word matches either the string hello or the string word.

As a more complex example, the regular expression B[an]*s matches any of the strings Bananas, Baaaaas, Bs, and any other string starting with a B, ending with an s, and containing any number of a or n characters in between.

A regular expression for the REGEXP operator may use any of the following special characters and constructs:

  • ^

    Match the beginning of a string.

    mysql> SELECT 'fo\nfo' REGEXP '^fo$';                   -> 0
    mysql> SELECT 'fofo' REGEXP '^fo';                      -> 1
    
  • $

    Match the end of a string.

    mysql> SELECT 'fo\no' REGEXP '^fo\no$';                 -> 1
    mysql> SELECT 'fo\no' REGEXP '^fo$';                    -> 0
    
  • .

    Match any character (including carriage return and newline).

    mysql> SELECT 'fofo' REGEXP '^f.*$';                    -> 1
    mysql> SELECT 'fo\r\nfo' REGEXP '^f.*$';                -> 1
    
  • a*

    Match any sequence of zero or more a characters.

    mysql> SELECT 'Ban' REGEXP '^Ba*n';                     -> 1
    mysql> SELECT 'Baaan' REGEXP '^Ba*n';                   -> 1
    mysql> SELECT 'Bn' REGEXP '^Ba*n';                      -> 1
    
  • a+

    Match any sequence of one or more a characters.

    mysql> SELECT 'Ban' REGEXP '^Ba+n';                     -> 1
    mysql> SELECT 'Bn' REGEXP '^Ba+n';                      -> 0
    
  • a?

    Match either zero or one a character.

    mysql> SELECT 'Bn' REGEXP '^Ba?n';                      -> 1
    mysql> SELECT 'Ban' REGEXP '^Ba?n';                     -> 1
    mysql> SELECT 'Baan' REGEXP '^Ba?n';                    -> 0
    
  • de|abc

    Match either of the sequences de or abc.

    mysql> SELECT 'pi' REGEXP 'pi|apa';                     -> 1
    mysql> SELECT 'axe' REGEXP 'pi|apa';                    -> 0
    mysql> SELECT 'apa' REGEXP 'pi|apa';                    -> 1
    mysql> SELECT 'apa' REGEXP '^(pi|apa)$';                -> 1
    mysql> SELECT 'pi' REGEXP '^(pi|apa)$';                 -> 1
    mysql> SELECT 'pix' REGEXP '^(pi|apa)$';                -> 0
    
  • (abc)*

    Match zero or more instances of the sequence abc.

    mysql> SELECT 'pi' REGEXP '^(pi)*$';                    -> 1
    mysql> SELECT 'pip' REGEXP '^(pi)*$';                   -> 0
    mysql> SELECT 'pipi' REGEXP '^(pi)*$';                  -> 1
    
  • {1}, {2,3}

    {n} or {m,n} notation provides a more general way of writing regular expressions that match many occurrences of the previous atom (or piece) of the pattern. m and n are integers.

    • a*

      Can be written as a{0,}.

    • a+

      Can be written as a{1,}.

    • a?

      Can be written as a{0,1}.

    To be more precise, a{n} matches exactly n instances of a. a{n,} matches n or more instances of a. a{m,n} matches m through n instances of a, inclusive.

    m and n must be in the range from 0 to RE_DUP_MAX (default 255), inclusive. If both m and n are given, m must be less than or equal to n.

    mysql> SELECT 'abcde' REGEXP 'a[bcd]{2}e';              -> 0
    mysql> SELECT 'abcde' REGEXP 'a[bcd]{3}e';              -> 1
    mysql> SELECT 'abcde' REGEXP 'a[bcd]{1,10}e';           -> 1
    
  • [a-dX], [^a-dX]

    Matches any character that is (or is not, if ^ is used) either a, b, c, d or X. A - character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. To include a literal ] character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. Any character that does not have a defined special meaning inside a [] pair matches only itself.

    mysql> SELECT 'aXbc' REGEXP '[a-dXYZ]';                 -> 1
    mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]$';               -> 0
    mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]+$';              -> 1
    mysql> SELECT 'aXbc' REGEXP '^[^a-dXYZ]+$';             -> 0
    mysql> SELECT 'gheis' REGEXP '^[^a-dXYZ]+$';            -> 1
    mysql> SELECT 'gheisa' REGEXP '^[^a-dXYZ]+$';           -> 0
    
  • [.characters.]

    Within a bracket expression (written using [ and ]), matches the sequence of characters of that collating element. characters is either a single character or a character name like newline. The following table lists the permissible character names.

    The following table shows the permissible character names and the characters that they match. For characters given as numeric values, the values are represented in octal.

    NameCharacterNameCharacter
    NUL0SOH001
    STX002ETX003
    EOT004ENQ005
    ACK006BEL007
    alert007BS010
    backspace'\b'HT011
    tab'\t'LF012
    newline'\n'VT013
    vertical-tab'\v'FF014
    form-feed'\f'CR015
    carriage-return'\r'SO016
    SI017DLE020
    DC1021DC2022
    DC3023DC4024
    NAK025SYN026
    ETB027CAN030
    EM031SUB032
    ESC033IS4034
    FS034IS3035
    GS035IS2036
    RS036IS1037
    US037space' '
    exclamation-mark'!'quotation-mark'"'
    number-sign'#'dollar-sign'$'
    percent-sign'%'ampersand'&'
    apostrophe'\''left-parenthesis'('
    right-parenthesis')'asterisk'*'
    plus-sign'+'comma','
    hyphen'-'hyphen-minus'-'
    period'.'full-stop'.'
    slash'/'solidus'/'
    zero'0'one'1'
    two'2'three'3'
    four'4'five'5'
    six'6'seven'7'
    eight'8'nine'9'
    colon':'semicolon';'
    less-than-sign'<'equals-sign'='
    greater-than-sign'>'question-mark'?'
    commercial-at'@'left-square-bracket'['
    backslash'\\'reverse-solidus'\\'
    right-square-bracket']'circumflex'^'
    circumflex-accent'^'underscore'_'
    low-line'_'grave-accent'`'
    left-brace'{'left-curly-bracket'{'
    vertical-line'|'right-brace'}'
    right-curly-bracket'}'tilde'~'
    DEL177  
    mysql> SELECT '~' REGEXP '[[.~.]]';                     -> 1
    mysql> SELECT '~' REGEXP '[[.tilde.]]';                 -> 1
    
  • [=character_class=]

    Within a bracket expression (written using [ and ]), [=character_class=] represents an equivalence class. It matches all characters with the same collation value, including itself. For example, if o and (+) are the members of an equivalence class, [[=o=]], [[=(+)=]], and [o(+)] are all synonymous. An equivalence class may not be used as an endpoint of a range.

  • [:character_class:]

    Within a bracket expression (written using [ and ]), [:character_class:] represents a character class that matches all characters belonging to that class. The following table lists the standard class names. These names stand for the character classes defined in the ctype(3) manual page. A particular locale may provide other class names. A character class may not be used as an endpoint of a range.

    Character Class NameMeaning
    alnumAlphanumeric characters
    alphaAlphabetic characters
    blankWhitespace characters
    cntrlControl characters
    digitDigit characters
    graphGraphic characters
    lowerLowercase alphabetic characters
    printGraphic or space characters
    punctPunctuation characters
    spaceSpace, tab, newline, and carriage return
    upperUppercase alphabetic characters
    xdigitHexadecimal digit characters
    mysql> SELECT 'justalnums' REGEXP '[[:alnum:]]+';       -> 1
    mysql> SELECT '!!' REGEXP '[[:alnum:]]+';               -> 0
    
  • [[:<:]], [[:>:]]

    These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).

    mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]';   -> 1
    mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]';  -> 0
    

To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:

mysql> SELECT '1+2' REGEXP '1+2';                       -> 0
mysql> SELECT '1+2' REGEXP '1\+2';                      -> 0
mysql> SELECT '1+2' REGEXP '1\\+2';                     -> 1

12.6. Numeric Functions and Operators

Table 12.10. Numeric Functions and Operators

NameDescription
ABS()Return the absolute value
ACOS()Return the arc cosine
ASIN()Return the arc sine
ATAN2(), ATAN()Return the arc tangent of the two arguments
ATAN()Return the arc tangent
CEIL()Return the smallest integer value not less than the argument
CEILING()Return the smallest integer value not less than the argument
CONV()Convert numbers between different number bases
COS()Return the cosine
COT()Return the cotangent
CRC32()Compute a cyclic redundancy check value
DEGREES()Convert radians to degrees
DIVInteger division
/Division operator
EXP()Raise to the power of
FLOOR()Return the largest integer value not greater than the argument
LN()Return the natural logarithm of the argument
LOG10()Return the base-10 logarithm of the argument
LOG2()Return the base-2 logarithm of the argument
LOG()Return the natural logarithm of the first argument
-Minus operator
MOD()Return the remainder
% or MODModulo operator
PI()Return the value of pi
+Addition operator
POW()Return the argument raised to the specified power
POWER()Return the argument raised to the specified power
RADIANS()Return argument converted to radians
RAND()Return a random floating-point value
ROUND()Round the argument
SIGN()Return the sign of the argument
SIN()Return the sine of the argument
SQRT()Return the square root of the argument
TAN()Return the tangent of the argument
*Multiplication operator
TRUNCATE()Truncate to specified number of decimal places
-Change the sign of the argument

12.6.1. Arithmetic Operators

Table 12.11. Arithmetic Operators

NameDescription
DIVInteger division
/Division operator
-Minus operator
% or MODModulo operator
+Addition operator
*Multiplication operator
-Change the sign of the argument

The usual arithmetic operators are available. The result is determined according to the following rules:

  • In the case of -, +, and *, the result is calculated with BIGINT (64-bit) precision if both operands are integers.

  • If both operands are integers and any of them are unsigned, the result is an unsigned integer. For subtraction, if the NO_UNSIGNED_SUBTRACTION SQL mode is enabled, the result is signed even if any operand is unsigned.

  • If any of the operands of a +, -, /, *, % is a real or string value, the precision of the result is the precision of the operand with the maximum precision.

  • In division performed with /, the scale of the result when using two exact-value operands is the scale of the first operand plus the value of the div_precision_increment system variable (which is 4 by default). For example, the result of the expression 5.05 / 0.014 has a scale of six decimal places (360.714286).

These rules are applied for each operation, such that nested calculations imply the precision of each component. Hence, (14620 / 9432456) / (24250 / 9432456), resolves first to (0.0014) / (0.0026), with the final result having 8 decimal places (0.60288653).

Because of these rules and the way they are applied, care should be taken to ensure that components and subcomponents of a calculation use the appropriate level of precision. See Section 12.10, “Cast Functions and Operators”.

For information about handling of overflow in numeric expression evaluation, see Section 11.2.6, “Out-of-Range and Overflow Handling”.

Arithmetic operators apply to numbers. For other types of values, alternative operations may be available. For example, to add date values, use DATE_ADD(); see Section 12.7, “Date and Time Functions”.

  • +

    Addition:

    mysql> SELECT 3+5;
            -> 8
    
  • -

    Subtraction:

    mysql> SELECT 3-5;
            -> -2
    
  • -

    Unary minus. This operator changes the sign of the operand.

    mysql> SELECT - 2;
            -> -2
    
    Note

    If this operator is used with a BIGINT, the return value is also a BIGINT. This means that you should avoid using - on integers that may have the value of –263.

  • *

    Multiplication:

    mysql> SELECT 3*5;
            -> 15
    mysql> SELECT 18014398509481984*18014398509481984.0;
            -> 324518553658426726783156020576256.0
    mysql> SELECT 18014398509481984*18014398509481984;
            -> out-of-range error
    

    The last expression produces an error because the result of the integer multiplication exceeds the 64-bit range of BIGINT calculations. (See Section 11.2, “Numeric Types”.)

  • /

    Division:

    mysql> SELECT 3/5;
            -> 0.60
    

    Division by zero produces a NULL result:

    mysql> SELECT 102/(1-1);
            -> NULL
    

    A division is calculated with BIGINT arithmetic only if performed in a context where its result is converted to an integer.

  • DIV

    Integer division. Similar to FLOOR(), but is safe with BIGINT values.

    In MySQL 5.7, if either operand has a noninteger type, the operands are converted to DECIMAL and divided using DECIMAL arithmetic before converting the result to BIGINT. If the result exceeds BIGINT range, an error occurs.

    mysql> SELECT 5 DIV 2;
            -> 2
    
  • N % M, N MOD M

    Modulo operation. Returns the remainder of N divided by M. For more information, see the description for the MOD() function in Section 12.6.2, “Mathematical Functions”.

12.6.2. Mathematical Functions

Table 12.12. Mathematical Functions

NameDescription
ABS()Return the absolute value
ACOS()Return the arc cosine
ASIN()Return the arc sine
ATAN2(), ATAN()Return the arc tangent of the two arguments
ATAN()Return the arc tangent
CEIL()Return the smallest integer value not less than the argument
CEILING()Return the smallest integer value not less than the argument
CONV()Convert numbers between different number bases
COS()Return the cosine
COT()Return the cotangent
CRC32()Compute a cyclic redundancy check value
DEGREES()Convert radians to degrees
EXP()Raise to the power of
FLOOR()Return the largest integer value not greater than the argument
LN()Return the natural logarithm of the argument
LOG10()Return the base-10 logarithm of the argument
LOG2()Return the base-2 logarithm of the argument
LOG()Return the natural logarithm of the first argument
MOD()Return the remainder
PI()Return the value of pi
POW()Return the argument raised to the specified power
POWER()Return the argument raised to the specified power
RADIANS()Return argument converted to radians
RAND()Return a random floating-point value
ROUND()Round the argument
SIGN()Return the sign of the argument
SIN()Return the sine of the argument
SQRT()Return the square root of the argument
TAN()Return the tangent of the argument
TRUNCATE()Truncate to specified number of decimal places

All mathematical functions return NULL in the event of an error.

  • ABS(X)

    Returns the absolute value of X.

    mysql> SELECT ABS(2);
            -> 2
    mysql> SELECT ABS(-32);
            -> 32
    

    This function is safe to use with BIGINT values.

  • ACOS(X)

    Returns the arc cosine of X, that is, the value whose cosine is X. Returns NULL if X is not in the range -1 to 1.

    mysql> SELECT ACOS(1);
            -> 0
    mysql> SELECT ACOS(1.0001);
            -> NULL
    mysql> SELECT ACOS(0);
            -> 1.5707963267949
    
  • ASIN(X)

    Returns the arc sine of X, that is, the value whose sine is X. Returns NULL if X is not in the range -1 to 1.

    mysql> SELECT ASIN(0.2);
            -> 0.20135792079033
    mysql> SELECT ASIN('foo');
    
    +-------------+
    | ASIN('foo') |
    +-------------+
    |           0 |
    +-------------+
    1 row in set, 1 warning (0.00 sec)
    
    mysql> SHOW WARNINGS;
    +---------+------+-----------------------------------------+
    | Level   | Code | Message                                 |
    +---------+------+-----------------------------------------+
    | Warning | 1292 | Truncated incorrect DOUBLE value: 'foo' |
    +---------+------+-----------------------------------------+
    
  • ATAN(X)

    Returns the arc tangent of X, that is, the value whose tangent is X.

    mysql> SELECT ATAN(2);
            -> 1.1071487177941
    mysql> SELECT ATAN(-2);
            -> -1.1071487177941
    
  • ATAN(Y,X), ATAN2(Y,X)

    Returns the arc tangent of the two variables X and Y. It is similar to calculating the arc tangent of Y / X, except that the signs of both arguments are used to determine the quadrant of the result.

    mysql> SELECT ATAN(-2,2);
            -> -0.78539816339745
    mysql> SELECT ATAN2(PI(),0);
            -> 1.5707963267949
    
  • CEIL(X)

    CEIL() is a synonym for CEILING().

  • CEILING(X)

    Returns the smallest integer value not less than X.

    mysql> SELECT CEILING(1.23);
            -> 2
    mysql> SELECT CEILING(-1.23);
            -> -1
    

    For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.

  • CONV(N,from_base,to_base)

    Converts numbers between different number bases. Returns a string representation of the number N, converted from base from_base to base to_base. Returns NULL if any argument is NULL. The argument N is interpreted as an integer, but may be specified as an integer or a string. The minimum base is 2 and the maximum base is 36. If to_base is a negative number, N is regarded as a signed number. Otherwise, N is treated as unsigned. CONV() works with 64-bit precision.

    mysql> SELECT CONV('a',16,2);
            -> '1010'
    mysql> SELECT CONV('6E',18,8);
            -> '172'
    mysql> SELECT CONV(-17,10,-18);
            -> '-H'
    mysql> SELECT CONV(10+'10'+'10'+0xa,10,10);
            -> '40'
    
  • COS(X)

    Returns the cosine of X, where X is given in radians.

    mysql> SELECT COS(PI());
            -> -1
    
  • COT(X)

    Returns the cotangent of X.

    mysql> SELECT COT(12);
            -> -1.5726734063977
    mysql> SELECT COT(0);
            -> NULL
    
  • CRC32(expr)

    Computes a cyclic redundancy check value and returns a 32-bit unsigned value. The result is NULL if the argument is NULL. The argument is expected to be a string and (if possible) is treated as one if it is not.

    mysql> SELECT CRC32('MySQL');
            -> 3259397556
    mysql> SELECT CRC32('mysql');
            -> 2501908538
    
  • DEGREES(X)

    Returns the argument X, converted from radians to degrees.

    mysql> SELECT DEGREES(PI());
            -> 180
    mysql> SELECT DEGREES(PI() / 2);
            -> 90
    
  • EXP(X)

    Returns the value of e (the base of natural logarithms) raised to the power of X. The inverse of this function is LOG() (using a single argument only) or LN().

    mysql> SELECT EXP(2);
            -> 7.3890560989307
    mysql> SELECT EXP(-2);
            -> 0.13533528323661
    mysql> SELECT EXP(0);
            -> 1
    
  • FLOOR(X)

    Returns the largest integer value not greater than X.

    mysql> SELECT FLOOR(1.23);
            -> 1
    mysql> SELECT FLOOR(-1.23);
            -> -2
    

    For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.

  • FORMAT(X,D)

    Formats the number X to a format like '#,###,###.##', rounded to D decimal places, and returns the result as a string. For details, see Section 12.5, “String Functions”.

  • HEX(N_or_S)

    This function can be used to obtain a hexadecimal representation of a decimal number or a string; the manner in which it does so varies according to the argument's type. See this function's description in Section 12.5, “String Functions”, for details.

  • LN(X)

    Returns the natural logarithm of X; that is, the base-e logarithm of X. If X is less than or equal to 0, then NULL is returned.

    mysql> SELECT LN(2);
            -> 0.69314718055995
    mysql> SELECT LN(-2);
            -> NULL
    

    This function is synonymous with LOG(X). The inverse of this function is the EXP() function.

  • LOG(X), LOG(B,X)

    If called with one parameter, this function returns the natural logarithm of X. If X is less than or equal to 0, then NULL is returned.

    The inverse of this function (when called with a single argument) is the EXP() function.

    mysql> SELECT LOG(2);
            -> 0.69314718055995
    mysql> SELECT LOG(-2);
            -> NULL
    

    If called with two parameters, this function returns the logarithm of X to the base B. If X is less than or equal to 0, or if B is less than or equal to 1, then NULL is returned.

    mysql> SELECT LOG(2,65536);
            -> 16
    mysql> SELECT LOG(10,100);
            -> 2
    mysql> SELECT LOG(1,100);
            -> NULL
    

    LOG(B,X) is equivalent to LOG(X) / LOG(B).

  • LOG2(X)

    Returns the base-2 logarithm of X.

    mysql> SELECT LOG2(65536);
            -> 16
    mysql> SELECT LOG2(-100);
            -> NULL
    

    LOG2() is useful for finding out how many bits a number requires for storage. This function is equivalent to the expression LOG(X) / LOG(2).

  • LOG10(X)

    Returns the base-10 logarithm of X.

    mysql> SELECT LOG10(2);
            -> 0.30102999566398
    mysql> SELECT LOG10(100);
            -> 2
    mysql> SELECT LOG10(-100);
            -> NULL
    

    LOG10(X) is equivalent to LOG(10,X).

  • MOD(N,M), N % M, N MOD M

    Modulo operation. Returns the remainder of N divided by M.

    mysql> SELECT MOD(234, 10);
            -> 4
    mysql> SELECT 253 % 7;
            -> 1
    mysql> SELECT MOD(29,9);
            -> 2
    mysql> SELECT 29 MOD 9;
            -> 2
    

    This function is safe to use with BIGINT values.

    MOD() also works on values that have a fractional part and returns the exact remainder after division:

    mysql> SELECT MOD(34.5,3);
            -> 1.5
    

    MOD(N,0) returns NULL.

  • PI()

    Returns the value of π (pi). The default number of decimal places displayed is seven, but MySQL uses the full double-precision value internally.

    mysql> SELECT PI();
            -> 3.141593
    mysql> SELECT PI()+0.000000000000000000;
            -> 3.141592653589793116
    
  • POW(X,Y)

    Returns the value of X raised to the power of Y.

    mysql> SELECT POW(2,2);
            -> 4
    mysql> SELECT POW(2,-2);
            -> 0.25
    
  • POWER(X,Y)

    This is a synonym for POW().

  • RADIANS(X)

    Returns the argument X, converted from degrees to radians. (Note that π radians equals 180 degrees.)

    mysql> SELECT RADIANS(90);
            -> 1.5707963267949
    
  • RAND(), RAND(N)

    Returns a random floating-point value v in the range 0 <= v < 1.0. If a constant integer argument N is specified, it is used as the seed value, which produces a repeatable sequence of column values. In the following example, note that the sequences of values produced by RAND(3) is the same both places where it occurs.

    mysql> CREATE TABLE t (i INT);
    Query OK, 0 rows affected (0.42 sec)
    
    mysql> INSERT INTO t VALUES(1),(2),(3);
    Query OK, 3 rows affected (0.00 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> SELECT i, RAND() FROM t;
    +------+------------------+
    | i    | RAND()           |
    +------+------------------+
    |    1 | 0.61914388706828 |
    |    2 | 0.93845168309142 |
    |    3 | 0.83482678498591 |
    +------+------------------+
    3 rows in set (0.00 sec)
    
    mysql> SELECT i, RAND(3) FROM t;
    +------+------------------+
    | i    | RAND(3)          |
    +------+------------------+
    |    1 | 0.90576975597606 |
    |    2 | 0.37307905813035 |
    |    3 | 0.14808605345719 |
    +------+------------------+
    3 rows in set (0.00 sec)
    
    mysql> SELECT i, RAND() FROM t;
    +------+------------------+
    | i    | RAND()           |
    +------+------------------+
    |    1 | 0.35877890638893 |
    |    2 | 0.28941420772058 |
    |    3 | 0.37073435016976 |
    +------+------------------+
    3 rows in set (0.00 sec)
    
    mysql> SELECT i, RAND(3) FROM t;
    +------+------------------+
    | i    | RAND(3)          |
    +------+------------------+
    |    1 | 0.90576975597606 |
    |    2 | 0.37307905813035 |
    |    3 | 0.14808605345719 |
    +------+------------------+
    3 rows in set (0.01 sec)
    

    With a constant initializer, the seed is initialized once when the statement is compiled, prior to execution. If a nonconstant initializer (such as a column name) is used as the argument, the seed is initialized with the value for each invocation of RAND(). (One implication of this is that for equal argument values, RAND() will return the same value each time.)

    To obtain a random integer R in the range i <= R < j, use the expression FLOOR(i + RAND() * (ji)). For example, to obtain a random integer in the range the range 7 <= R < 12, you could use the following statement:

    SELECT FLOOR(7 + (RAND() * 5));

    RAND() in a WHERE clause is re-evaluated every time the WHERE is executed.

    You cannot use a column with RAND() values in an ORDER BY clause, because ORDER BY would evaluate the column multiple times. However, you can retrieve rows in random order like this:

    mysql> SELECT * FROM tbl_name ORDER BY RAND();
    

    ORDER BY RAND() combined with LIMIT is useful for selecting a random sample from a set of rows:

    mysql> SELECT * FROM table1, table2 WHERE a=b AND c<d -> ORDER BY RAND() LIMIT 1000;
    

    RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.

    This function is unsafe for statement-based replication. A warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #49222)

  • ROUND(X), ROUND(X,D)

    Rounds the argument X to D decimal places. The rounding algorithm depends on the data type of X. D defaults to 0 if not specified. D can be negative to cause D digits left of the decimal point of the value X to become zero.

    mysql> SELECT ROUND(-1.23);
            -> -1
    mysql> SELECT ROUND(-1.58);
            -> -2
    mysql> SELECT ROUND(1.58);
            -> 2
    mysql> SELECT ROUND(1.298, 1);
            -> 1.3
    mysql> SELECT ROUND(1.298, 0);
            -> 1
    mysql> SELECT ROUND(23.298, -1);
            -> 20
    

    The return type is the same type as that of the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places):

    mysql> SELECT ROUND(150.000,2), ROUND(150,2);
    +------------------+--------------+
    | ROUND(150.000,2) | ROUND(150,2) |
    +------------------+--------------+
    |           150.00 |          150 |
    +------------------+--------------+
    

    ROUND() uses the following rules depending on the type of the first argument:

    • For exact-value numbers, ROUND() uses the round half away from zero or round toward nearest rule: A value with a fractional part of .5 or greater is rounded up to the next integer if positive or down to the next integer if negative. (In other words, it is rounded away from zero.) A value with a fractional part less than .5 is rounded down to the next integer if positive or up to the next integer if negative.

    • For approximate-value numbers, the result depends on the C library. On many systems, this means that ROUND() uses the "round to nearest even" rule: A value with any fractional part is rounded to the nearest even integer.

    The following example shows how rounding differs for exact and approximate values:

    mysql> SELECT ROUND(2.5), ROUND(25E-1);
    +------------+--------------+
    | ROUND(2.5) | ROUND(25E-1) |
    +------------+--------------+
    | 3          |            2 |
    +------------+--------------+
    

    For more information, see Section 12.19, “Precision Math”.

  • SIGN(X)

    Returns the sign of the argument as -1, 0, or 1, depending on whether X is negative, zero, or positive.

    mysql> SELECT SIGN(-32);
            -> -1
    mysql> SELECT SIGN(0);
            -> 0
    mysql> SELECT SIGN(234);
            -> 1
    
  • SIN(X)

    Returns the sine of X, where X is given in radians.

    mysql> SELECT SIN(PI());
            -> 1.2246063538224e-16
    mysql> SELECT ROUND(SIN(PI()));
            -> 0
    
  • SQRT(X)

    Returns the square root of a nonnegative number X.

    mysql> SELECT SQRT(4);
            -> 2
    mysql> SELECT SQRT(20);
            -> 4.4721359549996
    mysql> SELECT SQRT(-16);
            -> NULL
    
  • TAN(X)

    Returns the tangent of X, where X is given in radians.

    mysql> SELECT TAN(PI());
            -> -1.2246063538224e-16
    mysql> SELECT TAN(PI()+1);
            -> 1.5574077246549
    
  • TRUNCATE(X,D)

    Returns the number X, truncated to D decimal places. If D is 0, the result has no decimal point or fractional part. D can be negative to cause D digits left of the decimal point of the value X to become zero.

    mysql> SELECT TRUNCATE(1.223,1);
            -> 1.2
    mysql> SELECT TRUNCATE(1.999,1);
            -> 1.9
    mysql> SELECT TRUNCATE(1.999,0);
            -> 1
    mysql> SELECT TRUNCATE(-1.999,1);
            -> -1.9
    mysql> SELECT TRUNCATE(122,-2);
           -> 100
    mysql> SELECT TRUNCATE(10.28*100,0);
           -> 1028
    

    All numbers are rounded toward zero.

12.7. Date and Time Functions

This section describes the functions that can be used to manipulate temporal values. See Section 11.3, “Date and Time Types”, for a description of the range of values each date and time type has and the valid formats in which values may be specified.

Table 12.13. Date/Time Functions

NameDescription
ADDDATE()Add time values (intervals) to a date value
ADDTIME()Add time
CONVERT_TZ()Convert from one timezone to another
CURDATE()Return the current date
CURRENT_DATE(), CURRENT_DATESynonyms for CURDATE()
CURRENT_TIME(), CURRENT_TIMESynonyms for CURTIME()
CURRENT_TIMESTAMP(), CURRENT_TIMESTAMPSynonyms for NOW()
CURTIME()Return the current time
DATE_ADD()Add time values (intervals) to a date value
DATE_FORMAT()Format date as specified
DATE_SUB()Subtract a time value (interval) from a date
DATE()Extract the date part of a date or datetime expression
DATEDIFF()Subtract two dates
DAY()Synonym for DAYOFMONTH()
DAYNAME()Return the name of the weekday
DAYOFMONTH()Return the day of the month (0-31)
DAYOFWEEK()Return the weekday index of the argument
DAYOFYEAR()Return the day of the year (1-366)
EXTRACT()Extract part of a date
FROM_DAYS()Convert a day number to a date
FROM_UNIXTIME()Format UNIX timestamp as a date
GET_FORMAT()Return a date format string
HOUR()Extract the hour
LAST_DAYReturn the last day of the month for the argument
LOCALTIME(), LOCALTIMESynonym for NOW()
LOCALTIMESTAMP, LOCALTIMESTAMP()Synonym for NOW()
MAKEDATE()Create a date from the year and day of year
MAKETIMEMAKETIME()
MICROSECOND()Return the microseconds from argument
MINUTE()Return the minute from the argument
MONTH()Return the month from the date passed
MONTHNAME()Return the name of the month
NOW()Return the current date and time
PERIOD_ADD()Add a period to a year-month
PERIOD_DIFF()Return the number of months between periods
QUARTER()Return the quarter from a date argument
SEC_TO_TIME()Converts seconds to 'HH:MM:SS' format
SECOND()Return the second (0-59)
STR_TO_DATE()Convert a string to a date
SUBDATE()A synonym for DATE_SUB() when invoked with three arguments
SUBTIME()Subtract times
SYSDATE()Return the time at which the function executes
TIME_FORMAT()Format as time
TIME_TO_SEC()Return the argument converted to seconds
TIME()Extract the time portion of the expression passed
TIMEDIFF()Subtract time
TIMESTAMP()With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments
TIMESTAMPADD()Add an interval to a datetime expression
TIMESTAMPDIFF()Subtract an interval from a datetime expression
TO_DAYS()Return the date argument converted to days
TO_SECONDS()Return the date or datetime argument converted to seconds since Year 0
UNIX_TIMESTAMP()Return a UNIX timestamp
UTC_DATE()Return the current UTC date
UTC_TIME()Return the current UTC time
UTC_TIMESTAMP()Return the current UTC date and time
WEEK()Return the week number
WEEKDAY()Return the weekday index
WEEKOFYEAR()Return the calendar week of the date (0-53)
YEAR()Return the year
YEARWEEK()Return the year and week

Here is an example that uses date functions. The following query selects all rows with a date_col value from within the last 30 days:

mysql> SELECT something FROM tbl_name
    -> WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= date_col;

The query also selects rows with dates that lie in the future.

Functions that expect date values usually accept datetime values and ignore the time part. Functions that expect time values usually accept datetime values and ignore the date part.

Functions that return the current date or time each are evaluated only once per query at the start of query execution. This means that multiple references to a function such as NOW() within a single query always produce the same result. (For our purposes, a single query also includes a call to a stored program (stored routine, trigger, or event) and all subprograms called by that program.) This principle also applies to CURDATE(), CURTIME(), UTC_DATE(), UTC_TIME(), UTC_TIMESTAMP(), and to any of their synonyms.

The CURRENT_TIMESTAMP(), CURRENT_TIME(), CURRENT_DATE(), and FROM_UNIXTIME() functions return values in the connection's current time zone, which is available as the value of the time_zone system variable. In addition, UNIX_TIMESTAMP() assumes that its argument is a datetime value in the current time zone. See Section 10.6, “MySQL Server Time Zone Support”.

Some date functions can be used with zero dates or incomplete dates such as '2001-11-00', whereas others cannot. Functions that extract parts of dates typically work with incomplete dates and thus can return 0 when you might otherwise expect a nonzero value. For example:

mysql> SELECT DAYOFMONTH('2001-11-00'), MONTH('2005-00-00');
        -> 0, 0

Other functions expect complete dates and return NULL for incomplete dates. These include functions that perform date arithmetic or that map parts of dates to names. For example:

mysql> SELECT DATE_ADD('2006-05-00',INTERVAL 1 DAY);
        -> NULL
mysql> SELECT DAYNAME('2006-05-00');
        -> NULL

Several functions are more strict when passed a DATE() function value as their argument and reject incomplete dates with a day part of zero. These functions are affected: CONVERT_TZ(), DATE_ADD(), DATE_SUB(), DAYOFYEAR(), LAST_DAY() (permits a day part of zero), TIMESTAMPDIFF(), TO_DAYS(), TO_SECONDS(), WEEK(), WEEKDAY(), WEEKOFYEAR(), YEARWEEK().

Fractional seconds for TIME, DATETIME, and TIMESTAMP values are supported, with up to microsecond precision. Functions that take temporal arguments accept values with fractional seconds. Return values from temporal functions include fractional seconds as appropriate.

  • ADDDATE(date,INTERVAL expr unit), ADDDATE(expr,days)

    When invoked with the INTERVAL form of the second argument, ADDDATE() is a synonym for DATE_ADD(). The related function SUBDATE() is a synonym for DATE_SUB(). For information on the INTERVAL unit argument, see the discussion for DATE_ADD().

    mysql> SELECT DATE_ADD('2008-01-02', INTERVAL 31 DAY);
            -> '2008-02-02'
    mysql> SELECT ADDDATE('2008-01-02', INTERVAL 31 DAY);
            -> '2008-02-02'
    

    When invoked with the days form of the second argument, MySQL treats it as an integer number of days to be added to expr.

    mysql> SELECT ADDDATE('2008-01-02', 31);
            -> '2008-02-02'
    
  • ADDTIME(expr1,expr2)

    ADDTIME() adds expr2 to expr1 and returns the result. expr1 is a time or datetime expression, and expr2 is a time expression.

    mysql> SELECT ADDTIME('2007-12-31 23:59:59.999999', '1 1:1:1.000002');
            -> '2008-01-02 01:01:01.000001'
    mysql> SELECT ADDTIME('01:00:00.999999', '02:00:00.999998');
            -> '03:00:01.999997'
    
  • CONVERT_TZ(dt,from_tz,to_tz)

    CONVERT_TZ() converts a datetime value dt from the time zone given by from_tz to the time zone given by to_tz and returns the resulting value. Time zones are specified as described in Section 10.6, “MySQL Server Time Zone Support”. This function returns NULL if the arguments are invalid.

    If the value falls out of the supported range of the TIMESTAMP type when converted from from_tz to UTC, no conversion occurs. The TIMESTAMP range is described in Section 11.1.2, “Date and Time Type Overview”.

    mysql> SELECT CONVERT_TZ('2004-01-01 12:00:00','GMT','MET');
            -> '2004-01-01 13:00:00'
    mysql> SELECT CONVERT_TZ('2004-01-01 12:00:00','+00:00','+10:00');
            -> '2004-01-01 22:00:00'
    
    Note

    To use named time zones such as 'MET' or 'Europe/Moscow', the time zone tables must be properly set up. See Section 10.6, “MySQL Server Time Zone Support”, for instructions.

  • CURDATE()

    Returns the current date as a value in 'YYYY-MM-DD' or YYYYMMDD format, depending on whether the function is used in a string or numeric context.

    mysql> SELECT CURDATE();
            -> '2008-06-13'
    mysql> SELECT CURDATE() + 0;
            -> 20080613
    
  • CURRENT_DATE, CURRENT_DATE()

    CURRENT_DATE and CURRENT_DATE() are synonyms for CURDATE().

  • CURRENT_TIME, CURRENT_TIME([fsp])

    CURRENT_TIME and CURRENT_TIME() are synonyms for CURTIME().

  • CURRENT_TIMESTAMP, CURRENT_TIMESTAMP([fsp])

    CURRENT_TIMESTAMP and CURRENT_TIMESTAMP() are synonyms for NOW().

  • CURTIME([fsp])

    Returns the current time as a value in 'HH:MM:SS' or HHMMSS format, depending on whether the function is used in a string or numeric context. The value is expressed in the current time zone.

    If the fsp argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.

    mysql> SELECT CURTIME();
            -> '23:50:26'
    mysql> SELECT CURTIME() + 0;
            -> 235026.000000
    
  • DATE(expr)

    Extracts the date part of the date or datetime expression expr.

    mysql> SELECT DATE('2003-12-31 01:02:03');
            -> '2003-12-31'
    
  • DATEDIFF(expr1,expr2)

    DATEDIFF() returns expr1expr2 expressed as a value in days from one date to the other. expr1 and expr2 are date or date-and-time expressions. Only the date parts of the values are used in the calculation.

    mysql> SELECT DATEDIFF('2007-12-31 23:59:59','2007-12-30');
            -> 1
    mysql> SELECT DATEDIFF('2010-11-30 23:59:59','2010-12-31');
            -> -31
    
  • DATE_ADD(date,INTERVAL expr unit), DATE_SUB(date,INTERVAL expr unit)

    These functions perform date arithmetic. The date argument specifies the starting date or datetime value. expr is an expression specifying the interval value to be added or subtracted from the starting date. expr is a string; it may start with a - for negative intervals. unit is a keyword indicating the units in which the expression should be interpreted.

    The INTERVAL keyword and the unit specifier are not case sensitive.

    The following table shows the expected form of the expr argument for each unit value.

    unit ValueExpected expr Format
    MICROSECONDMICROSECONDS
    SECONDSECONDS
    MINUTEMINUTES
    HOURHOURS
    DAYDAYS
    WEEKWEEKS
    MONTHMONTHS
    QUARTERQUARTERS
    YEARYEARS
    SECOND_MICROSECOND'SECONDS.MICROSECONDS'
    MINUTE_MICROSECOND'MINUTES:SECONDS.MICROSECONDS'
    MINUTE_SECOND'MINUTES:SECONDS'
    HOUR_MICROSECOND'HOURS:MINUTES:SECONDS.MICROSECONDS'
    HOUR_SECOND'HOURS:MINUTES:SECONDS'
    HOUR_MINUTE'HOURS:MINUTES'
    DAY_MICROSECOND'DAYS HOURS:MINUTES:SECONDS.MICROSECONDS'
    DAY_SECOND'DAYS HOURS:MINUTES:SECONDS'
    DAY_MINUTE'DAYS HOURS:MINUTES'
    DAY_HOUR'DAYS HOURS'
    YEAR_MONTH'YEARS-MONTHS'

    The return value depends on the arguments:

    • DATETIME if the first argument is a DATETIME (or TIMESTAMP) value, or if the first argument is a DATE and the unit value uses HOURS, MINUTES, or SECONDS.

    • String otherwise.

    To ensure that the result is DATETIME, you can use CAST() to convert the first argument to DATETIME.

    MySQL permits any punctuation delimiter in the expr format. Those shown in the table are the suggested delimiters. If the date argument is a DATE value and your calculations involve only YEAR, MONTH, and DAY parts (that is, no time parts), the result is a DATE value. Otherwise, the result is a DATETIME value.

    Date arithmetic also can be performed using INTERVAL together with the + or - operator:

    date + INTERVAL expr unit
    date - INTERVAL expr unit
    

    INTERVAL expr unit is permitted on either side of the + operator if the expression on the other side is a date or datetime value. For the - operator, INTERVAL expr unit is permitted only on the right side, because it makes no sense to subtract a date or datetime value from an interval.

    mysql> SELECT '2008-12-31 23:59:59' + INTERVAL 1 SECOND;
            -> '2009-01-01 00:00:00'
    mysql> SELECT INTERVAL 1 DAY + '2008-12-31';
            -> '2009-01-01'
    mysql> SELECT '2005-01-01' - INTERVAL 1 SECOND;
            -> '2004-12-31 23:59:59'
    mysql> SELECT DATE_ADD('2000-12-31 23:59:59',
        ->                 INTERVAL 1 SECOND);
            -> '2001-01-01 00:00:00'
    mysql> SELECT DATE_ADD('2010-12-31 23:59:59',
        ->                 INTERVAL 1 DAY);
            -> '2011-01-01 23:59:59'
    mysql> SELECT DATE_ADD('2100-12-31 23:59:59',
        ->                 INTERVAL '1:1' MINUTE_SECOND);
            -> '2101-01-01 00:01:00'
    mysql> SELECT DATE_SUB('2005-01-01 00:00:00',
        ->                 INTERVAL '1 1:1:1' DAY_SECOND);
            -> '2004-12-30 22:58:59'
    mysql> SELECT DATE_ADD('1900-01-01 00:00:00',
        ->                 INTERVAL '-1 10' DAY_HOUR);
            -> '1899-12-30 14:00:00'
    mysql> SELECT DATE_SUB('1998-01-02', INTERVAL 31 DAY);
            -> '1997-12-02'
    mysql> SELECT DATE_ADD('1992-12-31 23:59:59.000002',
        ->            INTERVAL '1.999999' SECOND_MICROSECOND);
            -> '1993-01-01 00:00:01.000001'
    

    If you specify an interval value that is too short (does not include all the interval parts that would be expected from the unit keyword), MySQL assumes that you have left out the leftmost parts of the interval value. For example, if you specify a unit of DAY_SECOND, the value of expr is expected to have days, hours, minutes, and seconds parts. If you specify a value like '1:10', MySQL assumes that the days and hours parts are missing and the value represents minutes and seconds. In other words, '1:10' DAY_SECOND is interpreted in such a way that it is equivalent to '1:10' MINUTE_SECOND. This is analogous to the way that MySQL interprets TIME values as representing elapsed time rather than as a time of day.

    Because expr is treated as a string, be careful if you specify a nonstring value with INTERVAL. For example, with an interval specifier of HOUR_MINUTE, 6/4 evaluates to 1.5000 and is treated as 1 hour, 5000 minutes:

    mysql> SELECT 6/4;
            -> 1.5000
    mysql> SELECT DATE_ADD('2009-01-01', INTERVAL 6/4 HOUR_MINUTE);
            -> '2009-01-04 12:20:00'
    

    To ensure interpretation of the interval value as you expect, a CAST() operation may be used. To treat 6/4 as 1 hour, 5 minutes, cast it to a DECIMAL value with a single fractional digit:

    mysql> SELECT CAST(6/4 AS DECIMAL(3,1));
            -> 1.5
    mysql> SELECT DATE_ADD('1970-01-01 12:00:00',
        ->                 INTERVAL CAST(6/4 AS DECIMAL(3,1)) HOUR_MINUTE);
            -> '1970-01-01 13:05:00'
    

    If you add to or subtract from a date value something that contains a time part, the result is automatically converted to a datetime value:

    mysql> SELECT DATE_ADD('2013-01-01', INTERVAL 1 DAY);
            -> '2013-01-02'
    mysql> SELECT DATE_ADD('2013-01-01', INTERVAL 1 HOUR);
            -> '2013-01-01 01:00:00'
    

    If you add MONTH, YEAR_MONTH, or YEAR and the resulting date has a day that is larger than the maximum day for the new month, the day is adjusted to the maximum days in the new month:

    mysql> SELECT DATE_ADD('2009-01-30', INTERVAL 1 MONTH);
            -> '2009-02-28'
    

    Date arithmetic operations require complete dates and do not work with incomplete dates such as '2006-07-00' or badly malformed dates:

    mysql> SELECT DATE_ADD('2006-07-00', INTERVAL 1 DAY);
            -> NULL
    mysql> SELECT '2005-03-32' + INTERVAL 1 MONTH;
            -> NULL
    
  • DATE_FORMAT(date,format)

    Formats the date value according to the format string.

    The following specifiers may be used in the format string. The % character is required before format specifier characters.

    SpecifierDescription
    %aAbbreviated weekday name (Sun..Sat)
    %bAbbreviated month name (Jan..Dec)
    %cMonth, numeric (0..12)
    %DDay of the month with English suffix (0th, 1st, 2nd, 3rd, …)
    %dDay of the month, numeric (00..31)
    %eDay of the month, numeric (0..31)
    %fMicroseconds (000000..999999)
    %HHour (00..23)
    %hHour (01..12)
    %IHour (01..12)
    %iMinutes, numeric (00..59)
    %jDay of year (001..366)
    %kHour (0..23)
    %lHour (1..12)
    %MMonth name (January..December)
    %mMonth, numeric (00..12)
    %pAM or PM
    %rTime, 12-hour (hh:mm:ss followed by AM or PM)
    %SSeconds (00..59)
    %sSeconds (00..59)
    %TTime, 24-hour (hh:mm:ss)
    %UWeek (00..53), where Sunday is the first day of the week
    %uWeek (00..53), where Monday is the first day of the week
    %VWeek (01..53), where Sunday is the first day of the week; used with %X
    %vWeek (01..53), where Monday is the first day of the week; used with %x
    %WWeekday name (Sunday..Saturday)
    %wDay of the week (0=Sunday..6=Saturday)
    %XYear for the week where Sunday is the first day of the week, numeric, four digits; used with %V
    %xYear for the week, where Monday is the first day of the week, numeric, four digits; used with %v
    %YYear, numeric, four digits
    %yYear, numeric (two digits)
    %%A literal % character
    %xx, for any x not listed above

    Ranges for the month and day specifiers begin with zero due to the fact that MySQL permits the storing of incomplete dates such as '2014-00-00'.

    The language used for day and month names and abbreviations is controlled by the value of the lc_time_names system variable (Section 10.7, “MySQL Server Locale Support”).

    DATE_FORMAT() returns a string with a character set and collation given by character_set_connection and collation_connection so that it can return month and weekday names containing non-ASCII characters.

    mysql> SELECT DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y');
            -> 'Sunday October 2009'
    mysql> SELECT DATE_FORMAT('2007-10-04 22:23:00', '%H:%i:%s');
            -> '22:23:00'
    mysql> SELECT DATE_FORMAT('1900-10-04 22:23:00',
        ->                 '%D %y %a %d %m %b %j');
            -> '4th 00 Thu 04 10 Oct 277'
    mysql> SELECT DATE_FORMAT('1997-10-04 22:23:00',
        ->                 '%H %k %I %r %T %S %w');
            -> '22 22 10 10:23:00 PM 22:23:00 00 6'
    mysql> SELECT DATE_FORMAT('1999-01-01', '%X %V');
            -> '1998 52'
    mysql> SELECT DATE_FORMAT('2006-06-00', '%d');
            -> '00'
    
  • DATE_SUB(date,INTERVAL expr unit)

    See the description for DATE_ADD().

  • DAY(date)

    DAY() is a synonym for DAYOFMONTH().

  • DAYNAME(date)

    Returns the name of the weekday for date. The language used for the name is controlled by the value of the lc_time_names system variable (Section 10.7, “MySQL Server Locale Support”).

    mysql> SELECT DAYNAME('2007-02-03');
            -> 'Saturday'
    
  • DAYOFMONTH(date)

    Returns the day of the month for date, in the range 1 to 31, or 0 for dates such as '0000-00-00' or '2008-00-00' that have a zero day part.

    mysql> SELECT DAYOFMONTH('2007-02-03');
            -> 3
    
  • DAYOFWEEK(date)

    Returns the weekday index for date (1 = Sunday, 2 = Monday, …, 7 = Saturday). These index values correspond to the ODBC standard.

    mysql> SELECT DAYOFWEEK('2007-02-03');
            -> 7
    
  • DAYOFYEAR(date)

    Returns the day of the year for date, in the range 1 to 366.

    mysql> SELECT DAYOFYEAR('2007-02-03');
            -> 34
    
  • EXTRACT(unit FROM date)

    The EXTRACT() function uses the same kinds of unit specifiers as DATE_ADD() or DATE_SUB(), but extracts parts from the date rather than performing date arithmetic.

    mysql> SELECT EXTRACT(YEAR FROM '2009-07-02');
           -> 2009
    mysql> SELECT EXTRACT(YEAR_MONTH FROM '2009-07-02 01:02:03');
           -> 200907
    mysql> SELECT EXTRACT(DAY_MINUTE FROM '2009-07-02 01:02:03');
           -> 20102
    mysql> SELECT EXTRACT(MICROSECOND
        ->                FROM '2003-01-02 10:30:00.000123');
            -> 123
    
  • FROM_DAYS(N)

    Given a day number N, returns a DATE value.

    mysql> SELECT FROM_DAYS(730669);
            -> '2007-07-03'
    

    Use FROM_DAYS() with caution on old dates. It is not intended for use with values that precede the advent of the Gregorian calendar (1582). See Section 12.8, “What Calendar Is Used By MySQL?”.

  • FROM_UNIXTIME(unix_timestamp), FROM_UNIXTIME(unix_timestamp,format)

    Returns a representation of the unix_timestamp argument as a value in 'YYYY-MM-DD HH:MM:SS' or YYYYMMDDHHMMSS format, depending on whether the function is used in a string or numeric context. The value is expressed in the current time zone. unix_timestamp is an internal timestamp value such as is produced by the UNIX_TIMESTAMP() function.

    If format is given, the result is formatted according to the format string, which is used the same way as listed in the entry for the DATE_FORMAT() function.

    mysql> SELECT FROM_UNIXTIME(1196440219);
            -> '2007-11-30 10:30:19'
    mysql> SELECT FROM_UNIXTIME(1196440219) + 0;
            -> 20071130103019.000000
    mysql> SELECT FROM_UNIXTIME(UNIX_TIMESTAMP(),
        ->                      '%Y %D %M %h:%i:%s %x');
            -> '2007 30th November 10:30:59 2007'
    

    Note: If you use UNIX_TIMESTAMP() and FROM_UNIXTIME() to convert between TIMESTAMP values and Unix timestamp values, the conversion is lossy because the mapping is not one-to-one in both directions. For details, see the description of the UNIX_TIMESTAMP() function.

  • GET_FORMAT({DATE|TIME|DATETIME}, {'EUR'|'USA'|'JIS'|'ISO'|'INTERNAL'})

    Returns a format string. This function is useful in combination with the DATE_FORMAT() and the STR_TO_DATE() functions.

    The possible values for the first and second arguments result in several possible format strings (for the specifiers used, see the table in the DATE_FORMAT() function description). ISO format refers to ISO 9075, not ISO 8601.

    TIMESTAMP can also be used as the first argument to GET_FORMAT(), in which case the function returns the same values as for DATETIME.

    mysql> SELECT DATE_FORMAT('2003-10-03',GET_FORMAT(DATE,'EUR'));
            -> '03.10.2003'
    mysql> SELECT STR_TO_DATE('10.31.2003',GET_FORMAT(DATE,'USA'));
            -> '2003-10-31'
    
  • HOUR(time)

    Returns the hour for time. The range of the return value is 0 to 23 for time-of-day values. However, the range of TIME values actually is much larger, so HOUR can return values greater than 23.

    mysql> SELECT HOUR('10:05:03');
            -> 10
    mysql> SELECT HOUR('272:59:59');
            -> 272
    
  • LAST_DAY(date)

    Takes a date or datetime value and returns the corresponding value for the last day of the month. Returns NULL if the argument is invalid.

    mysql> SELECT LAST_DAY('2003-02-05');
            -> '2003-02-28'
    mysql> SELECT LAST_DAY('2004-02-05');
            -> '2004-02-29'
    mysql> SELECT LAST_DAY('2004-01-01 01:01:01');
            -> '2004-01-31'
    mysql> SELECT LAST_DAY('2003-03-32');
            -> NULL
    
  • LOCALTIME, LOCALTIME([fsp])

    LOCALTIME and LOCALTIME() are synonyms for NOW().

  • LOCALTIMESTAMP, LOCALTIMESTAMP([fsp])

    LOCALTIMESTAMP and LOCALTIMESTAMP() are synonyms for NOW().

  • MAKEDATE(year,dayofyear)

    Returns a date, given year and day-of-year values. dayofyear must be greater than 0 or the result is NULL.

    mysql> SELECT MAKEDATE(2011,31), MAKEDATE(2011,32);
            -> '2011-01-31', '2011-02-01'
    mysql> SELECT MAKEDATE(2011,365), MAKEDATE(2014,365);
            -> '2011-12-31', '2014-12-31'
    mysql> SELECT MAKEDATE(2011,0);
            -> NULL
    
  • MAKETIME(hour,minute,second)

    Returns a time value calculated from the hour, minute, and second arguments.

    The second argument can have a fractional part.

    mysql> SELECT MAKETIME(12,15,30);
            -> '12:15:30'
    
  • MICROSECOND(expr)

    Returns the microseconds from the time or datetime expression expr as a number in the range from 0 to 999999.

    mysql> SELECT MICROSECOND('12:00:00.123456');
            -> 123456
    mysql> SELECT MICROSECOND('2009-12-31 23:59:59.000010');
            -> 10
    
  • MINUTE(time)

    Returns the minute for time, in the range 0 to 59.

    mysql> SELECT MINUTE('2008-02-03 10:05:03');
            -> 5
    
  • MONTH(date)

    Returns the month for date, in the range 1 to 12 for January to December, or 0 for dates such as '0000-00-00' or '2008-00-00' that have a zero month part.

    mysql> SELECT MONTH('2008-02-03');
            -> 2
    
  • MONTHNAME(date)

    Returns the full name of the month for date. The language used for the name is controlled by the value of the lc_time_names system variable (Section 10.7, “MySQL Server Locale Support”).

    mysql> SELECT MONTHNAME('2008-02-03');
            -> 'February'
    
  • NOW([fsp])

    Returns the current date and time as a value in 'YYYY-MM-DD HH:MM:SS' or YYYYMMDDHHMMSS format, depending on whether the function is used in a string or numeric context. The value is expressed in the current time zone.

    If the fsp argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.

    mysql> SELECT NOW();
            -> '2007-12-15 23:50:26'
    mysql> SELECT NOW() + 0;
            -> 20071215235026.000000
    

    NOW() returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger, NOW() returns the time at which the function or triggering statement began to execute.) This differs from the behavior for SYSDATE(), which returns the exact time at which it executes.

    mysql> SELECT NOW(), SLEEP(2), NOW();
    +---------------------+----------+---------------------+
    | NOW()               | SLEEP(2) | NOW()               |
    +---------------------+----------+---------------------+
    | 2006-04-12 13:47:36 |        0 | 2006-04-12 13:47:36 |
    +---------------------+----------+---------------------+
    
    mysql> SELECT SYSDATE(), SLEEP(2), SYSDATE();
    +---------------------+----------+---------------------+
    | SYSDATE()           | SLEEP(2) | SYSDATE()           |
    +---------------------+----------+---------------------+
    | 2006-04-12 13:47:44 |        0 | 2006-04-12 13:47:46 |
    +---------------------+----------+---------------------+
    

    In addition, the SET TIMESTAMP statement affects the value returned by NOW() but not by SYSDATE(). This means that timestamp settings in the binary log have no effect on invocations of SYSDATE(). Setting the timestamp to a nonzero value causes each subsequent invocation of NOW() to return that value. Setting the timestamp to zero cancels this effect so that NOW() once again returns the current date and time.

    See the description for SYSDATE() for additional information about the differences between the two functions.

  • PERIOD_ADD(P,N)

    Adds N months to period P (in the format YYMM or YYYYMM). Returns a value in the format YYYYMM. Note that the period argument P is not a date value.

    mysql> SELECT PERIOD_ADD(200801,2);
            -> 200803
    
  • PERIOD_DIFF(P1,P2)

    Returns the number of months between periods P1 and P2. P1 and P2 should be in the format YYMM or YYYYMM. Note that the period arguments P1 and P2 are not date values.

    mysql> SELECT PERIOD_DIFF(200802,200703);
            -> 11
    
  • QUARTER(date)

    Returns the quarter of the year for date, in the range 1 to 4.

    mysql> SELECT QUARTER('2008-04-01');
            -> 2
    
  • SECOND(time)

    Returns the second for time, in the range 0 to 59.

    mysql> SELECT SECOND('10:05:03');
            -> 3
    
  • SEC_TO_TIME(seconds)

    Returns the seconds argument, converted to hours, minutes, and seconds, as a TIME value. The range of the result is constrained to that of the TIME data type. A warning occurs if the argument corresponds to a value outside that range.

    mysql> SELECT SEC_TO_TIME(2378);
            -> '00:39:38'
    mysql> SELECT SEC_TO_TIME(2378) + 0;
            -> 3938
    
  • STR_TO_DATE(str,format)

    This is the inverse of the DATE_FORMAT() function. It takes a string str and a format string format. STR_TO_DATE() returns a DATETIME value if the format string contains both date and time parts, or a DATE or TIME value if the string contains only date or time parts. If the date, time, or datetime value extracted from str is illegal, STR_TO_DATE() returns NULL and produces a warning.

    The server scans str attempting to match format to it. The format string can contain literal characters and format specifiers beginning with %. Literal characters in format must match literally in str. Format specifiers in format must match a date or time part in str. For the specifiers that can be used in format, see the DATE_FORMAT() function description.

    mysql> SELECT STR_TO_DATE('01,5,2013','%d,%m,%Y');
            -> '2013-05-01'
    mysql> SELECT STR_TO_DATE('May 1, 2013','%M %d,%Y');
            -> '2013-05-01'
    

    Scanning starts at the beginning of str and fails if format is found not to match. Extra characters at the end of str are ignored.

    mysql> SELECT STR_TO_DATE('a09:30:17','a%h:%i:%s');
            -> '09:30:17'
    mysql> SELECT STR_TO_DATE('a09:30:17','%h:%i:%s');
            -> NULL
    mysql> SELECT STR_TO_DATE('09:30:17a','%h:%i:%s');
            -> '09:30:17'
    

    Unspecified date or time parts have a value of 0, so incompletely specified values in str produce a result with some or all parts set to 0:

    mysql> SELECT STR_TO_DATE('abc','abc');
            -> '0000-00-00'
    mysql> SELECT STR_TO_DATE('9','%m');
            -> '0000-09-00'
    mysql> SELECT STR_TO_DATE('9','%s');
            -> '00:00:09'
    

    Range checking on the parts of date values is as described in Section 11.3.1, “The DATE, DATETIME, and TIMESTAMP Types”. This means, for example, that zero dates or dates with part values of 0 are permitted unless the SQL mode is set to disallow such values.

    mysql> SELECT STR_TO_DATE('00/00/0000', '%m/%d/%Y');
            -> '0000-00-00'
    mysql> SELECT STR_TO_DATE('04/31/2004', '%m/%d/%Y');
            -> '2004-04-31'
    
    Note

    You cannot use format "%X%V" to convert a year-week string to a date because the combination of a year and week does not uniquely identify a year and month if the week crosses a month boundary. To convert a year-week to a date, you should also specify the weekday:

    mysql> SELECT STR_TO_DATE('200442 Monday', '%X%V %W');
            -> '2004-10-18'
    
  • SUBDATE(date,INTERVAL expr unit), SUBDATE(expr,days)

    When invoked with the INTERVAL form of the second argument, SUBDATE() is a synonym for DATE_SUB(). For information on the INTERVAL unit argument, see the discussion for DATE_ADD().

    mysql> SELECT DATE_SUB('2008-01-02', INTERVAL 31 DAY);
            -> '2007-12-02'
    mysql> SELECT SUBDATE('2008-01-02', INTERVAL 31 DAY);
            -> '2007-12-02'
    

    The second form enables the use of an integer value for days. In such cases, it is interpreted as the number of days to be subtracted from the date or datetime expression expr.

    mysql> SELECT SUBDATE('2008-01-02 12:00:00', 31);
            -> '2007-12-02 12:00:00'
    
  • SUBTIME(expr1,expr2)

    SUBTIME() returns expr1expr2 expressed as a value in the same format as expr1. expr1 is a time or datetime expression, and expr2 is a time expression.

    mysql> SELECT SUBTIME('2007-12-31 23:59:59.999999','1 1:1:1.000002');
            -> '2007-12-30 22:58:58.999997'
    mysql> SELECT SUBTIME('01:00:00.999999', '02:00:00.999998');
            -> '-00:59:59.999999'
    
  • SYSDATE()

    Returns the current date and time as a value in 'YYYY-MM-DD HH:MM:SS' or YYYYMMDDHHMMSS format, depending on whether the function is used in a string or numeric context.

    SYSDATE() returns the time at which it executes. This differs from the behavior for NOW(), which returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger, NOW() returns the time at which the function or triggering statement began to execute.)

    mysql> SELECT NOW(), SLEEP(2), NOW();
    +---------------------+----------+---------------------+
    | NOW()               | SLEEP(2) | NOW()               |
    +---------------------+----------+---------------------+
    | 2006-04-12 13:47:36 |        0 | 2006-04-12 13:47:36 |
    +---------------------+----------+---------------------+
    
    mysql> SELECT SYSDATE(), SLEEP(2), SYSDATE();
    +---------------------+----------+---------------------+
    | SYSDATE()           | SLEEP(2) | SYSDATE()           |
    +---------------------+----------+---------------------+
    | 2006-04-12 13:47:44 |        0 | 2006-04-12 13:47:46 |
    +---------------------+----------+---------------------+
    

    In addition, the SET TIMESTAMP statement affects the value returned by NOW() but not by SYSDATE(). This means that timestamp settings in the binary log have no effect on invocations of SYSDATE().

    Because SYSDATE() can return different values even within the same statement, and is not affected by SET TIMESTAMP, it is nondeterministic and therefore unsafe for replication if statement-based binary logging is used. If that is a problem, you can use row-based logging.

    Alternatively, you can use the --sysdate-is-now option to cause SYSDATE() to be an alias for NOW(). This works if the option is used on both the master and the slave.

    The nondeterministic nature of SYSDATE() also means that indexes cannot be used for evaluating expressions that refer to it.

  • TIME(expr)

    Extracts the time part of the time or datetime expression expr and returns it as a string.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

    mysql> SELECT TIME('2003-12-31 01:02:03');
            -> '01:02:03'
    mysql> SELECT TIME('2003-12-31 01:02:03.000123');
            -> '01:02:03.000123'
    
  • TIMEDIFF(expr1,expr2)

    TIMEDIFF() returns expr1expr2 expressed as a time value. expr1 and expr2 are time or date-and-time expressions, but both must be of the same type.

    The result returned by TIMEDIFF() is limited to the range allowed for TIME values. Alternatively, you can use either of the functions TIMESTAMPDIFF() and UNIX_TIMESTAMP(), both of which return integers.

    mysql> SELECT TIMEDIFF('2000:01:01 00:00:00',
        ->                 '2000:01:01 00:00:00.000001');
            -> '-00:00:00.000001'
    mysql> SELECT TIMEDIFF('2008-12-31 23:59:59.000001',
        ->                 '2008-12-30 01:01:01.000002');
            -> '46:58:57.999999'
    
  • TIMESTAMP(expr), TIMESTAMP(expr1,expr2)

    With a single argument, this function returns the date or datetime expression expr as a datetime value. With two arguments, it adds the time expression expr2 to the date or datetime expression expr1 and returns the result as a datetime value.

    mysql> SELECT TIMESTAMP('2003-12-31');
            -> '2003-12-31 00:00:00'
    mysql> SELECT TIMESTAMP('2003-12-31 12:00:00','12:00:00');
            -> '2004-01-01 00:00:00'
    
  • TIMESTAMPADD(unit,interval,datetime_expr)

    Adds the integer expression interval to the date or datetime expression datetime_expr. The unit for interval is given by the unit argument, which should be one of the following values: MICROSECOND (microseconds), SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, or YEAR.

    The unit value may be specified using one of keywords as shown, or with a prefix of SQL_TSI_. For example, DAY and SQL_TSI_DAY both are legal.

    mysql> SELECT TIMESTAMPADD(MINUTE,1,'2003-01-02');
            -> '2003-01-02 00:01:00'
    mysql> SELECT TIMESTAMPADD(WEEK,1,'2003-01-02');
            -> '2003-01-09'
    
  • TIMESTAMPDIFF(unit,datetime_expr1,datetime_expr2)

    Returns datetime_expr2datetime_expr1, where datetime_expr1 and datetime_expr2 are date or datetime expressions. One expression may be a date and the other a datetime; a date value is treated as a datetime having the time part '00:00:00' where necessary. The unit for the result (an integer) is given by the unit argument. The legal values for unit are the same as those listed in the description of the TIMESTAMPADD() function.

    mysql> SELECT TIMESTAMPDIFF(MONTH,'2003-02-01','2003-05-01');
            -> 3
    mysql> SELECT TIMESTAMPDIFF(YEAR,'2002-05-01','2001-01-01');
            -> -1
    mysql> SELECT TIMESTAMPDIFF(MINUTE,'2003-02-01','2003-05-01 12:05:55');
            -> 128885
    
    Note

    The order of the date or datetime arguments for this function is the opposite of that used with the TIMESTAMP() function when invoked with 2 arguments.

  • TIME_FORMAT(time,format)

    This is used like the DATE_FORMAT() function, but the format string may contain format specifiers only for hours, minutes, seconds, and microseconds. Other specifiers produce a NULL value or 0.

    If the time value contains an hour part that is greater than 23, the %H and %k hour format specifiers produce a value larger than the usual range of 0..23. The other hour format specifiers produce the hour value modulo 12.

    mysql> SELECT TIME_FORMAT('100:00:00', '%H %k %h %I %l');
            -> '100 100 04 04 4'
    
  • TIME_TO_SEC(time)

    Returns the time argument, converted to seconds.

    mysql> SELECT TIME_TO_SEC('22:23:00');
            -> 80580
    mysql> SELECT TIME_TO_SEC('00:39:38');
            -> 2378
    
  • TO_DAYS(date)

    Given a date date, returns a day number (the number of days since year 0).

    mysql> SELECT TO_DAYS(950501);
            -> 728779
    mysql> SELECT TO_DAYS('2007-10-07');
            -> 733321
    

    TO_DAYS() is not intended for use with values that precede the advent of the Gregorian calendar (1582), because it does not take into account the days that were lost when the calendar was changed. For dates before 1582 (and possibly a later year in other locales), results from this function are not reliable. See Section 12.8, “What Calendar Is Used By MySQL?”, for details.

    Remember that MySQL converts two-digit year values in dates to four-digit form using the rules in Section 11.3, “Date and Time Types”. For example, '2008-10-07' and '08-10-07' are seen as identical dates:

    mysql> SELECT TO_DAYS('2008-10-07'), TO_DAYS('08-10-07');
            -> 733687, 733687
    

    In MySQL, the zero date is defined as '0000-00-00', even though this date is itself considered invalid. This means that, for '0000-00-00' and '0000-01-01', TO_DAYS() returns the values shown here:

    mysql> SELECT TO_DAYS('0000-00-00');
    +-----------------------+
    | to_days('0000-00-00') |
    +-----------------------+
    |                  NULL |
    +-----------------------+
    1 row in set, 1 warning (0.00 sec)
    
    mysql> SHOW WARNINGS;
    +---------+------+----------------------------------------+
    | Level   | Code | Message                                |
    +---------+------+----------------------------------------+
    | Warning | 1292 | Incorrect datetime value: '0000-00-00' |
    +---------+------+----------------------------------------+
    1 row in set (0.00 sec)
    
    
    mysql> SELECT TO_DAYS('0000-01-01');
    +-----------------------+
    | to_days('0000-01-01') |
    +-----------------------+
    |                     1 |
    +-----------------------+
    1 row in set (0.00 sec)
    

    This is true whether or not the ALLOW_INVALID_DATES SQL server mode is enabled.

  • TO_SECONDS(expr)

    Given a date or datetime expr, returns a the number of seconds since the year 0. If expr is not a valid date or datetime value, returns NULL.

    mysql> SELECT TO_SECONDS(950501);
            -> 62966505600
    mysql> SELECT TO_SECONDS('2009-11-29');
            -> 63426672000
    mysql> SELECT TO_SECONDS('2009-11-29 13:43:32');
            -> 63426721412
    mysql> SELECT TO_SECONDS( NOW() );
            -> 63426721458
    

    Like TO_DAYS(), TO_SECONDS() is not intended for use with values that precede the advent of the Gregorian calendar (1582), because it does not take into account the days that were lost when the calendar was changed. For dates before 1582 (and possibly a later year in other locales), results from this function are not reliable. See Section 12.8, “What Calendar Is Used By MySQL?”, for details.

    Like TO_DAYS(), TO_SECONDS(), converts two-digit year values in dates to four-digit form using the rules in Section 11.3, “Date and Time Types”.

    In MySQL, the zero date is defined as '0000-00-00', even though this date is itself considered invalid. This means that, for '0000-00-00' and '0000-01-01', TO_SECONDS() returns the values shown here:

    mysql> SELECT TO_SECONDS('0000-00-00');
    +--------------------------+
    | TO_SECONDS('0000-00-00') |
    +--------------------------+
    |                     NULL |
    +--------------------------+
    1 row in set, 1 warning (0.00 sec)
    
    mysql> SHOW WARNINGS;
    +---------+------+----------------------------------------+
    | Level   | Code | Message                                |
    +---------+------+----------------------------------------+
    | Warning | 1292 | Incorrect datetime value: '0000-00-00' |
    +---------+------+----------------------------------------+
    1 row in set (0.00 sec)
    
    
    mysql> SELECT TO_SECONDS('0000-01-01');
    +--------------------------+
    | TO_SECONDS('0000-01-01') |
    +--------------------------+
    |                    86400 |
    +--------------------------+
    1 row in set (0.00 sec)
    

    This is true whether or not the ALLOW_INVALID_DATES SQL server mode is enabled.

  • UNIX_TIMESTAMP(), UNIX_TIMESTAMP(date)

    If called with no argument, returns a Unix timestamp (seconds since '1970-01-01 00:00:00' UTC) as an unsigned integer. If UNIX_TIMESTAMP() is called with a date argument, it returns the value of the argument as seconds since '1970-01-01 00:00:00' UTC. date may be a DATE string, a DATETIME string, a TIMESTAMP, or a number in the format YYMMDD or YYYYMMDD. The server interprets date as a value in the current time zone and converts it to an internal value in UTC. Clients can set their time zone as described in Section 10.6, “MySQL Server Time Zone Support”.

    mysql> SELECT UNIX_TIMESTAMP();
            -> 1196440210
    mysql> SELECT UNIX_TIMESTAMP('2007-11-30 10:30:19');
            -> 1196440219
    

    When UNIX_TIMESTAMP() is used on a TIMESTAMP column, the function returns the internal timestamp value directly, with no implicit string-to-Unix-timestamp conversion. If you pass an out-of-range date to UNIX_TIMESTAMP(), it returns 0.

    Note: If you use UNIX_TIMESTAMP() and FROM_UNIXTIME() to convert between TIMESTAMP values and Unix timestamp values, the conversion is lossy because the mapping is not one-to-one in both directions. For example, due to conventions for local time zone changes, it is possible for two UNIX_TIMESTAMP() to map two TIMESTAMP values to the same Unix timestamp value. FROM_UNIXTIME() will map that value back to only one of the original TIMESTAMP values. Here is an example, using TIMESTAMP values in the CET time zone:

    mysql> SELECT UNIX_TIMESTAMP('2005-03-27 03:00:00');
    +---------------------------------------+
    | UNIX_TIMESTAMP('2005-03-27 03:00:00') |
    +---------------------------------------+
    |                            1111885200 |
    +---------------------------------------+
    mysql> SELECT UNIX_TIMESTAMP('2005-03-27 02:00:00');
    +---------------------------------------+
    | UNIX_TIMESTAMP('2005-03-27 02:00:00') |
    +---------------------------------------+
    |                            1111885200 |
    +---------------------------------------+
    mysql> SELECT FROM_UNIXTIME(1111885200);
    +---------------------------+
    | FROM_UNIXTIME(1111885200) |
    +---------------------------+
    | 2005-03-27 03:00:00       |
    +---------------------------+
    

    If you want to subtract UNIX_TIMESTAMP() columns, you might want to cast the result to signed integers. See Section 12.10, “Cast Functions and Operators”.

  • UTC_DATE, UTC_DATE()

    Returns the current UTC date as a value in 'YYYY-MM-DD' or YYYYMMDD format, depending on whether the function is used in a string or numeric context.

    mysql> SELECT UTC_DATE(), UTC_DATE() + 0;
            -> '2003-08-14', 20030814
    
  • UTC_TIME, UTC_TIME([fsp])

    Returns the current UTC time as a value in 'HH:MM:SS' or HHMMSS format, depending on whether the function is used in a string or numeric context.

    If the fsp argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.

    mysql> SELECT UTC_TIME(), UTC_TIME() + 0;
            -> '18:07:53', 180753.000000
    
  • UTC_TIMESTAMP, UTC_TIMESTAMP([fsp])

    Returns the current UTC date and time as a value in 'YYYY-MM-DD HH:MM:SS' or YYYYMMDDHHMMSS format, depending on whether the function is used in a string or numeric context.

    If the fsp argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.

    mysql> SELECT UTC_TIMESTAMP(), UTC_TIMESTAMP() + 0;
            -> '2003-08-14 18:08:04', 20030814180804.000000
    
  • WEEK(date[,mode])

    This function returns the week number for date. The two-argument form of WEEK() enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the value of the default_week_format system variable is used. See Section 5.1.4, “Server System Variables”.

    The following table describes how the mode argument works.

    ModeFirst day of weekRangeWeek 1 is the first week …
    0Sunday0-53with a Sunday in this year
    1Monday0-53with more than 3 days this year
    2Sunday1-53with a Sunday in this year
    3Monday1-53with more than 3 days this year
    4Sunday0-53with more than 3 days this year
    5Monday0-53with a Monday in this year
    6Sunday1-53with more than 3 days this year
    7Monday1-53with a Monday in this year
    mysql> SELECT WEEK('2008-02-20');
            -> 7
    mysql> SELECT WEEK('2008-02-20',0);
            -> 7
    mysql> SELECT WEEK('2008-02-20',1);
            -> 8
    mysql> SELECT WEEK('2008-12-31',1);
            -> 53
    

    Note that if a date falls in the last week of the previous year, MySQL returns 0 if you do not use 2, 3, 6, or 7 as the optional mode argument:

    mysql> SELECT YEAR('2000-01-01'), WEEK('2000-01-01',0);
            -> 2000, 0
    

    One might argue that MySQL should return 52 for the WEEK() function, because the given date actually occurs in the 52nd week of 1999. We decided to return 0 instead because we want the function to return the week number in the given year. This makes use of the WEEK() function reliable when combined with other functions that extract a date part from a date.

    If you would prefer the result to be evaluated with respect to the year that contains the first day of the week for the given date, use 0, 2, 5, or 7 as the optional mode argument.

    mysql> SELECT WEEK('2000-01-01',2);
            -> 52
    

    Alternatively, use the YEARWEEK() function:

    mysql> SELECT YEARWEEK('2000-01-01');
            -> 199952
    mysql> SELECT MID(YEARWEEK('2000-01-01'),5,2);
            -> '52'
    
  • WEEKDAY(date)

    Returns the weekday index for date (0 = Monday, 1 = Tuesday, … 6 = Sunday).

    mysql> SELECT WEEKDAY('2008-02-03 22:23:00');
            -> 6
    mysql> SELECT WEEKDAY('2007-11-06');
            -> 1
    
  • WEEKOFYEAR(date)

    Returns the calendar week of the date as a number in the range from 1 to 53. WEEKOFYEAR() is a compatibility function that is equivalent to WEEK(date,3).

    mysql> SELECT WEEKOFYEAR('2008-02-20');
            -> 8
    
  • YEAR(date)

    Returns the year for date, in the range 1000 to 9999, or 0 for the zero date.

    mysql> SELECT YEAR('1987-01-01');
            -> 1987
    
  • YEARWEEK(date), YEARWEEK(date,mode)

    Returns year and week for a date. The mode argument works exactly like the mode argument to WEEK(). The year in the result may be different from the year in the date argument for the first and the last week of the year.

    mysql> SELECT YEARWEEK('1987-01-01');
            -> 198653
    

    Note that the week number is different from what the WEEK() function would return (0) for optional arguments 0 or 1, as WEEK() then returns the week in the context of the given year.

12.8. What Calendar Is Used By MySQL?

MySQL uses what is known as a proleptic Gregorian calendar.

Every country that has switched from the Julian to the Gregorian calendar has had to discard at least ten days during the switch. To see how this works, consider the month of October 1582, when the first Julian-to-Gregorian switch occurred.

MondayTuesdayWednesdayThursdayFridaySaturdaySunday
1234151617
18192021222324
25262728293031

There are no dates between October 4 and October 15. This discontinuity is called the cutover. Any dates before the cutover are Julian, and any dates following the cutover are Gregorian. Dates during a cutover are nonexistent.

A calendar applied to dates when it was not actually in use is called proleptic. Thus, if we assume there was never a cutover and Gregorian rules always rule, we have a proleptic Gregorian calendar. This is what is used by MySQL, as is required by standard SQL. For this reason, dates prior to the cutover stored as MySQL DATE or DATETIME values must be adjusted to compensate for the difference. It is important to realize that the cutover did not occur at the same time in all countries, and that the later it happened, the more days were lost. For example, in Great Britain, it took place in 1752, when Wednesday September 2 was followed by Thursday September 14. Russia remained on the Julian calendar until 1918, losing 13 days in the process, and what is popularly referred to as its October Revolution occurred in November according to the Gregorian calendar.

12.9. Full-Text Search Functions

MATCH (col1,col2,...) AGAINST (expr [search_modifier])

search_modifier:
  {
       IN NATURAL LANGUAGE MODE
     | IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
     | IN BOOLEAN MODE
     | WITH QUERY EXPANSION
  }

MySQL has support for full-text indexing and searching:

  • A full-text index in MySQL is an index of type FULLTEXT.

  • Full-text indexes can be used only with InnoDB or MyISAM tables, and can be created only for CHAR, VARCHAR, or TEXT columns.

  • A FULLTEXT index definition can be given in the CREATE TABLE statement when a table is created, or added later using ALTER TABLE or CREATE INDEX.

  • For large data sets, it is much faster to load your data into a table that has no FULLTEXT index and then create the index after that, than to load data into a table that has an existing FULLTEXT index.

Full-text searching is performed using MATCH() ... AGAINST syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.

There are three types of full-text searches:

  • A natural language search interprets the search string as a phrase in natural human language (a phrase in free text). There are no special operators. The stopword list applies, controlled by innodb_ft_enable_stopword, innodb_ft_server_stopword_table, and innodb_ft_user_stopword_table for InnoDB search indexes, and ft_stopword_file for MyISAM ones.

  • Full-text searches are natural language searches if the IN NATURAL LANGUAGE MODE modifier is given or if no modifier is given. For more information, see Section 12.9.1, “Natural Language Full-Text Searches”.

  • A boolean search interprets the search string using the rules of a special query language. The string contains the words to search for. It can also contain operators that specify requirements such that a word must be present or absent in matching rows, or that it should be weighted higher or lower than usual. Certain common words (stopwords) are omitted from the search index and do not match if present in the search string. The IN BOOLEAN MODE modifier specifies a boolean search. For more information, see Section 12.9.2, “Boolean Full-Text Searches”.

  • A query expansion search is a modification of a natural language search. The search string is used to perform a natural language search. Then words from the most relevant rows returned by the search are added to the search string and the search is done again. The query returns the rows from the second search. The IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION or WITH QUERY EXPANSION modifier specifies a query expansion search. For more information, see Section 12.9.3, “Full-Text Searches with Query Expansion”.

For information about FULLTEXT query performance, see Section 8.3.4, “Column Indexes”.

For more technical details about processing for InnoDB FULLTEXT indexes, see Section 14.2.3.13.3, “FULLTEXT Indexes”.

Constraints on full-text searching are listed in Section 12.9.5, “Full-Text Restrictions”.

The myisam_ftdump utility dumps the contents of a MyISAM full-text index. This may be helpful for debugging full-text queries. See Section 4.6.2, “myisam_ftdump — Display Full-Text Index information”.

12.9.1. Natural Language Full-Text Searches

By default or with the IN NATURAL LANGUAGE MODE modifier, the MATCH() function performs a natural language search for a string against a text collection. A collection is a set of one or more columns included in a FULLTEXT index. The search string is given as the argument to AGAINST(). For each row in the table, MATCH() returns a relevance value; that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.

mysql> CREATE TABLE articles (
      id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
      title VARCHAR(200),
      body TEXT,
      FULLTEXT (title,body)
    ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO articles (title,body) VALUES
    ('MySQL Tutorial','DBMS stands for DataBase ...'),
    ('How To Use MySQL Well','After you went through a ...'),
    ('Optimizing MySQL','In this tutorial we will show ...'),
    ('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
    ('MySQL vs. YourSQL','In the following database comparison ...'),
    ('MySQL Security','When configured properly, MySQL ...');
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM articles
    WHERE MATCH (title,body)
    AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+
| id | title             | body                                     |
+----+-------------------+------------------------------------------+
|  1 | MySQL Tutorial    | DBMS stands for DataBase ...             |
|  5 | MySQL vs. YourSQL | In the following database comparison ... |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)

By default, the search is performed in case-insensitive fashion. To perform a case-sensitive full-text search, use a binary collation for the indexed columns. For example, a column that uses the latin1 character set of can be assigned a collation of latin1_bin to make it case sensitive for full-text searches.

When MATCH() is used in a WHERE clause, as in the example shown earlier, the rows returned are automatically sorted with the highest relevance first. Relevance values are nonnegative floating-point numbers. Zero relevance means no similarity. Relevance is computed based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word.

To simply count matches, you could use a query like this:

mysql> SELECT COUNT(*) FROM articles
    WHERE MATCH (title,body)
    AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----------+
| COUNT(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

You might find it quicker to rewrite the query as follows:

mysql> SELECT
    COUNT(IF(MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE), 1, NULL))
    AS count
    FROM articles;
+-------+
| count |
+-------+
|     2 |
+-------+
1 row in set (0.03 sec)

The first query does some extra work (sorting the results by relevance) but also can use an index lookup based on the WHERE clause. The index lookup might make the first query faster if the search matches few rows. The second query performs a full table scan, which might be faster than the index lookup if the search term was present in most rows.

For natural-language full-text searches, the columns named in the MATCH() function must be the same columns included in some FULLTEXT index in your table. For the preceding query, note that the columns named in the MATCH() function (title and body) are the same as those named in the definition of the article table's FULLTEXT index. To search the title or body separately, you would create separate FULLTEXT indexes for each column.

You can also perform a boolean search or a search with query expansion. These search types are described in Section 12.9.2, “Boolean Full-Text Searches”, and Section 12.9.3, “Full-Text Searches with Query Expansion”.

A full-text search that uses an index can name columns only from a single table in the MATCH() clause because an index cannot span multiple tables. For MyISAM tables, a boolean search can be done in the absence of an index (albeit more slowly), in which case it is possible to name columns from multiple tables.

The preceding example is a basic illustration that shows how to use the MATCH() function where rows are returned in order of decreasing relevance. The next example shows how to retrieve the relevance values explicitly. Returned rows are not ordered because the SELECT statement includes neither WHERE nor ORDER BY clauses:

mysql> SELECT id, MATCH (title,body)
    AGAINST ('Tutorial' IN NATURAL LANGUAGE MODE) AS score
    FROM articles;
+----+---------------------+
| id | score               |
+----+---------------------+
|  1 | 0.22764469683170319 |
|  2 |                   0 |
|  3 | 0.22764469683170319 |
|  4 |                   0 |
|  5 |                   0 |
|  6 |                   0 |
+----+---------------------+
6 rows in set (0.00 sec)

The following example is more complex. The query returns the relevance values and it also sorts the rows in order of decreasing relevance. To achieve this result, specify MATCH() twice: once in the SELECT list and once in the WHERE clause. This causes no additional overhead, because the MySQL optimizer notices that the two MATCH() calls are identical and invokes the full-text search code only once.

mysql> SELECT id, body, MATCH (title,body) AGAINST
    ('Security implications of running MySQL as root'
    IN NATURAL LANGUAGE MODE) AS score
    FROM articles WHERE MATCH (title,body) AGAINST
    ('Security implications of running MySQL as root'
    IN NATURAL LANGUAGE MODE);
+----+-----------------------+------------------------------------------+
| id | title                 | body                                     |
+----+-----------------------+------------------------------------------+
|  5 | MySQL vs. YourSQL     | In the following database comparison ... |
|  1 | MySQL Tutorial        | DBMS stands for DataBase ...             |
|  3 | Optimizing MySQL      | In this tutorial we will show ...        |
|  6 | MySQL Security        | When configured properly, MySQL ...      |
|  2 | How To Use MySQL Well | After you went through a ...             |
|  4 | 1001 MySQL Tricks     | 1. Never run mysqld as root. 2. ...      |
+----+-----------------------+------------------------------------------+
6 rows in set (0.00 sec)

The MySQL FULLTEXT implementation regards any sequence of true word characters (letters, digits, and underscores) as a word. That sequence may also contain apostrophes ('), but not more than one in a row. This means that aaa'bbb is regarded as one word, but aaa''bbb is regarded as two words. Apostrophes at the beginning or the end of a word are stripped by the FULLTEXT parser; 'aaa'bbb' would be parsed as aaa'bbb.

The FULLTEXT parser determines where words start and end by looking for certain delimiter characters; for example,   (space), , (comma), and . (period). If words are not separated by delimiters (as in, for example, Chinese), the FULLTEXT parser cannot determine where a word begins or ends. To be able to add words or other indexed terms in such languages to a FULLTEXT index, you must preprocess them so that they are separated by some arbitrary delimiter such as ".

In MySQL 5.7, it is possible to write a plugin that replaces the built-in full-text parser. For details, see Section 22.2, “The MySQL Plugin API”. For example parser plugin source code, see the plugin/fulltext directory of a MySQL source distribution.

Some words are ignored in full-text searches:

  • Any word that is too short is ignored. The default minimum length of words that are found by full-text searches is three characters for InnoDB search indexes, or four characters for MyISAM. You can control the cutoff by setting a configuration option before creating the index: innodb_ft_min_token_size configuration option for InnoDB search indexes, or ft_min_word_len for MyISAM.

  • Words in the stopword list are ignored. A stopword is a word such as the or some that is so common that it is considered to have zero semantic value. There is a built-in stopword list, but it can be overridden by a user-defined list. The stopword lists and related configuration options are different for InnoDB search indexes and MyISAM ones. Stopword processing is controlled by the configuration options innodb_ft_enable_stopword, innodb_ft_server_stopword_table, and innodb_ft_user_stopword_table for InnoDB search indexes, and ft_stopword_file for MyISAM ones.

The default stopword lists are shown in Section 12.9.4, “Full-Text Stopwords”. The default minimum word length and stopword list can be changed as described in Section 12.9.6, “Fine-Tuning MySQL Full-Text Search”.

Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Thus, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row. This technique works best with large collections.

MyISAM Limitation

For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results for search indexes on MyISAM tables. For example, although the word MySQL is present in every row of the articles table shown earlier, a search for the word in a MyISAM search index produces no results:

mysql> SELECT * FROM articles
    WHERE MATCH (title,body)
    AGAINST ('MySQL' IN NATURAL LANGUAGE MODE);
Empty set (0.00 sec)

The search result is empty because the word MySQL is present in at least 50% of the rows, and so is effectively treated as a stopword. This filtering technique is more suitable for large data sets, where you might not want the result set to return every second row from a 1GB table, than for small data sets where it might cause poor results for popular terms.

The 50% threshold can surprise you when you first try full-text searching to see how it works, and makes InnoDB tables more suited to experimentation with full-text searches. If you create a MyISAM table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results until the table contains more rows. Users who need to bypass the 50% limitation can build search indexes on InnoDB tables, or the boolean search mode explained in Section 12.9.2, “Boolean Full-Text Searches”.

12.9.2. Boolean Full-Text Searches

MySQL can perform boolean full-text searches using the IN BOOLEAN MODE modifier. With this modifier, certain characters have special meaning at the beginning or end of words in the search string. In the following query, the + and - operators indicate that a word must be present or absent, respectively, for a match to occur. Thus, the query retrieves all the rows that contain the word MySQL but that do not contain the word YourSQL:

mysql> SELECT * FROM articles WHERE MATCH (title,body)
    AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+
| id | title                 | body                                |
+----+-----------------------+-------------------------------------+
|  1 | MySQL Tutorial        | DBMS stands for DataBase ...        |
|  2 | How To Use MySQL Well | After you went through a ...        |
|  3 | Optimizing MySQL      | In this tutorial we will show ...   |
|  4 | 1001 MySQL Tricks     | 1. Never run mysqld as root. 2. ... |
|  6 | MySQL Security        | When configured properly, MySQL ... |
+----+-----------------------+-------------------------------------+
Note

In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which

  • + stands for AND

  • - stands for NOT

  • [no operator] implies OR

Boolean full-text searches have these characteristics:

  • They do not use the 50% threshold that applies to MyISAM search indexes.

  • They do not automatically sort rows in order of decreasing relevance.

  • Boolean queries against a MyISAM search index can work even without a FULLTEXT index, although a search executed in this fashion would be quite slow. InnoDB tables require a FULLTEXT index on all columns of the MATCH() expression to perform boolean queries.

  • The minimum and maximum word length full-text parameters apply: innodb_ft_min_token_size and innodb_ft_max_token_size for InnoDB search indexes, and ft_min_word_len and ft_max_word_len for MyISAM ones.

  • The stopword list applies, controlled by innodb_ft_enable_stopword, innodb_ft_server_stopword_table, and innodb_ft_user_stopword_table for InnoDB search indexes, and ft_stopword_file for MyISAM ones.

  • InnoDB full-text search does not support the use of multiple operators on a single search word, as in this example: '++apple'. Use of multiple operators on a single search word returns a syntax error to standard out. MyISAM full-text search will successfully process the same search ignoring all operators except for the operator immediately adjacent to the search word.

The boolean full-text search capability supports the following operators:

  • +

    A leading plus sign indicates that this word must be present in each row that is returned.

  • -

    A leading minus sign indicates that this word must not be present in any of the rows that are returned.

    Note: The - operator acts only to exclude rows that are otherwise matched by other search terms. Thus, a boolean-mode search that contains only terms preceded by - returns an empty result. It does not return all rows except those containing any of the excluded terms.

  • (no operator)

    By default (when neither + nor - is specified), the word is optional, but the rows that contain it are rated higher. This mimics the behavior of MATCH() ... AGAINST() without the IN BOOLEAN MODE modifier.

  • @distance

    This operator works on InnoDB tables only. It tests whether two or more words all start within a specified distance from each other, measured in words. Specify the search words within a double-quoted string immediately before the @distance operator, for example, MATCH(col1) AGAINST('"word1 word2 word3" @8' IN BOOLEAN MODE)

  • > <

    These two operators are used to change a word's contribution to the relevance value that is assigned to a row. The > operator increases the contribution and the < operator decreases it. See the example following this list.

  • ( )

    Parentheses group words into subexpressions. Parenthesized groups can be nested.

  • ~

    A leading tilde acts as a negation operator, causing the word's contribution to the row's relevance to be negative. This is useful for marking noise words. A row containing such a word is rated lower than others, but is not excluded altogether, as it would be with the - operator.

  • *

    The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it is appended to the word to be affected. Words match if they begin with the word preceding the * operator.

    If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short or a stopword. Whether a word is too short is determined from the innodb_ft_min_token_size setting for InnoDB tables, or ft_min_word_len for MyISAM tables. The wildcarded word is considered as a prefix that must be present at the start of one or more words. If the minimum word length is 4, a search for '+word +the*' could return fewer rows than a search for '+word +the', because the second query ignores the too-short search term the.

  • "

    A phrase that is enclosed within double quote (") characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase".

    If the phrase contains no words that are in the index, the result is empty. The words might not be in the index because of a combination of factors: if they do not exist in the text, are stopwords, or are shorter than the minimum length of indexed words.

The following examples demonstrate some search strings that use boolean full-text operators:

  • 'apple banana'

    Find rows that contain at least one of the two words.

  • '+apple +juice'

    Find rows that contain both words.

  • '+apple macintosh'

    Find rows that contain the word apple, but rank rows higher if they also contain macintosh.

  • '+apple -macintosh'

    Find rows that contain the word apple but not macintosh.

  • '+apple ~macintosh'

    Find rows that contain the word apple, but if the row also contains the word macintosh, rate it lower than if row does not. This is softer than a search for '+apple -macintosh', for which the presence of macintosh causes the row not to be returned at all.

  • '+apple +(>turnover <strudel)'

    Find rows that contain the words apple and turnover, or apple and strudel (in any order), but rank apple turnover higher than apple strudel.

  • 'apple*'

    Find rows that contain words such as apple, apples, applesauce, or applet.

  • '"some words"'

    Find rows that contain the exact phrase some words (for example, rows that contain some words of wisdom but not some noise words). Note that the " characters that enclose the phrase are operator characters that delimit the phrase. They are not the quotation marks that enclose the search string itself.

12.9.3. Full-Text Searches with Query Expansion

Full-text search supports query expansion (and in particular, its variant blind query expansion). This is generally useful when a search phrase is too short, which often means that the user is relying on implied knowledge that the full-text search engine lacks. For example, a user searching for database may really mean that MySQL, Oracle, DB2, and RDBMS all are phrases that should match databases and should be returned, too. This is implied knowledge.

Blind query expansion (also known as automatic relevance feedback) is enabled by adding WITH QUERY EXPANSION or IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION following the search phrase. It works by performing the search twice, where the search phrase for the second search is the original search phrase concatenated with the few most highly relevant documents from the first search. Thus, if one of these documents contains the word databases and the word MySQL, the second search finds the documents that contain the word MySQL even if they do not contain the word database. The following example shows this difference:

mysql> SELECT * FROM articles
    WHERE MATCH (title,body)
    AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+
| id | title             | body                                     |
+----+-------------------+------------------------------------------+
|  1 | MySQL Tutorial    | DBMS stands for DataBase ...             |
|  5 | MySQL vs. YourSQL | In the following database comparison ... |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM articles
    WHERE MATCH (title,body)
    AGAINST ('database' WITH QUERY EXPANSION);
+----+-----------------------+------------------------------------------+
| id | title                 | body                                     |
+----+-----------------------+------------------------------------------+
|  5 | MySQL vs. YourSQL     | In the following database comparison ... |
|  1 | MySQL Tutorial        | DBMS stands for DataBase ...             |
|  3 | Optimizing MySQL      | In this tutorial we will show ...        |
|  6 | MySQL Security        | When configured properly, MySQL ...      |
|  2 | How To Use MySQL Well | After you went through a ...             |
|  4 | 1001 MySQL Tricks     | 1. Never run mysqld as root. 2. ...      |
+----+-----------------------+------------------------------------------+
6 rows in set (0.00 sec)

Another example could be searching for books by Georges Simenon about Maigret, when a user is not sure how to spell Maigret. A search for Megre and the reluctant witnesses finds only Maigret and the Reluctant Witnesses without query expansion. A search with query expansion finds all books with the word Maigret on the second pass.

Note

Because blind query expansion tends to increase noise significantly by returning nonrelevant documents, use it only when a search phrase is short.

12.9.4. Full-Text Stopwords

The stopword list is loaded and searched for full-text queries using the server character set and collation (the values of the character_set_server and collation_server system variables). False hits or misses might occur for stopword lookups if the stopword file or columns used for full-text indexing or searches have a character set or collation different from character_set_server or collation_server.

Case sensitivity of stopword lookups depends on the server collation. For example, lookups are case insensitive if the collation is latin1_swedish_ci, whereas lookups are case sensitive if the collation is latin1_general_cs or latin1_bin.

Stopwords for InnoDB Search Indexes

InnoDB has a relatively short list of default stopwords, because documents from technical, literary, and so on sources often use short words as keywords or in significant phrases. For example, you might search for to be or not to be and expect to get a sensible result, rather than having all those words ignored.

To see the list, query the table information_schema.innodb_ft_default_stopword. To define your own stopword list used for all InnoDB tables, define a table with the same structure as innodb_ft_default_stopword, fill it with the desired stopwords, and set the value of the innodb_ft_server_stopword_table option to a value of the form db_name/table_name before creating the search index. To create special stopword lists on a table-by-table basis, define other tables to hold these lists and specify the appropriate one in the innodb_ft_user_stopword_table option before creating the search index.

Stopwords for MyISAM Search Indexes

In MySQL 5.7, the stopword file is loaded and searched using latin1 if character_set_server is ucs2, utf16, utf16le, or utf32.

The following table shows the default list of stopwords for MyISAM search indexes. In a MySQL source distribution, you can find this list in the storage/myisam/ft_static.c file.

a'sableaboutaboveaccording
accordinglyacrossactuallyafterafterwards
againagainstain'tallallow
allowsalmostalonealongalready
alsoalthoughalwaysamamong
amongstanandanotherany
anybodyanyhowanyoneanythinganyway
anywaysanywhereapartappearappreciate
appropriatearearen'taroundas
asideaskaskingassociatedat
availableawayawfullybebecame
becausebecomebecomesbecomingbeen
beforebeforehandbehindbeingbelieve
belowbesidebesidesbestbetter
betweenbeyondbothbriefbut
byc'monc'scamecan
can'tcannotcantcausecauses
certaincertainlychangesclearlyco
comcomecomesconcerningconsequently
considerconsideringcontaincontainingcontains
correspondingcouldcouldn'tcoursecurrently
definitelydescribeddespitediddidn't
differentdodoesdoesn'tdoing
don'tdonedowndownwardsduring
eacheduegeighteither
elseelsewhereenoughentirelyespecially
etetceveneverevery
everybodyeveryoneeverythingeverywhereex
exactlyexampleexceptfarfew
fifthfirstfivefollowedfollowing
followsforformerformerlyforth
fourfromfurtherfurthermoreget
getsgettinggivengivesgo
goesgoinggonegotgotten
greetingshadhadn'thappenshardly
hashasn'thavehaven'thaving
hehe'shellohelphence
herherehere'shereafterhereby
hereinhereuponhersherselfhi
himhimselfhishitherhopefully
howhowbeithoweveri'di'll
i'mi'veieifignored
immediateininasmuchincindeed
indicateindicatedindicatesinnerinsofar
insteadintoinwardisisn't
itit'dit'llit'sits
itselfjustkeepkeepskept
knowknownknowslastlately
laterlatterlatterlyleastless
lestletlet'slikeliked
likelylittlelooklookinglooks
ltdmainlymanymaymaybe
memeanmeanwhilemerelymight
moremoreovermostmostlymuch
mustmymyselfnamenamely
ndnearnearlynecessaryneed
needsneitherneverneverthelessnew
nextninenonobodynon
nonenoonenornormallynot
nothingnovelnownowhereobviously
ofoffoftenohok
okayoldononceone
onesonlyontoorother
othersotherwiseoughtourours
ourselvesoutoutsideoveroverall
ownparticularparticularlyperperhaps
placedpleasepluspossiblepresumably
probablyprovidesquequiteqv
ratherrdrereallyreasonably
regardingregardlessregardsrelativelyrespectively
rightsaidsamesawsay
sayingsayssecondsecondlysee
seeingseemseemedseemingseems
seenselfselvessensiblesent
seriousseriouslysevenseveralshall
sheshouldshouldn'tsincesix
sosomesomebodysomehowsomeone
somethingsometimesometimessomewhatsomewhere
soonsorryspecifiedspecifyspecifying
stillsubsuchsupsure
t'staketakentelltends
ththanthankthanksthanx
thatthat'sthatsthetheir
theirsthemthemselvesthenthence
therethere'sthereaftertherebytherefore
thereintheresthereuponthesethey
they'dthey'llthey'rethey'vethink
thirdthisthoroughthoroughlythose
thoughthreethroughthroughoutthru
thustotogethertootook
towardtowardstriedtriestruly
trytryingtwicetwoun
underunfortunatelyunlessunlikelyuntil
untoupuponususe
usedusefulusesusingusually
valuevariousveryviaviz
vswantwantswaswasn't
waywewe'dwe'llwe're
we'vewelcomewellwentwere
weren'twhatwhat'swhateverwhen
whencewheneverwherewhere'swhereafter
whereaswherebywhereinwhereuponwherever
whetherwhichwhilewhitherwho
who'swhoeverwholewhomwhose
whywillwillingwishwith
withinwithoutwon'twonderwould
wouldn'tyesyetyouyou'd
you'llyou'reyou'veyouryours
yourselfyourselveszero  

12.9.5. Full-Text Restrictions

  • Full-text searches are supported for InnoDB and MyISAM tables only.

  • Full-text searches are not supported for partitioned tables. See Section 17.6, “Restrictions and Limitations on Partitioning”.

  • Full-text searches can be used with most multi-byte character sets. The exception is that for Unicode, the utf8 character set can be used, but not the ucs2 character set. Although FULLTEXT indexes on ucs2 columns cannot be used, you can perform IN BOOLEAN MODE searches on a ucs2 column that has no such index.

    The remarks for utf8 also apply to utf8mb4, and the remarks for ucs2 also apply to utf16, utf16le, and utf32.

  • Ideographic languages such as Chinese and Japanese do not have word delimiters. Therefore, the FULLTEXT parser cannot determine where words begin and end in these and other such languages. The implications of this and some workarounds for the problem are described in Section 12.9, “Full-Text Search Functions”.

  • Although the use of multiple character sets within a single table is supported, all columns in a FULLTEXT index must use the same character set and collation.

  • The MATCH() column list must match exactly the column list in some FULLTEXT index definition for the table, unless this MATCH() is IN BOOLEAN MODE on a MyISAM table. For MyISAM tables, boolean-mode searches can be done on nonindexed columns, although they are likely to be slow.

  • The argument to AGAINST() must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.

  • Index hints are more limited for FULLTEXT searches than for non-FULLTEXT searches. See Section 13.2.9.3, “Index Hint Syntax”.

12.9.6. Fine-Tuning MySQL Full-Text Search

MySQL's full-text search capability has few user-tunable parameters. You can exert more control over full-text searching behavior if you have a MySQL source distribution because some changes require source code modifications. See Section 2.9, “Installing MySQL from Source”.

Note that full-text search is carefully tuned for effectiveness. Modifying the default behavior in most cases can actually decrease effectiveness. Do not alter the MySQL sources unless you know what you are doing.

Most full-text variables described in this section must be set at server startup time. A server restart is required to change them; they cannot be modified while the server is running.

Some variable changes require that you rebuild the FULLTEXT indexes in your tables. Instructions for doing so are given later in this section.

  • The minimum and maximum lengths of words to be indexed are defined by the innodb_ft_min_token_size and innodb_ft_max_token_size for InnoDB search indexes, and ft_min_word_len and ft_max_word_len for MyISAM ones. After changing any of these options, rebuild your FULLTEXT indexes for the change to take effect. For example, to make two-character words searchable, you could put the following lines in an option file:

    [mysqld]
    innodb_ft_min_token_size=2
    ft_min_word_len=2

    Then restart the server and rebuild your FULLTEXT indexes. For MyISAM tables, note particularly the remarks regarding myisamchk in the instructions following this list.

  • To override the default stopword list, set the ft_stopword_file system variable. (See Section 5.1.4, “Server System Variables”.) The variable value should be the path name of the file containing the stopword list, or the empty string to disable stopword filtering. The server looks for the file in the data directory unless an absolute path name is given to specify a different directory. After changing the value of this variable or the contents of the stopword file, restart the server and rebuild your FULLTEXT indexes.

    The stopword list is free-form, separating stopwords with any nonalphanumeric character such as newline, space, or comma. Exceptions are the underscore character (_) and a single apostrophe (') which are treated as part of a word. The character set of the stopword list is the server's default character set; see Section 10.1.3.1, “Server Character Set and Collation”.

  • The 50% threshold for natural language searches is determined by the particular weighting scheme chosen. To disable it, look for the following line in storage/myisam/ftdefs.h:

    #define GWS_IN_USE GWS_PROB

    Change that line to this:

    #define GWS_IN_USE GWS_FREQ

    Then recompile MySQL. There is no need to rebuild the indexes in this case.

    Note

    By making this change, you severely decrease MySQL's ability to provide adequate relevance values for the MATCH() function. If you really need to search for such common words, it would be better to search using IN BOOLEAN MODE instead, which does not observe the 50% threshold.

  • To change the operators used for boolean full-text searches on MyISAM tables, set the ft_boolean_syntax system variable. (InnoDB does not have an equivalent setting.) This variable can be changed while the server is running, but you must have the SUPER privilege to do so. No rebuilding of indexes is necessary in this case. See Section 5.1.4, “Server System Variables”, which describes the rules governing how to set this variable.

  • You can change the set of characters that are considered word characters in several ways, as described in the following list. After making the modification, rebuild the indexes for each table that contains any FULLTEXT indexes. Suppose that you want to treat the hyphen character ('-') as a word character. Use one of these methods:

    • Modify the MySQL source: In storage/myisam/ftdefs.h, see the true_word_char() and misc_word_char() macros. Add '-' to one of those macros and recompile MySQL.

    • Modify a character set file: This requires no recompilation. The true_word_char() macro uses a character type table to distinguish letters and numbers from other characters. . You can edit the contents of the <ctype><map> array in one of the character set XML files to specify that '-' is a letter. Then use the given character set for your FULLTEXT indexes. For information about the <ctype><map> array format, see Section 10.3.1, “Character Definition Arrays”.

    • Add a new collation for the character set used by the indexed columns, and alter the columns to use that collation. For general information about adding collations, see Section 10.4, “Adding a Collation to a Character Set”. For an example specific to full-text indexing, see Section 12.9.7, “Adding a Collation for Full-Text Indexing”.

If you modify full-text variables that affect indexing (innodb_ft_min_token_size, innodb_ft_max_token_size, innodb_ft_server_stopword_table, innodb_ft_user_stopword_table, innodb_ft_enable_stopword, ft_min_word_len, ft_max_word_len, or ft_stopword_file), or if you change the stopword file itself, you must rebuild your FULLTEXT indexes after making the changes and restarting the server.

To rebuild the FULLTEXT indexes for an InnoDB table, use ALTER TABLE with the DROP INDEX and ADD INDEX options to drop and re-create each index.

To rebuild the FULLTEXT indexes for a MyISAM table, it is sufficient to do a QUICK repair operation:

mysql> REPAIR TABLE tbl_name QUICK;

Alternatively, use ALTER TABLE as just described. In some cases, this may be faster than a repair operation.

Each table that contains any FULLTEXT index must be repaired as just shown. Otherwise, queries for the table may yield incorrect results, and modifications to the table will cause the server to see the table as corrupt and in need of repair.

Note that if you use myisamchk to perform an operation that modifies MyISAM table indexes (such as repair or analyze), the FULLTEXT indexes are rebuilt using the default full-text parameter values for minimum word length, maximum word length, and stopword file unless you specify otherwise. This can result in queries failing.

The problem occurs because these parameters are known only by the server. They are not stored in MyISAM index files. To avoid the problem if you have modified the minimum or maximum word length or stopword file values used by the server, specify the same ft_min_word_len, ft_max_word_len, and ft_stopword_file values for myisamchk that you use for mysqld. For example, if you have set the minimum word length to 3, you can repair a table with myisamchk like this:

shell> myisamchk --recover --ft_min_word_len=3 tbl_name.MYI

To ensure that myisamchk and the server use the same values for full-text parameters, place each one in both the [mysqld] and [myisamchk] sections of an option file:

[mysqld]
ft_min_word_len=3

[myisamchk]
ft_min_word_len=3

An alternative to using myisamchk for MyISAM table index modification is to use the REPAIR TABLE, ANALYZE TABLE, OPTIMIZE TABLE, or ALTER TABLE statements. These statements are performed by the server, which knows the proper full-text parameter values to use.

12.9.7. Adding a Collation for Full-Text Indexing

This section describes how to add a new collation for full-text searches. The sample collation is like latin1_swedish_ci but treats the '-' character as a letter rather than as a punctuation character so that it can be indexed as a word character. General information about adding collations is given in Section 10.4, “Adding a Collation to a Character Set”; it is assumed that you have read it and are familiar with the files involved.

To add a collation for full-text indexing, use this procedure:

  1. Add a collation to the Index.xml file. The collation ID must be unused, so choose a value different from 1000 if that ID is already taken on your system.

    <charset name="latin1">
    ...
    <collation name="latin1_fulltext_ci" id="1000"/>
    </charset>
  2. Declare the sort order for the collation in the latin1.xml file. In this case, the order can be copied from latin1_swedish_ci:

    <collation name="latin1_fulltext_ci">
    <map>
    00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
    10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
    20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
    30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
    40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
    50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F
    60 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
    50 51 52 53 54 55 56 57 58 59 5A 7B 7C 7D 7E 7F
    80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F
    90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F
    A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF
    B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF
    41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49
    44 4E 4F 4F 4F 4F 5D D7 D8 55 55 55 59 59 DE DF
    41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49
    44 4E 4F 4F 4F 4F 5D F7 D8 55 55 55 59 59 DE FF
    </map>
    </collation>
  3. Modify the ctype array in latin1.xml. Change the value corresponding to 0x2D (which is the code for the '-' character) from 10 (punctuation) to 01 (small letter). In the following array, this is the element in the fourth row down, third value from the end.

    <ctype>
    <map>
    00
    20 20 20 20 20 20 20 20 20 28 28 28 28 28 20 20
    20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
    48 10 10 10 10 10 10 10 10 10 10 10 10 01 10 10
    84 84 84 84 84 84 84 84 84 84 10 10 10 10 10 10
    10 81 81 81 81 81 81 01 01 01 01 01 01 01 01 01
    01 01 01 01 01 01 01 01 01 01 01 10 10 10 10 10
    10 82 82 82 82 82 82 02 02 02 02 02 02 02 02 02
    02 02 02 02 02 02 02 02 02 02 02 10 10 10 10 20
    10 00 10 02 10 10 10 10 10 10 01 10 01 00 01 00
    00 10 10 10 10 10 10 10 10 10 02 10 02 00 02 01
    48 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
    10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
    01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
    01 01 01 01 01 01 01 10 01 01 01 01 01 01 01 02
    02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
    02 02 02 02 02 02 02 10 02 02 02 02 02 02 02 02
    </map>
    </ctype>
    
  4. Restart the server.

  5. To employ the new collation, include it in the definition of columns that are to use it:

    mysql> DROP TABLE IF EXISTS t1;
    Query OK, 0 rows affected (0.13 sec)
    
    mysql> CREATE TABLE t1 (
        a TEXT CHARACTER SET latin1 COLLATE latin1_fulltext_ci,
        FULLTEXT INDEX(a)
        ) ENGINE=InnoDB;
    Query OK, 0 rows affected (0.47 sec)
    
  6. Test the collation to verify that hyphen is considered as a word character:

    mysql> INSERT INTO t1 VALUEs ('----'),('....'),('abcd');
    Query OK, 3 rows affected (0.22 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> SELECT * FROM t1 WHERE MATCH a AGAINST ('----' IN BOOLEAN MODE);
    +------+
    | a    |
    +------+
    | ---- |
    +------+
    1 row in set (0.00 sec)
    

12.10. Cast Functions and Operators

Table 12.14. Cast Functions

NameDescription
BINARYCast a string to a binary string
CAST()Cast a value as a certain type
CONVERT()Cast a value as a certain type

  • BINARY

    The BINARY operator casts the string following it to a binary string. This is an easy way to force a column comparison to be done byte by byte rather than character by character. This causes the comparison to be case sensitive even if the column is not defined as BINARY or BLOB. BINARY also causes trailing spaces to be significant.

    mysql> SELECT 'a' = 'A';
            -> 1
    mysql> SELECT BINARY 'a' = 'A';
            -> 0
    mysql> SELECT 'a' = 'a ';
            -> 1
    mysql> SELECT BINARY 'a' = 'a ';
            -> 0
    

    In a comparison, BINARY affects the entire operation; it can be given before either operand with the same result.

    BINARY str is shorthand for CAST(str AS BINARY).

    Note that in some contexts, if you cast an indexed column to BINARY, MySQL is not able to use the index efficiently.

  • CAST(expr AS type)

    The CAST() function takes an expression of any type and produces a result value of a specified type, similar to CONVERT(). See the description of CONVERT() for more information.

  • CONVERT(expr,type), CONVERT(expr USING transcoding_name)

    The CONVERT() and CAST() functions take an expression of any type and produce a result value of a specified type.

    The type for the result can be one of the following values:

    • BINARY[(N)]

    • CHAR[(N)]

    • DATE

    • DATETIME

    • DECIMAL[(M[,D])]

    • SIGNED [INTEGER]

    • TIME

    • UNSIGNED [INTEGER]

    BINARY produces a string with the BINARY data type. See Section 11.4.2, “The BINARY and VARBINARY Types” for a description of how this affects comparisons. If the optional length N is given, BINARY(N) causes the cast to use no more than N bytes of the argument. Values shorter than N bytes are padded with 0x00 bytes to a length of N.

    CHAR(N) causes the cast to use no more than N characters of the argument.

    CAST() and CONVERT(... USING ...) are standard SQL syntax. The non-USING form of CONVERT() is ODBC syntax.

    CONVERT() with USING is used to convert data between different character sets. In MySQL, transcoding names are the same as the corresponding character set names. For example, this statement converts the string 'abc' in the default character set to the corresponding string in the utf8 character set:

    SELECT CONVERT('abc' USING utf8);

Normally, you cannot compare a BLOB value or other binary string in case-insensitive fashion because binary strings have no character set, and thus no concept of lettercase. To perform a case-insensitive comparison, use the CONVERT() function to convert the value to a nonbinary string. Comparisons of the result use the string collation. For example, if the character set of the result has a case-insensitive collation, a LIKE operation is not case sensitive:

SELECT 'A' LIKE CONVERT(blob_col USING latin1) FROM tbl_name;

To use a different character set, substitute its name for latin1 in the preceding statement. To specify a particular collation for the converted string, use a COLLATE clause following the CONVERT() call, as described in Section 10.1.9.2, “CONVERT() and CAST(). For example, to use latin1_german1_ci:

SELECT 'A' LIKE CONVERT(blob_col USING latin1) COLLATE latin1_german1_ci
  FROM tbl_name;

CONVERT() can be used more generally for comparing strings that are represented in different character sets.

LOWER() (and UPPER()) are ineffective when applied to binary strings (BINARY, VARBINARY, BLOB). To perform lettercase conversion, convert the string to a nonbinary string:

mysql> SET @str = BINARY 'New York';
mysql> SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+-------------+-----------------------------------+
| LOWER(@str) | LOWER(CONVERT(@str USING latin1)) |
+-------------+-----------------------------------+
| New York    | new york                          |
+-------------+-----------------------------------+

The cast functions are useful when you want to create a column with a specific type in a CREATE TABLE ... SELECT statement:

CREATE TABLE new_table SELECT CAST('2000-01-01' AS DATE);

The functions also can be useful for sorting ENUM columns in lexical order. Normally, sorting of ENUM columns occurs using the internal numeric values. Casting the values to CHAR results in a lexical sort:

SELECT enum_col FROM tbl_name ORDER BY CAST(enum_col AS CHAR);

CAST(str AS BINARY) is the same thing as BINARY str. CAST(expr AS CHAR) treats the expression as a string with the default character set.

CAST() also changes the result if you use it as part of a more complex expression such as CONCAT('Date: ',CAST(NOW() AS DATE)).

You should not use CAST() to extract data in different formats but instead use string functions like LEFT() or EXTRACT(). See Section 12.7, “Date and Time Functions”.

To cast a string to a numeric value in numeric context, you normally do not have to do anything other than to use the string value as though it were a number:

mysql> SELECT 1+'1';
       -> 2

If you use a string in an arithmetic operation, it is converted to a floating-point number during expression evaluation.

If you use a number in string context, the number automatically is converted to a string:

mysql> SELECT CONCAT('hello you ',2);
        -> 'hello you 2'

For information about implicit conversion of numbers to strings, see Section 12.2, “Type Conversion in Expression Evaluation”.

MySQL supports arithmetic with both signed and unsigned 64-bit values. If you are using numeric operators (such as + or -) and one of the operands is an unsigned integer, the result is unsigned by default (see Section 12.6.1, “Arithmetic Operators”). You can override this by using the SIGNED or UNSIGNED cast operator to cast a value to a signed or unsigned 64-bit integer, respectively.

mysql> SELECT CAST(1-2 AS UNSIGNED)
        -> 18446744073709551615
mysql> SELECT CAST(CAST(1-2 AS UNSIGNED) AS SIGNED);
        -> -1

If either operand is a floating-point value, the result is a floating-point value and is not affected by the preceding rule. (In this context, DECIMAL column values are regarded as floating-point values.)

mysql> SELECT CAST(1 AS UNSIGNED) - 2.0;
        -> -1.0

The SQL mode affects the result of conversion operations. Examples:

  • If you convert a zero date string to a date, CONVERT() and CAST() return NULL and produce a warning when the NO_ZERO_DATE SQL mode is enabled.

  • For integer subtraction, if the NO_UNSIGNED_SUBTRACTION SQL mode is enabled, the subtraction result is signed even if any operand is unsigned.

For more information, see Section 5.1.7, “Server SQL Modes”.

12.11. XML Functions

Table 12.15. XML Functions

NameDescription
ExtractValue()Extracts a value from an XML string using XPath notation
UpdateXML()Return replaced XML fragment

This section discusses XML and related functionality in MySQL.

Note

It is possible to obtain XML-formatted output from MySQL in the mysql and mysqldump clients by invoking them with the --xml option. See Section 4.5.1, “mysql — The MySQL Command-Line Tool”, and Section 4.5.4, “mysqldump — A Database Backup Program”.

Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this Manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.

Note

These functions remain under development. We continue to improve these and other aspects of XML and XPath functionality in MySQL 5.7 and onwards. You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.

XPath expressions used with these functions support user variables and local stored program variables. User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):

  • User variables (weak checking).  Variables using the syntax $@variable_name (that is, user variables) are not checked. No warnings or errors are issued by the server if a variable has the wrong type or has previously not been assigned a value. This also means the user is fully responsible for any typographical errors, since no warnings will be given if (for example) $@myvariable is used where $@myvariable was intended.

    Example:

    mysql> SET @xml = '<a><b>X</b><b>Y</b></a>';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SET @i =1, @j = 2;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT @i, ExtractValue(@xml, '//b[$@i]');
    +------+--------------------------------+
    | @i   | ExtractValue(@xml, '//b[$@i]') |
    +------+--------------------------------+
    |    1 | X                              |
    +------+--------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT @j, ExtractValue(@xml, '//b[$@j]');
    +------+--------------------------------+
    | @j   | ExtractValue(@xml, '//b[$@j]') |
    +------+--------------------------------+
    |    2 | Y                              |
    +------+--------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT @k, ExtractValue(@xml, '//b[$@k]');
    +------+--------------------------------+
    | @k   | ExtractValue(@xml, '//b[$@k]') |
    +------+--------------------------------+
    | NULL |                                |
    +------+--------------------------------+
    1 row in set (0.00 sec)
    
  • Variables in stored programs (strong checking).  Variables using the syntax $variable_name can be declared and used with these functions when they are called inside stored programs. Such variables are local to the stored program in which they are defined, and are strongly checked for type and value.

    Example:

    mysql> DELIMITER |
    
    mysql> CREATE PROCEDURE myproc ()
        -> BEGIN
        ->   DECLARE i INT DEFAULT 1;
        ->   DECLARE xml VARCHAR(25) DEFAULT '<a>X</a><a>Y</a><a>Z</a>';
        ->
        ->   WHILE i < 4 DO
        ->     SELECT xml, i, ExtractValue(xml, '//a[$i]');
        ->     SET i = i+1;
        ->   END WHILE;
        -> END |
    Query OK, 0 rows affected (0.01 sec)
    
    mysql> DELIMITER ;
    
    mysql> CALL myproc;
    +--------------------------+---+------------------------------+
    | xml                      | i | ExtractValue(xml, '//a[$i]') |
    +--------------------------+---+------------------------------+
    | <a>X</a><a>Y</a><a>Z</a> | 1 | X                            |
    +--------------------------+---+------------------------------+
    1 row in set (0.00 sec)
    
    +--------------------------+---+------------------------------+
    | xml                      | i | ExtractValue(xml, '//a[$i]') |
    +--------------------------+---+------------------------------+
    | <a>X</a><a>Y</a><a>Z</a> | 2 | Y                            |
    +--------------------------+---+------------------------------+
    1 row in set (0.01 sec)
    
    +--------------------------+---+------------------------------+
    | xml                      | i | ExtractValue(xml, '//a[$i]') |
    +--------------------------+---+------------------------------+
    | <a>X</a><a>Y</a><a>Z</a> | 3 | Z                            |
    +--------------------------+---+------------------------------+
    1 row in set (0.01 sec)
    

    Parameters.  Variables used in XPath expressions inside stored routines that are passed in as parameters are also subject to strong checking.

Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.

Note

Currently, a user variable used to store an XPath expression is treated as an empty string. Because of this, it is not possible to store an XPath expression as a user variable. (Bug #32911)

  • ExtractValue(xml_frag, xpath_expr)

    ExtractValue() takes two string arguments, a fragment of XML markup xml_frag and an XPath expression xpath_expr (also known as a locator); it returns the text (CDATA) of the first text node which is a child of the elements or elements matched by the XPath expression.

    Using this function is the equivalent of performing a match using the xpath_expr after appending /text(). In other words, ExtractValue('<a><b>Sakila</b></a>', '/a/b') and ExtractValue('<a><b>Sakila</b></a>', '/a/b/text()') produce the same result.

    If multiple matches are found, the content of the first child text node of each matching element is returned (in the order matched) as a single, space-delimited string.

    If no matching text node is found for the expression (including the implicit /text())—for whatever reason, as long as xpath_expr is valid, and xml_frag consists of elements which are properly nested and closed—an empty string is returned. No distinction is made between a match on an empty element and no match at all. This is by design.

    If you need to determine whether no matching element was found in xml_frag or such an element was found but contained no child text nodes, you should test the result of an expression that uses the XPath count() function. For example, both of these statements return an empty string, as shown here:

    mysql> SELECT ExtractValue('<a><b/></a>', '/a/b');
    +-------------------------------------+
    | ExtractValue('<a><b/></a>', '/a/b') |
    +-------------------------------------+
    |                                     |
    +-------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT ExtractValue('<a><c/></a>', '/a/b');
    +-------------------------------------+
    | ExtractValue('<a><c/></a>', '/a/b') |
    +-------------------------------------+
    |                                     |
    +-------------------------------------+
    1 row in set (0.00 sec)
    

    However, you can determine whether there was actually a matching element using the following:

    mysql> SELECT ExtractValue('<a><b/></a>', 'count(/a/b)');
    +-------------------------------------+
    | ExtractValue('<a><b/></a>', 'count(/a/b)') |
    +-------------------------------------+
    | 1                                   |
    +-------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT ExtractValue('<a><c/></a>', 'count(/a/b)');
    +-------------------------------------+
    | ExtractValue('<a><c/></a>', 'count(/a/b)') |
    +-------------------------------------+
    | 0                                   |
    +-------------------------------------+
    1 row in set (0.01 sec)
    
    Important

    ExtractValue() returns only CDATA, and does not return any tags that might be contained within a matching tag, nor any of their content (see the result returned as val1 in the following example).

    mysql> SELECT
        ->   ExtractValue('<a>ccc<b>ddd</b></a>', '/a') AS val1,
        ->   ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b') AS val2,
        ->   ExtractValue('<a>ccc<b>ddd</b></a>', '//b') AS val3,
        ->   ExtractValue('<a>ccc<b>ddd</b></a>', '/b') AS val4,
        ->   ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
    
    +------+------+------+------+---------+
    | val1 | val2 | val3 | val4 | val5    |
    +------+------+------+------+---------+
    | ccc  | ddd  | ddd  |      | ddd eee |
    +------+------+------+------+---------+
    

    This function uses the current SQL collation for making comparisons with contains(), performing the same collation aggregation as other string functions (such as CONCAT()), in taking into account the collation coercibility of their arguments; see Section 10.1.7.5, “Collation of Expressions”, for an explanation of the rules governing this behavior.

    (Previously, binary—that is, case-sensitive—comparison was always used.)

    NULL is returned if xml_frag contains elements which are not properly nested or closed, and a warning is generated, as shown in this example:

    mysql> SELECT ExtractValue('<a>c</a><b', '//a');
    +-----------------------------------+
    | ExtractValue('<a>c</a><b', '//a') |
    +-----------------------------------+
    | NULL                              |
    +-----------------------------------+
    1 row in set, 1 warning (0.00 sec)
    
    mysql> SHOW WARNINGS;
    +---------+------+-------------------------------------------------------------------------------------------+
    | Level   | Code | Message                                                                                   |
    +---------+------+-------------------------------------------------------------------------------------------+
    | Warning | 1523 | Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' |
    +---------+------+-------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT ExtractValue('<a>c</a><b/>', '//a');
    +-------------------------------------+
    | ExtractValue('<a>c</a><b/>', '//a') |
    +-------------------------------------+
    | c                                   |
    +-------------------------------------+
    1 row in set (0.00 sec)
    
  • UpdateXML(xml_target, xpath_expr, new_xml)

    This function replaces a single portion of a given fragment of XML markup xml_target with a new XML fragment new_xml, and then returns the changed XML. The portion of xml_target that is replaced matches an XPath expression xpath_expr supplied by the user.

    If no expression matching xpath_expr is found, or if multiple matches are found, the function returns the original xml_target XML fragment. All three arguments should be strings.

    mysql> SELECT
        ->   UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
        ->   UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
        ->   UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
        ->   UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
        ->   UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
        -> \G
    
    *************************** 1. row ***************************
    val1: <e>fff</e>
    val2: <a><b>ccc</b><d></d></a>
    val3: <a><e>fff</e><d></d></a>
    val4: <a><b>ccc</b><e>fff</e></a>
    val5: <a><d></d><b>ccc</b><d></d></a>
    
Note

A discussion in depth of XPath syntax and usage are beyond the scope of this Manual. Please see the XML Path Language (XPath) 1.0 specification for definitive information. A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.

Descriptions and examples of some basic XPath expressions follow:

  • /tag

    Matches <tag/> if and only if <tag/> is the root element.

    Example: /a has a match in <a><b/></a> because it matches the outermost (root) tag. It does not match the inner a element in <b><a/></b> because in this instance it is the child of another element.

  • /tag1/tag2

    Matches <tag2/> if and only if it is a child of <tag1/>, and <tag1/> is the root element.

    Example: /a/b matches the b element in the XML fragment <a><b/></a> because it is a child of the root element a. It does not have a match in <b><a/></b> because in this case, b is the root element (and hence the child of no other element). Nor does the XPath expression have a match in <a><c><b/></c></a>; here, b is a descendant of a, but not actually a child of a.

    This construct is extendable to three or more elements. For example, the XPath expression /a/b/c matches the c element in the fragment <a><b><c/></b></a>.

  • //tag

    Matches any instance of <tag>.

    Example: //a matches the a element in any of the following: <a><b><c/></b></a>; <c><a><b/></a></b>; <c><b><a/></b></c>.

    // can be combined with /. For example, //a/b matches the b element in either of the fragments <a><b/></a> or <a><b><c/></b></a>

    Note

    //tag is the equivalent of /descendant-or-self::*/tag. A common error is to confuse this with /descendant-or-self::tag, although the latter expression can actually lead to very different results, as can be seen here:

    mysql> SET @xml = '<a><b><c>w</c><b>x</b><d>y</d>z</b></a>';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT @xml;
    +-----------------------------------------+
    | @xml                                    |
    +-----------------------------------------+
    | <a><b><c>w</c><b>x</b><d>y</d>z</b></a> |
    +-----------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT ExtractValue(@xml, '//b[1]');
    +------------------------------+
    | ExtractValue(@xml, '//b[1]') |
    +------------------------------+
    | x z                          |
    +------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT ExtractValue(@xml, '//b[2]');
    +------------------------------+
    | ExtractValue(@xml, '//b[2]') |
    +------------------------------+
    |                              |
    +------------------------------+
    1 row in set (0.01 sec)
    
    mysql> SELECT ExtractValue(@xml, '/descendant-or-self::*/b[1]');
    +---------------------------------------------------+
    | ExtractValue(@xml, '/descendant-or-self::*/b[1]') |
    +---------------------------------------------------+
    | x z                                               |
    +---------------------------------------------------+
    1 row in set (0.06 sec)
    
    mysql> SELECT ExtractValue(@xml, '/descendant-or-self::*/b[2]');
    +---------------------------------------------------+
    | ExtractValue(@xml, '/descendant-or-self::*/b[2]') |
    +---------------------------------------------------+
    |                                                   |
    +---------------------------------------------------+
    1 row in set (0.00 sec)
    
    
    mysql> SELECT ExtractValue(@xml, '/descendant-or-self::b[1]');
    +-------------------------------------------------+
    | ExtractValue(@xml, '/descendant-or-self::b[1]') |
    +-------------------------------------------------+
    | z                                               |
    +-------------------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT ExtractValue(@xml, '/descendant-or-self::b[2]');
    +-------------------------------------------------+
    | ExtractValue(@xml, '/descendant-or-self::b[2]') |
    +-------------------------------------------------+
    | x                                               |
    +-------------------------------------------------+
    1 row in set (0.00 sec)
    
  • The * operator acts as a wildcard that matches any element. For example, the expression /*/b matches the b element in either of the XML fragments <a><b/></a> or <c><b/></c>. However, the expression does not produce a match in the fragment <b><a/></b> because b must be a child of some other element. The wildcard may be used in any position: The expression /*/b/* will match any child of a b element that is itself not the root element.

  • You can match any of several locators using the | (UNION) operator. For example, the expression //b|//c matches all b and c elements in the XML target.

  • It is also possible to match an element based on the value of one or more of its attributes. This done using the syntax tag[@attribute="value"]. For example, the expression //b[@id="idB"] matches the second b element in the fragment <a><b id="idA"/><c/><b id="idB"/></a>. To match against any element having attribute="value", use the XPath expression //*[attribute="value"].

    To filter multiple attribute values, simply use multiple attribute-comparison clauses in succession. For example, the expression //b[@c="x"][@d="y"] matches the element <b c="x" d="y"/> occurring anywhere in a given XML fragment.

    To find elements for which the same attribute matches any of several values, you can use multiple locators joined by the | operator. For example, to match all b elements whose c attributes have either of the values 23 or 17, use the expression //b[@c="23"]|//b[@c="17"]. You can also use the logical or operator for this purpose: //b[@c="23" or @c="17"].

    Note

    The difference between or and | is that or joins conditions, while | joins result sets.

XPath Limitations.  The XPath syntax supported by these functions is currently subject to the following limitations:

  • Nodeset-to-nodeset comparison (such as '/a/b[@c=@d]') is not supported.

  • All of the standard XPath comparison operators are supported. (Bug #22823)

  • Relative locator expressions are resolved in the context of the root node. For example, consider the following query and result:

    mysql> SELECT ExtractValue(
        ->   '<a><b c="1">X</b><b c="2">Y</b></a>',
        ->    'a/b'
        -> ) AS result;
    +--------+
    | result |
    +--------+
    | X Y    |
    +--------+
    1 row in set (0.03 sec)
    

    In this case, the locator a/b resolves to /a/b.

    Relative locators are also supported within predicates. In the following example, d[../@c="1"] is resolved as /a/b[@c="1"]/d:

    mysql> SELECT ExtractValue(
        ->      '<a>
        ->        <b c="1"><d>X</d></b>
        ->        <b c="2"><d>X</d></b>
        ->      </a>',
        ->      'a/b/d[../@c="1"]')
        -> AS result;
    +--------+
    | result |
    +--------+
    | X      |
    +--------+
    1 row in set (0.00 sec)
    
  • Locators prefixed with expressions that evaluate as scalar values—including variable references, literals, numbers, and scalar function calls—are not permitted, and their use results in an error.

  • The :: operator is not supported in combination with node types such as the following:

    • axis::comment()

    • axis::text()

    • axis::processing-instructions()

    • axis::node()

    However, name tests (such as axis::name and axis::*) are supported, as shown in these examples:

    mysql> SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b');
    +-------------------------------------------------------+
    | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b') |
    +-------------------------------------------------------+
    | x                                                     |
    +-------------------------------------------------------+
    1 row in set (0.02 sec)
    
    mysql> SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*');
    +-------------------------------------------------------+
    | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*') |
    +-------------------------------------------------------+
    | x y                                                   |
    +-------------------------------------------------------+
    1 row in set (0.01 sec)
    
  • Up-and-down navigation is not supported in cases where the path would lead above the root element. That is, you cannot use expressions which match on descendants of ancestors of a given element, where one or more of the ancestors of the current element is also an ancestor of the root element (see Bug #16321).

  • The following XPath functions are not supported, or have known issues as indicated:

    • id()

    • lang()

    • local-name()

    • name()

    • namespace-uri()

    • normalize-space()

    • starts-with()

    • string()

    • substring-after()

    • substring-before()

    • translate()

  • The following axes are not supported:

    • following-sibling

    • following

    • preceding-sibling

    • preceding

XPath expressions passed as arguments to ExtractValue() and UpdateXML() may contain the colon character (:) in element selectors, which enables their use with markup employing XML namespaces notation. For example:

mysql> SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+
| ExtractValue(@xml, '//e:f') |
+-----------------------------+
| 444                         |
+-----------------------------+
1 row in set (0.00 sec)

mysql> SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+
| UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') |
+--------------------------------------------+
| <a>111<g:h>555</g:h></a>                   |
+--------------------------------------------+
1 row in set (0.00 sec)

This is similar in some respects to what is permitted by Apache Xalan and some other parsers, and is much simpler than requiring namespace declarations or the use of the namespace-uri() and local-name() functions.

Error handling.  For both ExtractValue() and UpdateXML(), the XPath locator used must be valid and the XML to be searched must consist of elements which are properly nested and closed. If the locator is invalid, an error is generated:

mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'

If xml_frag does not consist of elements which are properly nested and closed, NULL is returned and a warning is generated, as shown in this example:

mysql> SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+
| ExtractValue('<a>c</a><b', '//a') |
+-----------------------------------+
| NULL                              |
+-----------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+---------+------+-------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                   |
+---------+------+-------------------------------------------------------------------------------------------+
| Warning | 1523 | Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' |
+---------+------+-------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+
| ExtractValue('<a>c</a><b/>', '//a') |
+-------------------------------------+
| c                                   |
+-------------------------------------+
1 row in set (0.00 sec)
Important

The replacement XML used as the third argument to UpdateXML() is not checked to determine whether it consists solely of elements which are properly nested and closed.

XPath Injection.  code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. It is based on exploiting assumptions made by developers about the type and content of data input from users. XPath is no exception in this regard.

A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:

//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id

This is the XPath equivalent of an SQL statement like this one:

SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';

A PHP application employing XPath might handle the login process like this:

<?php

  $file     =   "users.xml";

  $login    =   $POST["login"];
  $password =   $POST["password"];

  $xpath = "//user[login/text()=$login and password/text()=$password]/attribute::id";

  if( file_exists($file) )
  {
    $xml = simplexml_load_file($file);

    if($result = $xml->xpath($xpath))
      echo "You are now logged in as user $result[0].";
    else
      echo "Invalid login name or password.";
  }
  else
    exit("Failed to open $file.");

?>

No checks are performed on the input. This means that a malevolent user can short-circuit the test by entering ' or 1=1 for both the login name and password, resulting in $xpath being evaluated as shown here:

//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id

Since the expression inside the square brackets always evaluates as true, it is effectively the same as this one, which matches the id attribute of every user element in the XML document:

//user/attribute::id

One way in which this particular attack can be circumvented is simply by quoting the variable names to be interpolated in the definition of $xpath, forcing the values passed from a Web form to be converted to strings:

$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";

This is the same strategy that is often recommended for preventing SQL injection attacks. In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:

  • Never accepted untested data from users in your application.

  • Check all user-submitted data for type; reject or convert data that is of the wrong type

  • Test numeric data for out of range values; truncate, round, or reject values that are out of range. Test strings for illegal characters and either strip them out or reject input containing them.

  • Do not output explicit error messages that might provide an unauthorized user with clues that could be used to compromise the system; log these to a file or database table instead.

Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).

It is also important to check the output being sent back to the client. Consider what can happen when we use the MySQL ExtractValue() function:

mysql> SELECT ExtractValue(
    ->     LOAD_FILE('users.xml'),
    ->     '//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
    -> ) AS id;
+-------------------------------+
| id                            |
+-------------------------------+
| 00327 13579 02403 42354 28570 |
+-------------------------------+
1 row in set (0.01 sec)

Because ExtractValue() returns multiple matches as a single space-delimited string, this injection attack provides every valid ID contained within users.xml to the user as a single row of output. As an extra safeguard, you should also test output before returning it to the user. Here is a simple example:

mysql> SELECT @id = ExtractValue(
    ->     LOAD_FILE('users.xml'),
    ->     '//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT IF(
    ->     INSTR(@id, ' ') = 0,
    ->     @id,
    ->     'Unable to retrieve user ID')
    -> AS singleID;
+----------------------------+
| singleID                   |
+----------------------------+
| Unable to retrieve user ID |
+----------------------------+
1 row in set (0.00 sec)

In general, the guidelines for returning data to users securely are the same as for accepting user input. These can be summed up as:

  • Always test outgoing data for type and permissible values.

  • Never permit unauthorized users to view error messages that might provide information about the application that could be used to exploit it.

12.12. Bit Functions

Table 12.16. Bitwise Functions

NameDescription
BIT_COUNT()Return the number of bits that are set
&Bitwise AND
~Invert bits
|Bitwise OR
^Bitwise XOR
<<Left shift
>>Right shift

MySQL uses BIGINT (64-bit) arithmetic for bit operations, so these operators have a maximum range of 64 bits.

  • |

    Bitwise OR:

    mysql> SELECT 29 | 15;
            -> 31
    

    The result is an unsigned 64-bit integer.

  • &

    Bitwise AND:

    mysql> SELECT 29 & 15;
            -> 13
    

    The result is an unsigned 64-bit integer.

  • ^

    Bitwise XOR:

    mysql> SELECT 1 ^ 1;
            -> 0
    mysql> SELECT 1 ^ 0;
            -> 1
    mysql> SELECT 11 ^ 3;
            -> 8
    

    The result is an unsigned 64-bit integer.

  • <<

    Shifts a longlong (BIGINT) number to the left.

    mysql> SELECT 1 << 2;
            -> 4
    

    The result is an unsigned 64-bit integer. The value is truncated to 64 bits. In particular, if the shift count is greater or equal to the width of an unsigned 64-bit number, the result is zero.

  • >>

    Shifts a longlong (BIGINT) number to the right.

    mysql> SELECT 4 >> 2;
            -> 1
    

    The result is an unsigned 64-bit integer. The value is truncated to 64 bits. In particular, if the shift count is greater or equal to the width of an unsigned 64-bit number, the result is zero.

  • ~

    Invert all bits.

    mysql> SELECT 5 & ~1;
            -> 4
    

    The result is an unsigned 64-bit integer.

  • BIT_COUNT(N)

    Returns the number of bits that are set in the argument N.

    mysql> SELECT BIT_COUNT(29), BIT_COUNT(b'101010');
            -> 4, 3
    

12.13. Encryption and Compression Functions

Table 12.17. Encryption Functions

NameDescription
AES_DECRYPT()Decrypt using AES
AES_ENCRYPT()Encrypt using AES
COMPRESS()Return result as a binary string
DECODE()Decodes a string encrypted using ENCODE()
DES_DECRYPT()Decrypt a string
DES_ENCRYPT()Encrypt a string
ENCODE()Encode a string
ENCRYPT()Encrypt a string
MD5()Calculate MD5 checksum
OLD_PASSWORD()Return the value of the pre-4.1 implementation of PASSWORD
PASSWORD()Calculate and return a password string
SHA1(), SHA()Calculate an SHA-1 160-bit checksum
SHA2()Calculate an SHA-2 checksum
UNCOMPRESS()Uncompress a string compressed
UNCOMPRESSED_LENGTH()Return the length of a string before compression
VALIDATE_PASSWORD_STRENGTH()Determine strength of password

Many encryption and compression functions return strings for which the result might contain arbitrary byte values. If you want to store these results, use a column with a VARBINARY or BLOB binary string data type. This will avoid potential problems with trailing space removal or character set conversion that would change data values, such as may occur if you use a nonbinary string data type (CHAR, VARCHAR, TEXT).

Some encryption functions return strings of ASCII characters: MD5(), OLD_PASSWORD(), PASSWORD(), SHA(), SHA1(), SHA2(). In MySQL 5.7, their return value is a nonbinary string that has a character set and collation determined by the character_set_connection and collation_connection system variables.

For versions in which functions such as MD5() or SHA1() return a string of hex digits as a binary string, the return value cannot be converted to uppercase or compared in case-insensitive fashion as is. You must convert the value to a nonbinary string. See the discussion of binary string conversion in Section 12.10, “Cast Functions and Operators”.

If an application stores values from a function such as MD5() or SHA1() that returns a string of hex digits, more efficient storage and comparisons can be obtained by converting the hex representation to binary using UNHEX() and storing the result in a BINARY(N) column. Each pair of hex digits requires one byte in binary form, so the value of N depends on the length of the hex string. N is 16 for an MD5() value and 20 for a SHA1() value. For SHA2(), N ranges from 28 to 32 depending on the argument specifying the desired bit length of the result.

The size penalty for storing the hex string in a CHAR column is at least two times, up to eight times if the value is stored in a column that uses the utf8 character set (where each character uses 4 bytes). Storing the string also results in slower comparisons because of the larger values and the need to take character set collation rules into account.

Suppose that an application stores MD5() string values in a CHAR(32) column:

CREATE TABLE md5_tbl (md5_val CHAR(32), ...);
INSERT INTO md5_tbl (md5_val, ...) VALUES(MD5('abcdef'), ...);

To convert hex strings to more compact form, modify the application to use UNHEX() and BINARY(16) instead as follows:

CREATE TABLE md5_tbl (md5_val BINARY(16), ...);
INSERT INTO md5_tbl (md5_val, ...) VALUES(UNHEX(MD5('abcdef')), ...);

Applications should be prepared to handle the very rare case that a hashing function produces the same value for two different input values. One way to make collisions detectable is to make the hash column a primary key.

Note

Exploits for the MD5 and SHA-1 algorithms have become known. You may wish to consider using one of the other encryption functions described in this section instead, such as SHA2().

Caution

Passwords or other sensitive values supplied as arguments to encryption functions are sent in plaintext to the MySQL server unless an SSL connection is used. Also, such values will appear in any MySQL logs to which they are written. To avoid these types of exposure, applications can encrypt sensitive values on the client side before sending them to the server. The same considerations apply to encryption keys. To avoid exposing these, applications can use stored procedures to encrypt and decrypt values on the server side.

  • AES_DECRYPT(crypt_str,key_str)

    This function decrypts data using the official AES (Advanced Encryption Standard) algorithm. For more information, see the description of AES_ENCRYPT().

  • AES_ENCRYPT(str,key_str)

    AES_ENCRYPT() and AES_DECRYPT() implement encryption and decryption of data using the official AES (Advanced Encryption Standard) algorithm, previously known as Rijndael. The AES standard allows various key sizes. These functions implement AES with a 128-bit key length, but you can extend them to 256 bits by modifying the source. The key length is a trade off between performance and security.

    AES_ENCRYPT() encrypts the string str using a key key_str and returns a binary string containing the encrypted output. AES_DECRYPT() decrypts the encrypted string crypt_str using a key key_str and returns the original plaintext string. If either argument is NULL, the results of these functions are also NULL.

    The str and crypt_str arguments can be any length, and padding is automatically added to str so it is a multiple of a block as required by block-based algorithms such as AES. This padding is automatically removed by the AES_DECRYPT() function. The length of crypt_str can be calculated using this formula:

    16 * (trunc(string_length / 16) + 1)
    

    The most secure way to pass a key to the key_str argument is to create a truly random 128-bit value and pass it as a binary value. For example:

    INSERT INTO t
    VALUES (1,AES_ENCRYPT('text',UNHEX('F3229A0B371ED2D9441B830D21A390C3')));

    A passphrase can be used to generate an AES key by hashing the passphrase. For example:

    INSERT INTO t VALUES (1,AES_ENCRYPT('text', SHA2('My secret passphrase',512)));

    Do not pass a password or passphrase directly to crypt_str, hash it first. Previous versions of this documentation suggested the former approach, but it is no longer recommended as the examples shown here are more secure.

    If AES_DECRYPT() detects invalid data or incorrect padding, it returns NULL. However, it is possible for AES_DECRYPT() to return a non-NULL value (possibly garbage) if the input data or the key is invalid.

  • COMPRESS(string_to_compress)

    Compresses a string and returns the result as a binary string. This function requires MySQL to have been compiled with a compression library such as zlib. Otherwise, the return value is always NULL. The compressed string can be uncompressed with UNCOMPRESS().

    mysql> SELECT LENGTH(COMPRESS(REPEAT('a',1000)));
            -> 21
    mysql> SELECT LENGTH(COMPRESS(''));
            -> 0
    mysql> SELECT LENGTH(COMPRESS('a'));
            -> 13
    mysql> SELECT LENGTH(COMPRESS(REPEAT('a',16)));
            -> 15
    

    The compressed string contents are stored the following way:

    • Empty strings are stored as empty strings.

    • Nonempty strings are stored as a 4-byte length of the uncompressed string (low byte first), followed by the compressed string. If the string ends with space, an extra . character is added to avoid problems with endspace trimming should the result be stored in a CHAR or VARCHAR column. (However, use of nonbinary string data types such as CHAR or VARCHAR to store compressed strings is not recommended anyway because character set conversion may occur. Use a VARBINARY or BLOB binary string column instead.)

  • DECODE(crypt_str,pass_str)

    DECODE() decrypts the encrypted string crypt_str using pass_str as the password. crypt_str should be a string returned from ENCODE().

    Note

    The ENCODE() and DECODE() functions are deprecated in MySQL 5.7, will be removed in a future MySQL release, and should no longer be used.

  • DES_DECRYPT(crypt_str[,key_str])

    Decrypts a string encrypted with DES_ENCRYPT(). If an error occurs, this function returns NULL.

    This function works only if MySQL has been configured with SSL support. See Section 6.3.9, “Using SSL for Secure Connections”.

    If no key_str argument is given, DES_DECRYPT() examines the first byte of the encrypted string to determine the DES key number that was used to encrypt the original string, and then reads the key from the DES key file to decrypt the message. For this to work, the user must have the SUPER privilege. The key file can be specified with the --des-key-file server option.

    If you pass this function a key_str argument, that string is used as the key for decrypting the message.

    If the crypt_str argument does not appear to be an encrypted string, MySQL returns the given crypt_str.

  • DES_ENCRYPT(str[,{key_num|key_str}])

    Encrypts the string with the given key using the Triple-DES algorithm.

    This function works only if MySQL has been configured with SSL support. See Section 6.3.9, “Using SSL for Secure Connections”.

    The encryption key to use is chosen based on the second argument to DES_ENCRYPT(), if one was given. With no argument, the first key from the DES key file is used. With a key_num argument, the given key number (0 to 9) from the DES key file is used. With a key_str argument, the given key string is used to encrypt str.

    The key file can be specified with the --des-key-file server option.

    The return string is a binary string where the first character is CHAR(128 | key_num). If an error occurs, DES_ENCRYPT() returns NULL.

    The 128 is added to make it easier to recognize an encrypted key. If you use a string key, key_num is 127.

    The string length for the result is given by this formula:

    new_len = orig_len + (8 - (orig_len % 8)) + 1
    

    Each line in the DES key file has the following format:

    key_num des_key_str
    

    Each key_num value must be a number in the range from 0 to 9. Lines in the file may be in any order. des_key_str is the string that is used to encrypt the message. There should be at least one space between the number and the key. The first key is the default key that is used if you do not specify any key argument to DES_ENCRYPT().

    You can tell MySQL to read new key values from the key file with the FLUSH DES_KEY_FILE statement. This requires the RELOAD privilege.

    One benefit of having a set of default keys is that it gives applications a way to check for the existence of encrypted column values, without giving the end user the right to decrypt those values.

    mysql> SELECT customer_address FROM customer_table 
         > WHERE crypted_credit_card = DES_ENCRYPT('credit_card_number');
    
  • ENCODE(str,pass_str)

    ENCODE() encrypts str using pass_str as the password. The result is a binary string of the same length as str. To decrypt the result, use DECODE().

    Note

    The ENCODE() and DECODE() functions are deprecated in MySQL 5.7, will be removed in a future MySQL release, and should no longer be used.

    If you still need to use ENCODE(), a salt value must be used with it to reduce risk. For example:

    ENCODE('plaintext', CONCAT('my_random_salt','my_secret_password'))

    A new random salt value must be used whenever a password is updated.

  • ENCRYPT(str[,salt])

    Encrypts str using the Unix crypt() system call and returns a binary string. The salt argument must be a string with at least two characters or the result will be NULL. If no salt argument is given, a random value is used.

    mysql> SELECT ENCRYPT('hello');
            -> 'VxuFAJXVARROc'
    

    ENCRYPT() ignores all but the first eight characters of str, at least on some systems. This behavior is determined by the implementation of the underlying crypt() system call.

    The use of ENCRYPT() with the ucs2, utf16, utf16le, or utf32 multi-byte character sets is not recommended because the system call expects a string terminated by a zero byte.

    If crypt() is not available on your system (as is the case with Windows), ENCRYPT() always returns NULL.

  • MD5(str)

    Calculates an MD5 128-bit checksum for the string. The value is returned as a string of 32 hex digits, or NULL if the argument was NULL. The return value can, for example, be used as a hash key. See the notes at the beginning of this section about storing hash values efficiently.

    The return value is a nonbinary string in the connection character set.

    mysql> SELECT MD5('testing');
            -> 'ae2b1fca515949e5d54fb22b8ed95575'
    

    This is the RSA Data Security, Inc. MD5 Message-Digest Algorithm.

    See the note regarding the MD5 algorithm at the beginning this section.

  • OLD_PASSWORD(str)

    OLD_PASSWORD() was added when the implementation of PASSWORD() was changed in MySQL 4.1 to improve security. OLD_PASSWORD() returns the value of the pre-4.1 implementation of PASSWORD() as a string, and is intended to permit you to reset passwords for any pre-4.1 clients that need to connect to your version 5.7 MySQL server without locking them out. See Section 6.1.2.4, “Password Hashing in MySQL”.

    The return value is a nonbinary string in the connection character set.

    Note

    Passwords that use the pre-4.1 hashing method are less secure than passwords that use the native password hashing method and should be avoided. Pre-4.1 passwords are deprecated and support for them will be removed in a future MySQL release.

  • PASSWORD(str)

    Calculates and returns a hashed password string from the cleartext password str and returns a nonbinary string in the connection character set, or NULL if the argument is NULL. This function is the SQL interface to the algorithm used by the server to encrypt MySQL passwords for storage in the mysql.user grant table.

    The old_passwords system variable controls the password hashing method used by PASSWORD() (as well as for the IDENTIFIED BY clause of the CREATE USER and GRANT statements).

    The following table shows the permitted values of old_passwords, the password hashing method for each value, and which authentication plugins use passwords hashed with each method.

    ValuePassword Hashing FormatIntended Use
    0MySQL 4.1 native hashingAccounts that authenticate with the mysql_native_password plugin
    1Pre-4.1 (old) hashingAccounts that authenticate with the mysql_old_password plugin
    2SHA-256 hashingAccounts that authenticate with the sha256_password plugin

    If old_passwords=1, PASSWORD('str') returns the same value as OLD_PASSWORD('str'). The latter function is not affected by the value of old_passwords.

    Examples:

    mysql> SET old_passwords = 0;
    mysql> SELECT PASSWORD('mypass');
    +-------------------------------------------+
    | PASSWORD('mypass')                        |
    +-------------------------------------------+
    | *6C8989366EAF75BB670AD8EA7A7FC1176A95CEF4 |
    +-------------------------------------------+
    
    mysql> SET old_passwords = 1;
    mysql> SELECT PASSWORD('mypass');
    +--------------------+
    | PASSWORD('mypass') |
    +--------------------+
    | 6f8c114b58f2ce9e   |
    +--------------------+
    

    SHA-256 password hashing (old_passwords=2) uses a random salt value, which makes the result from PASSWORD() nondeterministic. Consequently, statements that use this function are not safe for statement-based replication and cannot be stored in the query cache.

    Encryption performed by PASSWORD() is one-way (not reversible). It is not the same type of encryption as used for Unix passwords; for that, use ENCRYPT().

    Note

    The PASSWORD() function is used by the authentication system in MySQL Server; you should not use it in your own applications. For that purpose, consider MD5() or SHA2() instead. Also see RFC 2195, section 2 (Challenge-Response Authentication Mechanism (CRAM)), for more information about handling passwords and authentication securely in your applications.

    Note

    Passwords that use the pre-4.1 hashing method are less secure than passwords that use the native password hashing method and should be avoided. Pre-4.1 passwords are deprecated and support for them will be removed in a future MySQL release.

    Caution

    Statements that invoke PASSWORD() may be recorded in server logs or in a history file such as ~/.mysql_history, which means that cleartext passwords may be read by anyone having read access to that information. See Section 6.1.2, “Keeping Passwords Secure”.

  • SHA1(str), SHA(str)

    Calculates an SHA-1 160-bit checksum for the string, as described in RFC 3174 (Secure Hash Algorithm). The value is returned as a string of 40 hex digits, or NULL if the argument was NULL. One of the possible uses for this function is as a hash key. See the notes at the beginning of this section about storing hash values efficiently. You can also use SHA1() as a cryptographic function for storing passwords. SHA() is synonymous with SHA1().

    The return value is a nonbinary string in the connection character set.

    mysql> SELECT SHA1('abc');
            -> 'a9993e364706816aba3e25717850c26c9cd0d89d'
    

    SHA1() can be considered a cryptographically more secure equivalent of MD5(). However, see the note regarding the MD5 and SHA-1 algorithms at the beginning this section.

  • SHA2(str, hash_length)

    Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). The first argument is the cleartext string to be hashed. The second argument indicates the desired bit length of the result, which must have a value of 224, 256, 384, 512, or 0 (which is equivalent to 256). If either argument is NULL or the hash length is not one of the permitted values, the return value is NULL. Otherwise, the function result is a hash value containing the desired number of bits. See the notes at the beginning of this section about storing hash values efficiently.

    The return value is a nonbinary string in the connection character set.

    mysql> SELECT SHA2('abc', 224);
            -> '23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7'
    

    This function works only if MySQL has been configured with SSL support. See Section 6.3.9, “Using SSL for Secure Connections”.

    SHA2() can be considered cryptographically more secure than MD5() or SHA1().

  • UNCOMPRESS(string_to_uncompress)

    Uncompresses a string compressed by the COMPRESS() function. If the argument is not a compressed value, the result is NULL. This function requires MySQL to have been compiled with a compression library such as zlib. Otherwise, the return value is always NULL.

    mysql> SELECT UNCOMPRESS(COMPRESS('any string'));
            -> 'any string'
    mysql> SELECT UNCOMPRESS('any string');
            -> NULL
    
  • UNCOMPRESSED_LENGTH(compressed_string)

    Returns the length that the compressed string had before being compressed.

    mysql> SELECT UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('a',30)));
            -> 30
    
  • VALIDATE_PASSWORD_STRENGTH(str)

    Given an argument representing a cleartext password, this function returns an integer to indicate how strong the password is. The return value ranges from 0 (weak) to 100 (strong).

    The password is subjected to increasingly strict tests and the return value reflects which tests were satisfied, as shown in the following table.

    Password TestReturn Value
    Length < 40
    Length ≥ 4 and < validate_password_length25
    Satisfies policy 1 (LOW)50
    Satisfies policy 2 (MEDIUM)75
    Satisfies policy 3 (STRONG)100

    Password assessment by VALIDATE_PASSWORD_STRENGTH() is done by the validate_password plugin. If that plugin is not installed, the function always returns 0. For information about installing the validate_password plugin, see Section 6.1.2.6, “The Password Validation Plugin”. To examine or configure the parameters that affect password testing, check or set the system variables implemented by validate_password plugin. See Section 6.1.2.6.2, “Password Validation Plugin Options and Variables”.

12.14. Information Functions

Table 12.18. Information Functions

NameDescription
BENCHMARK()Repeatedly execute an expression
CHARSET()Return the character set of the argument
COERCIBILITY()Return the collation coercibility value of the string argument
COLLATION()Return the collation of the string argument
CONNECTION_ID()Return the connection ID (thread ID) for the connection
CURRENT_USER(), CURRENT_USERThe authenticated user name and host name
DATABASE()Return the default (current) database name
FOUND_ROWS()For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause
LAST_INSERT_ID()Value of the AUTOINCREMENT column for the last INSERT
ROW_COUNT()The number of rows updated
SCHEMA()A synonym for DATABASE()
SESSION_USER()Synonym for USER()
SYSTEM_USER()Synonym for USER()
USER()The user name and host name provided by the client
VERSION()Returns a string that indicates the MySQL server version

  • BENCHMARK(count,expr)

    The BENCHMARK() function executes the expression expr repeatedly count times. It may be used to time how quickly MySQL processes the expression. The result value is always 0. The intended use is from within the mysql client, which reports query execution times:

    mysql> SELECT BENCHMARK(1000000,ENCODE('hello','goodbye'));
    +----------------------------------------------+
    | BENCHMARK(1000000,ENCODE('hello','goodbye')) |
    +----------------------------------------------+
    |                                            0 |
    +----------------------------------------------+
    1 row in set (4.74 sec)
    

    The time reported is elapsed time on the client end, not CPU time on the server end. It is advisable to execute BENCHMARK() several times, and to interpret the result with regard to how heavily loaded the server machine is.

    BENCHMARK() is intended for measuring the runtime performance of scalar expressions, which has some significant implications for the way that you use it and interpret the results:

    • Only scalar expressions can be used. Although the expression can be a subquery, it must return a single column and at most a single row. For example, BENCHMARK(10, (SELECT * FROM t)) will fail if the table t has more than one column or more than one row.

    • Executing a SELECT expr statement N times differs from executing SELECT BENCHMARK(N, expr) in terms of the amount of overhead involved. The two have very different execution profiles and you should not expect them to take the same amount of time. The former involves the parser, optimizer, table locking, and runtime evaluation N times each. The latter involves only runtime evaluation N times, and all the other components just once. Memory structures already allocated are reused, and runtime optimizations such as local caching of results already evaluated for aggregate functions can alter the results. Use of BENCHMARK() thus measures performance of the runtime component by giving more weight to that component and removing the noise introduced by the network, parser, optimizer, and so forth.

  • CHARSET(str)

    Returns the character set of the string argument.

    mysql> SELECT CHARSET('abc');
            -> 'latin1'
    mysql> SELECT CHARSET(CONVERT('abc' USING utf8));
            -> 'utf8'
    mysql> SELECT CHARSET(USER());
            -> 'utf8'
    
  • COERCIBILITY(str)

    Returns the collation coercibility value of the string argument.

    mysql> SELECT COERCIBILITY('abc' COLLATE latin1_swedish_ci);
            -> 0
    mysql> SELECT COERCIBILITY(USER());
            -> 3
    mysql> SELECT COERCIBILITY('abc');
            -> 4
    

    The return values have the meanings shown in the following table. Lower values have higher precedence.

    CoercibilityMeaningExample
    0Explicit collationValue with COLLATE clause
    1No collationConcatenation of strings with different collations
    2Implicit collationColumn value, stored routine parameter or local variable
    3System constantUSER() return value
    4CoercibleLiteral string
    5IgnorableNULL or an expression derived from NULL
  • COLLATION(str)

    Returns the collation of the string argument.

    mysql> SELECT COLLATION('abc');
            -> 'latin1_swedish_ci'
    mysql> SELECT COLLATION(_utf8'abc');
            -> 'utf8_general_ci'
    
  • CONNECTION_ID()

    Returns the connection ID (thread ID) for the connection. Every connection has an ID that is unique among the set of currently connected clients.

    mysql> SELECT CONNECTION_ID();
            -> 23786
    
  • CURRENT_USER, CURRENT_USER()

    Returns the user name and host name combination for the MySQL account that the server used to authenticate the current client. This account determines your access privileges. The return value is a string in the utf8 character set.

    The value of CURRENT_USER() can differ from the value of USER().

    mysql> SELECT USER();
            -> 'davida@localhost'
    mysql> SELECT * FROM mysql.user;
    ERROR 1044: Access denied for user ''@'localhost' to
    database 'mysql'
    mysql> SELECT CURRENT_USER();
            -> '@localhost'
    

    The example illustrates that although the client specified a user name of davida (as indicated by the value of the USER() function), the server authenticated the client using an anonymous user account (as seen by the empty user name part of the CURRENT_USER() value). One way this might occur is that there is no account listed in the grant tables for davida.

    Within a stored program or view, CURRENT_USER() returns the account for the user who defined the object (as given by its DEFINER value) unless defined with the SQL SECURITY INVOKER characteristic. In the latter case, CURRENT_USER() returns the object's invoker.

    Triggers and events have no option to define the SQL SECURITY characteristic, so for these objects, CURRENT_USER() returns the account for the user who defined the object. To return the invoker, use USER() or SESSION_USER().

    The following statements support use of the CURRENT_USER() function to take the place of the name of (and, possibly, a host for) an affected user or a definer; in such cases, CURRENT_USER() is expanded where and as needed:

    For information about the implications that this expansion of CURRENT_USER() has for replication in different releases of MySQL 5.7, see Section 16.4.1.7, “Replication of CURRENT_USER().

  • DATABASE()

    Returns the default (current) database name as a string in the utf8 character set. If there is no default database, DATABASE() returns NULL. Within a stored routine, the default database is the database that the routine is associated with, which is not necessarily the same as the database that is the default in the calling context.

    mysql> SELECT DATABASE();
            -> 'test'
    

    If there is no default database, DATABASE() returns NULL.

  • FOUND_ROWS()

    A SELECT statement may include a LIMIT clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without the LIMIT, but without running the statement again. To obtain this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke FOUND_ROWS() afterward:

    mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
        -> WHERE id > 100 LIMIT 10;
    mysql> SELECT FOUND_ROWS();
    

    The second SELECT returns a number indicating how many rows the first SELECT would have returned had it been written without the LIMIT clause.

    In the absence of the SQL_CALC_FOUND_ROWS option in the most recent successful SELECT statement, FOUND_ROWS() returns the number of rows in the result set returned by that statement. If the statement includes a LIMIT clause, FOUND_ROWS() returns the number of rows up to the limit. For example, FOUND_ROWS() returns 10 or 60, respectively, if the statement includes LIMIT 10 or LIMIT 50, 10.

    The row count available through FOUND_ROWS() is transient and not intended to be available past the statement following the SELECT SQL_CALC_FOUND_ROWS statement. If you need to refer to the value later, save it:

    mysql> SELECT SQL_CALC_FOUND_ROWS * FROM ... ;
    mysql> SET @rows = FOUND_ROWS();
    

    If you are using SELECT SQL_CALC_FOUND_ROWS, MySQL must calculate how many rows are in the full result set. However, this is faster than running the query again without LIMIT, because the result set need not be sent to the client.

    SQL_CALC_FOUND_ROWS and FOUND_ROWS() can be useful in situations when you want to restrict the number of rows that a query returns, but also determine the number of rows in the full result set without running the query again. An example is a Web script that presents a paged display containing links to the pages that show other sections of a search result. Using FOUND_ROWS() enables you to determine how many other pages are needed for the rest of the result.

    The use of SQL_CALC_FOUND_ROWS and FOUND_ROWS() is more complex for UNION statements than for simple SELECT statements, because LIMIT may occur at multiple places in a UNION. It may be applied to individual SELECT statements in the UNION, or global to the UNION result as a whole.

    The intent of SQL_CALC_FOUND_ROWS for UNION is that it should return the row count that would be returned without a global LIMIT. The conditions for use of SQL_CALC_FOUND_ROWS with UNION are:

    • The SQL_CALC_FOUND_ROWS keyword must appear in the first SELECT of the UNION.

    • The value of FOUND_ROWS() is exact only if UNION ALL is used. If UNION without ALL is used, duplicate removal occurs and the value of FOUND_ROWS() is only approximate.

    • If no LIMIT is present in the UNION, SQL_CALC_FOUND_ROWS is ignored and returns the number of rows in the temporary table that is created to process the UNION.

    Beyond the cases described here, the behavior of FOUND_ROWS() is undefined (for example, its value following a SELECT statement that fails with an error).

    Important

    FOUND_ROWS() is not replicated reliably using statement-based replication. This function is automatically replicated using row-based replication.

  • LAST_INSERT_ID(), LAST_INSERT_ID(expr)

    With no argument, LAST_INSERT_ID() returns a BIGINT UNSIGNED (64-bit) value representing the first automatically generated value successfully inserted for an AUTO_INCREMENT column as a result of the most recently executed INSERT statement. The value of LAST_INSERT_ID() remains unchanged if no rows are successfully inserted.

    With an argument, LAST_INSERT_ID() returns an unsigned integer.

    For example, after inserting a row that generates an AUTO_INCREMENT value, you can get the value like this:

    mysql> SELECT LAST_INSERT_ID();
            -> 195
    

    The currently executing statement does not affect the value of LAST_INSERT_ID(). Suppose that you generate an AUTO_INCREMENT value with one statement, and then refer to LAST_INSERT_ID() in a multiple-row INSERT statement that inserts rows into a table with its own AUTO_INCREMENT column. The value of LAST_INSERT_ID() will remain stable in the second statement; its value for the second and later rows is not affected by the earlier row insertions. (However, if you mix references to LAST_INSERT_ID() and LAST_INSERT_ID(expr), the effect is undefined.)

    If the previous statement returned an error, the value of LAST_INSERT_ID() is undefined. For transactional tables, if the statement is rolled back due to an error, the value of LAST_INSERT_ID() is left undefined. For manual ROLLBACK, the value of LAST_INSERT_ID() is not restored to that before the transaction; it remains as it was at the point of the ROLLBACK.

    Within the body of a stored routine (procedure or function) or a trigger, the value of LAST_INSERT_ID() changes the same way as for statements executed outside the body of these kinds of objects. The effect of a stored routine or trigger upon the value of LAST_INSERT_ID() that is seen by following statements depends on the kind of routine:

    • If a stored procedure executes statements that change the value of LAST_INSERT_ID(), the changed value is seen by statements that follow the procedure call.

    • For stored functions and triggers that change the value, the value is restored when the function or trigger ends, so following statements will not see a changed value.

    The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.

    The value of LAST_INSERT_ID() is not changed if you set the AUTO_INCREMENT column of a row to a non-magic value (that is, a value that is not NULL and not 0).

    Important

    If you insert multiple rows using a single INSERT statement, LAST_INSERT_ID() returns the value generated for the first inserted row only. The reason for this is to make it possible to reproduce easily the same INSERT statement against some other server.

    For example:

    mysql> USE test;
    Database changed
    mysql> CREATE TABLE t (
        ->   id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
        ->   name VARCHAR(10) NOT NULL
        -> );
    Query OK, 0 rows affected (0.09 sec)
    
    mysql> INSERT INTO t VALUES (NULL, 'Bob');
    Query OK, 1 row affected (0.01 sec)
    
    mysql> SELECT * FROM t;
    +----+------+
    | id | name |
    +----+------+
    |  1 | Bob  |
    +----+------+
    1 row in set (0.01 sec)
    
    mysql> SELECT LAST_INSERT_ID();
    +------------------+
    | LAST_INSERT_ID() |
    +------------------+
    |                1 |
    +------------------+
    1 row in set (0.00 sec)
    
    mysql> INSERT INTO t VALUES
        -> (NULL, 'Mary'), (NULL, 'Jane'), (NULL, 'Lisa');
    Query OK, 3 rows affected (0.00 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> SELECT * FROM t;
    +----+------+
    | id | name |
    +----+------+
    |  1 | Bob  |
    |  2 | Mary |
    |  3 | Jane |
    |  4 | Lisa |
    +----+------+
    4 rows in set (0.01 sec)
    
    mysql> SELECT LAST_INSERT_ID();
    +------------------+
    | LAST_INSERT_ID() |
    +------------------+
    |                2 |
    +------------------+
    1 row in set (0.00 sec)
    

    Although the second INSERT statement inserted three new rows into t, the ID generated for the first of these rows was 2, and it is this value that is returned by LAST_INSERT_ID() for the following SELECT statement.

    If you use INSERT IGNORE and the row is ignored, the AUTO_INCREMENT counter is not incremented and LAST_INSERT_ID() returns 0, which reflects that no row was inserted.

    If expr is given as an argument to LAST_INSERT_ID(), the value of the argument is returned by the function and is remembered as the next value to be returned by LAST_INSERT_ID(). This can be used to simulate sequences:

    1. Create a table to hold the sequence counter and initialize it:

      mysql> CREATE TABLE sequence (id INT NOT NULL);
      mysql> INSERT INTO sequence VALUES (0);
      
    2. Use the table to generate sequence numbers like this:

      mysql> UPDATE sequence SET id=LAST_INSERT_ID(id+1);
      mysql> SELECT LAST_INSERT_ID();
      

      The UPDATE statement increments the sequence counter and causes the next call to LAST_INSERT_ID() to return the updated value. The SELECT statement retrieves that value. The mysql_insert_id() C API function can also be used to get the value. See Section 21.8.7.37, “mysql_insert_id().

    You can generate sequences without calling LAST_INSERT_ID(), but the utility of using the function this way is that the ID value is maintained in the server as the last automatically generated value. It is multi-user safe because multiple clients can issue the UPDATE statement and get their own sequence value with the SELECT statement (or mysql_insert_id()), without affecting or being affected by other clients that generate their own sequence values.

    Note that mysql_insert_id() is only updated after INSERT and UPDATE statements, so you cannot use the C API function to retrieve the value for LAST_INSERT_ID(expr) after executing other SQL statements like SELECT or SET.

  • ROW_COUNT()

    In MySQL 5.7, ROW_COUNT() returns a value as follows:

    • DDL statements: 0. This applies to statements such as CREATE TABLE or DROP TABLE.

    • DML statements other than SELECT: The number of affected rows. This applies to statements such as UPDATE, INSERT, or DELETE (as before), but now also to statements such as ALTER TABLE and LOAD DATA INFILE.

    • SELECT: -1 if the statement returns a result set, or the number of rows affected if it does not. For example, for SELECT * FROM t1, ROW_COUNT() returns -1. For SELECT * FROM t1 INTO OUTFILE 'file_name', ROW_COUNT() returns the number of rows written to the file.

    • SIGNAL statements: 0.

    For UPDATE statements, the affected-rows value by default is the number of rows actually changed. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is the number of rows found; that is, matched by the WHERE clause.

    For REPLACE statements, the affected-rows value is 2 if the new row replaced an old row, because in this case, one row was inserted after the duplicate was deleted.

    For INSERT ... ON DUPLICATE KEY UPDATE statements, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag, the affected-rows value is 1 (not 0) if an existing row is set to its current values.

    The ROW_COUNT() value is similar to the value from the mysql_affected_rows() C API function and the row count that the mysql client displays following statement execution.

    mysql> INSERT INTO t VALUES(1),(2),(3);
    Query OK, 3 rows affected (0.00 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> SELECT ROW_COUNT();
    +-------------+
    | ROW_COUNT() |
    +-------------+
    |           3 |
    +-------------+
    1 row in set (0.00 sec)
    
    mysql> DELETE FROM t WHERE i IN(1,2);
    Query OK, 2 rows affected (0.00 sec)
    
    mysql> SELECT ROW_COUNT();
    +-------------+
    | ROW_COUNT() |
    +-------------+
    |           2 |
    +-------------+
    1 row in set (0.00 sec)
    
    Important

    ROW_COUNT() is not replicated reliably using statement-based replication. This function is automatically replicated using row-based replication.

  • SCHEMA()

    This function is a synonym for DATABASE().

  • SESSION_USER()

    SESSION_USER() is a synonym for USER().

  • SYSTEM_USER()

    SYSTEM_USER() is a synonym for USER().

  • USER()

    Returns the current MySQL user name and host name as a string in the utf8 character set.

    mysql> SELECT USER();
            -> 'davida@localhost'
    

    The value indicates the user name you specified when connecting to the server, and the client host from which you connected. The value can be different from that of CURRENT_USER().

    You can extract only the user name part like this:

    mysql> SELECT SUBSTRING_INDEX(USER(),'@',1);
            -> 'davida'
    
  • VERSION()

    Returns a string that indicates the MySQL server version. The string uses the utf8 character set. The value might have a suffix in addition to the version number. See the description of the version system variable in Section 5.1.4, “Server System Variables”.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

    mysql> SELECT VERSION();
            -> '5.7.3-standard'
    

12.15. Functions Used with Global Transaction IDs

The functions described in this section are used with GTID-based replication. For more information about GTIDs and how these functions are used, see Section 16.1.3, “Replication with Global Transaction Identifiers”.

Note

The union of two GTID sets is simply their string representations joined together with an interposed comma. In other words, you can define a very simple function for obtaining the union of two GTID sets like this:

CREATE FUNCTION GTID_UNION(g1 TEXT, g2 TEXT) 
    RETURNS TEXT DETERMINISTIC
RETURN CONCAT(g1,',',g2);  

Table 12.19. GTID Functions

NameDescription
GTID_SUBSET()Return true if all GTIDs in subset are also in set; otherwise false.
GTID_SUBTRACT()Return all GTIDs in set that are not in subset.
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()Wait until the slave SQL thread has executed all the given GTIDs. Returns: the number of events that were executed (or NULL, if GTID mode is not enabled).

  • GTID_SUBSET(subset,set)

    Given two sets of global transaction IDs subset and set, returns true if all GTIDs in subset are also in set. Returns false otherwise.

  • GTID_SUBTRACT(set,subset)

    Given two sets of global transaction IDs subset and set, returns only those GTIDs from set that are not in subset.

  • WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(gtid_set[, timeout])

    Wait until the slave SQL thread has executed all of the transactions whose global transaction identifiers are contained in gtid_set (see Section 16.1.3.1, “GTID Concepts”, for a definition of GTID sets), or until timeout seconds have elapsed, whichever occurs first. timeout is optional; the default timeout is 0 seconds, in which case the master simply waits until all of the transactions in the GTID set have been executed.

    For more information, see Section 16.1.3, “Replication with Global Transaction Identifiers”.

    mysql> SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('3E11FA47-71CA-11E1-9E33-C80AA9429562:1-5');
            -> 5
    

    The return value is the number of transactional events that were executed. If GTID-based replication is not active (that is, if the value of the gtid_mode variable is OFF), then this value is undefined and thus WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS() returns NULL.

12.16. Miscellaneous Functions

Table 12.20. Miscellaneous Functions

NameDescription
DEFAULT()Return the default value for a table column
GET_LOCK()Get a named lock
INET_ATON()Return the numeric value of an IP address
INET_NTOA()Return the IP address from a numeric value
INET6_ATON()Return the numeric value of an IPv6 address
INET6_NTOA()Return the IPv6 address from a numeric value
IS_FREE_LOCK()Checks whether the named lock is free
IS_IPV4_COMPAT()Return true if argument is an IPv4-compatible address
IS_IPV4_MAPPED()Return true if argument is an IPv4-mapped address
IS_IPV4()Return true if argument is an IPv4 address
IS_IPV6()Return true if argument is an IPv6 address
IS_USED_LOCK()Checks whether the named lock is in use. Return connection identifier if true.
MASTER_POS_WAIT()Block until the slave has read and applied all updates up to the specified position
NAME_CONST()Causes the column to have the given name
RAND()Return a random floating-point value
RELEASE_LOCK()Releases the named lock
SLEEP()Sleep for a number of seconds
UUID_SHORT()Return an integer-valued universal identifier
UUID()Return a Universal Unique Identifier (UUID)
VALUES()Defines the values to be used during an INSERT

  • DEFAULT(col_name)

    Returns the default value for a table column. An error results if the column has no default value.

    mysql> UPDATE t SET i = DEFAULT(i)+1 WHERE id < 100;
    
  • FORMAT(X,D)

    Formats the number X to a format like '#,###,###.##', rounded to D decimal places, and returns the result as a string. For details, see Section 12.5, “String Functions”.

  • GET_LOCK(str,timeout)

    Tries to obtain a lock with a name given by the string str, using a timeout of timeout seconds. Returns 1 if the lock was obtained successfully, 0 if the attempt timed out (for example, because another client has previously locked the name), or NULL if an error occurred (such as running out of memory or the thread was killed with mysqladmin kill). If you have a lock obtained with GET_LOCK(), it is released when you execute RELEASE_LOCK(), execute a new GET_LOCK(), or your connection terminates (either normally or abnormally). Locks obtained with GET_LOCK() do not interact with transactions. That is, committing a transaction does not release any such locks obtained during the transaction.

    This function can be used to implement application locks or to simulate record locks. Names are locked on a server-wide basis. If a name has been locked by one client, GET_LOCK() blocks any request by another client for a lock with the same name. This enables clients that agree on a given lock name to use the name to perform cooperative advisory locking. But be aware that it also enables a client that is not among the set of cooperating clients to lock a name, either inadvertently or deliberately, and thus prevent any of the cooperating clients from locking that name. One way to reduce the likelihood of this is to use lock names that are database-specific or application-specific. For example, use lock names of the form db_name.str or app_name.str.

    mysql> SELECT GET_LOCK('lock1',10);
            -> 1
    mysql> SELECT IS_FREE_LOCK('lock2');
            -> 1
    mysql> SELECT GET_LOCK('lock2',10);
            -> 1
    mysql> SELECT RELEASE_LOCK('lock2');
            -> 1
    mysql> SELECT RELEASE_LOCK('lock1');
            -> NULL
    

    The second RELEASE_LOCK() call returns NULL because the lock 'lock1' was automatically released by the second GET_LOCK() call.

    If multiple clients are waiting for a lock, the order in which they will acquire it is undefined and depends on factors such as the thread library in use. In particular, applications should not assume that clients will acquire the lock in the same order that they issued the lock requests.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

  • INET_ATON(expr)

    Given the dotted-quad representation of an IPv4 network address as a string, returns an integer that represents the numeric value of the address in network byte order (big endian). INET_ATON() returns NULL if it does not understand its argument.

    mysql> SELECT INET_ATON('10.0.5.9');
            -> 167773449
    

    For this example, the return value is calculated as 10×2563 + 0×2562 + 5×256 + 9.

    INET_ATON() may or may not return a non-NULL result for short-form IP addresses (such as '127.1' as a representation of '127.0.0.1'). Because of this, INET_ATON()a should not be used for such addresses.

    Note

    To store values generated by INET_ATON(), use an INT UNSIGNED column rather than INT, which is signed. If you use a signed column, values corresponding to IP addresses for which the first octet is greater than 127 cannot be stored correctly. See Section 11.2.6, “Out-of-Range and Overflow Handling”.

  • INET_NTOA(expr)

    Given a numeric IPv4 network address in network byte order, returns the dotted-quad string representation of the address as a nonbinary string in the connection character set. INET_NTOA() returns NULL if it does not understand its argument.

    mysql> SELECT INET_NTOA(167773449);
            -> '10.0.5.9'
    
  • INET6_ATON(expr)

    Given an IPv6 or IPv4 network address as a string, returns a binary string that represents the numeric value of the address in network byte order (big endian). Because numeric-format IPv6 addresses require more bytes than the largest integer type, the representation returned by this function has the VARBINARY data type: VARBINARY(16) for IPv6 addresses and VARBINARY(4) for IPv4 addresses. If the argument is not a valid address, INET6_ATON() returns NULL.

    The following examples use HEX() to display the INET6_ATON() result in printable form:

    mysql> SELECT HEX(INET6_ATON('fdfe::5a55:caff:fefa:9089'));
            -> 'FDFE0000000000005A55CAFFFEFA9089'
    mysql> SELECT HEX(INET6_ATON('10.0.5.9'));
            -> '0A000509'
    

    INET6_ATON() observes several constraints on valid arguments. These are given in the following list along with examples.

    • A trailing zone ID is not permitted, as in fe80::3%1 or fe80::3%eth0.

    • A trailing network mask is not permitted, as in 2001:45f:3:ba::/64 or 192.168.1.0/24.

    • For values representing IPv4 addresses, only classless addresses are supported. Classful addresses such as 192.168.1 are rejected. A trailing port number is not permitted, as in 192.168.1.2:8080. Hexadecimal numbers in address components are not permitted, as in 192.0xa0.1.2. Octal numbers are not supported: 192.168.010.1 is treated as 192.168.10.1, not 192.168.8.1. These IPv4 constraints also apply to IPv6 addresses that have IPv4 address parts, such as IPv4-compatible or IPv4-mapped addresses.

    To convert an IPv4 address expr represented in numeric form as an INT value to an IPv6 address represented in numeric form as a VARBINARY value, use this expression:

    INET6_ATON(INET_NTOA(expr))
    

    For example:

    mysql> SELECT HEX(INET6_ATON(INET_NTOA(167773449)));
            -> '0A000509'
    
  • INET6_NTOA(expr)

    Given an IPv6 or IPv4 network address represented in numeric form as a binary string, returns the string representation of the address as a nonbinary string in the connection character set. If the argument is not a valid address, INET6_NTOA() returns NULL.

    INET6_NTOA() has these properties:

    • It does not use operating system functions to perform conversions, thus the output string is platform independent.

    • The return string has a maximum length of 39 (4 x 8 + 7). Given this statement:

      CREATE TABLE t AS SELECT INET6_NTOA(expr) AS c1;
      

      The resulting table would have this definition:

      CREATE TABLE t (c1 VARCHAR(39) CHARACTER SET utf8 DEFAULT NULL);
    • The return string uses lowercase letters for IPv6 addresses.

    mysql> SELECT INET6_NTOA(INET6_ATON('fdfe::5a55:caff:fefa:9089'));
            -> 'fdfe::5a55:caff:fefa:9089'
    mysql> SELECT INET6_NTOA(INET6_ATON('10.0.5.9'));
            -> '10.0.5.9'
    
    mysql> SELECT INET6_NTOA(UNHEX('FDFE0000000000005A55CAFFFEFA9089'));
            -> 'fdfe::5a55:caff:fefa:9089'
    mysql> SELECT INET6_NTOA(UNHEX('0A000509'));
            -> '10.0.5.9'
    
  • IS_FREE_LOCK(str)

    Checks whether the lock named str is free to use (that is, not locked). Returns 1 if the lock is free (no one is using the lock), 0 if the lock is in use, and NULL if an error occurs (such as an incorrect argument).

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

  • IS_IPV4(expr)

    Returns 1 if the argument is a valid IPv4 address specified as a string, 0 otherwise.

    mysql> SELECT IS_IPV4('10.0.5.9'), IS_IPV4('10.0.5.256');
            -> 1, 0
    

    For a given argument, if IS_IPV4() returns 1, INET_ATON() (and INET6_ATON()) will return non-NULL. The converse statement is not true: In some cases, INET_ATON() returns non-NULL when IS_IPV4() returns 0.

    As implied by the preceding remarks, IS_IPV4() is more strict than INET_ATON() about what constitutes a valid IPv4 address, so it may be useful for applications that need to perform strong checks against invalid values. Alternatively, use INET6_ATON() to convert IPv4 addresses to internal form and check for a NULL result (which indicates an invalid address). INET6_ATON() is equally strong as IS_IPV4() about checking IPv4 addresses.

  • IS_IPV4_COMPAT(expr)

    This function takes an IPv6 address represented in numeric form as a binary string, as returned by INET6_ATON(). It returns 1 if the argument is a valid IPv4-compatible IPv6 address, 0 otherwise. IPv4-compatible addresses have the form ::ipv4_address.

    mysql> SELECT IS_IPV4_COMPAT(INET6_ATON('::10.0.5.9'));
            -> 1
    mysql> SELECT IS_IPV4_COMPAT(INET6_ATON('::ffff:10.0.5.9'));
            -> 0
    

    The IPv4 part of an IPv4-compatible address can also be represented using hexadecimal notation. For example, 192.168.0.1 has this raw hexadecimal value:

    mysql> SELECT HEX(INET6_ATON('192.168.0.1'));
            -> 'C0A80001'
    

    Expressed in IPv4-compatible form, ::192.168.0.1 is equivalent to ::c0a8:0001 or (without leading zeros) ::c0a8:1

    mysql> SELECT
        ->   IS_IPV4_COMPAT(INET6_ATON('::192.168.0.1')),
        ->   IS_IPV4_COMPAT(INET6_ATON('::c0a8:0001')),
        ->   IS_IPV4_COMPAT(INET6_ATON('::c0a8:1'));
            -> 1, 1, 1
    
  • IS_IPV4_MAPPED(expr)

    This function takes an IPv6 address represented in numeric form as a binary string, as returned by INET6_ATON(). It returns 1 if the argument is a valid IPv4-mapped IPv6 address, 0 otherwise. IPv4-mapped addresses have the form ::ffff:ipv4_address.

    mysql> SELECT IS_IPV4_MAPPED(INET6_ATON('::10.0.5.9'));
            -> 0
    mysql> SELECT IS_IPV4_MAPPED(INET6_ATON('::ffff:10.0.5.9'));
            -> 1
    

    As with IS_IPV4_COMPAT() the IPv4 part of an IPv4-mapped address can also be represented using hexadecimal notation:

    mysql> SELECT
        ->   IS_IPV4_MAPPED(INET6_ATON('::ffff:192.168.0.1')),
        ->   IS_IPV4_MAPPED(INET6_ATON('::ffff:c0a8:0001')),
        ->   IS_IPV4_MAPPED(INET6_ATON('::ffff:c0a8:1'));
            -> 1, 1, 1
    
  • IS_IPV6(expr)

    Returns 1 if the argument is a valid IPv6 address specified as a string, 0 otherwise. This function does not consider IPv4 addresses to be valid IPv6 addresses.

    mysql> SELECT IS_IPV6('10.0.5.9'), IS_IPV6('::1');
            -> 0, 1
    

    For a given argument, if IS_IPV6() returns 1, INET6_ATON() will return non-NULL.

  • IS_USED_LOCK(str)

    Checks whether the lock named str is in use (that is, locked). If so, it returns the connection identifier of the client that holds the lock. Otherwise, it returns NULL.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

  • MASTER_POS_WAIT(log_name,log_pos[,timeout])

    This function is useful for control of master/slave synchronization. It blocks until the slave has read and applied all updates up to the specified position in the master log. The return value is the number of log events the slave had to wait for to advance to the specified position. The function returns NULL if the slave SQL thread is not started, the slave's master information is not initialized, the arguments are incorrect, or an error occurs. It returns -1 if the timeout has been exceeded. If the slave SQL thread stops while MASTER_POS_WAIT() is waiting, the function returns NULL. If the slave is past the specified position, the function returns immediately.

    If a timeout value is specified, MASTER_POS_WAIT() stops waiting when timeout seconds have elapsed. timeout must be greater than 0; a zero or negative timeout means no timeout.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

  • NAME_CONST(name,value)

    Returns the given value. When used to produce a result set column, NAME_CONST() causes the column to have the given name. The arguments should be constants.

    mysql> SELECT NAME_CONST('myname', 14);
    +--------+
    | myname |
    +--------+
    |     14 |
    +--------+
    

    This function is for internal use only. The server uses it when writing statements from stored programs that contain references to local program variables, as described in Section 18.7, “Binary Logging of Stored Programs”, You might see this function in the output from mysqlbinlog.

  • RELEASE_LOCK(str)

    Releases the lock named by the string str that was obtained with GET_LOCK(). Returns 1 if the lock was released, 0 if the lock was not established by this thread (in which case the lock is not released), and NULL if the named lock did not exist. The lock does not exist if it was never obtained by a call to GET_LOCK() or if it has previously been released.

    The DO statement is convenient to use with RELEASE_LOCK(). See Section 13.2.3, “DO Syntax”.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

  • SLEEP(duration)

    Sleeps (pauses) for the number of seconds given by the duration argument, then returns 0. If SLEEP() is interrupted, it returns 1. The duration may have a fractional part. If the argument is NULL or negative, SLEEP() produces a warning, or an error in strict SQL mode.

    This function is unsafe for statement-based replication. In MySQL 5.7, a warning is logged if you use this function when binlog_format is set to STATEMENT. (Bug #47995)

  • UUID()

    Returns a Universal Unique Identifier (UUID) generated according to DCE 1.1: Remote Procedure Call (Appendix A) CAE (Common Applications Environment) Specifications published by The Open Group in October 1997 (Document Number C706, http://www.opengroup.org/public/pubs/catalog/c706.htm).

    A UUID is designed as a number that is globally unique in space and time. Two calls to UUID() are expected to generate two different values, even if these calls are performed on two separate computers that are not connected to each other.

    A UUID is a 128-bit number represented by a utf8 string of five hexadecimal numbers in aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee format:

    • The first three numbers are generated from a timestamp.

    • The fourth number preserves temporal uniqueness in case the timestamp value loses monotonicity (for example, due to daylight saving time).

    • The fifth number is an IEEE 802 node number that provides spatial uniqueness. A random number is substituted if the latter is not available (for example, because the host computer has no Ethernet card, or we do not know how to find the hardware address of an interface on your operating system). In this case, spatial uniqueness cannot be guaranteed. Nevertheless, a collision should have very low probability.

      Currently, the MAC address of an interface is taken into account only on FreeBSD and Linux. On other operating systems, MySQL uses a randomly generated 48-bit number.

    mysql> SELECT UUID();
            -> '6ccd780c-baba-1026-9564-0040f4311e29'
    
    Warning

    Although UUID() values are intended to be unique, they are not necessarily unguessable or unpredictable. If unpredictability is required, UUID values should be generated some other way.

    Note

    UUID() does not work with statement-based replication.

  • UUID_SHORT()

    Returns a short universal identifier as a 64-bit unsigned integer (rather than a string-form 128-bit identifier as returned by the UUID() function).

    The value of UUID_SHORT() is guaranteed to be unique if the following conditions hold:

    • The server_id of the current host is unique among your set of master and slave servers

    • server_id is between 0 and 255

    • You do not set back your system time for your server between mysqld restarts

    • You do not invoke UUID_SHORT() on average more than 16 million times per second between mysqld restarts

    The UUID_SHORT() return value is constructed this way:

      (server_id & 255) << 56
    + (server_startup_time_in_seconds << 24)
    + incremented_variable++;
    mysql> SELECT UUID_SHORT();
            -> 92395783831158784
    

    Note that UUID_SHORT() does not work with statement-based replication.

  • VALUES(col_name)

    In an INSERT ... ON DUPLICATE KEY UPDATE statement, you can use the VALUES(col_name) function in the UPDATE clause to refer to column values from the INSERT portion of the statement. In other words, VALUES(col_name) in the UPDATE clause refers to the value of col_name that would be inserted, had no duplicate-key conflict occurred. This function is especially useful in multiple-row inserts. The VALUES() function is meaningful only in the ON DUPLICATE KEY UPDATE clause of INSERT statements and returns NULL otherwise. See Section 13.2.5.3, “INSERT ... ON DUPLICATE KEY UPDATE Syntax”.

    mysql> INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
        -> ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
    

12.17. Functions and Modifiers for Use with GROUP BY Clauses

12.17.1. GROUP BY (Aggregate) Functions

Table 12.21. Aggregate (GROUP BY) Functions

NameDescription
AVG()Return the average value of the argument
BIT_AND()Return bitwise and
BIT_OR()Return bitwise or
BIT_XOR()Return bitwise xor
COUNT(DISTINCT)Return the count of a number of different values
COUNT()Return a count of the number of rows returned
GROUP_CONCAT()Return a concatenated string
MAX()Return the maximum value
MIN()Return the minimum value
STD()Return the population standard deviation
STDDEV_POP()Return the population standard deviation
STDDEV_SAMP()Return the sample standard deviation
STDDEV()Return the population standard deviation
SUM()Return the sum
VAR_POP()Return the population standard variance
VAR_SAMP()Return the sample variance
VARIANCE()Return the population standard variance

This section describes group (aggregate) functions that operate on sets of values. Unless otherwise stated, group functions ignore NULL values.

If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows. For more information, see Section 12.17.3, “MySQL Extensions to GROUP BY.

For numeric arguments, the variance and standard deviation functions return a DOUBLE value. The SUM() and AVG() functions return a DECIMAL value for exact-value arguments (integer or DECIMAL), and a DOUBLE value for approximate-value arguments (FLOAT or DOUBLE).

The SUM() and AVG() aggregate functions do not work with temporal values. (They convert the values to numbers, losing everything after the first nonnumeric character.) To work around this problem, convert to numeric units, perform the aggregate operation, and convert back to a temporal value. Examples:

SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col))) FROM tbl_name;
SELECT FROM_DAYS(SUM(TO_DAYS(date_col))) FROM tbl_name;

Functions such as SUM() or AVG() that expect a numeric argument cast the argument to a number if necessary. For SET or ENUM values, the cast operation causes the underlying numeric value to be used.

  • AVG([DISTINCT] expr)

    Returns the average value of expr. The DISTINCT option can be used to return the average of the distinct values of expr.

    AVG() returns NULL if there were no matching rows.

    mysql> SELECT student_name, AVG(test_score)
        ->        FROM student
        ->        GROUP BY student_name;
    
  • BIT_AND(expr)

    Returns the bitwise AND of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision.

    This function returns 18446744073709551615 if there were no matching rows. (This is the value of an unsigned BIGINT value with all bits set to 1.)

  • BIT_OR(expr)

    Returns the bitwise OR of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision.

    This function returns 0 if there were no matching rows.

  • BIT_XOR(expr)

    Returns the bitwise XOR of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision.

    This function returns 0 if there were no matching rows.

  • COUNT(expr)

    Returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement. The result is a BIGINT value.

    COUNT() returns 0 if there were no matching rows.

    mysql> SELECT student.student_name,COUNT(*)
        ->        FROM student,course
        ->        WHERE student.student_id=course.student_id
        ->        GROUP BY student_name;
    

    COUNT(*) is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain NULL values.

    COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause. For example:

    mysql> SELECT COUNT(*) FROM student;
    

    This optimization applies only to MyISAM tables only, because an exact row count is stored for this storage engine and can be accessed very quickly. For transactional storage engines such as InnoDB, storing an exact row count is more problematic because multiple transactions may be occurring, each of which may affect the count.

  • COUNT(DISTINCT expr,[expr...])

    Returns a count of the number of rows with different non-NULL expr values.

    COUNT(DISTINCT) returns 0 if there were no matching rows.

    mysql> SELECT COUNT(DISTINCT results) FROM student;
    

    In MySQL, you can obtain the number of distinct expression combinations that do not contain NULL by giving a list of expressions. In standard SQL, you would have to do a concatenation of all expressions inside COUNT(DISTINCT ...).

  • GROUP_CONCAT(expr)

    This function returns a string result with the concatenated non-NULL values from a group. It returns NULL if there are no non-NULL values. The full syntax is as follows:

    GROUP_CONCAT([DISTINCT] expr [,expr ...]
                 [ORDER BY {unsigned_integer | col_name | expr}
                     [ASC | DESC] [,col_name ...]]
                 [SEPARATOR str_val])
    
    mysql> SELECT student_name,
        ->     GROUP_CONCAT(test_score)
        ->     FROM student
        ->     GROUP BY student_name;
    

    Or:

    mysql> SELECT student_name,
        ->     GROUP_CONCAT(DISTINCT test_score
        ->               ORDER BY test_score DESC SEPARATOR ' ')
        ->     FROM student
        ->     GROUP BY student_name;
    

    In MySQL, you can get the concatenated values of expression combinations. To eliminate duplicate values, use the DISTINCT clause. To sort values in the result, use the ORDER BY clause. To sort in reverse order, add the DESC (descending) keyword to the name of the column you are sorting by in the ORDER BY clause. The default is ascending order; this may be specified explicitly using the ASC keyword. The default separator between values in a group is comma (,). To specify a separator explicitly, use SEPARATOR followed by the string literal value that should be inserted between group values. To eliminate the separator altogether, specify SEPARATOR ''.

    The result is truncated to the maximum length that is given by the group_concat_max_len system variable, which has a default value of 1024. The value can be set higher, although the effective maximum length of the return value is constrained by the value of max_allowed_packet. The syntax to change the value of group_concat_max_len at runtime is as follows, where val is an unsigned integer:

    SET [GLOBAL | SESSION] group_concat_max_len = val;
    

    The return value is a nonbinary or binary string, depending on whether the arguments are nonbinary or binary strings. The result type is TEXT or BLOB unless group_concat_max_len is less than or equal to 512, in which case the result type is VARCHAR or VARBINARY.

    See also CONCAT() and CONCAT_WS(): Section 12.5, “String Functions”.

  • MAX([DISTINCT] expr)

    Returns the maximum value of expr. MAX() may take a string argument; in such cases, it returns the maximum string value. See Section 8.3.1, “How MySQL Uses Indexes”. The DISTINCT keyword can be used to find the maximum of the distinct values of expr, however, this produces the same result as omitting DISTINCT.

    MAX() returns NULL if there were no matching rows.

    mysql> SELECT student_name, MIN(test_score), MAX(test_score)
        ->        FROM student
        ->        GROUP BY student_name;
    

    For MAX(), MySQL currently compares ENUM and SET columns by their string value rather than by the string's relative position in the set. This differs from how ORDER BY compares them. This is expected to be rectified in a future MySQL release.

  • MIN([DISTINCT] expr)

    Returns the minimum value of expr. MIN() may take a string argument; in such cases, it returns the minimum string value. See Section 8.3.1, “How MySQL Uses Indexes”. The DISTINCT keyword can be used to find the minimum of the distinct values of expr, however, this produces the same result as omitting DISTINCT.

    MIN() returns NULL if there were no matching rows.

    mysql> SELECT student_name, MIN(test_score), MAX(test_score)
        ->        FROM student
        ->        GROUP BY student_name;
    

    For MIN(), MySQL currently compares ENUM and SET columns by their string value rather than by the string's relative position in the set. This differs from how ORDER BY compares them. This is expected to be rectified in a future MySQL release.

  • STD(expr)

    Returns the population standard deviation of expr. This is an extension to standard SQL. The standard SQL function STDDEV_POP() can be used instead.

    This function returns NULL if there were no matching rows.

  • STDDEV(expr)

    Returns the population standard deviation of expr. This function is provided for compatibility with Oracle. The standard SQL function STDDEV_POP() can be used instead.

    This function returns NULL if there were no matching rows.

  • STDDEV_POP(expr)

    Returns the population standard deviation of expr (the square root of VAR_POP()). You can also use STD() or STDDEV(), which are equivalent but not standard SQL.

    STDDEV_POP() returns NULL if there were no matching rows.

  • STDDEV_SAMP(expr)

    Returns the sample standard deviation of expr (the square root of VAR_SAMP().

    STDDEV_SAMP() returns NULL if there were no matching rows.

  • SUM([DISTINCT] expr)

    Returns the sum of expr. If the return set has no rows, SUM() returns NULL. The DISTINCT keyword can be used to sum only the distinct values of expr.

    SUM() returns NULL if there were no matching rows.

  • VAR_POP(expr)

    Returns the population standard variance of expr. It considers rows as the whole population, not as a sample, so it has the number of rows as the denominator. You can also use VARIANCE(), which is equivalent but is not standard SQL.

    VAR_POP() returns NULL if there were no matching rows.

  • VAR_SAMP(expr)

    Returns the sample variance of expr. That is, the denominator is the number of rows minus one.

    VAR_SAMP() returns NULL if there were no matching rows.

  • VARIANCE(expr)

    Returns the population standard variance of expr. This is an extension to standard SQL. The standard SQL function VAR_POP() can be used instead.

    VARIANCE() returns NULL if there were no matching rows.

12.17.2. GROUP BY Modifiers

The GROUP BY clause permits a WITH ROLLUP modifier that causes extra rows to be added to the summary output. These rows represent higher-level (or super-aggregate) summary operations. ROLLUP thus enables you to answer questions at multiple levels of analysis with a single query. It can be used, for example, to provide support for OLAP (Online Analytical Processing) operations.

Suppose that a table named sales has year, country, product, and profit columns for recording sales profitability:

CREATE TABLE sales
(
    year    INT NOT NULL,
    country VARCHAR(20) NOT NULL,
    product VARCHAR(32) NOT NULL,
    profit  INT
);

The table's contents can be summarized per year with a simple GROUP BY like this:

mysql> SELECT year, SUM(profit) FROM sales GROUP BY year;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 |        4525 |
| 2001 |        3010 |
+------+-------------+

This output shows the total profit for each year, but if you also want to determine the total profit summed over all years, you must add up the individual values yourself or run an additional query.

Or you can use ROLLUP, which provides both levels of analysis with a single query. Adding a WITH ROLLUP modifier to the GROUP BY clause causes the query to produce another row that shows the grand total over all year values:

mysql> SELECT year, SUM(profit) FROM sales GROUP BY year WITH ROLLUP;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 |        4525 |
| 2001 |        3010 |
| NULL |        7535 |
+------+-------------+

The grand total super-aggregate line is identified by the value NULL in the year column.

ROLLUP has a more complex effect when there are multiple GROUP BY columns. In this case, each time there is a break (change in value) in any but the last grouping column, the query produces an extra super-aggregate summary row.

For example, without ROLLUP, a summary on the sales table based on year, country, and product might look like this:

mysql> SELECT year, country, product, SUM(profit)
    -> FROM sales
    -> GROUP BY year, country, product;
+------+---------+------------+-------------+
| year | country | product    | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer   |        1500 |
| 2000 | Finland | Phone      |         100 |
| 2000 | India   | Calculator |         150 |
| 2000 | India   | Computer   |        1200 |
| 2000 | USA     | Calculator |          75 |
| 2000 | USA     | Computer   |        1500 |
| 2001 | Finland | Phone      |          10 |
| 2001 | USA     | Calculator |          50 |
| 2001 | USA     | Computer   |        2700 |
| 2001 | USA     | TV         |         250 |
+------+---------+------------+-------------+

The output indicates summary values only at the year/country/product level of analysis. When ROLLUP is added, the query produces several extra rows:

mysql> SELECT year, country, product, SUM(profit)
    -> FROM sales
    -> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product    | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer   |        1500 |
| 2000 | Finland | Phone      |         100 |
| 2000 | Finland | NULL       |        1600 |
| 2000 | India   | Calculator |         150 |
| 2000 | India   | Computer   |        1200 |
| 2000 | India   | NULL       |        1350 |
| 2000 | USA     | Calculator |          75 |
| 2000 | USA     | Computer   |        1500 |
| 2000 | USA     | NULL       |        1575 |
| 2000 | NULL    | NULL       |        4525 |
| 2001 | Finland | Phone      |          10 |
| 2001 | Finland | NULL       |          10 |
| 2001 | USA     | Calculator |          50 |
| 2001 | USA     | Computer   |        2700 |
| 2001 | USA     | TV         |         250 |
| 2001 | USA     | NULL       |        3000 |
| 2001 | NULL    | NULL       |        3010 |
| NULL | NULL    | NULL       |        7535 |
+------+---------+------------+-------------+

For this query, adding ROLLUP causes the output to include summary information at four levels of analysis, not just one. Here is how to interpret the ROLLUP output:

  • Following each set of product rows for a given year and country, an extra summary row is produced showing the total for all products. These rows have the product column set to NULL.

  • Following each set of rows for a given year, an extra summary row is produced showing the total for all countries and products. These rows have the country and products columns set to NULL.

  • Finally, following all other rows, an extra summary row is produced showing the grand total for all years, countries, and products. This row has the year, country, and products columns set to NULL.

Other Considerations When using ROLLUP

The following items list some behaviors specific to the MySQL implementation of ROLLUP.

When you use ROLLUP, you cannot also use an ORDER BY clause to sort the results. In other words, ROLLUP and ORDER BY are mutually exclusive. However, you still have some control over sort order. GROUP BY in MySQL sorts results, and you can use explicit ASC and DESC keywords with columns named in the GROUP BY list to specify sort order for individual columns. (The higher-level summary rows added by ROLLUP still appear after the rows from which they are calculated, regardless of the sort order.)

LIMIT can be used to restrict the number of rows returned to the client. LIMIT is applied after ROLLUP, so the limit applies against the extra rows added by ROLLUP. For example:

mysql> SELECT year, country, product, SUM(profit)
    -> FROM sales
    -> GROUP BY year, country, product WITH ROLLUP
    -> LIMIT 5;
+------+---------+------------+-------------+
| year | country | product    | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer   |        1500 |
| 2000 | Finland | Phone      |         100 |
| 2000 | Finland | NULL       |        1600 |
| 2000 | India   | Calculator |         150 |
| 2000 | India   | Computer   |        1200 |
+------+---------+------------+-------------+

Using LIMIT with ROLLUP may produce results that are more difficult to interpret, because you have less context for understanding the super-aggregate rows.

The NULL indicators in each super-aggregate row are produced when the row is sent to the client. The server looks at the columns named in the GROUP BY clause following the leftmost one that has changed value. For any column in the result set with a name that is a lexical match to any of those names, its value is set to NULL. (If you specify grouping columns by column number, the server identifies which columns to set to NULL by number.)

Because the NULL values in the super-aggregate rows are placed into the result set at such a late stage in query processing, you cannot test them as NULL values within the query itself. For example, you cannot add HAVING product IS NULL to the query to eliminate from the output all but the super-aggregate rows.

On the other hand, the NULL values do appear as NULL on the client side and can be tested as such using any MySQL client programming interface.

MySQL permits a column that does not appear in the GROUP BY list to be named in the select list. In this case, the server is free to choose any value from this nonaggregated column in summary rows, and this includes the extra rows added by WITH ROLLUP. For example, in the following query, country is a nonaggregated column that does not appear in the GROUP BY list and values chosen for this column are indeterminate:

mysql> SELECT year, country, SUM(profit)
    -> FROM sales GROUP BY year WITH ROLLUP;
+------+---------+-------------+
| year | country | SUM(profit) |
+------+---------+-------------+
| 2000 | India   |        4525 |
| 2001 | USA     |        3010 |
| NULL | USA     |        7535 |
+------+---------+-------------+

This behavior occurs if the ONLY_FULL_GROUP_BY SQL mode is not enabled. If that mode is enabled, the server rejects the query as illegal because country is not listed in the GROUP BY clause. For more information about nonaggregated columns and GROUP BY, see Section 12.17.3, “MySQL Extensions to GROUP BY.

12.17.3. MySQL Extensions to GROUP BY

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:

SELECT o.custid, c.name, MAX(o.payment)
  FROM orders AS o, customers AS c
  WHERE o.custid = c.custid
  GROUP BY o.custid;

For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses.

A similar MySQL extension applies to the HAVING clause. In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the HAVING clause that are not named in the GROUP BY clause. A MySQL extension permits references to such columns to simplify calculations. This extension assumes that the nongrouped columns will have the same group-wise values. Otherwise, the result is indeterminate.

To disable the MySQL GROUP BY extension, enable the ONLY_FULL_GROUP_BY SQL mode. This enables standard SQL behavior: Columns not named in the GROUP BY clause cannot be used in the select list or HAVING clause unless enclosed in an aggregate function.

ONLY_FULL_GROUP_BY also affects use of aliases in the HAVING clauses. For example, the following query returns name values that occur only once in table orders:

SELECT name, COUNT(name) FROM orders
  GROUP BY name
  HAVING COUNT(name) = 1;

MySQL extends this behavior to permit the use of an alias in the HAVING clause for the aggregated column:

SELECT name, COUNT(name) AS c FROM orders
  GROUP BY name
  HAVING c = 1;

Enabling ONLY_FULL_GROUP_BY disables this MySQL extension and a non-grouping field 'c' is used in HAVING clause error occurs because the column c in the HAVING clause is not enclosed in an aggregate function (instead, it is an aggregate function).

The select list extension also applies to ORDER BY. That is, you can refer to nonaggregated columns in the ORDER BY clause that do not appear in the GROUP BY clause. (However, as mentioned previously, ORDER BY does not affect which values are chosen from nonaggregated columns; it only sorts them after they have been chosen.) This extension does not apply if the ONLY_FULL_GROUP_BY SQL mode is enabled.

In some cases, you can use MIN() and MAX() to obtain a specific column value even if it is not unique. If the sort column contains integers no larger than 6 digits, the following query gives the value of column from the row containing the smallest sort value:

SUBSTR(MIN(CONCAT(LPAD(sort,6,'0'),column)),7)

See Section 3.6.4, “The Rows Holding the Group-wise Maximum of a Certain Column”.

If you are trying to follow standard SQL, you cannot use expressions in GROUP BY clauses. As a workaround, use an alias for the expression:

SELECT id, FLOOR(value/100) AS val
  FROM tbl_name
  GROUP BY id, val;

MySQL permits expressions in GROUP BY clauses, so the alias is unnecessary:

SELECT id, FLOOR(value/100)
  FROM tbl_name
  GROUP BY id, FLOOR(value/100);

12.18. Spatial Extensions

MySQL supports spatial extensions to enable the generation, storage, and analysis of geographic features. These features are available for MyISAM, InnoDB, NDB, and ARCHIVE tables.

For spatial columns, MyISAM supports both SPATIAL and non-SPATIAL indexes. Other storage engines support non-SPATIAL indexes, as described in Section 13.1.11, “CREATE INDEX Syntax”.

This chapter covers the following topics:

  • The basis of these spatial extensions in the OpenGIS geometry model

  • Data formats for representing spatial data

  • How to use spatial data in MySQL

  • Use of indexing for spatial data

  • MySQL differences from the OpenGIS specification

Additional Resources

  • The Open Geospatial Consortium publishes the OpenGIS® Simple Features Specifications For SQL, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. This specification is available from the OGC Web site at http://www.opengis.org/docs/99-049.pdf.

  • If you have questions or concerns about the use of the spatial extensions to MySQL, you can discuss them in the GIS forum: http://forums.mysql.com/list.php?23.

12.18.1. Introduction to MySQL Spatial Support

MySQL implements spatial extensions following the specification of the Open Geospatial Consortium (OGC). This is an international consortium of more than 250 companies, agencies, and universities participating in the development of publicly available conceptual solutions that can be useful with all kinds of applications that manage spatial data. The OGC maintains a Web site at http://www.opengis.org/.

In 1997, the Open Geospatial Consortium published the OpenGIS® Simple Features Specifications For SQL, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. This specification is available from the OGC Web site at http://www.opengis.org/docs/99-049.pdf. It contains additional information relevant to this chapter.

MySQL implements a subset of the SQL with Geometry Types environment proposed by OGC. This term refers to an SQL environment that has been extended with a set of geometry types. A geometry-valued SQL column is implemented as a column that has a geometry type. The specification describe a set of SQL geometry types, as well as functions on those types to create and analyze geometry values.

A geographic feature is anything in the world that has a location. A feature can be:

  • An entity. For example, a mountain, a pond, a city.

  • A space. For example, town district, the tropics.

  • A definable location. For example, a crossroad, as a particular place where two streets intersect.

Some documents use the term geospatial feature to refer to geographic features.

Geometry is another word that denotes a geographic feature. Originally the word geometry meant measurement of the earth. Another meaning comes from cartography, referring to the geometric features that cartographers use to map the world.

This chapter uses all of these terms synonymously: geographic feature, geospatial feature, feature, or geometry. Here, the term most commonly used is geometry, defined as a point or an aggregate of points representing anything in the world that has a location.

12.18.2. The OpenGIS Geometry Model

The set of geometry types proposed by OGC's SQL with Geometry Types environment is based on the OpenGIS Geometry Model. In this model, each geometric object has the following general properties:

  • It is associated with a Spatial Reference System, which describes the coordinate space in which the object is defined.

  • It belongs to some geometry class.

12.18.2.1. The Geometry Class Hierarchy

The geometry classes define a hierarchy as follows:

  • Geometry (noninstantiable)

    • Point (instantiable)

    • Curve (noninstantiable)

      • LineString (instantiable)

        • Line

        • LinearRing

    • Surface (noninstantiable)

      • Polygon (instantiable)

    • GeometryCollection (instantiable)

      • MultiPoint (instantiable)

      • MultiCurve (noninstantiable)

        • MultiLineString (instantiable)

      • MultiSurface (noninstantiable)

        • MultiPolygon (instantiable)

It is not possible to create objects in noninstantiable classes. It is possible to create objects in instantiable classes. All classes have properties, and instantiable classes may also have assertions (rules that define valid class instances).

Geometry is the base class. It is an abstract class. The instantiable subclasses of Geometry are restricted to zero-, one-, and two-dimensional geometric objects that exist in two-dimensional coordinate space. All instantiable geometry classes are defined so that valid instances of a geometry class are topologically closed (that is, all defined geometries include their boundary).

The base Geometry class has subclasses for Point, Curve, Surface, and GeometryCollection:

  • Point represents zero-dimensional objects.

  • Curve represents one-dimensional objects, and has subclass LineString, with sub-subclasses Line and LinearRing.

  • Surface is designed for two-dimensional objects and has subclass Polygon.

  • GeometryCollection has specialized zero-, one-, and two-dimensional collection classes named MultiPoint, MultiLineString, and MultiPolygon for modeling geometries corresponding to collections of Points, LineStrings, and Polygons, respectively. MultiCurve and MultiSurface are introduced as abstract superclasses that generalize the collection interfaces to handle Curves and Surfaces.

Geometry, Curve, Surface, MultiCurve, and MultiSurface are defined as noninstantiable classes. They define a common set of methods for their subclasses and are included for extensibility.

Point, LineString, Polygon, GeometryCollection, MultiPoint, MultiLineString, and MultiPolygon are instantiable classes.

12.18.2.2. Class Geometry

Geometry is the root class of the hierarchy. It is a noninstantiable class but has a number of properties that are common to all geometry values created from any of the Geometry subclasses. These properties are described in the following list. Particular subclasses have their own specific properties, described later.

Geometry Properties

A geometry value has the following properties:

  • Its type. Each geometry belongs to one of the instantiable classes in the hierarchy.

  • Its SRID, or Spatial Reference Identifier. This value identifies the geometry's associated Spatial Reference System that describes the coordinate space in which the geometry object is defined.

    In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry.

  • Its coordinates in its Spatial Reference System, represented as double-precision (8-byte) numbers. All nonempty geometries include at least one pair of (X,Y) coordinates. Empty geometries contain no coordinates.

    Coordinates are related to the SRID. For example, in different coordinate systems, the distance between two objects may differ even when objects have the same coordinates, because the distance on the planar coordinate system and the distance on the geocentric system (coordinates on the Earth's surface) are different things.

  • Its interior, boundary, and exterior.

    Every geometry occupies some position in space. The exterior of a geometry is all space not occupied by the geometry. The interior is the space occupied by the geometry. The boundary is the interface between the geometry's interior and exterior.

  • Its MBR (Minimum Bounding Rectangle), or Envelope. This is the bounding geometry, formed by the minimum and maximum (X,Y) coordinates:

    ((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
  • Whether the value is simple or nonsimple. Geometry values of types (LineString, MultiPoint, MultiLineString) are either simple or nonsimple. Each type determines its own assertions for being simple or nonsimple.

  • Whether the value is closed or not closed. Geometry values of types (LineString, MultiString) are either closed or not closed. Each type determines its own assertions for being closed or not closed.

  • Whether the value is empty or nonempty A geometry is empty if it does not have any points. Exterior, interior, and boundary of an empty geometry are not defined (that is, they are represented by a NULL value). An empty geometry is defined to be always simple and has an area of 0.

  • Its dimension. A geometry can have a dimension of –1, 0, 1, or 2:

    • –1 for an empty geometry.

    • 0 for a geometry with no length and no area.

    • 1 for a geometry with nonzero length and zero area.

    • 2 for a geometry with nonzero area.

    Point objects have a dimension of zero. LineString objects have a dimension of 1. Polygon objects have a dimension of 2. The dimensions of MultiPoint, MultiLineString, and MultiPolygon objects are the same as the dimensions of the elements they consist of.

12.18.2.3. Class Point

A Point is a geometry that represents a single location in coordinate space.

Point Examples

  • Imagine a large-scale map of the world with many cities. A Point object could represent each city.

  • On a city map, a Point object could represent a bus stop.

Point Properties

  • X-coordinate value.

  • Y-coordinate value.

  • Point is defined as a zero-dimensional geometry.

  • The boundary of a Point is the empty set.

12.18.2.4. Class Curve

A Curve is a one-dimensional geometry, usually represented by a sequence of points. Particular subclasses of Curve define the type of interpolation between points. Curve is a noninstantiable class.

Curve Properties

  • A Curve has the coordinates of its points.

  • A Curve is defined as a one-dimensional geometry.

  • A Curve is simple if it does not pass through the same point twice.

  • A Curve is closed if its start point is equal to its endpoint.

  • The boundary of a closed Curve is empty.

  • The boundary of a nonclosed Curve consists of its two endpoints.

  • A Curve that is simple and closed is a LinearRing.

12.18.2.5. Class LineString

A LineString is a Curve with linear interpolation between points.

LineString Examples

  • On a world map, LineString objects could represent rivers.

  • In a city map, LineString objects could represent streets.

LineString Properties

  • A LineString has coordinates of segments, defined by each consecutive pair of points.

  • A LineString is a Line if it consists of exactly two points.

  • A LineString is a LinearRing if it is both closed and simple.

12.18.2.6. Class Surface

A Surface is a two-dimensional geometry. It is a noninstantiable class. Its only instantiable subclass is Polygon.

Surface Properties

  • A Surface is defined as a two-dimensional geometry.

  • The OpenGIS specification defines a simple Surface as a geometry that consists of a single patch that is associated with a single exterior boundary and zero or more interior boundaries.

  • The boundary of a simple Surface is the set of closed curves corresponding to its exterior and interior boundaries.

12.18.2.7. Class Polygon

A Polygon is a planar Surface representing a multisided geometry. It is defined by a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the Polygon.

Polygon Examples

  • On a region map, Polygon objects could represent forests, districts, and so on.

Polygon Assertions

  • The boundary of a Polygon consists of a set of LinearRing objects (that is, LineString objects that are both simple and closed) that make up its exterior and interior boundaries.

  • A Polygon has no rings that cross. The rings in the boundary of a Polygon may intersect at a Point, but only as a tangent.

  • A Polygon has no lines, spikes, or punctures.

  • A Polygon has an interior that is a connected point set.

  • A Polygon may have holes. The exterior of a Polygon with holes is not connected. Each hole defines a connected component of the exterior.

The preceding assertions make a Polygon a simple geometry.

12.18.2.8. Class GeometryCollection

A GeometryCollection is a geometry that is a collection of one or more geometries of any class.

All the elements in a GeometryCollection must be in the same Spatial Reference System (that is, in the same coordinate system). There are no other constraints on the elements of a GeometryCollection, although the subclasses of GeometryCollection described in the following sections may restrict membership. Restrictions may be based on:

  • Element type (for example, a MultiPoint may contain only Point elements)

  • Dimension

  • Constraints on the degree of spatial overlap between elements

12.18.2.9. Class MultiPoint

A MultiPoint is a geometry collection composed of Point elements. The points are not connected or ordered in any way.

MultiPoint Examples

  • On a world map, a MultiPoint could represent a chain of small islands.

  • On a city map, a MultiPoint could represent the outlets for a ticket office.

MultiPoint Properties

  • A MultiPoint is a zero-dimensional geometry.

  • A MultiPoint is simple if no two of its Point values are equal (have identical coordinate values).

  • The boundary of a MultiPoint is the empty set.

12.18.2.10. Class MultiCurve

A MultiCurve is a geometry collection composed of Curve elements. MultiCurve is a noninstantiable class.

MultiCurve Properties

  • A MultiCurve is a one-dimensional geometry.

  • A MultiCurve is simple if and only if all of its elements are simple; the only intersections between any two elements occur at points that are on the boundaries of both elements.

  • A MultiCurve boundary is obtained by applying the mod 2 union rule (also known as the odd-even rule): A point is in the boundary of a MultiCurve if it is in the boundaries of an odd number of MultiCurve elements.

  • A MultiCurve is closed if all of its elements are closed.

  • The boundary of a closed MultiCurve is always empty.

12.18.2.11. Class MultiLineString

A MultiLineString is a MultiCurve geometry collection composed of LineString elements.

MultiLineString Examples

  • On a region map, a MultiLineString could represent a river system or a highway system.

12.18.2.12. Class MultiSurface

A MultiSurface is a geometry collection composed of surface elements. MultiSurface is a noninstantiable class. Its only instantiable subclass is MultiPolygon.

MultiSurface Assertions

  • Two MultiSurface surfaces have no interiors that intersect.

  • Two MultiSurface elements have boundaries that intersect at most at a finite number of points.

12.18.2.13. Class MultiPolygon

A MultiPolygon is a MultiSurface object composed of Polygon elements.

MultiPolygon Examples

  • On a region map, a MultiPolygon could represent a system of lakes.

MultiPolygon Assertions

  • A MultiPolygon has no two Polygon elements with interiors that intersect.

  • A MultiPolygon has no two Polygon elements that cross (crossing is also forbidden by the previous assertion), or that touch at an infinite number of points.

  • A MultiPolygon may not have cut lines, spikes, or punctures. A MultiPolygon is a regular, closed point set.

  • A MultiPolygon that has more than one Polygon has an interior that is not connected. The number of connected components of the interior of a MultiPolygon is equal to the number of Polygon values in the MultiPolygon.

MultiPolygon Properties

  • A MultiPolygon is a two-dimensional geometry.

  • A MultiPolygon boundary is a set of closed curves (LineString values) corresponding to the boundaries of its Polygon elements.

  • Each Curve in the boundary of the MultiPolygon is in the boundary of exactly one Polygon element.

  • Every Curve in the boundary of an Polygon element is in the boundary of the MultiPolygon.

12.18.3. Supported Spatial Data Formats

This section describes the standard spatial data formats that are used to represent geometry objects in queries. They are:

  • Well-Known Text (WKT) format

  • Well-Known Binary (WKB) format

Internally, MySQL stores geometry values in a format that is not identical to either WKT or WKB format.

12.18.3.1. Well-Known Text (WKT) Format

The Well-Known Text (WKT) representation of Geometry is designed to exchange geometry data in ASCII form. For a Backus-Naur grammar that specifies the formal production rules for writing WKT values, see the OpenGIS specification document referenced in Section 12.18, “Spatial Extensions”.

Examples of WKT representations of geometry objects:

  • A Point:

    POINT(15 20)

    Note that point coordinates are specified with no separating comma. This differs from the syntax for the SQL POINT() function, which requires a comma between the coordinates. Take care to use the syntax appropriate to the context of a given spatial operation. For example, the following statements both extract the X-coordinate from a Point object. The first produces the object directly using the POINT() function. The second uses a WKT representation converted to a Point with GeomFromText().

    mysql> SELECT X(POINT(15, 20));
    +------------------+
    | X(POINT(15, 20)) |
    +------------------+
    |               15 |
    +------------------+
    
    mysql> SELECT X(GeomFromText('POINT(15 20)'));
    +---------------------------------+
    | X(GeomFromText('POINT(15 20)')) |
    +---------------------------------+
    |                              15 |
    +---------------------------------+
    
  • A LineString with four points:

    LINESTRING(0 0, 10 10, 20 25, 50 60)

    Note that point coordinate pairs are separated by commas.

  • A Polygon with one exterior ring and one interior ring:

    POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))
  • A MultiPoint with three Point values:

    MULTIPOINT(0 0, 20 20, 60 60)
  • A MultiLineString with two LineString values:

    MULTILINESTRING((10 10, 20 20), (15 15, 30 15))
  • A MultiPolygon with two Polygon values:

    MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((5 5,7 5,7 7,5 7, 5 5)))
  • A GeometryCollection consisting of two Point values and one LineString:

    GEOMETRYCOLLECTION(POINT(10 10), POINT(30 30), LINESTRING(15 15, 20 20))

12.18.3.2. Well-Known Binary (WKB) Format

The Well-Known Binary (WKB) representation for geometric values is defined by the OpenGIS specification. It is also defined in the ISO SQL/MM Part 3: Spatial standard.

WKB is used to exchange geometry data as binary streams represented by BLOB values containing geometric WKB information.

WKB uses 1-byte unsigned integers, 4-byte unsigned integers, and 8-byte double-precision numbers (IEEE 754 format). A byte is eight bits.

For example, a WKB value that corresponds to POINT(1 1) consists of this sequence of 21 bytes (each represented here by two hex digits):

0101000000000000000000F03F000000000000F03F

The sequence may be broken down into these components:

Byte order : 01
WKB type   : 01000000
X          : 000000000000F03F
Y          : 000000000000F03F

Component representation is as follows:

  • The byte order may be either 1 or 0 to indicate little-endian or big-endian storage. The little-endian and big-endian byte orders are also known as Network Data Representation (NDR) and External Data Representation (XDR), respectively.

  • The WKB type is a code that indicates the geometry type. Values from 1 through 7 indicate Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection.

  • A Point value has X and Y coordinates, each represented as a double-precision value.

WKB values for more complex geometry values are represented by more complex data structures, as detailed in the OpenGIS specification.

12.18.4. Creating a Spatially Enabled MySQL Database

This section describes the data types you can use for representing spatial data in MySQL, and the functions available for creating and retrieving spatial values.

12.18.4.1. MySQL Spatial Data Types

MySQL has data types that correspond to OpenGIS classes. Some of these types hold single geometry values:

  • GEOMETRY

  • POINT

  • LINESTRING

  • POLYGON

GEOMETRY can store geometry values of any type. The other single-value types (POINT, LINESTRING, and POLYGON) restrict their values to a particular geometry type.

The other data types hold collections of values:

  • MULTIPOINT

  • MULTILINESTRING

  • MULTIPOLYGON

  • GEOMETRYCOLLECTION

GEOMETRYCOLLECTION can store a collection of objects of any type. The other collection types (MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, and GEOMETRYCOLLECTION) restrict collection members to those having a particular geometry type.

12.18.4.2. Creating Spatial Values

This section describes how to create spatial values using Well-Known Text and Well-Known Binary functions that are defined in the OpenGIS standard, and using MySQL-specific functions.

12.18.4.2.1. Creating Geometry Values Using WKT Functions

MySQL provides a number of functions that take as arguments a Well-Known Text representation and, optionally, a spatial reference system identifier (SRID). They return the corresponding geometry.

GeomFromText() accepts a WKT of any geometry type as its first argument. An implementation also provides type-specific construction functions for construction of geometry values of each geometry type.

The OpenGIS specification also defines the following optional functions, which MySQL does not implement. These functions construct Polygon or MultiPolygon values based on the WKT representation of a collection of rings or closed LineString values. These values may intersect.

  • BdMPolyFromText(wkt,srid)

    Constructs a MultiPolygon value from a MultiLineString value in WKT format containing an arbitrary collection of closed LineString values.

  • BdPolyFromText(wkt,srid)

    Constructs a Polygon value from a MultiLineString value in WKT format containing an arbitrary collection of closed LineString values.

12.18.4.2.2. Creating Geometry Values Using WKB Functions

MySQL provides a number of functions that take as arguments a BLOB containing a Well-Known Binary representation and, optionally, a spatial reference system identifier (SRID). They return the corresponding geometry.

These functions also accept geometry objects for compatibility with the return value of the functions in Section 12.18.4.2.3, “Creating Geometry Values Using MySQL-Specific Functions”. Thus, those functions may be used to provide the first argument to the functions in this section.

The OpenGIS specification also describes optional functions for constructing Polygon or MultiPolygon values based on the WKB representation of a collection of rings or closed LineString values. These values may intersect. MySQL does not implement these functions:

  • BdMPolyFromWKB(wkb,srid)

    Constructs a MultiPolygon value from a MultiLineString value in WKB format containing an arbitrary collection of closed LineString values.

  • BdPolyFromWKB(wkb,srid)

    Constructs a Polygon value from a MultiLineString value in WKB format containing an arbitrary collection of closed LineString values.

12.18.4.2.3. Creating Geometry Values Using MySQL-Specific Functions

MySQL provides a set of useful nonstandard functions for creating geometry values. The functions described in this section are MySQL extensions to the OpenGIS specification.

These functions produce geometry objects from either WKB values or geometry objects as arguments. If any argument is not a proper WKB or geometry representation of the proper object type, the return value is NULL.

For example, you can insert the geometry return value from Point() directly into a Point column:

INSERT INTO t1 (pt_col) VALUES(Point(1,2));
  • GeometryCollection(g1,g2,...)

    Constructs a GeometryCollection.

  • LineString(pt1,pt2,...)

    Constructs a LineString value from a number of Point or WKB Point arguments. If the number of arguments is less than two, the return value is NULL.

  • MultiLineString(ls1,ls2,...)

    Constructs a MultiLineString value using LineString or WKB LineString arguments.

  • MultiPoint(pt1,pt2,...)

    Constructs a MultiPoint value using Point or WKB Point arguments.

  • MultiPolygon(poly1,poly2,...)

    Constructs a MultiPolygon value from a set of Polygon or WKB Polygon arguments.

  • Point(x,y)

    Constructs a Point using its coordinates.

  • Polygon(ls1,ls2,...)

    Constructs a Polygon value from a number of LineString or WKB LineString arguments. If any argument does not represent a LinearRing (that is, not a closed and simple LineString), the return value is NULL.

12.18.4.3. Creating Spatial Columns

MySQL provides a standard way of creating spatial columns for geometry types, for example, with CREATE TABLE or ALTER TABLE. Currently, spatial columns are supported for MyISAM, InnoDB, NDB, and ARCHIVE tables. See also the annotations about spatial indexes under Section 12.18.6.1, “Creating Spatial Indexes”.

  • Use the CREATE TABLE statement to create a table with a spatial column:

    CREATE TABLE geom (g GEOMETRY);
  • Use the ALTER TABLE statement to add or drop a spatial column to or from an existing table:

    ALTER TABLE geom ADD pt POINT;
    ALTER TABLE geom DROP pt;

12.18.4.4. Populating Spatial Columns

After you have created spatial columns, you can populate them with spatial data.

Values should be stored in internal geometry format, but you can convert them to that format from either Well-Known Text (WKT) or Well-Known Binary (WKB) format. The following examples demonstrate how to insert geometry values into a table by converting WKT values into internal geometry format:

  • Perform the conversion directly in the INSERT statement:

    INSERT INTO geom VALUES (GeomFromText('POINT(1 1)'));
    
    SET @g = 'POINT(1 1)';
    INSERT INTO geom VALUES (GeomFromText(@g));
  • Perform the conversion prior to the INSERT:

    SET @g = GeomFromText('POINT(1 1)');
    INSERT INTO geom VALUES (@g);

The following examples insert more complex geometries into the table:

SET @g = 'LINESTRING(0 0,1 1,2 2)';
INSERT INTO geom VALUES (GeomFromText(@g));

SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))';
INSERT INTO geom VALUES (GeomFromText(@g));

SET @g =
'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))';
INSERT INTO geom VALUES (GeomFromText(@g));

The preceding examples all use GeomFromText() to create geometry values. You can also use type-specific functions:

SET @g = 'POINT(1 1)';
INSERT INTO geom VALUES (PointFromText(@g));

SET @g = 'LINESTRING(0 0,1 1,2 2)';
INSERT INTO geom VALUES (LineStringFromText(@g));

SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))';
INSERT INTO geom VALUES (PolygonFromText(@g));

SET @g =
'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))';
INSERT INTO geom VALUES (GeomCollFromText(@g));

Note that if a client application program wants to use WKB representations of geometry values, it is responsible for sending correctly formed WKB in queries to the server. However, there are several ways of satisfying this requirement. For example:

  • Inserting a POINT(1 1) value with hex literal syntax:

    mysql> INSERT INTO geom VALUES
        -> (GeomFromWKB(0x0101000000000000000000F03F000000000000F03F));
    
  • An ODBC application can send a WKB representation, binding it to a placeholder using an argument of BLOB type:

    INSERT INTO geom VALUES (GeomFromWKB(?))

    Other programming interfaces may support a similar placeholder mechanism.

  • In a C program, you can escape a binary value using mysql_real_escape_string() and include the result in a query string that is sent to the server. See Section 21.8.7.54, “mysql_real_escape_string().

12.18.4.5. Fetching Spatial Data

Geometry values stored in a table can be fetched in internal format. You can also convert them into WKT or WKB format.

  • Fetching spatial data in internal format:

    Fetching geometry values using internal format can be useful in table-to-table transfers:

    CREATE TABLE geom2 (g GEOMETRY) SELECT g FROM geom;
  • Fetching spatial data in WKT format:

    The AsText() function converts a geometry from internal format into a WKT string.

    SELECT AsText(g) FROM geom;
  • Fetching spatial data in WKB format:

    The AsBinary() function converts a geometry from internal format into a BLOB containing the WKB value.

    SELECT AsBinary(g) FROM geom;

12.18.5. Spatial Analysis Functions

After populating spatial columns with values, you are ready to query and analyze them. MySQL provides a set of functions to perform various operations on spatial data. These functions can be grouped into four major categories according to the type of operation they perform:

  • Functions that convert geometries between various formats

  • Functions that provide access to qualitative or quantitative properties of a geometry

  • Functions that describe relations between two geometries

  • Functions that create new geometries from existing ones

Spatial analysis functions can be used in many contexts, such as:

  • Any interactive SQL program, such as mysql.

  • Application programs written in any language that supports a MySQL client API

12.18.5.1. Geometry Format Conversion Functions

MySQL supports the following functions for converting geometry values between internal format and either WKT or WKB format:

12.18.5.2. Geometry Property Functions

Each function that belongs to this group takes a geometry value as its argument and returns some quantitative or qualitative property of the geometry. Some functions restrict their argument type. Such functions return NULL if the argument is of an incorrect geometry type. For example, Area() returns NULL if the object type is neither Polygon nor MultiPolygon.

12.18.5.2.1. General Geometry Functions

The functions listed in this section do not restrict their argument and accept a geometry value of any type.

  • Dimension(g)

    Returns the inherent dimension of the geometry value g. The result can be –1, 0, 1, or 2. The meaning of these values is given in Section 12.18.2.2, “Class Geometry.

    mysql> SELECT Dimension(GeomFromText('LineString(1 1,2 2)'));
    +------------------------------------------------+
    | Dimension(GeomFromText('LineString(1 1,2 2)')) |
    +------------------------------------------------+
    |                                              1 |
    +------------------------------------------------+
    
  • Envelope(g)

    Returns the Minimum Bounding Rectangle (MBR) for the geometry value g. The result is returned as a Polygon value.

    The polygon is defined by the corner points of the bounding box:

    POLYGON((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
    mysql> SELECT AsText(Envelope(GeomFromText('LineString(1 1,2 2)')));
    +-------------------------------------------------------+
    | AsText(Envelope(GeomFromText('LineString(1 1,2 2)'))) |
    +-------------------------------------------------------+
    | POLYGON((1 1,2 1,2 2,1 2,1 1))                        |
    +-------------------------------------------------------+
    
  • GeometryType(g)

    Returns as a binary string the name of the geometry type of which the geometry instance g is a member. The name corresponds to one of the instantiable Geometry subclasses.

    mysql> SELECT GeometryType(GeomFromText('POINT(1 1)'));
    +------------------------------------------+
    | GeometryType(GeomFromText('POINT(1 1)')) |
    +------------------------------------------+
    | POINT                                    |
    +------------------------------------------+
    
  • IsSimple(g)

    Returns 1 if the geometry value g has no anomalous geometric points, such as self-intersection or self-tangency. IsSimple() returns 0 if the argument is not simple, and NULL if it is NULL.

    The description of each instantiable geometric class given earlier in the chapter includes the specific conditions that cause an instance of that class to be classified as not simple. (See Section 12.18.2.1, “The Geometry Class Hierarchy”.)

  • SRID(g)

    Returns an integer indicating the Spatial Reference System ID for the geometry value g.

    In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry.

    mysql> SELECT SRID(GeomFromText('LineString(1 1,2 2)',101));
    +-----------------------------------------------+
    | SRID(GeomFromText('LineString(1 1,2 2)',101)) |
    +-----------------------------------------------+
    |                                           101 |
    +-----------------------------------------------+
    

The OpenGIS specification also defines the following functions, which MySQL does not implement:

  • Boundary(g)

    Returns a geometry that is the closure of the combinatorial boundary of the geometry value g.

  • IsEmpty(g)

    This function is a placeholder that returns 0 for any valid geometry value, 1 for any invalid geometry value or NULL.

    MySQL does not support GIS EMPTY values such as POINT EMPTY.

12.18.5.2.2. Point Functions

A Point consists of X and Y coordinates, which may be obtained using the following functions:

  • X(p)

    Returns the X-coordinate value for the Point object p as a double-precision number.

    mysql> SELECT X(POINT(56.7, 53.34));
    +-----------------------+
    | X(POINT(56.7, 53.34)) |
    +-----------------------+
    |                  56.7 |
    +-----------------------+
    
  • Y(p)

    Returns the Y-coordinate value for the Point object p as a double-precision number.

    mysql> SELECT Y(POINT(56.7, 53.34));
    +-----------------------+
    | Y(POINT(56.7, 53.34)) |
    +-----------------------+
    |                 53.34 |
    +-----------------------+
    
12.18.5.2.3. LineString Functions

A LineString consists of Point values. You can extract particular points of a LineString, count the number of points that it contains, or obtain its length.

  • EndPoint(ls)

    Returns the Point that is the endpoint of the LineString value ls.

    mysql> SET @ls = 'LineString(1 1,2 2,3 3)';
    mysql> SELECT AsText(EndPoint(GeomFromText(@ls)));
    +-------------------------------------+
    | AsText(EndPoint(GeomFromText(@ls))) |
    +-------------------------------------+
    | POINT(3 3)                          |
    +-------------------------------------+
    
  • GLength(ls)

    Returns as a double-precision number the length of the LineString value ls in its associated spatial reference.

    mysql> SET @ls = 'LineString(1 1,2 2,3 3)';
    mysql> SELECT GLength(GeomFromText(@ls));
    +----------------------------+
    | GLength(GeomFromText(@ls)) |
    +----------------------------+
    |            2.8284271247462 |
    +----------------------------+
    

    GLength() is a nonstandard name. It corresponds to the OpenGIS Length() function.

  • IsClosed(ls)

    Returns 1 if the LineString value ls is closed (that is, its StartPoint() and EndPoint() values are the same) and is simple (does not pass through the same point more than once). Returns 0 if ls is not closed, and –1 if it is NULL.

    mysql> SET @ls1 = 'LineString(1 1,2 2,3 3,2 2)';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SET @ls2 = 'LineString(1 1,2 2,3 3,1 1)';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT IsClosed(GeomFromText(@ls1));
    +------------------------------+
    | IsClosed(GeomFromText(@ls1)) |
    +------------------------------+
    |                            0 |
    +------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT IsClosed(GeomFromText(@ls2));
    +------------------------------+
    | IsClosed(GeomFromText(@ls2)) |
    +------------------------------+
    |                            1 |
    +------------------------------+
    1 row in set (0.00 sec)
    
  • NumPoints(ls)

    Returns the number of Point objects in the LineString value ls.

    mysql> SET @ls = 'LineString(1 1,2 2,3 3)';
    mysql> SELECT NumPoints(GeomFromText(@ls));
    +------------------------------+
    | NumPoints(GeomFromText(@ls)) |
    +------------------------------+
    |                            3 |
    +------------------------------+
    
  • PointN(ls,N)

    Returns the N-th Point in the Linestring value ls. Points are numbered beginning with 1.

    mysql> SET @ls = 'LineString(1 1,2 2,3 3)';
    mysql> SELECT AsText(PointN(GeomFromText(@ls),2));
    +-------------------------------------+
    | AsText(PointN(GeomFromText(@ls),2)) |
    +-------------------------------------+
    | POINT(2 2)                          |
    +-------------------------------------+
    
  • StartPoint(ls)

    Returns the Point that is the start point of the LineString value ls.

    mysql> SET @ls = 'LineString(1 1,2 2,3 3)';
    mysql> SELECT AsText(StartPoint(GeomFromText(@ls)));
    +---------------------------------------+
    | AsText(StartPoint(GeomFromText(@ls))) |
    +---------------------------------------+
    | POINT(1 1)                            |
    +---------------------------------------+
    
12.18.5.2.4. MultiLineString Functions

These functions return properties of MultiLineString values.

  • GLength(mls)

    Returns as a double-precision number the length of the MultiLineString value mls. The length of mls is equal to the sum of the lengths of its elements.

    mysql> SET @mls = 'MultiLineString((1 1,2 2,3 3),(4 4,5 5))';
    mysql> SELECT GLength(GeomFromText(@mls));
    +-----------------------------+
    | GLength(GeomFromText(@mls)) |
    +-----------------------------+
    |             4.2426406871193 |
    +-----------------------------+
    

    GLength() is a nonstandard name. It corresponds to the OpenGIS Length() function.

  • IsClosed(mls)

    Returns 1 if the MultiLineString value mls is closed (that is, the StartPoint() and EndPoint() values are the same for each LineString in mls). Returns 0 if mls is not closed, and –1 if it is NULL.

    mysql> SET @mls = 'MultiLineString((1 1,2 2,3 3),(4 4,5 5))';
    mysql> SELECT IsClosed(GeomFromText(@mls));
    +------------------------------+
    | IsClosed(GeomFromText(@mls)) |
    +------------------------------+
    |                            0 |
    +------------------------------+
    
12.18.5.2.5. Polygon Functions

These functions return properties of Polygon values.

  • Area(poly)

    Returns as a double-precision number the area of the Polygon value poly, as measured in its spatial reference system.

    mysql> SET @poly = 'Polygon((0 0,0 3,3 0,0 0),(1 1,1 2,2 1,1 1))';
    mysql> SELECT Area(GeomFromText(@poly));
    +---------------------------+
    | Area(GeomFromText(@poly)) |
    +---------------------------+
    |                         4 |
    +---------------------------+
    
  • ExteriorRing(poly)

    Returns the exterior ring of the Polygon value poly as a LineString.

    mysql> SET @poly =
        -> 'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
    mysql> SELECT AsText(ExteriorRing(GeomFromText(@poly)));
    +-------------------------------------------+
    | AsText(ExteriorRing(GeomFromText(@poly))) |
    +-------------------------------------------+
    | LINESTRING(0 0,0 3,3 3,3 0,0 0)           |
    +-------------------------------------------+
    
  • InteriorRingN(poly,N)

    Returns the N-th interior ring for the Polygon value poly as a LineString. Rings are numbered beginning with 1.

    mysql> SET @poly =
        -> 'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
    mysql> SELECT AsText(InteriorRingN(GeomFromText(@poly),1));
    +----------------------------------------------+
    | AsText(InteriorRingN(GeomFromText(@poly),1)) |
    +----------------------------------------------+
    | LINESTRING(1 1,1 2,2 2,2 1,1 1)              |
    +----------------------------------------------+
    
  • NumInteriorRings(poly)

    Returns the number of interior rings in the Polygon value poly.

    mysql> SET @poly =
        -> 'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
    mysql> SELECT NumInteriorRings(GeomFromText(@poly));
    +---------------------------------------+
    | NumInteriorRings(GeomFromText(@poly)) |
    +---------------------------------------+
    |                                     1 |
    +---------------------------------------+
    
12.18.5.2.6. MultiPolygon Functions

These functions return properties of MultiPolygon values.

  • Area(mpoly)

    Returns as a double-precision number the area of the MultiPolygon value mpoly, as measured in its spatial reference system.

    mysql> SET @mpoly =
        -> 'MultiPolygon(((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1)))';
    mysql> SELECT Area(GeomFromText(@mpoly));
    +----------------------------+
    | Area(GeomFromText(@mpoly)) |
    +----------------------------+
    |                          8 |
    +----------------------------+
    
  • Centroid(mpoly)

    Returns the mathematical centroid for the MultiPolygon value mpoly as a Point. The result is not guaranteed to be on the MultiPolygon.

    mysql> SET @poly =
        -> GeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7,5 5))');
    mysql> SELECT GeometryType(@poly),AsText(Centroid(@poly));
    +---------------------+--------------------------------------------+
    | GeometryType(@poly) | AsText(Centroid(@poly))                    |
    +---------------------+--------------------------------------------+
    | POLYGON             | POINT(4.958333333333333 4.958333333333333) |
    +---------------------+--------------------------------------------+
    

The OpenGIS specification also defines the following function, which MySQL does not implement:

  • PointOnSurface(mpoly)

    Returns a Point value that is guaranteed to be on the MultiPolygon value mpoly.

12.18.5.2.7. GeometryCollection Functions

These functions return properties of GeometryCollection values.

  • GeometryN(gc,N)

    Returns the N-th geometry in the GeometryCollection value gc. Geometries are numbered beginning with 1.

    mysql> SET @gc = 'GeometryCollection(Point(1 1),LineString(2 2, 3 3))';
    mysql> SELECT AsText(GeometryN(GeomFromText(@gc),1));
    +----------------------------------------+
    | AsText(GeometryN(GeomFromText(@gc),1)) |
    +----------------------------------------+
    | POINT(1 1)                             |
    +----------------------------------------+
    
  • NumGeometries(gc)

    Returns the number of geometries in the GeometryCollection value gc.

    mysql> SET @gc = 'GeometryCollection(Point(1 1),LineString(2 2, 3 3))';
    mysql> SELECT NumGeometries(GeomFromText(@gc));
    +----------------------------------+
    | NumGeometries(GeomFromText(@gc)) |
    +----------------------------------+
    |                                2 |
    +----------------------------------+
    

12.18.5.3. Functions That Create New Geometries from Existing Ones

The following sections describe functions that take geometry values as arguments and return new geometry values.

12.18.5.3.1. Geometry Functions That Produce New Geometries

Section 12.18.5.2, “Geometry Property Functions”, discusses several functions that construct new geometries from existing ones. See that section for descriptions of these functions:

12.18.5.3.2. Spatial Operators

OpenGIS proposes a number of other functions that can produce geometries. They are designed to implement spatial operators.

  • Buffer(g,d)

    Returns a geometry that represents all points whose distance from the geometry value g is less than or equal to a distance of d.

    Buffer() supports negative distances for polygons, multipolygons, and geometry collections containing polygons or multipolygons. For point, multipoint, linestring, multilinestring, and geometry collections not containing any polygons or multipolygons, Buffer() with a negative distance returns NULL.

The OpenGIS specification also defines the following functions, which MySQL does not implement:

  • ConvexHull(g)

    Returns a geometry that represents the convex hull of the geometry value g.

  • Difference(g1,g2)

    Returns a geometry that represents the point set difference of the geometry value g1 with g2.

  • Intersection(g1,g2)

    Returns a geometry that represents the point set intersection of the geometry values g1 with g2.

  • SymDifference(g1,g2)

    Returns a geometry that represents the point set symmetric difference of the geometry value g1 with g2.

  • Union(g1,g2)

    Returns a geometry that represents the point set union of the geometry values g1 and g2.

12.18.5.4. Functions for Testing Spatial Relations Between Geometric Objects

The functions described in these sections take two geometries as input parameters and return a qualitative or quantitative relation between them.

12.18.5.4.1. Relations on Geometry Minimal Bounding Rectangles (MBRs)

MySQL provides several functions that test relations between minimal bounding rectangles of two geometries g1 and g2. The return values 1 and 0 indicate true and false, respectively.

  • MBRContains(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangle of g1 contains the Minimum Bounding Rectangle of g2. This tests the opposite relationship as MBRWithin().

    mysql> SET @g1 = GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
    mysql> SET @g2 = GeomFromText('Point(1 1)');
    mysql> SELECT MBRContains(@g1,@g2), MBRContains(@g2,@g1);
    ----------------------+----------------------+
    | MBRContains(@g1,@g2) | MBRContains(@g2,@g1) |
    +----------------------+----------------------+
    |                    1 |                    0 |
    +----------------------+----------------------+
    
  • MBRDisjoint(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangles of the two geometries g1 and g2 are disjoint (do not intersect).

  • MBREqual(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangles of the two geometries g1 and g2 are the same.

  • MBRIntersects(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangles of the two geometries g1 and g2 intersect.

  • MBROverlaps(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangles of the two geometries g1 and g2 overlap. The term spatially overlaps is used if two geometries intersect and their intersection results in a geometry of the same dimension but not equal to either of the given geometries.

  • MBRTouches(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangles of the two geometries g1 and g2 touch. Two geometries spatially touch if the interiors of the geometries do not intersect, but the boundary of one of the geometries intersects either the boundary or the interior of the other.

  • MBRWithin(g1,g2)

    Returns 1 or 0 to indicate whether the Minimum Bounding Rectangle of g1 is within the Minimum Bounding Rectangle of g2. This tests the opposite relationship as MBRContains().

    mysql> SET @g1 = GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
    mysql> SET @g2 = GeomFromText('Polygon((0 0,0 5,5 5,5 0,0 0))');
    mysql> SELECT MBRWithin(@g1,@g2), MBRWithin(@g2,@g1);
    +--------------------+--------------------+
    | MBRWithin(@g1,@g2) | MBRWithin(@g2,@g1) |
    +--------------------+--------------------+
    |                  1 |                  0 |
    +--------------------+--------------------+
    
12.18.5.4.2. Functions That Test Spatial Relationships Between Geometries

The OpenGIS specification defines the following functions. They test the relationship between two geometry values g1 and g2.

The return values 1 and 0 indicate true and false, respectively.

Note

MySQL originally implemented these functions such that they used object bounding rectangles and returned the same result as the corresponding MBR-based functions. Corresponding versions are available that use precise object shapes. These versions are named with an ST_ prefix. For example, Contains() uses object bounding rectangles, whereas ST_Contains() uses object shapes.

There are also ST_ aliases for existing spatial functions that were already exact. For example, ST_IsEmpty() is an alias for IsEmpty()

Functions That Use Object Shapes
  • ST_Contains(g1,g2)

    Returns 1 or 0 to indicate whether g1 completely contains g2. This tests the opposite relationship as ST_Within().

  • ST_Crosses(g1,g2)

    Returns 1 if g1 spatially crosses g2. Returns NULL if g1 is a Polygon or a MultiPolygon, or if g2 is a Point or a MultiPoint. Otherwise, returns 0.

    The term spatially crosses denotes a spatial relation between two given geometries that has the following properties:

    • The two geometries intersect

    • Their intersection results in a geometry that has a dimension that is one less than the maximum dimension of the two given geometries

    • Their intersection is not equal to either of the two given geometries

  • ST_Disjoint(g1,g2)

    Returns 1 or 0 to indicate whether g1 is spatially disjoint from (does not intersect) g2.

  • ST_Equals(g1,g2)

    Returns 1 or 0 to indicate whether g1 is spatially equal to g2.

  • ST_Intersects(g1,g2)

    Returns 1 or 0 to indicate whether g1 spatially intersects g2.

  • ST_Overlaps(g1,g2)

    Returns 1 or 0 to indicate whether g1 spatially overlaps g2. The term spatially overlaps is used if two geometries intersect and their intersection results in a geometry of the same dimension but not equal to either of the given geometries.

  • ST_Touches(g1,g2)

    Returns 1 or 0 to indicate whether g1 spatially touches g2. Two geometries spatially touch if the interiors of the geometries do not intersect, but the boundary of one of the geometries intersects either the boundary or the interior of the other.

  • ST_Within(g1,g2)

    Returns 1 or 0 to indicate whether g1 is spatially within g2. This tests the opposite relationship as ST_Contains().

Functions That Use Bounding Rectangles
  • Contains(g1,g2)

    Returns 1 or 0 to indicate whether g1 completely contains g2. This tests the opposite relationship as Within().

  • Crosses(g1,g2)

    Returns 1 if g1 spatially crosses g2. Returns NULL if g1 is a Polygon or a MultiPolygon, or if g2 is a Point or a MultiPoint. Otherwise, returns 0.

    The term spatially crosses denotes a spatial relation between two given geometries that has the following properties:

    • The two geometries intersect

    • Their intersection results in a geometry that has a dimension that is one less than the maximum dimension of the two given geometries

    • Their intersection is not equal to either of the two given geometries

  • Disjoint(g1,g2)

    Returns 1 or 0 to indicate whether g1 is spatially disjoint from (does not intersect) g2.

  • Equals(g1,g2)

    Returns 1 or 0 to indicate whether g1 is spatially equal to g2.

  • Intersects(g1,g2)

    Returns 1 or 0 to indicate whether g1 spatially intersects g2.

  • Overlaps(g1,g2)

    Returns 1 or 0 to indicate whether g1 spatially overlaps g2. The term spatially overlaps is used if two geometries intersect and their intersection results in a geometry of the same dimension but not equal to either of the given geometries.

  • Touches(g1,g2)

    Returns 1 or 0 to indicate whether g1 spatially touches g2. Two geometries spatially touch if the interiors of the geometries do not intersect, but the boundary of one of the geometries intersects either the boundary or the interior of the other.

  • Within(g1,g2)

    Returns 1 or 0 to indicate whether g1 is spatially within g2. This tests the opposite relationship as Contains().

12.18.6. Optimizing Spatial Analysis

For MyISAM tables, Search operations in nonspatial databases can be optimized using SPATIAL indexes. This is true for spatial databases as well. With the help of a great variety of multi-dimensional indexing methods that have previously been designed, it is possible to optimize spatial searches. The most typical of these are:

  • Point queries that search for all objects that contain a given point

  • Region queries that search for all objects that overlap a given region

MySQL uses R-Trees with quadratic splitting for SPATIAL indexes on spatial columns. A SPATIAL index is built using the MBR of a geometry. For most geometries, the MBR is a minimum rectangle that surrounds the geometries. For a horizontal or a vertical linestring, the MBR is a rectangle degenerated into the linestring. For a point, the MBR is a rectangle degenerated into the point.

It is also possible to create normal indexes on spatial columns. In a non-SPATIAL index, you must declare a prefix for any spatial column except for POINT columns.

MyISAM supports both SPATIAL and non-SPATIAL indexes. Other storage engines support non-SPATIAL indexes, as described in Section 13.1.11, “CREATE INDEX Syntax”.

12.18.6.1. Creating Spatial Indexes

For MyISAM tables, MySQL can create spatial indexes using syntax similar to that for creating regular indexes, but extended with the SPATIAL keyword. Currently, columns in spatial indexes must be declared NOT NULL. The following examples demonstrate how to create spatial indexes:

  • With CREATE TABLE:

    CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g)) ENGINE=MyISAM;
  • With ALTER TABLE:

    ALTER TABLE geom ADD SPATIAL INDEX(g);
  • With CREATE INDEX:

    CREATE SPATIAL INDEX sp_index ON geom (g);

For MyISAM tables, SPATIAL INDEX creates an R-tree index. For storage engines that support nonspatial indexing of spatial columns, the engine creates a B-tree index. A B-tree index on spatial values will be useful for exact-value lookups, but not for range scans.

For more information on indexing spatial columns, see Section 13.1.11, “CREATE INDEX Syntax”.

To drop spatial indexes, use ALTER TABLE or DROP INDEX:

Example: Suppose that a table geom contains more than 32,000 geometries, which are stored in the column g of type GEOMETRY. The table also has an AUTO_INCREMENT column fid for storing object ID values.

mysql> DESCRIBE geom;
+-------+----------+------+-----+---------+----------------+
| Field | Type     | Null | Key | Default | Extra          |
+-------+----------+------+-----+---------+----------------+
| fid   | int(11)  |      | PRI | NULL    | auto_increment |
| g     | geometry |      |     |         |                |
+-------+----------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*) FROM geom;
+----------+
| count(*) |
+----------+
|    32376 |
+----------+
1 row in set (0.00 sec)

To add a spatial index on the column g, use this statement:

mysql> ALTER TABLE geom ADD SPATIAL INDEX(g);
Query OK, 32376 rows affected (4.05 sec)
Records: 32376  Duplicates: 0  Warnings: 0

12.18.6.2. Using a Spatial Index

The optimizer investigates whether available spatial indexes can be involved in the search for queries that use a function such as MBRContains() or MBRWithin() in the WHERE clause. The following query finds all objects that are in the given rectangle:

mysql> SET @poly =
    -> 'Polygon((30000 15000,
                               31000 15000,
                               31000 16000,
                               30000 16000,
                               30000 15000))';
mysql> SELECT fid,AsText(g) FROM geom WHERE
    -> MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+
| fid | AsText(g)                                                     |
+-----+---------------------------------------------------------------+
|  21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... |
|  22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... |
|  23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... |
|  24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... |
|  25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... |
|  26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... |
| 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... |
|   1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... |
|   2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... |
|   3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... |
|   4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... |
|   5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... |
|   6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... |
|   7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... |
|  10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... |
|  11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... |
|  13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... |
| 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... |
| 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... |
| 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... |
+-----+---------------------------------------------------------------+
20 rows in set (0.00 sec)

Use EXPLAIN to check the way this query is executed:

mysql> SET @poly =
    -> 'Polygon((30000 15000,
                               31000 15000,
                               31000 16000,
                               30000 16000,
                               30000 15000))';
mysql> EXPLAIN SELECT fid,AsText(g) FROM geom WHERE
    -> MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: geom
         type: range
possible_keys: g
          key: g
      key_len: 32
          ref: NULL
         rows: 50
        Extra: Using where
1 row in set (0.00 sec)

Check what would happen without a spatial index:

mysql> SET @poly =
    -> 'Polygon((30000 15000,
                               31000 15000,
                               31000 16000,
                               30000 16000,
                               30000 15000))';
mysql> EXPLAIN SELECT fid,AsText(g) FROM g IGNORE INDEX (g) WHERE
    -> MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: geom
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 32376
        Extra: Using where
1 row in set (0.00 sec)

Executing the SELECT statement without the spatial index yields the same result but causes the execution time to rise from 0.00 seconds to 0.46 seconds:

mysql> SET @poly =
    -> 'Polygon((30000 15000,
                               31000 15000,
                               31000 16000,
                               30000 16000,
                               30000 15000))';
mysql> SELECT fid,AsText(g) FROM geom IGNORE INDEX (g) WHERE
    -> MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+
| fid | AsText(g)                                                     |
+-----+---------------------------------------------------------------+
|   1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... |
|   2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... |
|   3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... |
|   4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... |
|   5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... |
|   6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... |
|   7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... |
|  10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... |
|  11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... |
|  13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... |
|  21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... |
|  22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... |
|  23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... |
|  24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... |
|  25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... |
|  26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... |
| 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... |
| 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... |
| 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... |
| 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... |
+-----+---------------------------------------------------------------+
20 rows in set (0.46 sec)

12.18.7. MySQL Conformance and Compatibility

MySQL does not yet implement the following GIS features:

  • Additional Metadata Views

    OpenGIS specifications propose several additional metadata views. For example, a system view named GEOMETRY_COLUMNS contains a description of geometry columns, one row for each geometry column in the database.

  • The OpenGIS function Length() on LineString and MultiLineString currently should be called in MySQL as GLength()

    The problem is that there is an existing SQL function Length() that calculates the length of string values, and sometimes it is not possible to distinguish whether the function is called in a textual or spatial context. We need either to solve this somehow, or decide on another function name.

12.19. Precision Math

MySQL 5.7 provides support for precision math: numeric value handling that results in extremely accurate results and a high degree control over invalid values. Precision math is based on these two features:

  • SQL modes that control how strict the server is about accepting or rejecting invalid data.

  • The MySQL library for fixed-point arithmetic.

These features have several implications for numeric operations and provide a high degree of compliance with standard SQL:

  • Precise calculations: For exact-value numbers, calculations do not introduce floating-point errors. Instead, exact precision is used. For example, MySQL treats a number such as .0001 as an exact value rather than as an approximation, and summing it 10,000 times produces a result of exactly 1, not a value that is merely close to 1.

  • Well-defined rounding behavior: For exact-value numbers, the result of ROUND() depends on its argument, not on environmental factors such as how the underlying C library works.

  • Platform independence: Operations on exact numeric values are the same across different platforms such as Windows and Unix.

  • Control over handling of invalid values: Overflow and division by zero are detectable and can be treated as errors. For example, you can treat a value that is too large for a column as an error rather than having the value truncated to lie within the range of the column's data type. Similarly, you can treat division by zero as an error rather than as an operation that produces a result of NULL. The choice of which approach to take is determined by the setting of the server SQL mode.

The following discussion covers several aspects of how precision math works, including possible incompatibilities with older applications. At the end, some examples are given that demonstrate how MySQL 5.7 handles numeric operations precisely. For information about controlling the SQL mode, see Section 5.1.7, “Server SQL Modes”.

12.19.1. Types of Numeric Values

The scope of precision math for exact-value operations includes the exact-value data types (DECIMAL and integer types) and exact-value numeric literals. Approximate-value data types and numeric literals are handled as floating-point numbers.

Exact-value numeric literals have an integer part or fractional part, or both. They may be signed. Examples: 1, .2, 3.4, -5, -6.78, +9.10.

Approximate-value numeric literals are represented in scientific notation with a mantissa and exponent. Either or both parts may be signed. Examples: 1.2E3, 1.2E-3, -1.2E3, -1.2E-3.

Two numbers that look similar may be treated differently. For example, 2.34 is an exact-value (fixed-point) number, whereas 2.34E0 is an approximate-value (floating-point) number.

The DECIMAL data type is a fixed-point type and calculations are exact. In MySQL, the DECIMAL type has several synonyms: NUMERIC, DEC, FIXED. The integer types also are exact-value types.

The FLOAT and DOUBLE data types are floating-point types and calculations are approximate. In MySQL, types that are synonymous with FLOAT or DOUBLE are DOUBLE PRECISION and REAL.

12.19.2. DECIMAL Data Type Changes

This section discusses the characteristics of the DECIMAL data type (and its synonyms) in MySQL 5.7, with particular regard to the following topics:

  • Maximum number of digits

  • Storage format

  • Storage requirements

  • The nonstandard MySQL extension to the upper range of DECIMAL columns

Possible incompatibilities with applications that are written for older versions of MySQL (prior to 5.0.3) are noted throughout this section.

The declaration syntax for a DECIMAL column is DECIMAL(M,D). The ranges of values for the arguments in MySQL 5.7 are as follows:

  • M is the maximum number of digits (the precision). It has a range of 1 to 65. (Older versions of MySQL permitted a range of 1 to 254.)

  • D is the number of digits to the right of the decimal point (the scale). It has a range of 0 to 30 and must be no larger than M.

The maximum value of 65 for M means that calculations on DECIMAL values are accurate up to 65 digits. This limit of 65 digits of precision also applies to exact-value numeric literals, so the maximum range of such literals differs from before. (In older versions of MySQL, decimal values could have up to 254 digits. However, calculations were done using floating-point and thus were approximate, not exact.)

Values for DECIMAL columns in MySQL 5.7 are stored using a binary format that packs nine decimal digits into 4 bytes. The storage requirements for the integer and fractional parts of each value are determined separately. Each multiple of nine digits requires 4 bytes, and any remaining digits left over require some fraction of 4 bytes. The storage required for remaining digits is given by the following table.

Leftover DigitsNumber of Bytes
00
1–21
3–42
5–63
7–94

For example, a DECIMAL(18,9) column has nine digits on either side of the decimal point, so the integer part and the fractional part each require 4 bytes. A DECIMAL(20,6) column has fourteen integer digits and six fractional digits. The integer digits require four bytes for nine of the digits and 3 bytes for the remaining five digits. The six fractional digits require 3 bytes.

Unlike some older versions of MySQL, DECIMAL columns in MySQL 5.7 do not store a leading + character or - character or leading 0 digits. If you insert +0003.1 into a DECIMAL(5,1) column, it is stored as 3.1. For negative numbers, a literal - character is not stored. Applications that rely on the older behavior must be modified to account for this change.

DECIMAL columns in MySQL 5.7 do not permit values larger than the range implied by the column definition. For example, a DECIMAL(3,0) column supports a range of -999 to 999. A DECIMAL(M,D) column permits at most M - D digits to the left of the decimal point. This is not compatible with applications relying on older versions of MySQL that permitted storing an extra digit in lieu of a + sign.

The SQL standard requires that the precision of NUMERIC(M,D) be exactly M digits. For DECIMAL(M,D), the standard requires a precision of at least M digits but permits more. In MySQL, DECIMAL(M,D) and NUMERIC(M,D) are the same, and both have a precision of exactly M digits.

For more detailed information about porting applications that rely on the old treatment of the DECIMAL data type, see the MySQL 5.0 Reference Manual.

12.19.3. Expression Handling

With precision math, exact-value numbers are used as given whenever possible. For example, numbers in comparisons are used exactly as given without a change in value. In strict SQL mode, for INSERT into a column with an exact data type (DECIMAL or integer), a number is inserted with its exact value if it is within the column range. When retrieved, the value should be the same as what was inserted. (Without strict mode, truncation for INSERT is permissible.)

Handling of a numeric expression depends on what kind of values the expression contains:

  • If any approximate values are present, the expression is approximate and is evaluated using floating-point arithmetic.

  • If no approximate values are present, the expression contains only exact values. If any exact value contains a fractional part (a value following the decimal point), the expression is evaluated using DECIMAL exact arithmetic and has a precision of 65 digits. The term exact is subject to the limits of what can be represented in binary. For example, 1.0/3.0 can be approximated in decimal notation as .333..., but not written as an exact number, so (1.0/3.0)*3.0 does not evaluate to exactly 1.0.

  • Otherwise, the expression contains only integer values. The expression is exact and is evaluated using integer arithmetic and has a precision the same as BIGINT (64 bits).

If a numeric expression contains any strings, they are converted to double-precision floating-point values and the expression is approximate.

Inserts into numeric columns are affected by the SQL mode, which is controlled by the sql_mode system variable. (See Section 5.1.7, “Server SQL Modes”.) The following discussion mentions strict mode (selected by the STRICT_ALL_TABLES or STRICT_TRANS_TABLES mode values) and ERROR_FOR_DIVISION_BY_ZERO. To turn on all restrictions, you can simply use TRADITIONAL mode, which includes both strict mode values and ERROR_FOR_DIVISION_BY_ZERO:

mysql> SET sql_mode='TRADITIONAL';

If a number is inserted into an exact type column (DECIMAL or integer), it is inserted with its exact value if it is within the column range.

If the value has too many digits in the fractional part, rounding occurs and a warning is generated. Rounding is done as described in Section 12.19.4, “Rounding Behavior”.

If the value has too many digits in the integer part, it is too large and is handled as follows:

  • If strict mode is not enabled, the value is truncated to the nearest legal value and a warning is generated.

  • If strict mode is enabled, an overflow error occurs.

Underflow is not detected, so underflow handling is undefined.

By default, division by zero produces a result of NULL and no warning. With the ERROR_FOR_DIVISION_BY_ZERO SQL mode enabled, MySQL handles division by zero differently:

  • If strict mode is not enabled, a warning occurs.

  • If strict mode is enabled, inserts and updates involving division by zero are prohibited, and an error occurs.

In other words, inserts and updates involving expressions that perform division by zero can be treated as errors, but this requires ERROR_FOR_DIVISION_BY_ZERO in addition to strict mode.

Suppose that we have this statement:

INSERT INTO t SET i = 1/0;

This is what happens for combinations of strict and ERROR_FOR_DIVISION_BY_ZERO modes.

sql_mode ValueResult
'' (Default)No warning, no error; i is set to NULL.
strictNo warning, no error; i is set to NULL.
ERROR_FOR_DIVISION_BY_ZEROWarning, no error; i is set to NULL.
strict,ERROR_FOR_DIVISION_BY_ZEROError condition; no row is inserted.

For inserts of strings into numeric columns, conversion from string to number is handled as follows if the string has nonnumeric contents:

  • A string that does not begin with a number cannot be used as a number and produces an error in strict mode, or a warning otherwise. This includes the empty string.

  • A string that begins with a number can be converted, but the trailing nonnumeric portion is truncated. If the truncated portion contains anything other than spaces, this produces an error in strict mode, or a warning otherwise.

12.19.4. Rounding Behavior

This section discusses precision math rounding for the ROUND() function and for inserts into columns with exact-value types (DECIMAL and integer).

The ROUND() function rounds differently depending on whether its argument is exact or approximate:

  • For exact-value numbers, ROUND() uses the round half up rule: A value with a fractional part of .5 or greater is rounded up to the next integer if positive or down to the next integer if negative. (In other words, it is rounded away from zero.) A value with a fractional part less than .5 is rounded down to the next integer if positive or up to the next integer if negative.

  • For approximate-value numbers, the result depends on the C library. On many systems, this means that ROUND() uses the round to nearest even rule: A value with any fractional part is rounded to the nearest even integer.

The following example shows how rounding differs for exact and approximate values:

mysql> SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+
| ROUND(2.5) | ROUND(25E-1) |
+------------+--------------+
| 3          |            2 |
+------------+--------------+

For inserts into a DECIMAL or integer column, the target is an exact data type, so rounding uses round half away from zero, regardless of whether the value to be inserted is exact or approximate:

mysql> CREATE TABLE t (d DECIMAL(10,0));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t VALUES(2.5),(2.5E0);
Query OK, 2 rows affected, 2 warnings (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 2

mysql> SELECT d FROM t;
+------+
| d    |
+------+
| 3    |
| 3    |
+------+

12.19.5. Precision Math Examples

This section provides some examples that show precision math query results in MySQL 5.7. These examples demonstrate the principles described in Section 12.19.3, “Expression Handling”, and Section 12.19.4, “Rounding Behavior”.

Example 1. Numbers are used with their exact value as given when possible:

mysql> SELECT (.1 + .2) = .3;
+----------------+
| (.1 + .2) = .3 |
+----------------+
|              1 |
+----------------+

For floating-point values, results are inexact:

mysql> SELECT (.1E0 + .2E0) = .3E0;
+----------------------+
| (.1E0 + .2E0) = .3E0 |
+----------------------+
|                    0 |
+----------------------+

Another way to see the difference in exact and approximate value handling is to add a small number to a sum many times. Consider the following stored procedure, which adds .0001 to a variable 1,000 times.

CREATE PROCEDURE p ()
BEGIN
  DECLARE i INT DEFAULT 0;
  DECLARE d DECIMAL(10,4) DEFAULT 0;
  DECLARE f FLOAT DEFAULT 0;
  WHILE i < 10000 DO
    SET d = d + .0001;
    SET f = f + .0001E0;
    SET i = i + 1;
  END WHILE;
  SELECT d, f;
END;

The sum for both d and f logically should be 1, but that is true only for the decimal calculation. The floating-point calculation introduces small errors:

+--------+------------------+
| d      | f                |
+--------+------------------+
| 1.0000 | 0.99999999999991 |
+--------+------------------+

Example 2. Multiplication is performed with the scale required by standard SQL. That is, for two numbers X1 and X2 that have scale S1 and S2, the scale of the result is S1 + S2:

mysql> SELECT .01 * .01;
+-----------+
| .01 * .01 |
+-----------+
| 0.0001    |
+-----------+

Example 3. Rounding behavior for exact-value numbers is well-defined:

Rounding behavior (for example, with the ROUND() function) is independent of the implementation of the underlying C library, which means that results are consistent from platform to platform.

  • Rounding for exact-value columns (DECIMAL and integer) and exact-valued numbers uses the round half away from zero rule. Values with a fractional part of .5 or greater are rounded away from zero to the nearest integer, as shown here:

    mysql> SELECT ROUND(2.5), ROUND(-2.5);
    +------------+-------------+
    | ROUND(2.5) | ROUND(-2.5) |
    +------------+-------------+
    | 3          | -3          |
    +------------+-------------+
    
  • Rounding for floating-point values uses the C library, which on many systems uses the round to nearest even rule. Values with any fractional part on such systems are rounded to the nearest even integer:

    mysql> SELECT ROUND(2.5E0), ROUND(-2.5E0);
    +--------------+---------------+
    | ROUND(2.5E0) | ROUND(-2.5E0) |
    +--------------+---------------+
    |            2 |            -2 |
    +--------------+---------------+
    

Example 4. In strict mode, inserting a value that is out of range for a column causes an error, rather than truncation to a legal value.

When MySQL is not running in strict mode, truncation to a legal value occurs:

mysql> SET sql_mode='';
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t SET i = 128;
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> SELECT i FROM t;
+------+
| i    |
+------+
|  127 |
+------+
1 row in set (0.00 sec)

However, an error occurs if strict mode is in effect:

mysql> SET sql_mode='STRICT_ALL_TABLES';
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t SET i = 128;
ERROR 1264 (22003): Out of range value adjusted for column 'i' at row 1

mysql> SELECT i FROM t;
Empty set (0.00 sec)

Example 5: In strict mode and with ERROR_FOR_DIVISION_BY_ZERO set, division by zero causes an error, not a result of NULL.

In nonstrict mode, division by zero has a result of NULL:

mysql> SET sql_mode='';
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t SET i = 1 / 0;
Query OK, 1 row affected (0.00 sec)

mysql> SELECT i FROM t;
+------+
| i    |
+------+
| NULL |
+------+
1 row in set (0.03 sec)

However, division by zero is an error if the proper SQL modes are in effect:

mysql> SET sql_mode='STRICT_ALL_TABLES,ERROR_FOR_DIVISION_BY_ZERO';
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t SET i = 1 / 0;
ERROR 1365 (22012): Division by 0

mysql> SELECT i FROM t;
Empty set (0.01 sec)

Example 6. Exact-value literals are evaluated as exact values.

Prior to MySQL 5.0.3, exact-value and approximate-value literals both are evaluated as double-precision floating-point values:

mysql> SELECT VERSION();
+------------+
| VERSION()  |
+------------+
| 4.1.18-log |
+------------+
1 row in set (0.01 sec)

mysql> CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.07 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> DESCRIBE t;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| a     | double(3,1) |      |     | 0.0     |       |
| b     | double      |      |     | 0       |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.04 sec)

As of MySQL 5.0.3, the approximate-value literal is evaluated using floating point, but the exact-value literal is handled as DECIMAL:

mysql> SELECT VERSION();
+-----------------+
| VERSION()       |
+-----------------+
| 5.1.6-alpha-log |
+-----------------+
1 row in set (0.11 sec)

mysql> CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> DESCRIBE t;
+-------+-----------------------+------+-----+---------+-------+
| Field | Type                  | Null | Key | Default | Extra |
+-------+-----------------------+------+-----+---------+-------+
| a     | decimal(2,1) unsigned | NO   |     | 0.0     |       |
| b     | double                | NO   |     | 0       |       |
+-------+-----------------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

Example 7. If the argument to an aggregate function is an exact numeric type, the result is also an exact numeric type, with a scale at least that of the argument.

Consider these statements:

mysql> CREATE TABLE t (i INT, d DECIMAL, f FLOAT);
mysql> INSERT INTO t VALUES(1,1,1);
mysql> CREATE TABLE y SELECT AVG(i), AVG(d), AVG(f) FROM t;

Before MySQL 5.0.3, the result is a double no matter the argument type:

mysql> DESCRIBE y;
+--------+--------------+------+-----+---------+-------+
| Field  | Type         | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+-------+
| AVG(i) | double(17,4) | YES  |     | NULL    |       |
| AVG(d) | double(17,4) | YES  |     | NULL    |       |
| AVG(f) | double       | YES  |     | NULL    |       |
+--------+--------------+------+-----+---------+-------+

As of MySQL 5.0.3, the result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type:

mysql> DESCRIBE y;
+--------+---------------+------+-----+---------+-------+
| Field  | Type          | Null | Key | Default | Extra |
+--------+---------------+------+-----+---------+-------+
| AVG(i) | decimal(14,4) | YES  |     | NULL    |       |
| AVG(d) | decimal(14,4) | YES  |     | NULL    |       |
| AVG(f) | double        | YES  |     | NULL    |       |
+--------+---------------+------+-----+---------+-------+

The result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type.