Table of Contents
GROUP BY
Clauses
Expressions can be used at several points in
SQL statements, such as in the
ORDER BY
or HAVING
clauses of
SELECT
statements, in the
WHERE
clause of a
SELECT
,
DELETE
, or
UPDATE
statement, or in
SET
statements. Expressions can be written using literal values, column
values, NULL
, built-in functions, stored
functions, user-defined functions, and operators. This chapter
describes the functions and operators that are permitted for writing
expressions in MySQL. Instructions for writing stored functions and
user-defined functions are given in
Section 18.2, “Using Stored Routines (Procedures and Functions)”, and
Section 22.3, “Adding New Functions to MySQL”. See
Section 9.2.4, “Function Name Parsing and Resolution”, for the rules describing how
the server interprets references to different kinds of functions.
An expression that contains NULL
always produces
a NULL
value unless otherwise indicated in the
documentation for a particular function or operator.
By default, there must be no whitespace between a function name and the parenthesis following it. This helps the MySQL parser distinguish between function calls and references to tables or columns that happen to have the same name as a function. However, spaces around function arguments are permitted.
You can tell the MySQL server to accept spaces after function names
by starting it with the
--sql-mode=IGNORE_SPACE
option. (See
Section 5.1.7, “Server SQL Modes”.) Individual client programs can
request this behavior by using the
CLIENT_IGNORE_SPACE
option for
mysql_real_connect()
. In either
case, all function names become reserved words.
For the sake of brevity, most examples in this chapter display the output from the mysql program in abbreviated form. Rather than showing examples in this format:
mysql> SELECT MOD(29,9);
+-----------+
| mod(29,9) |
+-----------+
| 2 |
+-----------+
1 rows in set (0.00 sec)
This format is used instead:
mysql> SELECT MOD(29,9);
-> 2
Table 12.1. Functions/Operators
Name | Description |
---|---|
ABS() | Return the absolute value |
ACOS() | Return the arc cosine |
ADDDATE() | Add time values (intervals) to a date value |
ADDTIME() | Add time |
AES_DECRYPT() | Decrypt using AES |
AES_ENCRYPT() | Encrypt using AES |
AND , && | Logical AND |
ASCII() | Return numeric value of left-most character |
ASIN() | Return the arc sine |
= | Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement) |
:= | Assign a value |
ATAN2() , ATAN() | Return the arc tangent of the two arguments |
ATAN() | Return the arc tangent |
AVG() | Return the average value of the argument |
BENCHMARK() | Repeatedly execute an expression |
BETWEEN ... AND ... | Check whether a value is within a range of values |
BIN() | Return a string containing binary representation of a number |
BINARY | Cast a string to a binary string |
BIT_AND() | Return bitwise and |
BIT_COUNT() | Return the number of bits that are set |
BIT_LENGTH() | Return length of argument in bits |
BIT_OR() | Return bitwise or |
BIT_XOR() | Return bitwise xor |
& | Bitwise AND |
~ | Invert bits |
| | Bitwise OR |
^ | Bitwise XOR |
CASE | Case operator |
CAST() | Cast a value as a certain type |
CEIL() | Return the smallest integer value not less than the argument |
CEILING() | Return the smallest integer value not less than the argument |
CHAR_LENGTH() | Return number of characters in argument |
CHAR() | Return the character for each integer passed |
CHARACTER_LENGTH() | A synonym for CHAR_LENGTH() |
CHARSET() | Return the character set of the argument |
COALESCE() | Return the first non-NULL argument |
COERCIBILITY() | Return the collation coercibility value of the string argument |
COLLATION() | Return the collation of the string argument |
COMPRESS() | Return result as a binary string |
CONCAT_WS() | Return concatenate with separator |
CONCAT() | Return concatenated string |
CONNECTION_ID() | Return the connection ID (thread ID) for the connection |
CONV() | Convert numbers between different number bases |
CONVERT_TZ() | Convert from one timezone to another |
CONVERT() | Cast a value as a certain type |
COS() | Return the cosine |
COT() | Return the cotangent |
COUNT(DISTINCT) | Return the count of a number of different values |
COUNT() | Return a count of the number of rows returned |
CRC32() | Compute a cyclic redundancy check value |
CURDATE() | Return the current date |
CURRENT_DATE() , CURRENT_DATE | Synonyms for CURDATE() |
CURRENT_TIME() , CURRENT_TIME | Synonyms for CURTIME() |
CURRENT_TIMESTAMP() , CURRENT_TIMESTAMP | Synonyms for NOW() |
CURRENT_USER() , CURRENT_USER | The authenticated user name and host name |
CURTIME() | Return the current time |
DATABASE() | Return the default (current) database name |
DATE_ADD() | Add time values (intervals) to a date value |
DATE_FORMAT() | Format date as specified |
DATE_SUB() | Subtract a time value (interval) from a date |
DATE() | Extract the date part of a date or datetime expression |
DATEDIFF() | Subtract two dates |
DAY() | Synonym for DAYOFMONTH() |
DAYNAME() | Return the name of the weekday |
DAYOFMONTH() | Return the day of the month (0-31) |
DAYOFWEEK() | Return the weekday index of the argument |
DAYOFYEAR() | Return the day of the year (1-366) |
DECODE() | Decodes a string encrypted using ENCODE() |
DEFAULT() | Return the default value for a table column |
DEGREES() | Convert radians to degrees |
DES_DECRYPT() | Decrypt a string |
DES_ENCRYPT() | Encrypt a string |
DIV | Integer division |
/ | Division operator |
ELT() | Return string at index number |
ENCODE() | Encode a string |
ENCRYPT() | Encrypt a string |
<=> | NULL-safe equal to operator |
= | Equal operator |
EXP() | Raise to the power of |
EXPORT_SET() | Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
EXTRACT() | Extract part of a date |
ExtractValue() | Extracts a value from an XML string using XPath notation |
FIELD() | Return the index (position) of the first argument in the subsequent arguments |
FIND_IN_SET() | Return the index position of the first argument within the second argument |
FLOOR() | Return the largest integer value not greater than the argument |
FORMAT() | Return a number formatted to specified number of decimal places |
FOUND_ROWS() | For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause |
FROM_BASE64() | Decode to a base-64 string and return result |
FROM_DAYS() | Convert a day number to a date |
FROM_UNIXTIME() | Format UNIX timestamp as a date |
GET_FORMAT() | Return a date format string |
GET_LOCK() | Get a named lock |
>= | Greater than or equal operator |
> | Greater than operator |
GREATEST() | Return the largest argument |
GROUP_CONCAT() | Return a concatenated string |
GTID_SUBSET() | Return true if all GTIDs in subset are also in set; otherwise false. |
GTID_SUBTRACT() | Return all GTIDs in set that are not in subset. |
HEX() | Return a hexadecimal representation of a decimal or string value |
HOUR() | Extract the hour |
IF() | If/else construct |
IFNULL() | Null if/else construct |
IN() | Check whether a value is within a set of values |
INET_ATON() | Return the numeric value of an IP address |
INET_NTOA() | Return the IP address from a numeric value |
INET6_ATON() | Return the numeric value of an IPv6 address |
INET6_NTOA() | Return the IPv6 address from a numeric value |
INSERT() | Insert a substring at the specified position up to the specified number of characters |
INSTR() | Return the index of the first occurrence of substring |
INTERVAL() | Return the index of the argument that is less than the first argument |
IS_FREE_LOCK() | Checks whether the named lock is free |
IS_IPV4_COMPAT() | Return true if argument is an IPv4-compatible address |
IS_IPV4_MAPPED() | Return true if argument is an IPv4-mapped address |
IS_IPV4() | Return true if argument is an IPv4 address |
IS_IPV6() | Return true if argument is an IPv6 address |
IS NOT NULL | NOT NULL value test |
IS NOT | Test a value against a boolean |
IS NULL | NULL value test |
IS_USED_LOCK() | Checks whether the named lock is in use. Return connection identifier if true. |
IS | Test a value against a boolean |
ISNULL() | Test whether the argument is NULL |
LAST_DAY | Return the last day of the month for the argument |
LAST_INSERT_ID() | Value of the AUTOINCREMENT column for the last INSERT |
LCASE() | Synonym for LOWER() |
LEAST() | Return the smallest argument |
<< | Left shift |
LEFT() | Return the leftmost number of characters as specified |
LENGTH() | Return the length of a string in bytes |
<= | Less than or equal operator |
< | Less than operator |
LIKE | Simple pattern matching |
LN() | Return the natural logarithm of the argument |
LOAD_FILE() | Load the named file |
LOCALTIME() , LOCALTIME | Synonym for NOW() |
LOCALTIMESTAMP , LOCALTIMESTAMP() | Synonym for NOW() |
LOCATE() | Return the position of the first occurrence of substring |
LOG10() | Return the base-10 logarithm of the argument |
LOG2() | Return the base-2 logarithm of the argument |
LOG() | Return the natural logarithm of the first argument |
LOWER() | Return the argument in lowercase |
LPAD() | Return the string argument, left-padded with the specified string |
LTRIM() | Remove leading spaces |
MAKE_SET() | Return a set of comma-separated strings that have the corresponding bit in bits set |
MAKEDATE() | Create a date from the year and day of year |
MAKETIME | MAKETIME() |
MASTER_POS_WAIT() | Block until the slave has read and applied all updates up to the specified position |
MATCH | Perform full-text search |
MAX() | Return the maximum value |
MD5() | Calculate MD5 checksum |
MICROSECOND() | Return the microseconds from argument |
MID() | Return a substring starting from the specified position |
MIN() | Return the minimum value |
- | Minus operator |
MINUTE() | Return the minute from the argument |
MOD() | Return the remainder |
% or MOD | Modulo operator |
MONTH() | Return the month from the date passed |
MONTHNAME() | Return the name of the month |
NAME_CONST() | Causes the column to have the given name |
NOT BETWEEN ... AND ... | Check whether a value is not within a range of values |
!= , <> | Not equal operator |
NOT IN() | Check whether a value is not within a set of values |
NOT LIKE | Negation of simple pattern matching |
NOT REGEXP | Negation of REGEXP |
NOT , ! | Negates value |
NOW() | Return the current date and time |
NULLIF() | Return NULL if expr1 = expr2 |
OCT() | Return a string containing octal representation of a number |
OCTET_LENGTH() | A synonym for LENGTH() |
OLD_PASSWORD() | Return the value of the pre-4.1 implementation of PASSWORD |
|| , OR | Logical OR |
ORD() | Return character code for leftmost character of the argument |
PASSWORD() | Calculate and return a password string |
PERIOD_ADD() | Add a period to a year-month |
PERIOD_DIFF() | Return the number of months between periods |
PI() | Return the value of pi |
+ | Addition operator |
POSITION() | A synonym for LOCATE() |
POW() | Return the argument raised to the specified power |
POWER() | Return the argument raised to the specified power |
PROCEDURE ANALYSE() | Analyze the results of a query |
QUARTER() | Return the quarter from a date argument |
QUOTE() | Escape the argument for use in an SQL statement |
RADIANS() | Return argument converted to radians |
RAND() | Return a random floating-point value |
REGEXP | Pattern matching using regular expressions |
RELEASE_LOCK() | Releases the named lock |
REPEAT() | Repeat a string the specified number of times |
REPLACE() | Replace occurrences of a specified string |
REVERSE() | Reverse the characters in a string |
>> | Right shift |
RIGHT() | Return the specified rightmost number of characters |
RLIKE | Synonym for REGEXP |
ROUND() | Round the argument |
ROW_COUNT() | The number of rows updated |
RPAD() | Append string the specified number of times |
RTRIM() | Remove trailing spaces |
SCHEMA() | A synonym for DATABASE() |
SEC_TO_TIME() | Converts seconds to 'HH:MM:SS' format |
SECOND() | Return the second (0-59) |
SESSION_USER() | Synonym for USER() |
SHA1() , SHA() | Calculate an SHA-1 160-bit checksum |
SHA2() | Calculate an SHA-2 checksum |
SIGN() | Return the sign of the argument |
SIN() | Return the sine of the argument |
SLEEP() | Sleep for a number of seconds |
SOUNDEX() | Return a soundex string |
SOUNDS LIKE | Compare sounds |
SPACE() | Return a string of the specified number of spaces |
SQRT() | Return the square root of the argument |
STD() | Return the population standard deviation |
STDDEV_POP() | Return the population standard deviation |
STDDEV_SAMP() | Return the sample standard deviation |
STDDEV() | Return the population standard deviation |
STR_TO_DATE() | Convert a string to a date |
STRCMP() | Compare two strings |
SUBDATE() | A synonym for DATE_SUB() when invoked with three arguments |
SUBSTR() | Return the substring as specified |
SUBSTRING_INDEX() | Return a substring from a string before the specified number of occurrences of the delimiter |
SUBSTRING() | Return the substring as specified |
SUBTIME() | Subtract times |
SUM() | Return the sum |
SYSDATE() | Return the time at which the function executes |
SYSTEM_USER() | Synonym for USER() |
TAN() | Return the tangent of the argument |
TIME_FORMAT() | Format as time |
TIME_TO_SEC() | Return the argument converted to seconds |
TIME() | Extract the time portion of the expression passed |
TIMEDIFF() | Subtract time |
* | Multiplication operator |
TIMESTAMP() | With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments |
TIMESTAMPADD() | Add an interval to a datetime expression |
TIMESTAMPDIFF() | Subtract an interval from a datetime expression |
TO_BASE64() | Return the argument converted to a base-64 string |
TO_DAYS() | Return the date argument converted to days |
TO_SECONDS() | Return the date or datetime argument converted to seconds since Year 0 |
TRIM() | Remove leading and trailing spaces |
TRUNCATE() | Truncate to specified number of decimal places |
UCASE() | Synonym for UPPER() |
- | Change the sign of the argument |
UNCOMPRESS() | Uncompress a string compressed |
UNCOMPRESSED_LENGTH() | Return the length of a string before compression |
UNHEX() | Return a string containing hex representation of a number |
UNIX_TIMESTAMP() | Return a UNIX timestamp |
UpdateXML() | Return replaced XML fragment |
UPPER() | Convert to uppercase |
USER() | The user name and host name provided by the client |
UTC_DATE() | Return the current UTC date |
UTC_TIME() | Return the current UTC time |
UTC_TIMESTAMP() | Return the current UTC date and time |
UUID_SHORT() | Return an integer-valued universal identifier |
UUID() | Return a Universal Unique Identifier (UUID) |
VALIDATE_PASSWORD_STRENGTH() | Determine strength of password |
VALUES() | Defines the values to be used during an INSERT |
VAR_POP() | Return the population standard variance |
VAR_SAMP() | Return the sample variance |
VARIANCE() | Return the population standard variance |
VERSION() | Returns a string that indicates the MySQL server version |
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS() | Wait until the slave SQL thread has executed all the given GTIDs. Returns: the number of events that were executed (or NULL, if GTID mode is not enabled). |
WEEK() | Return the week number |
WEEKDAY() | Return the weekday index |
WEEKOFYEAR() | Return the calendar week of the date (0-53) |
WEIGHT_STRING() | Return the weight string for a string |
XOR | Logical XOR |
YEAR() | Return the year |
YEARWEEK() | Return the year and week |
When an operator is used with operands of different types, type conversion occurs to make the operands compatible. Some conversions occur implicitly. For example, MySQL automatically converts numbers to strings as necessary, and vice versa.
mysql>SELECT 1+'1';
-> 2 mysql>SELECT CONCAT(2,' test');
-> '2 test'
It is also possible to convert a number to a string explicitly
using the CAST()
function.
Conversion occurs implicitly with the
CONCAT()
function because it
expects string arguments.
mysql>SELECT 38.8, CAST(38.8 AS CHAR);
-> 38.8, '38.8' mysql>SELECT 38.8, CONCAT(38.8);
-> 38.8, '38.8'
See later in this section for information about the character set of implicit number-to-string conversions.
The following rules describe how conversion occurs for comparison operations:
If one or both arguments are NULL
, the
result of the comparison is NULL
, except
for the NULL
-safe
<=>
equality comparison operator. For NULL <=>
NULL
, the result is true. No conversion is needed.
If both arguments in a comparison operation are strings, they are compared as strings.
If both arguments are integers, they are compared as integers.
Hexadecimal values are treated as binary strings if not compared to a number.
If one of the arguments is a
TIMESTAMP
or
DATETIME
column and the other
argument is a constant, the constant is converted to a
timestamp before the comparison is performed. This is done to
be more ODBC-friendly. Note that this is not done for the
arguments to IN()
! To be safe,
always use complete datetime, date, or time strings when doing
comparisons. For example, to achieve best results when using
BETWEEN
with date or time values,
use CAST()
to explicitly
convert the values to the desired data type.
A single-row subquery from a table or tables is not considered
a constant. For example, if a subquery returns an integer to
be compared to a DATETIME
value, the comparison is done as two integers. The integer is
not converted to a temporal value. To compare the operands as
DATETIME
values, use
CAST()
to explicitly convert
the subquery value to DATETIME
.
If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.
In all other cases, the arguments are compared as floating-point (real) numbers.
For information about conversion of values from one temporal type to another, see Section 11.3.7, “Conversion Between Date and Time Types”.
The following examples illustrate conversion of strings to numbers for comparison operations:
mysql>SELECT 1 > '6x';
-> 0 mysql>SELECT 7 > '6x';
-> 1 mysql>SELECT 0 > 'x6';
-> 0 mysql>SELECT 0 = 'x6';
-> 1
For comparisons of a string column with a number, MySQL cannot use
an index on the column to look up the value quickly. If
str_col
is an indexed string column,
the index cannot be used when performing the lookup in the
following statement:
SELECT * FROMtbl_name
WHEREstr_col
=1;
The reason for this is that there are many different strings that
may convert to the value 1
, such as
'1'
, ' 1'
, or
'1a'
.
Comparisons that use floating-point numbers (or values that are converted to floating-point numbers) are approximate because such numbers are inexact. This might lead to results that appear inconsistent:
mysql>SELECT '18015376320243458' = 18015376320243458;
-> 1 mysql>SELECT '18015376320243459' = 18015376320243459;
-> 0
Such results can occur because the values are converted to floating-point numbers, which have only 53 bits of precision and are subject to rounding:
mysql> SELECT '18015376320243459'+0.0;
-> 1.8015376320243e+16
Furthermore, the conversion from string to floating-point and from integer to floating-point do not necessarily occur the same way. The integer may be converted to floating-point by the CPU, whereas the string is converted digit by digit in an operation that involves floating-point multiplications.
The results shown will vary on different systems, and can be
affected by factors such as computer architecture or the compiler
version or optimization level. One way to avoid such problems is
to use CAST()
so that a value will
not be converted implicitly to a float-point number:
mysql> SELECT CAST('18015376320243459' AS UNSIGNED) = 18015376320243459;
-> 1
For more information about floating-point comparisons, see Section C.5.5.8, “Problems with Floating-Point Values”.
In MySQL 5.7, the server includes
dtoa
, a conversion library that provides the
basis for improved conversion between string or
DECIMAL
values and
approximate-value
(FLOAT
/DOUBLE
)
numbers:
Consistent conversion results across platforms, which eliminates, for example, Unix versus Windows conversion differences.
Accurate representation of values in cases where results previously did not provide sufficient precision, such as for values close to IEEE limits.
Conversion of numbers to string format with the best possible
precision. The precision of dtoa
is always
the same or better than that of the standard C library
functions.
Because the conversions produced by this library differ in some cases from previous results, the potential exists for incompatibilities in applications that rely on previous results. For example, applications that depend on a specific exact result from previous conversions might need adjustment to accommodate additional precision.
The dtoa
library provides conversions with the
following properties. D
represents a
value with a DECIMAL
or string
representation, and F
represents a
floating-point number in native binary (IEEE) format.
F
->
D
conversion is done with the best
possible precision, returning D
as
the shortest string that yields F
when read back in and rounded to the nearest value in native
binary format as specified by IEEE.
D
->
F
conversion is done such that
F
is the nearest native binary
number to the input decimal string
D
.
These properties imply that F
->
D
-> F
conversions are lossless unless F
is
-inf
, +inf
, or
NaN
. The latter values are not supported
because the SQL standard defines them as invalid values for
FLOAT
or
DOUBLE
.
For D
->
F
-> D
conversions, a sufficient condition for losslessness is that
D
uses 15 or fewer digits of precision,
is not a denormal value, -inf
,
+inf
, or NaN
. In some cases,
the conversion is lossless even if D
has more than 15 digits of precision, but this is not always the
case.
In MySQL 5.7, implicit conversion of a numeric or
temporal value to string produces a value that has a character set
and collation determined by the
character_set_connection
and
collation_connection
system
variables. (These variables commonly are set with
SET
NAMES
. For information about connection character sets,
see Section 10.1.4, “Connection Character Sets and Collations”.)
This means that such a conversion results in a character
(nonbinary) string (a CHAR
,
VARCHAR
, or
LONGTEXT
value), except in the case
that the connection character set is set to
binary
. In that case, the conversion result is
a binary string (a BINARY
,
VARBINARY
, or
LONGBLOB
value).
Table 12.2. Operators
Name | Description |
---|---|
AND , && | Logical AND |
= | Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement) |
:= | Assign a value |
BETWEEN ... AND ... | Check whether a value is within a range of values |
BINARY | Cast a string to a binary string |
& | Bitwise AND |
~ | Invert bits |
| | Bitwise OR |
^ | Bitwise XOR |
CASE | Case operator |
DIV | Integer division |
/ | Division operator |
<=> | NULL-safe equal to operator |
= | Equal operator |
>= | Greater than or equal operator |
> | Greater than operator |
IS NOT NULL | NOT NULL value test |
IS NOT | Test a value against a boolean |
IS NULL | NULL value test |
IS | Test a value against a boolean |
<< | Left shift |
<= | Less than or equal operator |
< | Less than operator |
LIKE | Simple pattern matching |
- | Minus operator |
% or MOD | Modulo operator |
NOT BETWEEN ... AND ... | Check whether a value is not within a range of values |
!= , <> | Not equal operator |
NOT LIKE | Negation of simple pattern matching |
NOT REGEXP | Negation of REGEXP |
NOT , ! | Negates value |
|| , OR | Logical OR |
+ | Addition operator |
REGEXP | Pattern matching using regular expressions |
>> | Right shift |
RLIKE | Synonym for REGEXP |
SOUNDS LIKE | Compare sounds |
* | Multiplication operator |
- | Change the sign of the argument |
XOR | Logical XOR |
Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence.
INTERVAL BINARY, COLLATE ! - (unary minus), ~ (unary bit inversion) ^ *, /, DIV, %, MOD -, + <<, >> & | = (comparison), <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN BETWEEN, CASE, WHEN, THEN, ELSE NOT &&, AND XOR ||, OR = (assignment), :=
The precedence of =
depends on whether it is
used as a comparison operator
(=
) or as an
assignment operator
(=
). When
used as a comparison operator, it has the same precedence as
<=>
,
>=
,
>
,
<=
,
<
,
<>
,
!=
,
IS
,
LIKE
,
REGEXP
, and
IN
. When used as an assignment
operator, it has the same precedence as
:=
.
Section 13.7.4, “SET
Syntax”, and
Section 9.4, “User-Defined Variables”, explain how MySQL determines
which interpretation of =
should apply.
The meaning of some operators depends on the SQL mode:
By default, ||
is a logical OR
operator. With
PIPES_AS_CONCAT
enabled,
||
is string
concatenation, with a precedence between
^
and
the unary operators.
By default, !
has a higher precedence than NOT
. With
HIGH_NOT_PRECEDENCE
enabled, !
and
NOT
have the same precedence.
See Section 5.1.7, “Server SQL Modes”.
The precedence of operators determines the order of evaluation of terms in an expression. To override this order and group terms explicitly, use parentheses. For example:
mysql>SELECT 1+2*3;
-> 7 mysql>SELECT (1+2)*3;
-> 9
Table 12.3. Comparison Operators
Name | Description |
---|---|
BETWEEN ... AND ... | Check whether a value is within a range of values |
COALESCE() | Return the first non-NULL argument |
<=> | NULL-safe equal to operator |
= | Equal operator |
>= | Greater than or equal operator |
> | Greater than operator |
GREATEST() | Return the largest argument |
IN() | Check whether a value is within a set of values |
INTERVAL() | Return the index of the argument that is less than the first argument |
IS NOT NULL | NOT NULL value test |
IS NOT | Test a value against a boolean |
IS NULL | NULL value test |
IS | Test a value against a boolean |
ISNULL() | Test whether the argument is NULL |
LEAST() | Return the smallest argument |
<= | Less than or equal operator |
< | Less than operator |
LIKE | Simple pattern matching |
NOT BETWEEN ... AND ... | Check whether a value is not within a range of values |
!= , <> | Not equal operator |
NOT IN() | Check whether a value is not within a set of values |
NOT LIKE | Negation of simple pattern matching |
STRCMP() | Compare two strings |
Comparison operations result in a value of 1
(TRUE
), 0
(FALSE
), or NULL
. These
operations work for both numbers and strings. Strings are
automatically converted to numbers and numbers to strings as
necessary.
The following relational comparison operators can be used to compare not only scalar operands, but row operands:
= > < >= <= <> !=
For examples of row comparisons, see Section 13.2.10.5, “Row Subqueries”.
Some of the functions in this section return values other than
1
(TRUE
),
0
(FALSE
), or
NULL
. For example,
LEAST()
and
GREATEST()
. However, the value
they return is based on comparison operations performed
according to the rules described in
Section 12.2, “Type Conversion in Expression Evaluation”.
To convert a value to a specific type for comparison purposes,
you can use the CAST()
function.
String values can be converted to a different character set
using CONVERT()
. See
Section 12.10, “Cast Functions and Operators”.
By default, string comparisons are not case sensitive and use
the current character set. The default is
latin1
(cp1252 West European), which also
works well for English.
Equal:
mysql>SELECT 1 = 0;
-> 0 mysql>SELECT '0' = 0;
-> 1 mysql>SELECT '0.0' = 0;
-> 1 mysql>SELECT '0.01' = 0;
-> 0 mysql>SELECT '.01' = 0.01;
-> 1
NULL
-safe equal. This operator performs
an equality comparison like the
=
operator,
but returns 1
rather than
NULL
if both operands are
NULL
, and 0
rather
than NULL
if one operand is
NULL
.
mysql>SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL;
-> 1, 1, 0 mysql>SELECT 1 = 1, NULL = NULL, 1 = NULL;
-> 1, NULL, NULL
Not equal:
mysql>SELECT '.01' <> '0.01';
-> 1 mysql>SELECT .01 <> '0.01';
-> 0 mysql>SELECT 'zapp' <> 'zappp';
-> 1
Less than or equal:
mysql> SELECT 0.1 <= 2;
-> 1
Less than:
mysql> SELECT 2 < 2;
-> 0
Greater than or equal:
mysql> SELECT 2 >= 2;
-> 1
Greater than:
mysql> SELECT 2 > 2;
-> 0
Tests a value against a boolean value, where
boolean_value
can be
TRUE
, FALSE
, or
UNKNOWN
.
mysql> SELECT 1 IS TRUE, 0 IS FALSE, NULL IS UNKNOWN;
-> 1, 1, 1
Tests a value against a boolean value, where
boolean_value
can be
TRUE
, FALSE
, or
UNKNOWN
.
mysql> SELECT 1 IS NOT UNKNOWN, 0 IS NOT UNKNOWN, NULL IS NOT UNKNOWN;
-> 1, 1, 0
Tests whether a value is NULL
.
mysql> SELECT 1 IS NULL, 0 IS NULL, NULL IS NULL;
-> 0, 0, 1
To work well with ODBC programs, MySQL supports the
following extra features when using IS
NULL
:
If sql_auto_is_null
variable is set to 1, then after a statement that
successfully inserts an automatically generated
AUTO_INCREMENT
value, you can find
that value by issuing a statement of the following form:
SELECT * FROMtbl_name
WHEREauto_col
IS NULL
If the statement returns a row, the value returned is
the same as if you invoked the
LAST_INSERT_ID()
function. For details, including the return value after
a multiple-row insert, see
Section 12.14, “Information Functions”. If no
AUTO_INCREMENT
value was successfully
inserted, the SELECT
statement returns no row.
The behavior of retrieving an
AUTO_INCREMENT
value by using an
IS NULL
comparison can be
disabled by setting
sql_auto_is_null = 0
.
See Section 5.1.4, “Server System Variables”.
The default value of
sql_auto_is_null
is 0
in MySQL 5.7.
For DATE
and
DATETIME
columns that are
declared as NOT NULL
, you can find
the special date '0000-00-00'
by
using a statement like this:
SELECT * FROMtbl_name
WHEREdate_column
IS NULL
This is needed to get some ODBC applications to work
because ODBC does not support a
'0000-00-00'
date value.
See
Section 21.1.7.1.1, “Obtaining Auto-Increment Values”,
and the description for the
FLAG_AUTO_IS_NULL
option at
Section 21.1.4.2, “Connector/ODBC Connection Parameters”.
Tests whether a value is not NULL
.
mysql> SELECT 1 IS NOT NULL, 0 IS NOT NULL, NULL IS NOT NULL;
-> 1, 1, 0
If expr
is greater than or equal
to min
and
expr
is less than or equal to
max
,
BETWEEN
returns
1
, otherwise it returns
0
. This is equivalent to the expression
(
if all the
arguments are of the same type. Otherwise type conversion
takes place according to the rules described in
Section 12.2, “Type Conversion in Expression Evaluation”, but applied to all the
three arguments.
min
<=
expr
AND
expr
<=
max
)
mysql>SELECT 2 BETWEEN 1 AND 3, 2 BETWEEN 3 and 1;
-> 1, 0 mysql>SELECT 1 BETWEEN 2 AND 3;
-> 0 mysql>SELECT 'b' BETWEEN 'a' AND 'c';
-> 1 mysql>SELECT 2 BETWEEN 2 AND '3';
-> 1 mysql>SELECT 2 BETWEEN 2 AND 'x-3';
-> 0
For best results when using
BETWEEN
with date or time
values, use CAST()
to
explicitly convert the values to the desired data type.
Examples: If you compare a
DATETIME
to two
DATE
values, convert the
DATE
values to
DATETIME
values. If you use a
string constant such as '2001-1-1'
in a
comparison to a DATE
, cast
the string to a DATE
.
This is the same as NOT
(
.
expr
BETWEEN
min
AND
max
)
Returns the first non-NULL
value in the
list, or NULL
if there are no
non-NULL
values.
mysql>SELECT COALESCE(NULL,1);
-> 1 mysql>SELECT COALESCE(NULL,NULL,NULL);
-> NULL
With two or more arguments, returns the largest
(maximum-valued) argument. The arguments are compared using
the same rules as for
LEAST()
.
mysql>SELECT GREATEST(2,0);
-> 2 mysql>SELECT GREATEST(34.0,3.0,5.0,767.0);
-> 767.0 mysql>SELECT GREATEST('B','A','C');
-> 'C'
GREATEST()
returns
NULL
if any argument is
NULL
.
Returns 1
if
expr
is equal to any of the
values in the IN
list, else returns
0
. If all values are constants, they are
evaluated according to the type of
expr
and sorted. The search for
the item then is done using a binary search. This means
IN
is very quick if the
IN
value list consists entirely of
constants. Otherwise, type conversion takes place according
to the rules described in Section 12.2, “Type Conversion in Expression Evaluation”,
but applied to all the arguments.
mysql>SELECT 2 IN (0,3,5,7);
-> 0 mysql>SELECT 'wefwf' IN ('wee','wefwf','weg');
-> 1
You should never mix quoted and unquoted values in an
IN
list because the comparison rules for
quoted values (such as strings) and unquoted values (such as
numbers) differ. Mixing types may therefore lead to
inconsistent results. For example, do not write an
IN
expression like this:
SELECT val1 FROM tbl1 WHERE val1 IN (1,2,'a');
Instead, write it like this:
SELECT val1 FROM tbl1 WHERE val1 IN ('1','2','a');
The number of values in the IN
list is
only limited by the
max_allowed_packet
value.
To comply with the SQL standard, IN
returns NULL
not only if the expression
on the left hand side is NULL
, but also
if no match is found in the list and one of the expressions
in the list is NULL
.
IN()
syntax can also be used to write
certain types of subqueries. See
Section 13.2.10.3, “Subqueries with ANY
, IN
, or
SOME
”.
This is the same as NOT
(
.
expr
IN
(value
,...))
If expr
is
NULL
,
ISNULL()
returns
1
, otherwise it returns
0
.
mysql>SELECT ISNULL(1+1);
-> 0 mysql>SELECT ISNULL(1/0);
-> 1
ISNULL()
can be used instead
of =
to test
whether a value is NULL
. (Comparing a
value to NULL
using
=
always
yields false.)
The ISNULL()
function shares
some special behaviors with the
IS NULL
comparison operator. See the description of
IS NULL
.
Returns 0
if N
< N1
, 1
if
N
<
N2
and so on or
-1
if N
is
NULL
. All arguments are treated as
integers. It is required that N1
< N2
<
N3
< ...
< Nn
for this function to work
correctly. This is because a binary search is used (very
fast).
mysql>SELECT INTERVAL(23, 1, 15, 17, 30, 44, 200);
-> 3 mysql>SELECT INTERVAL(10, 1, 10, 100, 1000);
-> 2 mysql>SELECT INTERVAL(22, 23, 30, 44, 200);
-> 0
With two or more arguments, returns the smallest (minimum-valued) argument. The arguments are compared using the following rules:
If any argument is NULL
, the result
is NULL
. No comparison is needed.
If the return value is used in an
INTEGER
context or all
arguments are integer-valued, they are compared as
integers.
If the return value is used in a
REAL
context or all
arguments are real-valued, they are compared as reals.
If the arguments comprise a mix of numbers and strings, they are compared as numbers.
If any argument is a nonbinary (character) string, the arguments are compared as nonbinary strings.
In all other cases, the arguments are compared as binary strings.
mysql>SELECT LEAST(2,0);
-> 0 mysql>SELECT LEAST(34.0,3.0,5.0,767.0);
-> 3.0 mysql>SELECT LEAST('B','A','C');
-> 'A'
Note that the preceding conversion rules can produce strange results in some borderline cases:
mysql> SELECT CAST(LEAST(3600, 9223372036854775808.0) as SIGNED);
-> -9223372036854775808
This happens because MySQL reads
9223372036854775808.0
in an integer
context. The integer representation is not good enough to
hold the value, so it wraps to a signed integer.
In SQL, all logical operators evaluate to
TRUE
, FALSE
, or
NULL
(UNKNOWN
). In MySQL,
these are implemented as 1 (TRUE
), 0
(FALSE
), and NULL
. Most of
this is common to different SQL database servers, although some
servers may return any nonzero value for
TRUE
.
MySQL evaluates any nonzero, non-NULL
value
to TRUE
. For example, the following
statements all assess to TRUE
:
mysql>SELECT 10 IS TRUE;
-> 1 mysql>SELECT -10 IS TRUE;
-> 1 mysql>SELECT 'string' IS NOT NULL;
-> 1
Logical NOT. Evaluates to 1
if the
operand is 0
, to 0
if
the operand is nonzero, and NOT NULL
returns NULL
.
mysql>SELECT NOT 10;
-> 0 mysql>SELECT NOT 0;
-> 1 mysql>SELECT NOT NULL;
-> NULL mysql>SELECT ! (1+1);
-> 0 mysql>SELECT ! 1+1;
-> 1
The last example produces 1
because the
expression evaluates the same way as
(!1)+1
.
Logical AND. Evaluates to 1
if all
operands are nonzero and not NULL
, to
0
if one or more operands are
0
, otherwise NULL
is
returned.
mysql>SELECT 1 && 1;
-> 1 mysql>SELECT 1 && 0;
-> 0 mysql>SELECT 1 && NULL;
-> NULL mysql>SELECT 0 && NULL;
-> 0 mysql>SELECT NULL && 0;
-> 0
Logical OR. When both operands are
non-NULL
, the result is
1
if any operand is nonzero, and
0
otherwise. With a
NULL
operand, the result is
1
if the other operand is nonzero, and
NULL
otherwise. If both operands are
NULL
, the result is
NULL
.
mysql>SELECT 1 || 1;
-> 1 mysql>SELECT 1 || 0;
-> 1 mysql>SELECT 0 || 0;
-> 0 mysql>SELECT 0 || NULL;
-> NULL mysql>SELECT 1 || NULL;
-> 1
Logical XOR. Returns NULL
if either
operand is NULL
. For
non-NULL
operands, evaluates to
1
if an odd number of operands is
nonzero, otherwise 0
is returned.
mysql>SELECT 1 XOR 1;
-> 0 mysql>SELECT 1 XOR 0;
-> 1 mysql>SELECT 1 XOR NULL;
-> NULL mysql>SELECT 1 XOR 1 XOR 1;
-> 1
a XOR b
is mathematically equal to
(a AND (NOT b)) OR ((NOT a) and b)
.
Assignment operator. Causes the user variable on the left
hand side of the operator to take on the value to its right.
The value on the right hand side may be a literal value,
another variable storing a value, or any legal expression
that yields a scalar value, including the result of a query
(provided that this value is a scalar value). You can
perform multiple assignments in the same
SET
statement. You can perform multiple assignments in the same
statement-
Unlike
=
, the
:=
operator is never interpreted as a comparison operator. This
means you can use
:=
in
any valid SQL statement (not just in
SET
statements) to assign a value to a variable.
mysql>SELECT @var1, @var2;
-> NULL, NULL mysql>SELECT @var1 := 1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2 := @var1;
-> 1, 1 mysql>SELECT @var1, @var2;
-> 1, 1 mysql>SELECT @var1:=COUNT(*) FROM t1;
-> 4 mysql>SELECT @var1;
-> 4
You can make value assignments using
:=
in
other statements besides
SELECT
, such as
UPDATE
, as shown here:
mysql>SELECT @var1;
-> 4 mysql>SELECT * FROM t1;
-> 1, 3, 5, 7 mysql>UPDATE t1 SET c1 = 2 WHERE c1 = @var1:= 1;
Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT @var1;
-> 1 mysql>SELECT * FROM t1;
-> 2, 3, 5, 7
While it is also possible both to set and to read the value
of the same variable in a single SQL statement using the
:=
operator, this is not recommended.
Section 9.4, “User-Defined Variables”, explains why you should
avoid doing this.
This operator is used to perform value assignments in two cases, described in the next two paragraphs.
Within a
SET
statement, =
is treated as an assignment
operator that causes the user variable on the left hand side
of the operator to take on the value to its right. (In other
words, when used in a
SET
statement, =
is treated identically to
:=
.)
The value on the right hand side may be a literal value,
another variable storing a value, or any legal expression
that yields a scalar value, including the result of a query
(provided that this value is a scalar value). You can
perform multiple assignments in the same
SET
statement.
In the SET
clause of an
UPDATE
statement,
=
also acts as an assignment operator; in
this case, however, it causes the column named on the left
hand side of the operator to assume the value given to the
right, provided any WHERE
conditions that
are part of the UPDATE
are
met. You can make multiple assignments in the same
SET
clause of an
UPDATE
statement.
In any other context, =
is treated as a
comparison operator.
mysql>SELECT @var1, @var2;
-> NULL, NULL mysql>SELECT @var1 := 1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2 := @var1;
-> 1, 1 mysql>SELECT @var1, @var2;
-> 1, 1
For more information, see Section 13.7.4, “SET
Syntax”,
Section 13.2.11, “UPDATE
Syntax”, and Section 13.2.10, “Subquery Syntax”.
CASE
value
WHEN
[compare_value
] THEN
result
[WHEN
[compare_value
] THEN
result
...] [ELSE
result
] END
CASE WHEN
[
condition
] THEN
result
[WHEN
[condition
] THEN
result
...] [ELSE
result
] END
The first version returns the
result
where
.
The second version returns the result for the first condition
that is true. If there was no matching result value, the
result after value
=compare_value
ELSE
is returned, or
NULL
if there is no ELSE
part.
mysql>SELECT CASE 1 WHEN 1 THEN 'one'
->WHEN 2 THEN 'two' ELSE 'more' END;
-> 'one' mysql>SELECT CASE WHEN 1>0 THEN 'true' ELSE 'false' END;
-> 'true' mysql>SELECT CASE BINARY 'B'
->WHEN 'a' THEN 1 WHEN 'b' THEN 2 END;
-> NULL
The return type of a CASE
expression is the compatible aggregated type of all return
values, but also depends on the context in which it is used.
If used in a string context, the result is returned as a
string. If used in a numeric context, the result is returned
as a decimal, real, or integer value.
The syntax of the CASE
expression shown here differs slightly
from that of the SQL CASE
statement described in
Section 13.6.5.1, “CASE
Syntax”, for use inside stored programs. The
CASE
statement cannot have an
ELSE NULL
clause, and it is terminated
with END CASE
instead of
END
.
If expr1
is TRUE
(
and expr1
<>
0
) then
expr1
<> NULLIF()
returns
expr2
; otherwise it returns
expr3
.
IF()
returns a numeric or
string value, depending on the context in which it is used.
mysql>SELECT IF(1>2,2,3);
-> 3 mysql>SELECT IF(1<2,'yes','no');
-> 'yes' mysql>SELECT IF(STRCMP('test','test1'),'no','yes');
-> 'no'
If only one of expr2
or
expr3
is explicitly
NULL
, the result type of the
IF()
function is the type of
the non-NULL
expression.
The default return type of IF()
(which may matter when it is stored into a temporary table) is
calculated as follows.
Expression | Return Value |
---|---|
expr2 or expr3
returns a string | string |
expr2 or expr3
returns a floating-point value | floating-point |
expr2 or expr3
returns an integer | integer |
If expr2
and
expr3
are both strings, the result
is case sensitive if either string is case sensitive.
There is also an IF
statement, which differs from the
IF()
function described here. See
Section 13.6.5.2, “IF
Syntax”.
If expr1
is not
NULL
,
IFNULL()
returns
expr1
; otherwise it returns
expr2
.
IFNULL()
returns a numeric or
string value, depending on the context in which it is used.
mysql>SELECT IFNULL(1,0);
-> 1 mysql>SELECT IFNULL(NULL,10);
-> 10 mysql>SELECT IFNULL(1/0,10);
-> 10 mysql>SELECT IFNULL(1/0,'yes');
-> 'yes'
The default result value of
IFNULL(
is the more “general” of the two expressions, in
the order expr1
,expr2
)STRING
,
REAL
, or
INTEGER
. Consider the case of a
table based on expressions or where MySQL must internally
store a value returned by
IFNULL()
in a temporary table:
mysql>CREATE TABLE tmp SELECT IFNULL(1,'test') AS test;
mysql>DESCRIBE tmp;
+-------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+--------------+------+-----+---------+-------+ | test | varbinary(4) | NO | | | | +-------+--------------+------+-----+---------+-------+
In this example, the type of the test
column is VARBINARY(4)
.
Returns NULL
if
is true, otherwise
returns expr1
=
expr2
expr1
. This is the same as
CASE WHEN
.
expr1
=
expr2
THEN NULL ELSE
expr1
END
mysql>SELECT NULLIF(1,1);
-> NULL mysql>SELECT NULLIF(1,2);
-> 1
Note that MySQL evaluates expr1
twice if the arguments are not equal.
Table 12.7. String Operators
Name | Description |
---|---|
ASCII() | Return numeric value of left-most character |
BIN() | Return a string containing binary representation of a number |
BIT_LENGTH() | Return length of argument in bits |
CHAR_LENGTH() | Return number of characters in argument |
CHAR() | Return the character for each integer passed |
CHARACTER_LENGTH() | A synonym for CHAR_LENGTH() |
CONCAT_WS() | Return concatenate with separator |
CONCAT() | Return concatenated string |
ELT() | Return string at index number |
EXPORT_SET() | Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
FIELD() | Return the index (position) of the first argument in the subsequent arguments |
FIND_IN_SET() | Return the index position of the first argument within the second argument |
FORMAT() | Return a number formatted to specified number of decimal places |
FROM_BASE64() | Decode to a base-64 string and return result |
HEX() | Return a hexadecimal representation of a decimal or string value |
INSERT() | Insert a substring at the specified position up to the specified number of characters |
INSTR() | Return the index of the first occurrence of substring |
LCASE() | Synonym for LOWER() |
LEFT() | Return the leftmost number of characters as specified |
LENGTH() | Return the length of a string in bytes |
LIKE | Simple pattern matching |
LOAD_FILE() | Load the named file |
LOCATE() | Return the position of the first occurrence of substring |
LOWER() | Return the argument in lowercase |
LPAD() | Return the string argument, left-padded with the specified string |
LTRIM() | Remove leading spaces |
MAKE_SET() | Return a set of comma-separated strings that have the corresponding bit in bits set |
MATCH | Perform full-text search |
MID() | Return a substring starting from the specified position |
NOT LIKE | Negation of simple pattern matching |
NOT REGEXP | Negation of REGEXP |
OCT() | Return a string containing octal representation of a number |
OCTET_LENGTH() | A synonym for LENGTH() |
ORD() | Return character code for leftmost character of the argument |
POSITION() | A synonym for LOCATE() |
QUOTE() | Escape the argument for use in an SQL statement |
REGEXP | Pattern matching using regular expressions |
REPEAT() | Repeat a string the specified number of times |
REPLACE() | Replace occurrences of a specified string |
REVERSE() | Reverse the characters in a string |
RIGHT() | Return the specified rightmost number of characters |
RLIKE | Synonym for REGEXP |
RPAD() | Append string the specified number of times |
RTRIM() | Remove trailing spaces |
SOUNDEX() | Return a soundex string |
SOUNDS LIKE | Compare sounds |
SPACE() | Return a string of the specified number of spaces |
STRCMP() | Compare two strings |
SUBSTR() | Return the substring as specified |
SUBSTRING_INDEX() | Return a substring from a string before the specified number of occurrences of the delimiter |
SUBSTRING() | Return the substring as specified |
TO_BASE64() | Return the argument converted to a base-64 string |
TRIM() | Remove leading and trailing spaces |
UCASE() | Synonym for UPPER() |
UNHEX() | Return a string containing hex representation of a number |
UPPER() | Convert to uppercase |
WEIGHT_STRING() | Return the weight string for a string |
String-valued functions return NULL
if the
length of the result would be greater than the value of the
max_allowed_packet
system
variable. See Section 8.11.2, “Tuning Server Parameters”.
For functions that operate on string positions, the first position is numbered 1.
For functions that take length arguments, noninteger arguments are rounded to the nearest integer.
Returns the numeric value of the leftmost character of the
string str
. Returns
0
if str
is the
empty string. Returns NULL
if
str
is NULL
.
ASCII()
works for 8-bit
characters.
mysql>SELECT ASCII('2');
-> 50 mysql>SELECT ASCII(2);
-> 50 mysql>SELECT ASCII('dx');
-> 100
See also the ORD()
function.
Returns a string representation of the binary value of
N
, where
N
is a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
Returns N
,10,2)NULL
if
N
is NULL
.
mysql> SELECT BIN(12);
-> '1100'
Returns the length of the string
str
in bits.
mysql> SELECT BIT_LENGTH('text');
-> 32
CHAR(
N
,...
[USING charset_name
])
CHAR()
interprets each argument
N
as an integer and returns a
string consisting of the characters given by the code values
of those integers. NULL
values are skipped.
mysql>SELECT CHAR(77,121,83,81,'76');
-> 'MySQL' mysql>SELECT CHAR(77,77.3,'77.3');
-> 'MMM'
CHAR()
arguments larger than
255 are converted into multiple result bytes. For example,
CHAR(256)
is equivalent to
CHAR(1,0)
, and
CHAR(256*256)
is equivalent to
CHAR(1,0,0)
:
mysql>SELECT HEX(CHAR(1,0)), HEX(CHAR(256));
+----------------+----------------+ | HEX(CHAR(1,0)) | HEX(CHAR(256)) | +----------------+----------------+ | 0100 | 0100 | +----------------+----------------+ mysql>SELECT HEX(CHAR(1,0,0)), HEX(CHAR(256*256));
+------------------+--------------------+ | HEX(CHAR(1,0,0)) | HEX(CHAR(256*256)) | +------------------+--------------------+ | 010000 | 010000 | +------------------+--------------------+
By default, CHAR()
returns a
binary string. To produce a string in a given character set,
use the optional USING
clause:
mysql> SELECT CHARSET(CHAR(0x65)), CHARSET(CHAR(0x65 USING utf8));
+---------------------+--------------------------------+
| CHARSET(CHAR(0x65)) | CHARSET(CHAR(0x65 USING utf8)) |
+---------------------+--------------------------------+
| binary | utf8 |
+---------------------+--------------------------------+
If USING
is given and the result string is
illegal for the given character set, a warning is issued.
Also, if strict SQL mode is enabled, the result from
CHAR()
becomes
NULL
.
Returns the length of the string
str
, measured in characters. A
multi-byte character counts as a single character. This means
that for a string containing five 2-byte characters,
LENGTH()
returns
10
, whereas
CHAR_LENGTH()
returns
5
.
CHARACTER_LENGTH()
is a synonym
for CHAR_LENGTH()
.
Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent nonbinary string form.
CONCAT()
returns
NULL
if any argument is
NULL
.
mysql>SELECT CONCAT('My', 'S', 'QL');
-> 'MySQL' mysql>SELECT CONCAT('My', NULL, 'QL');
-> NULL mysql>SELECT CONCAT(14.3);
-> '14.3'
For quoted strings, concatenation can be performed by placing the strings next to each other:
mysql> SELECT 'My' 'S' 'QL';
-> 'MySQL'
CONCAT_WS(
separator
,str1
,str2
,...)
CONCAT_WS()
stands for
Concatenate With Separator and is a special form of
CONCAT()
. The first argument is
the separator for the rest of the arguments. The separator is
added between the strings to be concatenated. The separator
can be a string, as can the rest of the arguments. If the
separator is NULL
, the result is
NULL
.
mysql>SELECT CONCAT_WS(',','First name','Second name','Last Name');
-> 'First name,Second name,Last Name' mysql>SELECT CONCAT_WS(',','First name',NULL,'Last Name');
-> 'First name,Last Name'
CONCAT_WS()
does not skip empty
strings. However, it does skip any NULL
values after the separator argument.
ELT()
returns the
N
th element of the list of strings:
str1
if
N
= 1
,
str2
if
N
= 2
, and so
on. Returns NULL
if
N
is less than 1
or greater than the number of arguments.
ELT()
is the complement of
FIELD()
.
mysql>SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
-> 'ej' mysql>SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
-> 'foo'
EXPORT_SET(
bits
,on
,off
[,separator
[,number_of_bits
]])
Returns a string such that for every bit set in the value
bits
, you get an
on
string and for every bit not set
in the value, you get an off
string. Bits in bits
are examined
from right to left (from low-order to high-order bits).
Strings are added to the result from left to right, separated
by the separator
string (the
default being the comma character
“,
”). The number of bits
examined is given by
number_of_bits
, which has a default
of 64 if not specified.
number_of_bits
is silently clipped
to 64 if larger than 64. It is treated as an unsigned integer,
so a value of –1 is effectively the same as 64.
mysql>SELECT EXPORT_SET(5,'Y','N',',',4);
-> 'Y,N,Y,N' mysql>SELECT EXPORT_SET(6,'1','0',',',10);
-> '0,1,1,0,0,0,0,0,0,0'
Returns the index (position) of str
in the str1
,
str2
,
str3
, ...
list.
Returns 0
if str
is not found.
If all arguments to FIELD()
are
strings, all arguments are compared as strings. If all
arguments are numbers, they are compared as numbers.
Otherwise, the arguments are compared as double.
If str
is NULL
,
the return value is 0
because
NULL
fails equality comparison with any
value. FIELD()
is the
complement of ELT()
.
mysql>SELECT FIELD('ej', 'Hej', 'ej', 'Heja', 'hej', 'foo');
-> 2 mysql>SELECT FIELD('fo', 'Hej', 'ej', 'Heja', 'hej', 'foo');
-> 0
Returns a value in the range of 1 to
N
if the string
str
is in the string list
strlist
consisting of
N
substrings. A string list is a
string composed of substrings separated by
“,
” characters. If the first
argument is a constant string and the second is a column of
type SET
, the
FIND_IN_SET()
function is
optimized to use bit arithmetic. Returns 0
if str
is not in
strlist
or if
strlist
is the empty string.
Returns NULL
if either argument is
NULL
. This function does not work properly
if the first argument contains a comma
(“,
”) character.
mysql> SELECT FIND_IN_SET('b','a,b,c,d');
-> 2
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. If D
is
0
, the result has no decimal point or
fractional part.
The optional third parameter enables a locale to be specified
to be used for the result number's decimal point, thousands
separator, and grouping between separators. Permissible locale
values are the same as the legal values for the
lc_time_names
system variable
(see Section 10.7, “MySQL Server Locale Support”). If no locale is
specified, the default is 'en_US'
.
mysql>SELECT FORMAT(12332.123456, 4);
-> '12,332.1235' mysql>SELECT FORMAT(12332.1,4);
-> '12,332.1000' mysql>SELECT FORMAT(12332.2,0);
-> '12,332' mysql>SELECT FORMAT(12332.2,2,'de_DE');
-> '12.332,20'
Takes a string encoded with the base-64 encoded rules used by
TO_BASE64()
and returns the
decoded result as a binary string. The result is
NULL
if the argument is
NULL
or not a valid base-64 string. See the
description of TO_BASE64()
for
details about the encoding and decoding rules.
mysql> SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
-> 'JWJj', 'abc'
For a string argument str
,
HEX()
returns a hexadecimal
string representation of str
where
each character in str
is converted
to two hexadecimal digits. The inverse of this operation is
performed by the UNHEX()
function.
For a numeric argument N
,
HEX()
returns a hexadecimal
string representation of the value of
N
treated as a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
The inverse of this operation is performed by
N
,10,16)CONV(HEX(
.
N
),16,10)
mysql>SELECT 0x616263, HEX('abc'), UNHEX(HEX('abc'));
-> 'abc', 616263, 'abc' mysql>SELECT HEX(255), CONV(HEX(255),16,10);
-> 'FF', 255
Returns the string str
, with the
substring beginning at position pos
and len
characters long replaced by
the string newstr
. Returns the
original string if pos
is not
within the length of the string. Replaces the rest of the
string from position pos
if
len
is not within the length of the
rest of the string. Returns NULL
if any
argument is NULL
.
mysql>SELECT INSERT('Quadratic', 3, 4, 'What');
-> 'QuWhattic' mysql>SELECT INSERT('Quadratic', -1, 4, 'What');
-> 'Quadratic' mysql>SELECT INSERT('Quadratic', 3, 100, 'What');
-> 'QuWhat'
This function is multi-byte safe.
Returns the position of the first occurrence of substring
substr
in string
str
. This is the same as the
two-argument form of LOCATE()
,
except that the order of the arguments is reversed.
mysql>SELECT INSTR('foobarbar', 'bar');
-> 4 mysql>SELECT INSTR('xbar', 'foobar');
-> 0
This function is multi-byte safe, and is case sensitive only if at least one argument is a binary string.
LCASE()
is a synonym for
LOWER()
.
In MySQL 5.7, LCASE()
used in
a view is rewritten as LOWER()
when storing
the view's definition. (Bug #12844279)
Returns the leftmost len
characters
from the string str
, or
NULL
if any argument is
NULL
.
mysql> SELECT LEFT('foobarbar', 5);
-> 'fooba'
This function is multi-byte safe.
Returns the length of the string
str
, measured in bytes. A
multi-byte character counts as multiple bytes. This means that
for a string containing five 2-byte characters,
LENGTH()
returns
10
, whereas
CHAR_LENGTH()
returns
5
.
mysql> SELECT LENGTH('text');
-> 4
Reads the file and returns the file contents as a string. To
use this function, the file must be located on the server
host, you must specify the full path name to the file, and you
must have the FILE
privilege.
The file must be readable by all and its size less than
max_allowed_packet
bytes. If
the secure_file_priv
system
variable is set to a nonempty directory name, the file to be
loaded must be located in that directory.
If the file does not exist or cannot be read because one of
the preceding conditions is not satisfied, the function
returns NULL
.
The character_set_filesystem
system variable controls interpretation of file names that are
given as literal strings.
mysql>UPDATE t
SET blob_col=LOAD_FILE('/tmp/picture')
WHERE id=1;
LOCATE(
,
substr
,str
)LOCATE(
substr
,str
,pos
)
The first syntax returns the position of the first occurrence
of substring substr
in string
str
. The second syntax returns the
position of the first occurrence of substring
substr
in string
str
, starting at position
pos
. Returns 0
if substr
is not in
str
.
mysql>SELECT LOCATE('bar', 'foobarbar');
-> 4 mysql>SELECT LOCATE('xbar', 'foobar');
-> 0 mysql>SELECT LOCATE('bar', 'foobarbar', 5);
-> 7
This function is multi-byte safe, and is case-sensitive only if at least one argument is a binary string.
Returns the string str
with all
characters changed to lowercase according to the current
character set mapping. The default is
latin1
(cp1252 West European).
mysql> SELECT LOWER('QUADRATICALLY');
-> 'quadratically'
LOWER()
(and
UPPER()
) are ineffective when
applied to binary strings
(BINARY
,
VARBINARY
,
BLOB
). To perform lettercase
conversion, convert the string to a nonbinary string:
mysql>SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+-------------+-----------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING latin1)) | +-------------+-----------------------------------+ | New York | new york | +-------------+-----------------------------------+
For Unicode character sets,
LOWER()
and
UPPER()
work accounting to
Unicode Collation Algorithm (UCA) 5.2.0 for
collations and for language-specific collations that are
derived from them. For other Unicode collations,
xxx
_unicode_520_ciLOWER()
and
UPPER()
work accounting to
Unicode Collation Algorithm (UCA) 4.0.0. See
Section 10.1.14.1, “Unicode Character Sets”.
This function is multi-byte safe.
In previous versions of MySQL, LOWER()
used
within a view was rewritten as
LCASE()
when storing the
view's definition. In MySQL 5.7,
LOWER()
is never rewritten in such cases,
but LCASE()
used within views is instead
rewritten as LOWER()
. (Bug #12844279)
Returns the string str
, left-padded
with the string padstr
to a length
of len
characters. If
str
is longer than
len
, the return value is shortened
to len
characters.
mysql>SELECT LPAD('hi',4,'??');
-> '??hi' mysql>SELECT LPAD('hi',1,'??');
-> 'h'
Returns the string str
with leading
space characters removed.
mysql> SELECT LTRIM(' barbar');
-> 'barbar'
This function is multi-byte safe.
Returns a set value (a string containing substrings separated
by “,
” characters) consisting
of the strings that have the corresponding bit in
bits
set.
str1
corresponds to bit 0,
str2
to bit 1, and so on.
NULL
values in
str1
,
str2
, ...
are
not appended to the result.
mysql>SELECT MAKE_SET(1,'a','b','c');
-> 'a' mysql>SELECT MAKE_SET(1 | 4,'hello','nice','world');
-> 'hello,world' mysql>SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
-> 'hello' mysql>SELECT MAKE_SET(0,'a','b','c');
-> ''
MID(
is a synonym for
str
,pos
,len
)SUBSTRING(
.
str
,pos
,len
)
Returns a string representation of the octal value of
N
, where
N
is a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
Returns N
,10,8)NULL
if
N
is NULL
.
mysql> SELECT OCT(12);
-> '14'
OCTET_LENGTH()
is a synonym for
LENGTH()
.
If the leftmost character of the string
str
is a multi-byte character,
returns the code for that character, calculated from the
numeric values of its constituent bytes using this formula:
(1st byte code) + (2nd byte code * 256) + (3rd byte code * 2562) ...
If the leftmost character is not a multi-byte character,
ORD()
returns the same value as
the ASCII()
function.
mysql> SELECT ORD('2');
-> 50
POSITION(
is a synonym for
substr
IN str
)LOCATE(
.
substr
,str
)
Quotes a string to produce a result that can be used as a
properly escaped data value in an SQL statement. The string is
returned enclosed by single quotation marks and with each
instance of backslash (“\
”),
single quote (“'
”), ASCII
NUL
, and Control+Z preceded by a backslash.
If the argument is NULL
, the return value
is the word “NULL” without enclosing single
quotation marks.
mysql>SELECT QUOTE('Don\'t!');
-> 'Don\'t!' mysql>SELECT QUOTE(NULL);
-> NULL
For comparison, see the quoting rules for literal strings and
within the C API in Section 9.1.1, “String Literals”, and
Section 21.8.7.54, “mysql_real_escape_string()
”.
Returns a string consisting of the string
str
repeated
count
times. If
count
is less than 1, returns an
empty string. Returns NULL
if
str
or
count
are NULL
.
mysql> SELECT REPEAT('MySQL', 3);
-> 'MySQLMySQLMySQL'
Returns the string str
with all
occurrences of the string from_str
replaced by the string to_str
.
REPLACE()
performs a
case-sensitive match when searching for
from_str
.
mysql> SELECT REPLACE('www.mysql.com', 'w', 'Ww');
-> 'WwWwWw.mysql.com'
This function is multi-byte safe.
Returns the string str
with the
order of the characters reversed.
mysql> SELECT REVERSE('abc');
-> 'cba'
This function is multi-byte safe.
Returns the rightmost len
characters from the string str
, or
NULL
if any argument is
NULL
.
mysql> SELECT RIGHT('foobarbar', 4);
-> 'rbar'
This function is multi-byte safe.
Returns the string str
,
right-padded with the string padstr
to a length of len
characters. If
str
is longer than
len
, the return value is shortened
to len
characters.
mysql>SELECT RPAD('hi',5,'?');
-> 'hi???' mysql>SELECT RPAD('hi',1,'?');
-> 'h'
This function is multi-byte safe.
Returns the string str
with
trailing space characters removed.
mysql> SELECT RTRIM('barbar ');
-> 'barbar'
This function is multi-byte safe.
Returns a soundex string from str
.
Two strings that sound almost the same should have identical
soundex strings. A standard soundex string is four characters
long, but the SOUNDEX()
function returns an arbitrarily long string. You can use
SUBSTRING()
on the result to
get a standard soundex string. All nonalphabetic characters in
str
are ignored. All international
alphabetic characters outside the A-Z range are treated as
vowels.
When using SOUNDEX()
, you
should be aware of the following limitations:
This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reliable results.
This function is not guaranteed to provide consistent
results with strings that use multi-byte character sets,
including utf-8
.
We hope to remove these limitations in a future release. See Bug #22638 for more information.
mysql>SELECT SOUNDEX('Hello');
-> 'H400' mysql>SELECT SOUNDEX('Quadratically');
-> 'Q36324'
This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.
This is the same as
SOUNDEX(
.
expr1
)
= SOUNDEX(expr2
)
Returns a string consisting of N
space characters.
mysql> SELECT SPACE(6);
-> ' '
SUBSTR(
,
str
,pos
)SUBSTR(
,
str
FROM pos
)SUBSTR(
,
str
,pos
,len
)SUBSTR(
str
FROM pos
FOR
len
)
SUBSTR()
is a synonym for
SUBSTRING()
.
SUBSTRING(
,
str
,pos
)SUBSTRING(
,
str
FROM pos
)SUBSTRING(
,
str
,pos
,len
)SUBSTRING(
str
FROM pos
FOR
len
)
The forms without a len
argument
return a substring from string str
starting at position pos
. The forms
with a len
argument return a
substring len
characters long from
string str
, starting at position
pos
. The forms that use
FROM
are standard SQL syntax. It is also
possible to use a negative value for
pos
. In this case, the beginning of
the substring is pos
characters
from the end of the string, rather than the beginning. A
negative value may be used for pos
in any of the forms of this function.
For all forms of SUBSTRING()
,
the position of the first character in the string from which
the substring is to be extracted is reckoned as
1
.
mysql>SELECT SUBSTRING('Quadratically',5);
-> 'ratically' mysql>SELECT SUBSTRING('foobarbar' FROM 4);
-> 'barbar' mysql>SELECT SUBSTRING('Quadratically',5,6);
-> 'ratica' mysql>SELECT SUBSTRING('Sakila', -3);
-> 'ila' mysql>SELECT SUBSTRING('Sakila', -5, 3);
-> 'aki' mysql>SELECT SUBSTRING('Sakila' FROM -4 FOR 2);
-> 'ki'
This function is multi-byte safe.
If len
is less than 1, the result
is the empty string.
SUBSTRING_INDEX(
str
,delim
,count
)
Returns the substring from string
str
before
count
occurrences of the delimiter
delim
. If
count
is positive, everything to
the left of the final delimiter (counting from the left) is
returned. If count
is negative,
everything to the right of the final delimiter (counting from
the right) is returned.
SUBSTRING_INDEX()
performs a
case-sensitive match when searching for
delim
.
mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
-> 'www.mysql' mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
-> 'mysql.com'
This function is multi-byte safe.
Converts the string argument to base-64 encoded form and
returns the result as a character string with the connection
character set and collation. If the argument is not a string,
it is converted to a string before conversion takes place. The
result is NULL
if the argument is
NULL
. Base-64 encoded strings can be
decoded using the the
FROM_BASE64()
function.
mysql> SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
-> 'JWJj', 'abc'
Different base-64 encoding schemes exist. These are the
encoding and decoding rules used by
TO_BASE64()
and
FROM_BASE64()
:
The encoding for alphabet value 62 is
'+'
.
The encoding for alphabet value 63 is
'/'
.
Encoded output consists of groups of 4 printable
characters. Each 3 bytes of the input data are encoded
using 4 characters. If the last group is incomplete, it is
padded with '='
characters to a length
of 4.
A newline is added after each 76 characters of encoded output to divide long output into multiple lines.
Decoding recognizes and ignores newline, carriage return, tab, and space.
TRIM([{BOTH | LEADING | TRAILING}
[
,
remstr
] FROM]
str
)TRIM([
remstr
FROM] str
)
Returns the string str
with all
remstr
prefixes or suffixes
removed. If none of the specifiers BOTH
,
LEADING
, or TRAILING
is
given, BOTH
is assumed.
remstr
is optional and, if not
specified, spaces are removed.
mysql>SELECT TRIM(' bar ');
-> 'bar' mysql>SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx');
-> 'barxxx' mysql>SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx');
-> 'bar' mysql>SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz');
-> 'barx'
This function is multi-byte safe.
UCASE()
is a synonym for
UPPER()
.
In previous versions of MySQL, UPPER()
used
within a view was rewritten as
UCASE()
when storing the
view's definition. In MySQL 5.7,
UPPER()
is never rewritten in such cases,
but UCASE()
used within views is instead
rewritten as UPPER()
. (Bug #12844279)
In MySQL 5.7, UCASE()
used in
a view is rewritten as UPPER()
when storing
the view's definition. (Bug #12844279)
For a string argument str
,
UNHEX(
performs the inverse operation of
str
)HEX(
.
That is, it interprets each pair of characters in the argument
as a hexadecimal number and converts it to the character
represented by the number. The return value is a binary
string.
str
)
mysql>SELECT UNHEX('4D7953514C');
-> 'MySQL' mysql>SELECT 0x4D7953514C;
-> 'MySQL' mysql>SELECT UNHEX(HEX('string'));
-> 'string' mysql>SELECT HEX(UNHEX('1267'));
-> '1267'
The characters in the argument string must be legal
hexadecimal digits: '0'
..
'9'
, 'A'
..
'F'
, 'a'
..
'f'
. If the argument contains any
nonhexadecimal digits, the result is NULL
:
mysql> SELECT UNHEX('GG');
+-------------+
| UNHEX('GG') |
+-------------+
| NULL |
+-------------+
A NULL
result can occur if the argument to
UNHEX()
is a
BINARY
column, because values
are padded with 0x00 bytes when stored but those bytes are not
stripped on retrieval. For example, '41'
is
stored into a CHAR(3)
column as
'41 '
and retrieved as
'41'
(with the trailing pad space
stripped), so UNHEX()
for the
column value returns 'A'
. By contrast
'41'
is stored into a
BINARY(3)
column as
'41\0'
and retrieved as
'41\0'
(with the trailing pad
0x00
byte not stripped).
'\0'
is not a legal hexadecimal digit, so
UNHEX()
for the column value
returns NULL
.
For a numeric argument N
, the
inverse of
HEX(
is not performed by N
)UNHEX()
.
Use
CONV(HEX(
instead. See the description of
N
),16,10)HEX()
.
Returns the string str
with all
characters changed to uppercase according to the current
character set mapping. The default is
latin1
(cp1252 West European).
mysql> SELECT UPPER('Hej');
-> 'HEJ'
See the description of LOWER()
for information that also applies to
UPPER()
. This included
information about how to perform lettercase conversion of
binary strings (BINARY
,
VARBINARY
,
BLOB
) for which these functions
are ineffective, and information about case folding for
Unicode character sets.
This function is multi-byte safe.
In previous versions of MySQL, UPPER()
used
within a view was rewritten as
UCASE()
when storing the
view's definition. In MySQL 5.7,
UPPER()
is never rewritten in such cases,
but UCASE()
used within views is instead
rewritten as UPPER()
. (Bug #12844279)
WEIGHT_STRING(
str
[AS {CHAR|BINARY}(N
)] [LEVEL
levels
]
[flags
])
levels
:
N
[ASC|DESC|REVERSE] [,
N
[ASC|DESC|REVERSE]] ...
This function returns the weight string for the input string. The return value is a binary string that represents the sorting and comparison value of the string. It has these properties:
If
WEIGHT_STRING(
=
str1
)WEIGHT_STRING(
,
then str2
)
(str1
=
str2
str1
and
str2
are considered equal)
If
WEIGHT_STRING(
<
str1
)WEIGHT_STRING(
,
then str2
)
(str1
<
str2
str1
sorts before
str2
)
WEIGHT_STRING()
can be used for
testing and debugging of collations, especially if you are
adding a new collation. See
Section 10.4, “Adding a Collation to a Character Set”.
The input string, str
, is a string
expression. If the input is a nonbinary (character) string
such as a CHAR
,
VARCHAR
, or
TEXT
value, the return value
contains the collation weights for the string. If the input is
a binary (byte) string such as a
BINARY
,
VARBINARY
, or
BLOB
value, the return value is
the same as the input (the weight for each byte in a binary
string is the byte value). If the input is
NULL
,
WEIGHT_STRING()
returns
NULL
.
Examples:
mysql>SET @s = _latin1 'AB' COLLATE latin1_swedish_ci;
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | AB | 4142 | 4142 | +------+---------+------------------------+
mysql>SET @s = _latin1 'ab' COLLATE latin1_swedish_ci;
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | ab | 6162 | 4142 | +------+---------+------------------------+
mysql>SET @s = CAST('AB' AS BINARY);
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | AB | 4142 | 4142 | +------+---------+------------------------+
mysql>SET @s = CAST('ab' AS BINARY);
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | ab | 6162 | 6162 | +------+---------+------------------------+
The preceding examples use
HEX()
to display the
WEIGHT_STRING()
result. Because
the result is a binary value,
HEX()
can be especially useful
when the result contains nonprinting values, to display it in
printable form:
mysql>SET @s = CONVERT(0xC39F USING utf8) COLLATE utf8_czech_ci;
mysql>SELECT HEX(WEIGHT_STRING(@s));
+------------------------+ | HEX(WEIGHT_STRING(@s)) | +------------------------+ | 0FEA0FEA | +------------------------+
For non-NULL
return values, the data type
of the value is VARBINARY
if
its length is within the maximum length for
VARBINARY
, otherwise the data
type is BLOB
.
The AS
clause may be given to cast the
input string to a nonbinary or binary string and to force it
to a given length:
AS CHAR(
casts the string to a nonbinary string and pads it on the
right with spaces to a length of
N
)N
characters.
N
must be at least 1. If
N
is less than the length of
the input string, the string is truncated to
N
characters. No warning occurs
for truncation.
AS BINARY(
is similar but casts the string to a binary string,
N
)N
is measured in bytes (not
characters), and padding uses 0x00
bytes (not spaces).
mysql> SELECT HEX(WEIGHT_STRING('ab' AS CHAR(4)));
+-------------------------------------+
| HEX(WEIGHT_STRING('ab' AS CHAR(4))) |
+-------------------------------------+
| 41422020 |
+-------------------------------------+
mysql> SELECT HEX(WEIGHT_STRING('ab' AS BINARY(4)));
+---------------------------------------+
| HEX(WEIGHT_STRING('ab' AS BINARY(4))) |
+---------------------------------------+
| 61620000 |
+---------------------------------------+
The LEVEL
clause may be given to specify
that the return value should contain weights for specific
collation levels.
The levels
specifier following the
LEVEL
keyword may be given either as a list
of one or more integers separated by commas, or as a range of
two integers separated by a dash. Whitespace around the
punctuation characters does not matter.
Examples:
LEVEL 1 LEVEL 2, 3, 5 LEVEL 1-3
Any level less than 1 is treated as 1. Any level greater than the maximum for the input string collation is treated as maximum for the collation. The maximum varies per collation, but is never greater than 6.
In a list of levels, levels must be given in increasing order. In a range of levels, if the second number is less than the first, it is treated as the first number (for example, 4-2 is the same as 4-4).
If the LEVEL
clause is omitted, MySQL
assumes LEVEL 1 -
, where
max
max
is the maximum level for the
collation.
If LEVEL
is specified using list syntax
(not range syntax), any level number can be followed by these
modifiers:
ASC
: Return the weights without
modification. This is the default.
DESC
: Return bitwise-inverted weights
(for example, 0x78f0 DESC
=
0x870f
).
REVERSE
: Return the weights in reverse
order (that is,the weights for the reversed string, with
the first character last and the last first).
Examples:
mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1));
+--------------------------------------+
| HEX(WEIGHT_STRING(0x007fff LEVEL 1)) |
+--------------------------------------+
| 007FFF |
+--------------------------------------+
mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC));
+-------------------------------------------+
| HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC)) |
+-------------------------------------------+
| FF8000 |
+-------------------------------------------+
mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1 REVERSE));
+----------------------------------------------+
| HEX(WEIGHT_STRING(0x007fff LEVEL 1 REVERSE)) |
+----------------------------------------------+
| FF7F00 |
+----------------------------------------------+
mysql> SELECT HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC REVERSE));
+---------------------------------------------------+
| HEX(WEIGHT_STRING(0x007fff LEVEL 1 DESC REVERSE)) |
+---------------------------------------------------+
| 0080FF |
+---------------------------------------------------+
The flags
clause currently is
unused.
If a string function is given a binary string as an argument, the resulting string is also a binary string. A number converted to a string is treated as a binary string. This affects only comparisons.
Normally, if any expression in a string comparison is case sensitive, the comparison is performed in case-sensitive fashion.
expr
LIKE pat
[ESCAPE
'escape_char
']
Pattern matching using SQL simple regular expression
comparison. Returns 1
(TRUE
) or 0
(FALSE
). If either
expr
or
pat
is NULL
,
the result is NULL
.
The pattern need not be a literal string. For example, it can be specified as a string expression or table column.
Per the SQL standard, LIKE
performs matching on a per-character basis, thus it can
produce results different from the
=
comparison
operator:
mysql>SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+ | 'ä' LIKE 'ae' COLLATE latin1_german2_ci | +-----------------------------------------+ | 0 | +-----------------------------------------+ mysql>SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+ | 'ä' = 'ae' COLLATE latin1_german2_ci | +--------------------------------------+ | 1 | +--------------------------------------+
In particular, trailing spaces are significant, which is not
true for CHAR
or
VARCHAR
comparisons performed
with the =
operator:
mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
| 1 | 0 |
+------------+---------------+
1 row in set (0.00 sec)
With LIKE
you can use the
following two wildcard characters in the pattern.
Character | Description |
---|---|
% | Matches any number of characters, even zero characters |
_ | Matches exactly one character |
mysql>SELECT 'David!' LIKE 'David_';
-> 1 mysql>SELECT 'David!' LIKE '%D%v%';
-> 1
To test for literal instances of a wildcard character,
precede it by the escape character. If you do not specify
the ESCAPE
character,
“\
” is assumed.
String | Description |
---|---|
\% | Matches one “% ” character |
\_ | Matches one “_ ” character |
mysql>SELECT 'David!' LIKE 'David\_';
-> 0 mysql>SELECT 'David_' LIKE 'David\_';
-> 1
To specify a different escape character, use the
ESCAPE
clause:
mysql> SELECT 'David_' LIKE 'David|_' ESCAPE '|';
-> 1
The escape sequence should be empty or one character long.
The expression must evaluate as a constant at execution
time. If the
NO_BACKSLASH_ESCAPES
SQL
mode is enabled, the sequence cannot be empty.
The following two statements illustrate that string comparisons are not case sensitive unless one of the operands is a binary string:
mysql>SELECT 'abc' LIKE 'ABC';
-> 1 mysql>SELECT 'abc' LIKE BINARY 'ABC';
-> 0
In MySQL, LIKE
is permitted on
numeric expressions. (This is an extension to the standard
SQL LIKE
.)
mysql> SELECT 10 LIKE '1%';
-> 1
Because MySQL uses C escape syntax in strings (for
example, “\n
” to represent
a newline character), you must double any
“\
” that you use in
LIKE
strings. For example, to
search for “\n
”, specify
it as “\\n
”. To search for
“\
”, specify it as
“\\\\
”; this is because
the backslashes are stripped once by the parser and again
when the pattern match is made, leaving a single backslash
to be matched against.
Exception: At the end of the pattern string, backslash can
be specified as “\\
”. At
the end of the string, backslash stands for itself because
there is nothing following to escape. Suppose that a table
contains the following values:
mysql> SELECT filename FROM t1;
+--------------+
| filename |
+--------------+
| C: |
| C:\ |
| C:\Programs |
| C:\Programs\ |
+--------------+
To test for values that end with backslash, you can match the values using either of the following patterns:
mysql>SELECT filename, filename LIKE '%\\' FROM t1;
+--------------+---------------------+ | filename | filename LIKE '%\\' | +--------------+---------------------+ | C: | 0 | | C:\ | 1 | | C:\Programs | 0 | | C:\Programs\ | 1 | +--------------+---------------------+ mysql>SELECT filename, filename LIKE '%\\\\' FROM t1;
+--------------+-----------------------+ | filename | filename LIKE '%\\\\' | +--------------+-----------------------+ | C: | 0 | | C:\ | 1 | | C:\Programs | 0 | | C:\Programs\ | 1 | +--------------+-----------------------+
expr
NOT LIKE pat
[ESCAPE
'escape_char
']
This is the same as NOT
(
.
expr
LIKE
pat
[ESCAPE
'escape_char
'])
Aggregate queries involving NOT
LIKE
comparisons with columns containing
NULL
may yield unexpected results. For
example, consider the following table and data:
CREATE TABLE foo (bar VARCHAR(10)); INSERT INTO foo VALUES (NULL), (NULL);
The query SELECT COUNT(*) FROM foo WHERE bar LIKE
'%baz%';
returns 0
. You might
assume that SELECT COUNT(*) FROM foo WHERE bar
NOT LIKE '%baz%';
would return
2
. However, this is not the case: The
second query returns 0
. This is because
NULL NOT LIKE
always returns
expr
NULL
, regardless of the value of
expr
. The same is true for
aggregate queries involving NULL
and
comparisons using
NOT
RLIKE
or NOT
REGEXP
. In such cases, you must test explicitly
for NOT NULL
using
OR
(and not
AND
), as shown here:
SELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%' OR bar IS NULL;
STRCMP()
returns
0
if the strings are the same,
-1
if the first argument is smaller than
the second according to the current sort order, and
1
otherwise.
mysql>SELECT STRCMP('text', 'text2');
-> -1 mysql>SELECT STRCMP('text2', 'text');
-> 1 mysql>SELECT STRCMP('text', 'text');
-> 0
STRCMP()
performs the
comparison using the collation of the arguments.
mysql>SET @s1 = _latin1 'x' COLLATE latin1_general_ci;
mysql>SET @s2 = _latin1 'X' COLLATE latin1_general_ci;
mysql>SET @s3 = _latin1 'x' COLLATE latin1_general_cs;
mysql>SET @s4 = _latin1 'X' COLLATE latin1_general_cs;
mysql>SELECT STRCMP(@s1, @s2), STRCMP(@s3, @s4);
+------------------+------------------+ | STRCMP(@s1, @s2) | STRCMP(@s3, @s4) | +------------------+------------------+ | 0 | 1 | +------------------+------------------+
If the collations are incompatible, one of the arguments must be converted to be compatible with the other. See Section 10.1.7.5, “Collation of Expressions”.
mysql>SELECT STRCMP(@s1, @s3);
ERROR 1267 (HY000) at line 10: Illegal mix of collations (latin1_general_ci,IMPLICIT) and (latin1_general_cs,IMPLICIT) for operation 'strcmp' mysql>SELECT STRCMP(@s1, @s3 COLLATE latin1_general_ci);
+--------------------------------------------+ | STRCMP(@s1, @s3 COLLATE latin1_general_ci) | +--------------------------------------------+ | 0 | +--------------------------------------------+
Table 12.9. String Regular Expression Operators
Name | Description |
---|---|
NOT REGEXP | Negation of REGEXP |
REGEXP | Pattern matching using regular expressions |
RLIKE | Synonym for REGEXP |
A regular expression is a powerful way of specifying a pattern for a complex search.
MySQL uses Henry Spencer's implementation of regular
expressions, which is aimed at conformance with POSIX 1003.2.
MySQL uses the extended version to support pattern-matching
operations performed with the
REGEXP
operator in SQL statements.
This section summarizes, with examples, the special characters
and constructs that can be used in MySQL for
REGEXP
operations. It does not
contain all the details that can be found in Henry Spencer's
regex(7)
manual page. That manual page is
included in MySQL source distributions, in the
regex.7
file under the
regex
directory. See also
Section 3.3.4.7, “Pattern Matching”.
,
expr
NOT REGEXP pat
expr
NOT RLIKE pat
This is the same as NOT
(
.
expr
REGEXP
pat
)
,
expr
REGEXP pat
expr
RLIKE pat
Performs a pattern match of a string expression
expr
against a pattern
pat
. The pattern can be an
extended regular expression. The syntax for regular
expressions is discussed in Section 12.5.2, “Regular Expressions”.
Returns 1
if
expr
matches
pat
; otherwise it returns
0
. If either
expr
or
pat
is NULL
,
the result is NULL
.
RLIKE
is a
synonym for REGEXP
, provided
for mSQL
compatibility.
The pattern need not be a literal string. For example, it can be specified as a string expression or table column.
Because MySQL uses the C escape syntax in strings (for
example, “\n
” to represent
the newline character), you must double any
“\
” that you use in your
REGEXP
strings.
REGEXP
is not case sensitive,
except when used with binary strings.
mysql>SELECT 'Monty!' REGEXP '.*';
-> 1 mysql>SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
-> 1 mysql>SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
-> 1 0 mysql>SELECT 'a' REGEXP '^[a-d]';
-> 1
REGEXP
and
RLIKE
use
the character set and collations of the arguments when
deciding the type of a character and performing the
comparison. If the arguments have different character sets
or collations, coercibility rules apply as described in
Section 10.1.7.5, “Collation of Expressions”.
The REGEXP
and
RLIKE
operators work in byte-wise fashion, so they are not
multi-byte safe and may produce unexpected results with
multi-byte character sets. In addition, these operators
compare characters by their byte values and accented
characters may not compare as equal even if a given
collation treats them as equal.
A regular expression describes a set of strings. The simplest
regular expression is one that has no special characters in it.
For example, the regular expression hello
matches hello
and nothing else.
Nontrivial regular expressions use certain special constructs so
that they can match more than one string. For example, the
regular expression hello|word
matches either
the string hello
or the string
word
.
As a more complex example, the regular expression
B[an]*s
matches any of the strings
Bananas
, Baaaaas
,
Bs
, and any other string starting with a
B
, ending with an s
, and
containing any number of a
or
n
characters in between.
A regular expression for the REGEXP
operator may use any of the following special characters and
constructs:
^
Match the beginning of a string.
mysql>SELECT 'fo\nfo' REGEXP '^fo$';
-> 0 mysql>SELECT 'fofo' REGEXP '^fo';
-> 1
$
Match the end of a string.
mysql>SELECT 'fo\no' REGEXP '^fo\no$';
-> 1 mysql>SELECT 'fo\no' REGEXP '^fo$';
-> 0
.
Match any character (including carriage return and newline).
mysql>SELECT 'fofo' REGEXP '^f.*$';
-> 1 mysql>SELECT 'fo\r\nfo' REGEXP '^f.*$';
-> 1
a*
Match any sequence of zero or more a
characters.
mysql>SELECT 'Ban' REGEXP '^Ba*n';
-> 1 mysql>SELECT 'Baaan' REGEXP '^Ba*n';
-> 1 mysql>SELECT 'Bn' REGEXP '^Ba*n';
-> 1
a+
Match any sequence of one or more a
characters.
mysql>SELECT 'Ban' REGEXP '^Ba+n';
-> 1 mysql>SELECT 'Bn' REGEXP '^Ba+n';
-> 0
a?
Match either zero or one a
character.
mysql>SELECT 'Bn' REGEXP '^Ba?n';
-> 1 mysql>SELECT 'Ban' REGEXP '^Ba?n';
-> 1 mysql>SELECT 'Baan' REGEXP '^Ba?n';
-> 0
de|abc
Match either of the sequences de
or
abc
.
mysql>SELECT 'pi' REGEXP 'pi|apa';
-> 1 mysql>SELECT 'axe' REGEXP 'pi|apa';
-> 0 mysql>SELECT 'apa' REGEXP 'pi|apa';
-> 1 mysql>SELECT 'apa' REGEXP '^(pi|apa)$';
-> 1 mysql>SELECT 'pi' REGEXP '^(pi|apa)$';
-> 1 mysql>SELECT 'pix' REGEXP '^(pi|apa)$';
-> 0
(abc)*
Match zero or more instances of the sequence
abc
.
mysql>SELECT 'pi' REGEXP '^(pi)*$';
-> 1 mysql>SELECT 'pip' REGEXP '^(pi)*$';
-> 0 mysql>SELECT 'pipi' REGEXP '^(pi)*$';
-> 1
{1}
, {2,3}
{n}
or {m,n}
notation
provides a more general way of writing regular expressions
that match many occurrences of the previous atom (or
“piece”) of the pattern. m
and n
are integers.
a*
Can be written as a{0,}
.
a+
Can be written as a{1,}
.
a?
Can be written as a{0,1}
.
To be more precise, a{n}
matches exactly
n
instances of a
.
a{n,}
matches n
or
more instances of a
.
a{m,n}
matches m
through n
instances of
a
, inclusive.
m
and n
must be in the
range from 0
to
RE_DUP_MAX
(default 255), inclusive. If
both m
and n
are
given, m
must be less than or equal to
n
.
mysql>SELECT 'abcde' REGEXP 'a[bcd]{2}e';
-> 0 mysql>SELECT 'abcde' REGEXP 'a[bcd]{3}e';
-> 1 mysql>SELECT 'abcde' REGEXP 'a[bcd]{1,10}e';
-> 1
[a-dX]
, [^a-dX]
Matches any character that is (or is not, if ^ is used)
either a
, b
,
c
, d
or
X
. A -
character
between two other characters forms a range that matches all
characters from the first character to the second. For
example, [0-9]
matches any decimal digit.
To include a literal ]
character, it must
immediately follow the opening bracket [
.
To include a literal -
character, it must
be written first or last. Any character that does not have a
defined special meaning inside a []
pair
matches only itself.
mysql>SELECT 'aXbc' REGEXP '[a-dXYZ]';
-> 1 mysql>SELECT 'aXbc' REGEXP '^[a-dXYZ]$';
-> 0 mysql>SELECT 'aXbc' REGEXP '^[a-dXYZ]+$';
-> 1 mysql>SELECT 'aXbc' REGEXP '^[^a-dXYZ]+$';
-> 0 mysql>SELECT 'gheis' REGEXP '^[^a-dXYZ]+$';
-> 1 mysql>SELECT 'gheisa' REGEXP '^[^a-dXYZ]+$';
-> 0
[.characters.]
Within a bracket expression (written using
[
and ]
), matches the
sequence of characters of that collating element.
characters
is either a single character
or a character name like newline
. The
following table lists the permissible character names.
The following table shows the permissible character names and the characters that they match. For characters given as numeric values, the values are represented in octal.
Name | Character | Name | Character |
---|---|---|---|
NUL | 0 | SOH | 001 |
STX | 002 | ETX | 003 |
EOT | 004 | ENQ | 005 |
ACK | 006 | BEL | 007 |
alert | 007 | BS | 010 |
backspace | '\b' | HT | 011 |
tab | '\t' | LF | 012 |
newline | '\n' | VT | 013 |
vertical-tab | '\v' | FF | 014 |
form-feed | '\f' | CR | 015 |
carriage-return | '\r' | SO | 016 |
SI | 017 | DLE | 020 |
DC1 | 021 | DC2 | 022 |
DC3 | 023 | DC4 | 024 |
NAK | 025 | SYN | 026 |
ETB | 027 | CAN | 030 |
EM | 031 | SUB | 032 |
ESC | 033 | IS4 | 034 |
FS | 034 | IS3 | 035 |
GS | 035 | IS2 | 036 |
RS | 036 | IS1 | 037 |
US | 037 | space | ' ' |
exclamation-mark | '!' | quotation-mark | '"' |
number-sign | '#' | dollar-sign | '$' |
percent-sign | '%' | ampersand | '&' |
apostrophe | '\'' | left-parenthesis | '(' |
right-parenthesis | ')' | asterisk | '*' |
plus-sign | '+' | comma | ',' |
hyphen | '-' | hyphen-minus | '-' |
period | '.' | full-stop | '.' |
slash | '/' | solidus | '/' |
zero | '0' | one | '1' |
two | '2' | three | '3' |
four | '4' | five | '5' |
six | '6' | seven | '7' |
eight | '8' | nine | '9' |
colon | ':' | semicolon | ';' |
less-than-sign | '<' | equals-sign | '=' |
greater-than-sign | '>' | question-mark | '?' |
commercial-at | '@' | left-square-bracket | '[' |
backslash | '\\' | reverse-solidus | '\\' |
right-square-bracket | ']' | circumflex | '^' |
circumflex-accent | '^' | underscore | '_' |
low-line | '_' | grave-accent | '`' |
left-brace | '{' | left-curly-bracket | '{' |
vertical-line | '|' | right-brace | '}' |
right-curly-bracket | '}' | tilde | '~' |
DEL | 177 |
mysql>SELECT '~' REGEXP '[[.~.]]';
-> 1 mysql>SELECT '~' REGEXP '[[.tilde.]]';
-> 1
[=character_class=]
Within a bracket expression (written using
[
and ]
),
[=character_class=]
represents an
equivalence class. It matches all characters with the same
collation value, including itself. For example, if
o
and (+)
are the
members of an equivalence class, [[=o=]]
,
[[=(+)=]]
, and [o(+)]
are all synonymous. An equivalence class may not be used as
an endpoint of a range.
[:character_class:]
Within a bracket expression (written using
[
and ]
),
[:character_class:]
represents a
character class that matches all characters belonging to
that class. The following table lists the standard class
names. These names stand for the character classes defined
in the ctype(3)
manual page. A particular
locale may provide other class names. A character class may
not be used as an endpoint of a range.
Character Class Name | Meaning |
---|---|
alnum | Alphanumeric characters |
alpha | Alphabetic characters |
blank | Whitespace characters |
cntrl | Control characters |
digit | Digit characters |
graph | Graphic characters |
lower | Lowercase alphabetic characters |
print | Graphic or space characters |
punct | Punctuation characters |
space | Space, tab, newline, and carriage return |
upper | Uppercase alphabetic characters |
xdigit | Hexadecimal digit characters |
mysql>SELECT 'justalnums' REGEXP '[[:alnum:]]+';
-> 1 mysql>SELECT '!!' REGEXP '[[:alnum:]]+';
-> 0
[[:<:]]
, [[:>:]]
These markers stand for word boundaries. They match the
beginning and end of words, respectively. A word is a
sequence of word characters that is not preceded by or
followed by word characters. A word character is an
alphanumeric character in the alnum
class
or an underscore (_
).
mysql>SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]';
-> 1 mysql>SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]';
-> 0
To use a literal instance of a special character in a regular
expression, precede it by two backslash (\) characters. The
MySQL parser interprets one of the backslashes, and the regular
expression library interprets the other. For example, to match
the string 1+2
that contains the special
+
character, only the last of the following
regular expressions is the correct one:
mysql>SELECT '1+2' REGEXP '1+2';
-> 0 mysql>SELECT '1+2' REGEXP '1\+2';
-> 0 mysql>SELECT '1+2' REGEXP '1\\+2';
-> 1
Table 12.10. Numeric Functions and Operators
Name | Description |
---|---|
ABS() | Return the absolute value |
ACOS() | Return the arc cosine |
ASIN() | Return the arc sine |
ATAN2() , ATAN() | Return the arc tangent of the two arguments |
ATAN() | Return the arc tangent |
CEIL() | Return the smallest integer value not less than the argument |
CEILING() | Return the smallest integer value not less than the argument |
CONV() | Convert numbers between different number bases |
COS() | Return the cosine |
COT() | Return the cotangent |
CRC32() | Compute a cyclic redundancy check value |
DEGREES() | Convert radians to degrees |
DIV | Integer division |
/ | Division operator |
EXP() | Raise to the power of |
FLOOR() | Return the largest integer value not greater than the argument |
LN() | Return the natural logarithm of the argument |
LOG10() | Return the base-10 logarithm of the argument |
LOG2() | Return the base-2 logarithm of the argument |
LOG() | Return the natural logarithm of the first argument |
- | Minus operator |
MOD() | Return the remainder |
% or MOD | Modulo operator |
PI() | Return the value of pi |
+ | Addition operator |
POW() | Return the argument raised to the specified power |
POWER() | Return the argument raised to the specified power |
RADIANS() | Return argument converted to radians |
RAND() | Return a random floating-point value |
ROUND() | Round the argument |
SIGN() | Return the sign of the argument |
SIN() | Return the sine of the argument |
SQRT() | Return the square root of the argument |
TAN() | Return the tangent of the argument |
* | Multiplication operator |
TRUNCATE() | Truncate to specified number of decimal places |
- | Change the sign of the argument |
The usual arithmetic operators are available. The result is determined according to the following rules:
In the case of
-
,
+
, and
*
, the result
is calculated with BIGINT
(64-bit) precision if both operands are integers.
If both operands are integers and any of them are unsigned,
the result is an unsigned integer. For subtraction, if the
NO_UNSIGNED_SUBTRACTION
SQL mode is enabled, the result is signed even if any
operand is unsigned.
If any of the operands of a
+
,
-
,
/
,
*
,
%
is a real or
string value, the precision of the result is the precision
of the operand with the maximum precision.
In division performed with
/
, the scale
of the result when using two exact-value operands is the
scale of the first operand plus the value of the
div_precision_increment
system variable (which is 4 by default). For example, the
result of the expression 5.05 / 0.014
has
a scale of six decimal places
(360.714286
).
These rules are applied for each operation, such that nested
calculations imply the precision of each component. Hence,
(14620 / 9432456) / (24250 / 9432456)
,
resolves first to (0.0014) / (0.0026)
, with
the final result having 8 decimal places
(0.60288653
).
Because of these rules and the way they are applied, care should be taken to ensure that components and subcomponents of a calculation use the appropriate level of precision. See Section 12.10, “Cast Functions and Operators”.
For information about handling of overflow in numeric expression evaluation, see Section 11.2.6, “Out-of-Range and Overflow Handling”.
Arithmetic operators apply to numbers. For other types of
values, alternative operations may be available. For example, to
add date values, use DATE_ADD()
;
see Section 12.7, “Date and Time Functions”.
Addition:
mysql> SELECT 3+5;
-> 8
Subtraction:
mysql> SELECT 3-5;
-> -2
Unary minus. This operator changes the sign of the operand.
mysql> SELECT - 2;
-> -2
Multiplication:
mysql>SELECT 3*5;
-> 15 mysql>SELECT 18014398509481984*18014398509481984.0;
-> 324518553658426726783156020576256.0 mysql>SELECT 18014398509481984*18014398509481984;
-> out-of-range error
The last expression produces an error because the result of
the integer multiplication exceeds the 64-bit range of
BIGINT
calculations. (See
Section 11.2, “Numeric Types”.)
Division:
mysql> SELECT 3/5;
-> 0.60
Division by zero produces a NULL
result:
mysql> SELECT 102/(1-1);
-> NULL
A division is calculated with
BIGINT
arithmetic only if
performed in a context where its result is converted to an
integer.
Integer division. Similar to
FLOOR()
, but is safe with
BIGINT
values.
In MySQL 5.7, if either operand has a
noninteger type, the operands are converted to
DECIMAL
and divided using
DECIMAL
arithmetic before
converting the result to
BIGINT
. If the result exceeds
BIGINT
range, an error occurs.
mysql> SELECT 5 DIV 2;
-> 2
Modulo operation. Returns the remainder of
N
divided by
M
. For more information, see the
description for the MOD()
function in Section 12.6.2, “Mathematical Functions”.
Table 12.12. Mathematical Functions
Name | Description |
---|---|
ABS() | Return the absolute value |
ACOS() | Return the arc cosine |
ASIN() | Return the arc sine |
ATAN2() , ATAN() | Return the arc tangent of the two arguments |
ATAN() | Return the arc tangent |
CEIL() | Return the smallest integer value not less than the argument |
CEILING() | Return the smallest integer value not less than the argument |
CONV() | Convert numbers between different number bases |
COS() | Return the cosine |
COT() | Return the cotangent |
CRC32() | Compute a cyclic redundancy check value |
DEGREES() | Convert radians to degrees |
EXP() | Raise to the power of |
FLOOR() | Return the largest integer value not greater than the argument |
LN() | Return the natural logarithm of the argument |
LOG10() | Return the base-10 logarithm of the argument |
LOG2() | Return the base-2 logarithm of the argument |
LOG() | Return the natural logarithm of the first argument |
MOD() | Return the remainder |
PI() | Return the value of pi |
POW() | Return the argument raised to the specified power |
POWER() | Return the argument raised to the specified power |
RADIANS() | Return argument converted to radians |
RAND() | Return a random floating-point value |
ROUND() | Round the argument |
SIGN() | Return the sign of the argument |
SIN() | Return the sine of the argument |
SQRT() | Return the square root of the argument |
TAN() | Return the tangent of the argument |
TRUNCATE() | Truncate to specified number of decimal places |
All mathematical functions return NULL
in the
event of an error.
Returns the absolute value of X
.
mysql>SELECT ABS(2);
-> 2 mysql>SELECT ABS(-32);
-> 32
This function is safe to use with
BIGINT
values.
Returns the arc cosine of X
, that
is, the value whose cosine is X
.
Returns NULL
if
X
is not in the range
-1
to 1
.
mysql>SELECT ACOS(1);
-> 0 mysql>SELECT ACOS(1.0001);
-> NULL mysql>SELECT ACOS(0);
-> 1.5707963267949
Returns the arc sine of X
, that
is, the value whose sine is X
.
Returns NULL
if
X
is not in the range
-1
to 1
.
mysql>SELECT ASIN(0.2);
-> 0.20135792079033 mysql>SELECT ASIN('foo');
+-------------+ | ASIN('foo') | +-------------+ | 0 | +-------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+-----------------------------------------+ | Level | Code | Message | +---------+------+-----------------------------------------+ | Warning | 1292 | Truncated incorrect DOUBLE value: 'foo' | +---------+------+-----------------------------------------+
Returns the arc tangent of X
,
that is, the value whose tangent is
X
.
mysql>SELECT ATAN(2);
-> 1.1071487177941 mysql>SELECT ATAN(-2);
-> -1.1071487177941
Returns the arc tangent of the two variables
X
and
Y
. It is similar to calculating
the arc tangent of
, except that the
signs of both arguments are used to determine the quadrant
of the result.
Y
/
X
mysql>SELECT ATAN(-2,2);
-> -0.78539816339745 mysql>SELECT ATAN2(PI(),0);
-> 1.5707963267949
Returns the smallest integer value not less than
X
.
mysql>SELECT CEILING(1.23);
-> 2 mysql>SELECT CEILING(-1.23);
-> -1
For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.
Converts numbers between different number bases. Returns a
string representation of the number
N
, converted from base
from_base
to base
to_base
. Returns
NULL
if any argument is
NULL
. The argument
N
is interpreted as an integer,
but may be specified as an integer or a string. The minimum
base is 2
and the maximum base is
36
. If to_base
is a negative number, N
is
regarded as a signed number. Otherwise,
N
is treated as unsigned.
CONV()
works with 64-bit
precision.
mysql>SELECT CONV('a',16,2);
-> '1010' mysql>SELECT CONV('6E',18,8);
-> '172' mysql>SELECT CONV(-17,10,-18);
-> '-H' mysql>SELECT CONV(10+'10'+'10'+0xa,10,10);
-> '40'
Returns the cosine of X
, where
X
is given in radians.
mysql> SELECT COS(PI());
-> -1
Returns the cotangent of X
.
mysql>SELECT COT(12);
-> -1.5726734063977 mysql>SELECT COT(0);
-> NULL
Computes a cyclic redundancy check value and returns a
32-bit unsigned value. The result is NULL
if the argument is NULL
. The argument is
expected to be a string and (if possible) is treated as one
if it is not.
mysql>SELECT CRC32('MySQL');
-> 3259397556 mysql>SELECT CRC32('mysql');
-> 2501908538
Returns the argument X
, converted
from radians to degrees.
mysql>SELECT DEGREES(PI());
-> 180 mysql>SELECT DEGREES(PI() / 2);
-> 90
Returns the value of e (the base of
natural logarithms) raised to the power of
X
. The inverse of this function
is LOG()
(using a single
argument only) or LN()
.
mysql>SELECT EXP(2);
-> 7.3890560989307 mysql>SELECT EXP(-2);
-> 0.13533528323661 mysql>SELECT EXP(0);
-> 1
Returns the largest integer value not greater than
X
.
mysql>SELECT FLOOR(1.23);
-> 1 mysql>SELECT FLOOR(-1.23);
-> -2
For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. For details, see
Section 12.5, “String Functions”.
This function can be used to obtain a hexadecimal representation of a decimal number or a string; the manner in which it does so varies according to the argument's type. See this function's description in Section 12.5, “String Functions”, for details.
Returns the natural logarithm of
X
; that is, the
base-e logarithm of
X
. If
X
is less than or equal to 0,
then NULL
is returned.
mysql>SELECT LN(2);
-> 0.69314718055995 mysql>SELECT LN(-2);
-> NULL
This function is synonymous with
LOG(
.
The inverse of this function is the
X
)EXP()
function.
If called with one parameter, this function returns the
natural logarithm of X
. If
X
is less than or equal to 0,
then NULL
is returned.
The inverse of this function (when called with a single
argument) is the EXP()
function.
mysql>SELECT LOG(2);
-> 0.69314718055995 mysql>SELECT LOG(-2);
-> NULL
If called with two parameters, this function returns the
logarithm of X
to the base
B
. If
X
is less than or equal to 0, or
if B
is less than or equal to 1,
then NULL
is returned.
mysql>SELECT LOG(2,65536);
-> 16 mysql>SELECT LOG(10,100);
-> 2 mysql>SELECT LOG(1,100);
-> NULL
LOG(
is equivalent to
B
,X
)LOG(
.
X
) /
LOG(B
)
Returns the base-2 logarithm of
.
X
mysql>SELECT LOG2(65536);
-> 16 mysql>SELECT LOG2(-100);
-> NULL
LOG2()
is useful for finding
out how many bits a number requires for storage. This
function is equivalent to the expression
LOG(
.
X
) /
LOG(2)
Returns the base-10 logarithm of
X
.
mysql>SELECT LOG10(2);
-> 0.30102999566398 mysql>SELECT LOG10(100);
-> 2 mysql>SELECT LOG10(-100);
-> NULL
Modulo operation. Returns the remainder of
N
divided by
M
.
mysql>SELECT MOD(234, 10);
-> 4 mysql>SELECT 253 % 7;
-> 1 mysql>SELECT MOD(29,9);
-> 2 mysql>SELECT 29 MOD 9;
-> 2
This function is safe to use with
BIGINT
values.
MOD()
also works on values
that have a fractional part and returns the exact remainder
after division:
mysql> SELECT MOD(34.5,3);
-> 1.5
MOD(
returns N
,0)NULL
.
Returns the value of π (pi). The default number of decimal places displayed is seven, but MySQL uses the full double-precision value internally.
mysql>SELECT PI();
-> 3.141593 mysql>SELECT PI()+0.000000000000000000;
-> 3.141592653589793116
Returns the value of X
raised to
the power of Y
.
mysql>SELECT POW(2,2);
-> 4 mysql>SELECT POW(2,-2);
-> 0.25
This is a synonym for POW()
.
Returns the argument X
, converted
from degrees to radians. (Note that π radians equals 180
degrees.)
mysql> SELECT RADIANS(90);
-> 1.5707963267949
Returns a random floating-point value
v
in the range
0
<= v
<
1.0
. If a constant integer argument
N
is specified, it is used as the
seed value, which produces a repeatable sequence of column
values. In the following example, note that the sequences of
values produced by RAND(3)
is the same
both places where it occurs.
mysql>CREATE TABLE t (i INT);
Query OK, 0 rows affected (0.42 sec) mysql>INSERT INTO t VALUES(1),(2),(3);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT i, RAND() FROM t;
+------+------------------+ | i | RAND() | +------+------------------+ | 1 | 0.61914388706828 | | 2 | 0.93845168309142 | | 3 | 0.83482678498591 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND(3) FROM t;
+------+------------------+ | i | RAND(3) | +------+------------------+ | 1 | 0.90576975597606 | | 2 | 0.37307905813035 | | 3 | 0.14808605345719 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND() FROM t;
+------+------------------+ | i | RAND() | +------+------------------+ | 1 | 0.35877890638893 | | 2 | 0.28941420772058 | | 3 | 0.37073435016976 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND(3) FROM t;
+------+------------------+ | i | RAND(3) | +------+------------------+ | 1 | 0.90576975597606 | | 2 | 0.37307905813035 | | 3 | 0.14808605345719 | +------+------------------+ 3 rows in set (0.01 sec)
With a constant initializer, the seed is initialized once
when the statement is compiled, prior to execution. If a
nonconstant initializer (such as a column name) is used as
the argument, the seed is initialized with the value for
each invocation of RAND()
.
(One implication of this is that for equal argument values,
RAND()
will return the same
value each time.)
To obtain a random integer R
in
the range i
<=
R
<
j
, use the expression
FLOOR(
– i
+ RAND() * (j
.
For example, to obtain a random integer in the range the
range i
))7
<=
R
< 12
, you
could use the following statement:
SELECT FLOOR(7 + (RAND() * 5));
RAND()
in a
WHERE
clause is re-evaluated every time
the WHERE
is executed.
You cannot use a column with
RAND()
values in an
ORDER BY
clause, because ORDER
BY
would evaluate the column multiple times.
However, you can retrieve rows in random order like this:
mysql> SELECT * FROM tbl_name
ORDER BY RAND();
ORDER BY RAND()
combined with
LIMIT
is useful for selecting a random
sample from a set of rows:
mysql>SELECT * FROM table1, table2 WHERE a=b AND c<d
->ORDER BY RAND() LIMIT 1000;
RAND()
is not meant to be a
perfect random generator. It is a fast way to generate
random numbers on demand that is portable between platforms
for the same MySQL version.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
. (Bug #49222)
Rounds the argument X
to
D
decimal places. The rounding
algorithm depends on the data type of
X
. D
defaults to 0 if not specified. D
can be negative to cause D
digits
left of the decimal point of the value
X
to become zero.
mysql>SELECT ROUND(-1.23);
-> -1 mysql>SELECT ROUND(-1.58);
-> -2 mysql>SELECT ROUND(1.58);
-> 2 mysql>SELECT ROUND(1.298, 1);
-> 1.3 mysql>SELECT ROUND(1.298, 0);
-> 1 mysql>SELECT ROUND(23.298, -1);
-> 20
The return type is the same type as that of the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places):
mysql> SELECT ROUND(150.000,2), ROUND(150,2);
+------------------+--------------+
| ROUND(150.000,2) | ROUND(150,2) |
+------------------+--------------+
| 150.00 | 150 |
+------------------+--------------+
ROUND()
uses the following
rules depending on the type of the first argument:
For exact-value numbers,
ROUND()
uses the
“round half away from zero” or “round
toward nearest” rule: A value with a fractional
part of .5 or greater is rounded up to the next integer
if positive or down to the next integer if negative. (In
other words, it is rounded away from zero.) A value with
a fractional part less than .5 is rounded down to the
next integer if positive or up to the next integer if
negative.
For approximate-value numbers, the result depends on the
C library. On many systems, this means that
ROUND()
uses the "round
to nearest even" rule: A value with any fractional part
is rounded to the nearest even integer.
The following example shows how rounding differs for exact and approximate values:
mysql> SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+
| ROUND(2.5) | ROUND(25E-1) |
+------------+--------------+
| 3 | 2 |
+------------+--------------+
For more information, see Section 12.19, “Precision Math”.
Returns the sign of the argument as -1
,
0
, or 1
, depending on
whether X
is negative, zero, or
positive.
mysql>SELECT SIGN(-32);
-> -1 mysql>SELECT SIGN(0);
-> 0 mysql>SELECT SIGN(234);
-> 1
Returns the sine of X
, where
X
is given in radians.
mysql>SELECT SIN(PI());
-> 1.2246063538224e-16 mysql>SELECT ROUND(SIN(PI()));
-> 0
Returns the square root of a nonnegative number
X
.
mysql>SELECT SQRT(4);
-> 2 mysql>SELECT SQRT(20);
-> 4.4721359549996 mysql>SELECT SQRT(-16);
-> NULL
Returns the tangent of X
, where
X
is given in radians.
mysql>SELECT TAN(PI());
-> -1.2246063538224e-16 mysql>SELECT TAN(PI()+1);
-> 1.5574077246549
Returns the number X
, truncated
to D
decimal places. If
D
is 0
, the
result has no decimal point or fractional part.
D
can be negative to cause
D
digits left of the decimal
point of the value X
to become
zero.
mysql>SELECT TRUNCATE(1.223,1);
-> 1.2 mysql>SELECT TRUNCATE(1.999,1);
-> 1.9 mysql>SELECT TRUNCATE(1.999,0);
-> 1 mysql>SELECT TRUNCATE(-1.999,1);
-> -1.9 mysql>SELECT TRUNCATE(122,-2);
-> 100 mysql>SELECT TRUNCATE(10.28*100,0);
-> 1028
All numbers are rounded toward zero.
This section describes the functions that can be used to manipulate temporal values. See Section 11.3, “Date and Time Types”, for a description of the range of values each date and time type has and the valid formats in which values may be specified.
Table 12.13. Date/Time Functions
Name | Description |
---|---|
ADDDATE() | Add time values (intervals) to a date value |
ADDTIME() | Add time |
CONVERT_TZ() | Convert from one timezone to another |
CURDATE() | Return the current date |
CURRENT_DATE() , CURRENT_DATE | Synonyms for CURDATE() |
CURRENT_TIME() , CURRENT_TIME | Synonyms for CURTIME() |
CURRENT_TIMESTAMP() , CURRENT_TIMESTAMP | Synonyms for NOW() |
CURTIME() | Return the current time |
DATE_ADD() | Add time values (intervals) to a date value |
DATE_FORMAT() | Format date as specified |
DATE_SUB() | Subtract a time value (interval) from a date |
DATE() | Extract the date part of a date or datetime expression |
DATEDIFF() | Subtract two dates |
DAY() | Synonym for DAYOFMONTH() |
DAYNAME() | Return the name of the weekday |
DAYOFMONTH() | Return the day of the month (0-31) |
DAYOFWEEK() | Return the weekday index of the argument |
DAYOFYEAR() | Return the day of the year (1-366) |
EXTRACT() | Extract part of a date |
FROM_DAYS() | Convert a day number to a date |
FROM_UNIXTIME() | Format UNIX timestamp as a date |
GET_FORMAT() | Return a date format string |
HOUR() | Extract the hour |
LAST_DAY | Return the last day of the month for the argument |
LOCALTIME() , LOCALTIME | Synonym for NOW() |
LOCALTIMESTAMP , LOCALTIMESTAMP() | Synonym for NOW() |
MAKEDATE() | Create a date from the year and day of year |
MAKETIME | MAKETIME() |
MICROSECOND() | Return the microseconds from argument |
MINUTE() | Return the minute from the argument |
MONTH() | Return the month from the date passed |
MONTHNAME() | Return the name of the month |
NOW() | Return the current date and time |
PERIOD_ADD() | Add a period to a year-month |
PERIOD_DIFF() | Return the number of months between periods |
QUARTER() | Return the quarter from a date argument |
SEC_TO_TIME() | Converts seconds to 'HH:MM:SS' format |
SECOND() | Return the second (0-59) |
STR_TO_DATE() | Convert a string to a date |
SUBDATE() | A synonym for DATE_SUB() when invoked with three arguments |
SUBTIME() | Subtract times |
SYSDATE() | Return the time at which the function executes |
TIME_FORMAT() | Format as time |
TIME_TO_SEC() | Return the argument converted to seconds |
TIME() | Extract the time portion of the expression passed |
TIMEDIFF() | Subtract time |
TIMESTAMP() | With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments |
TIMESTAMPADD() | Add an interval to a datetime expression |
TIMESTAMPDIFF() | Subtract an interval from a datetime expression |
TO_DAYS() | Return the date argument converted to days |
TO_SECONDS() | Return the date or datetime argument converted to seconds since Year 0 |
UNIX_TIMESTAMP() | Return a UNIX timestamp |
UTC_DATE() | Return the current UTC date |
UTC_TIME() | Return the current UTC time |
UTC_TIMESTAMP() | Return the current UTC date and time |
WEEK() | Return the week number |
WEEKDAY() | Return the weekday index |
WEEKOFYEAR() | Return the calendar week of the date (0-53) |
YEAR() | Return the year |
YEARWEEK() | Return the year and week |
Here is an example that uses date functions. The following query
selects all rows with a date_col
value
from within the last 30 days:
mysql>SELECT
->something
FROMtbl_name
WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <=
date_col
;
The query also selects rows with dates that lie in the future.
Functions that expect date values usually accept datetime values and ignore the time part. Functions that expect time values usually accept datetime values and ignore the date part.
Functions that return the current date or time each are evaluated
only once per query at the start of query execution. This means
that multiple references to a function such as
NOW()
within a single query always
produce the same result. (For our purposes, a single query also
includes a call to a stored program (stored routine, trigger, or
event) and all subprograms called by that program.) This principle
also applies to CURDATE()
,
CURTIME()
,
UTC_DATE()
,
UTC_TIME()
,
UTC_TIMESTAMP()
, and to any of
their synonyms.
The CURRENT_TIMESTAMP()
,
CURRENT_TIME()
,
CURRENT_DATE()
, and
FROM_UNIXTIME()
functions return
values in the connection's current time zone, which is available
as the value of the time_zone
system variable. In addition,
UNIX_TIMESTAMP()
assumes that its
argument is a datetime value in the current time zone. See
Section 10.6, “MySQL Server Time Zone Support”.
Some date functions can be used with “zero” dates or
incomplete dates such as '2001-11-00'
, whereas
others cannot. Functions that extract parts of dates typically
work with incomplete dates and thus can return 0 when you might
otherwise expect a nonzero value. For example:
mysql> SELECT DAYOFMONTH('2001-11-00'), MONTH('2005-00-00');
-> 0, 0
Other functions expect complete dates and return
NULL
for incomplete dates. These include
functions that perform date arithmetic or that map parts of dates
to names. For example:
mysql>SELECT DATE_ADD('2006-05-00',INTERVAL 1 DAY);
-> NULL mysql>SELECT DAYNAME('2006-05-00');
-> NULL
Several functions are more strict when passed a
DATE()
function value as their
argument and reject incomplete dates with a day part of zero.
These functions are affected:
CONVERT_TZ()
,
DATE_ADD()
,
DATE_SUB()
,
DAYOFYEAR()
,
LAST_DAY()
(permits a day part of
zero), TIMESTAMPDIFF()
,
TO_DAYS()
,
TO_SECONDS()
,
WEEK()
,
WEEKDAY()
,
WEEKOFYEAR()
,
YEARWEEK()
.
Fractional seconds for TIME
,
DATETIME
, and TIMESTAMP
values are supported, with up to microsecond precision. Functions
that take temporal arguments accept values with fractional
seconds. Return values from temporal functions include fractional
seconds as appropriate.
ADDDATE(
,
date
,INTERVAL
expr
unit
)ADDDATE(
expr
,days
)
When invoked with the INTERVAL
form of the
second argument, ADDDATE()
is a
synonym for DATE_ADD()
. The
related function SUBDATE()
is a
synonym for DATE_SUB()
. For
information on the INTERVAL
unit
argument, see the discussion
for DATE_ADD()
.
mysql>SELECT DATE_ADD('2008-01-02', INTERVAL 31 DAY);
-> '2008-02-02' mysql>SELECT ADDDATE('2008-01-02', INTERVAL 31 DAY);
-> '2008-02-02'
When invoked with the days
form of
the second argument, MySQL treats it as an integer number of
days to be added to expr
.
mysql> SELECT ADDDATE('2008-01-02', 31);
-> '2008-02-02'
ADDTIME()
adds
expr2
to
expr1
and returns the result.
expr1
is a time or datetime
expression, and expr2
is a time
expression.
mysql>SELECT ADDTIME('2007-12-31 23:59:59.999999', '1 1:1:1.000002');
-> '2008-01-02 01:01:01.000001' mysql>SELECT ADDTIME('01:00:00.999999', '02:00:00.999998');
-> '03:00:01.999997'
CONVERT_TZ()
converts a
datetime value dt
from the time
zone given by from_tz
to the time
zone given by to_tz
and returns the
resulting value. Time zones are specified as described in
Section 10.6, “MySQL Server Time Zone Support”. This function returns
NULL
if the arguments are invalid.
If the value falls out of the supported range of the
TIMESTAMP
type when converted
from from_tz
to UTC, no conversion
occurs. The TIMESTAMP
range is
described in Section 11.1.2, “Date and Time Type Overview”.
mysql>SELECT CONVERT_TZ('2004-01-01 12:00:00','GMT','MET');
-> '2004-01-01 13:00:00' mysql>SELECT CONVERT_TZ('2004-01-01 12:00:00','+00:00','+10:00');
-> '2004-01-01 22:00:00'
To use named time zones such as 'MET'
or
'Europe/Moscow'
, the time zone tables
must be properly set up. See
Section 10.6, “MySQL Server Time Zone Support”, for instructions.
Returns the current date as a value in
'YYYY-MM-DD'
or YYYYMMDD
format, depending on whether the function is used in a string
or numeric context.
mysql>SELECT CURDATE();
-> '2008-06-13' mysql>SELECT CURDATE() + 0;
-> 20080613
CURRENT_DATE
and
CURRENT_DATE()
are synonyms for
CURDATE()
.
CURRENT_TIME
,
CURRENT_TIME([
fsp
])
CURRENT_TIME
and
CURRENT_TIME()
are synonyms for
CURTIME()
.
CURRENT_TIMESTAMP
,
CURRENT_TIMESTAMP([
fsp
])
CURRENT_TIMESTAMP
and
CURRENT_TIMESTAMP()
are
synonyms for NOW()
.
Returns the current time as a value in
'HH:MM:SS'
or HHMMSS
format, depending on whether the function is used in a string
or numeric context. The value is expressed in the current time
zone.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql>SELECT CURTIME();
-> '23:50:26' mysql>SELECT CURTIME() + 0;
-> 235026.000000
Extracts the date part of the date or datetime expression
expr
.
mysql> SELECT DATE('2003-12-31 01:02:03');
-> '2003-12-31'
DATEDIFF()
returns
expr1
–
expr2
expressed as a value in days
from one date to the other. expr1
and expr2
are date or date-and-time
expressions. Only the date parts of the values are used in the
calculation.
mysql>SELECT DATEDIFF('2007-12-31 23:59:59','2007-12-30');
-> 1 mysql>SELECT DATEDIFF('2010-11-30 23:59:59','2010-12-31');
-> -31
DATE_ADD(
,
date
,INTERVAL
expr
unit
)DATE_SUB(
date
,INTERVAL
expr
unit
)
These functions perform date arithmetic. The
date
argument specifies the
starting date or datetime value.
expr
is an expression specifying
the interval value to be added or subtracted from the starting
date. expr
is a string; it may
start with a “-
” for negative
intervals. unit
is a keyword
indicating the units in which the expression should be
interpreted.
The INTERVAL
keyword and the
unit
specifier are not case
sensitive.
The following table shows the expected form of the
expr
argument for each
unit
value.
unit Value | Expected expr Format |
---|---|
MICROSECOND | MICROSECONDS |
SECOND | SECONDS |
MINUTE | MINUTES |
HOUR | HOURS |
DAY | DAYS |
WEEK | WEEKS |
MONTH | MONTHS |
QUARTER | QUARTERS |
YEAR | YEARS |
SECOND_MICROSECOND | 'SECONDS.MICROSECONDS' |
MINUTE_MICROSECOND | 'MINUTES:SECONDS.MICROSECONDS' |
MINUTE_SECOND | 'MINUTES:SECONDS' |
HOUR_MICROSECOND | 'HOURS:MINUTES:SECONDS.MICROSECONDS' |
HOUR_SECOND | 'HOURS:MINUTES:SECONDS' |
HOUR_MINUTE | 'HOURS:MINUTES' |
DAY_MICROSECOND | 'DAYS HOURS:MINUTES:SECONDS.MICROSECONDS' |
DAY_SECOND | 'DAYS HOURS:MINUTES:SECONDS' |
DAY_MINUTE | 'DAYS HOURS:MINUTES' |
DAY_HOUR | 'DAYS HOURS' |
YEAR_MONTH | 'YEARS-MONTHS' |
The return value depends on the arguments:
To ensure that the result is
DATETIME
, you can use
CAST()
to convert the first
argument to DATETIME
.
MySQL permits any punctuation delimiter in the
expr
format. Those shown in the
table are the suggested delimiters. If the
date
argument is a
DATE
value and your
calculations involve only YEAR
,
MONTH
, and DAY
parts
(that is, no time parts), the result is a
DATE
value. Otherwise, the
result is a DATETIME
value.
Date arithmetic also can be performed using
INTERVAL
together with the
+
or
-
operator:
date
+ INTERVALexpr
unit
date
- INTERVALexpr
unit
INTERVAL
is permitted on
either side of the
expr
unit
+
operator if
the expression on the other side is a date or datetime value.
For the -
operator, INTERVAL
is permitted only on
the right side, because it makes no sense to subtract a date
or datetime value from an interval.
expr
unit
mysql>SELECT '2008-12-31 23:59:59' + INTERVAL 1 SECOND;
-> '2009-01-01 00:00:00' mysql>SELECT INTERVAL 1 DAY + '2008-12-31';
-> '2009-01-01' mysql>SELECT '2005-01-01' - INTERVAL 1 SECOND;
-> '2004-12-31 23:59:59' mysql>SELECT DATE_ADD('2000-12-31 23:59:59',
->INTERVAL 1 SECOND);
-> '2001-01-01 00:00:00' mysql>SELECT DATE_ADD('2010-12-31 23:59:59',
->INTERVAL 1 DAY);
-> '2011-01-01 23:59:59' mysql>SELECT DATE_ADD('2100-12-31 23:59:59',
->INTERVAL '1:1' MINUTE_SECOND);
-> '2101-01-01 00:01:00' mysql>SELECT DATE_SUB('2005-01-01 00:00:00',
->INTERVAL '1 1:1:1' DAY_SECOND);
-> '2004-12-30 22:58:59' mysql>SELECT DATE_ADD('1900-01-01 00:00:00',
->INTERVAL '-1 10' DAY_HOUR);
-> '1899-12-30 14:00:00' mysql>SELECT DATE_SUB('1998-01-02', INTERVAL 31 DAY);
-> '1997-12-02' mysql>SELECT DATE_ADD('1992-12-31 23:59:59.000002',
->INTERVAL '1.999999' SECOND_MICROSECOND);
-> '1993-01-01 00:00:01.000001'
If you specify an interval value that is too short (does not
include all the interval parts that would be expected from the
unit
keyword), MySQL assumes that
you have left out the leftmost parts of the interval value.
For example, if you specify a unit
of DAY_SECOND
, the value of
expr
is expected to have days,
hours, minutes, and seconds parts. If you specify a value like
'1:10'
, MySQL assumes that the days and
hours parts are missing and the value represents minutes and
seconds. In other words, '1:10' DAY_SECOND
is interpreted in such a way that it is equivalent to
'1:10' MINUTE_SECOND
. This is analogous to
the way that MySQL interprets
TIME
values as representing
elapsed time rather than as a time of day.
Because expr
is treated as a
string, be careful if you specify a nonstring value with
INTERVAL
. For example, with an interval
specifier of HOUR_MINUTE
,
6/4
evaluates to 1.5000
and is treated as 1 hour, 5000 minutes:
mysql>SELECT 6/4;
-> 1.5000 mysql>SELECT DATE_ADD('2009-01-01', INTERVAL 6/4 HOUR_MINUTE);
-> '2009-01-04 12:20:00'
To ensure interpretation of the interval value as you expect,
a CAST()
operation may be used.
To treat 6/4
as 1 hour, 5 minutes, cast it
to a DECIMAL
value with a
single fractional digit:
mysql>SELECT CAST(6/4 AS DECIMAL(3,1));
-> 1.5 mysql>SELECT DATE_ADD('1970-01-01 12:00:00',
->INTERVAL CAST(6/4 AS DECIMAL(3,1)) HOUR_MINUTE);
-> '1970-01-01 13:05:00'
If you add to or subtract from a date value something that contains a time part, the result is automatically converted to a datetime value:
mysql>SELECT DATE_ADD('2013-01-01', INTERVAL 1 DAY);
-> '2013-01-02' mysql>SELECT DATE_ADD('2013-01-01', INTERVAL 1 HOUR);
-> '2013-01-01 01:00:00'
If you add MONTH
,
YEAR_MONTH
, or YEAR
and
the resulting date has a day that is larger than the maximum
day for the new month, the day is adjusted to the maximum days
in the new month:
mysql> SELECT DATE_ADD('2009-01-30', INTERVAL 1 MONTH);
-> '2009-02-28'
Date arithmetic operations require complete dates and do not
work with incomplete dates such as
'2006-07-00'
or badly malformed dates:
mysql>SELECT DATE_ADD('2006-07-00', INTERVAL 1 DAY);
-> NULL mysql>SELECT '2005-03-32' + INTERVAL 1 MONTH;
-> NULL
Formats the date
value according to
the format
string.
The following specifiers may be used in the
format
string. The
“%
” character is required
before format specifier characters.
Specifier | Description |
---|---|
%a | Abbreviated weekday name
(Sun ..Sat ) |
%b | Abbreviated month name (Jan ..Dec ) |
%c | Month, numeric (0 ..12 ) |
%D | Day of the month with English suffix (0th ,
1st , 2nd ,
3rd , …) |
%d | Day of the month, numeric (00 ..31 ) |
%e | Day of the month, numeric (0 ..31 ) |
%f | Microseconds (000000 ..999999 ) |
%H | Hour (00 ..23 ) |
%h | Hour (01 ..12 ) |
%I | Hour (01 ..12 ) |
%i | Minutes, numeric (00 ..59 ) |
%j | Day of year (001 ..366 ) |
%k | Hour (0 ..23 ) |
%l | Hour (1 ..12 ) |
%M | Month name (January ..December ) |
%m | Month, numeric (00 ..12 ) |
%p | AM or PM |
%r | Time, 12-hour (hh:mm:ss followed by
AM or PM ) |
%S | Seconds (00 ..59 ) |
%s | Seconds (00 ..59 ) |
%T | Time, 24-hour (hh:mm:ss ) |
%U | Week (00 ..53 ), where Sunday is the
first day of the week |
%u | Week (00 ..53 ), where Monday is the
first day of the week |
%V | Week (01 ..53 ), where Sunday is the
first day of the week; used with %X |
%v | Week (01 ..53 ), where Monday is the
first day of the week; used with %x |
%W | Weekday name (Sunday ..Saturday ) |
%w | Day of the week
(0 =Sunday..6 =Saturday) |
%X | Year for the week where Sunday is the first day of the week, numeric,
four digits; used with %V |
%x | Year for the week, where Monday is the first day of the week, numeric,
four digits; used with %v |
%Y | Year, numeric, four digits |
%y | Year, numeric (two digits) |
%% | A literal “% ” character |
% | x , for any
“x ” not listed
above |
Ranges for the month and day specifiers begin with zero due to
the fact that MySQL permits the storing of incomplete dates
such as '2014-00-00'
.
The language used for day and month names and abbreviations is
controlled by the value of the
lc_time_names
system variable
(Section 10.7, “MySQL Server Locale Support”).
DATE_FORMAT()
returns a string
with a character set and collation given by
character_set_connection
and
collation_connection
so that
it can return month and weekday names containing non-ASCII
characters.
mysql>SELECT DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y');
-> 'Sunday October 2009' mysql>SELECT DATE_FORMAT('2007-10-04 22:23:00', '%H:%i:%s');
-> '22:23:00' mysql>SELECT DATE_FORMAT('1900-10-04 22:23:00',
->'%D %y %a %d %m %b %j');
-> '4th 00 Thu 04 10 Oct 277' mysql>SELECT DATE_FORMAT('1997-10-04 22:23:00',
->'%H %k %I %r %T %S %w');
-> '22 22 10 10:23:00 PM 22:23:00 00 6' mysql>SELECT DATE_FORMAT('1999-01-01', '%X %V');
-> '1998 52' mysql>SELECT DATE_FORMAT('2006-06-00', '%d');
-> '00'
DATE_SUB(
date
,INTERVAL
expr
unit
)
See the description for
DATE_ADD()
.
DAY()
is a synonym for
DAYOFMONTH()
.
Returns the name of the weekday for
date
. The language used for the
name is controlled by the value of the
lc_time_names
system variable
(Section 10.7, “MySQL Server Locale Support”).
mysql> SELECT DAYNAME('2007-02-03');
-> 'Saturday'
Returns the day of the month for
date
, in the range
1
to 31
, or
0
for dates such as
'0000-00-00'
or
'2008-00-00'
that have a zero day part.
mysql> SELECT DAYOFMONTH('2007-02-03');
-> 3
Returns the weekday index for date
(1
= Sunday, 2
= Monday,
…, 7
= Saturday). These index values
correspond to the ODBC standard.
mysql> SELECT DAYOFWEEK('2007-02-03');
-> 7
Returns the day of the year for
date
, in the range
1
to 366
.
mysql> SELECT DAYOFYEAR('2007-02-03');
-> 34
The EXTRACT()
function uses the
same kinds of unit specifiers as
DATE_ADD()
or
DATE_SUB()
, but extracts parts
from the date rather than performing date arithmetic.
mysql>SELECT EXTRACT(YEAR FROM '2009-07-02');
-> 2009 mysql>SELECT EXTRACT(YEAR_MONTH FROM '2009-07-02 01:02:03');
-> 200907 mysql>SELECT EXTRACT(DAY_MINUTE FROM '2009-07-02 01:02:03');
-> 20102 mysql>SELECT EXTRACT(MICROSECOND
->FROM '2003-01-02 10:30:00.000123');
-> 123
Given a day number N
, returns a
DATE
value.
mysql> SELECT FROM_DAYS(730669);
-> '2007-07-03'
Use FROM_DAYS()
with caution on
old dates. It is not intended for use with values that precede
the advent of the Gregorian calendar (1582). See
Section 12.8, “What Calendar Is Used By MySQL?”.
FROM_UNIXTIME(
,
unix_timestamp
)FROM_UNIXTIME(
unix_timestamp
,format
)
Returns a representation of the
unix_timestamp
argument as a value
in 'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context. The value
is expressed in the current time zone.
unix_timestamp
is an internal
timestamp value such as is produced by the
UNIX_TIMESTAMP()
function.
If format
is given, the result is
formatted according to the format
string, which is used the same way as listed in the entry for
the DATE_FORMAT()
function.
mysql>SELECT FROM_UNIXTIME(1196440219);
-> '2007-11-30 10:30:19' mysql>SELECT FROM_UNIXTIME(1196440219) + 0;
-> 20071130103019.000000 mysql>SELECT FROM_UNIXTIME(UNIX_TIMESTAMP(),
->'%Y %D %M %h:%i:%s %x');
-> '2007 30th November 10:30:59 2007'
Note: If you use
UNIX_TIMESTAMP()
and
FROM_UNIXTIME()
to convert
between TIMESTAMP
values and
Unix timestamp values, the conversion is lossy because the
mapping is not one-to-one in both directions. For details, see
the description of the
UNIX_TIMESTAMP()
function.
GET_FORMAT({DATE|TIME|DATETIME},
{'EUR'|'USA'|'JIS'|'ISO'|'INTERNAL'})
Returns a format string. This function is useful in
combination with the
DATE_FORMAT()
and the
STR_TO_DATE()
functions.
The possible values for the first and second arguments result
in several possible format strings (for the specifiers used,
see the table in the
DATE_FORMAT()
function
description). ISO format refers to ISO 9075, not ISO 8601.
Function Call | Result |
---|---|
GET_FORMAT(DATE,'USA') | '%m.%d.%Y' |
GET_FORMAT(DATE,'JIS') | '%Y-%m-%d' |
GET_FORMAT(DATE,'ISO') | '%Y-%m-%d' |
GET_FORMAT(DATE,'EUR') | '%d.%m.%Y' |
GET_FORMAT(DATE,'INTERNAL') | '%Y%m%d' |
GET_FORMAT(DATETIME,'USA') | '%Y-%m-%d %H.%i.%s' |
GET_FORMAT(DATETIME,'JIS') | '%Y-%m-%d %H:%i:%s' |
GET_FORMAT(DATETIME,'ISO') | '%Y-%m-%d %H:%i:%s' |
GET_FORMAT(DATETIME,'EUR') | '%Y-%m-%d %H.%i.%s' |
GET_FORMAT(DATETIME,'INTERNAL') | '%Y%m%d%H%i%s' |
GET_FORMAT(TIME,'USA') | '%h:%i:%s %p' |
GET_FORMAT(TIME,'JIS') | '%H:%i:%s' |
GET_FORMAT(TIME,'ISO') | '%H:%i:%s' |
GET_FORMAT(TIME,'EUR') | '%H.%i.%s' |
GET_FORMAT(TIME,'INTERNAL') | '%H%i%s' |
TIMESTAMP
can also be used as
the first argument to
GET_FORMAT()
, in which case the
function returns the same values as for
DATETIME
.
mysql>SELECT DATE_FORMAT('2003-10-03',GET_FORMAT(DATE,'EUR'));
-> '03.10.2003' mysql>SELECT STR_TO_DATE('10.31.2003',GET_FORMAT(DATE,'USA'));
-> '2003-10-31'
Returns the hour for time
. The
range of the return value is 0
to
23
for time-of-day values. However, the
range of TIME
values actually
is much larger, so HOUR
can return values
greater than 23
.
mysql>SELECT HOUR('10:05:03');
-> 10 mysql>SELECT HOUR('272:59:59');
-> 272
Takes a date or datetime value and returns the corresponding
value for the last day of the month. Returns
NULL
if the argument is invalid.
mysql>SELECT LAST_DAY('2003-02-05');
-> '2003-02-28' mysql>SELECT LAST_DAY('2004-02-05');
-> '2004-02-29' mysql>SELECT LAST_DAY('2004-01-01 01:01:01');
-> '2004-01-31' mysql>SELECT LAST_DAY('2003-03-32');
-> NULL
LOCALTIME
and
LOCALTIME()
are synonyms for
NOW()
.
LOCALTIMESTAMP
,
LOCALTIMESTAMP([
fsp
])
LOCALTIMESTAMP
and
LOCALTIMESTAMP()
are synonyms
for NOW()
.
Returns a date, given year and day-of-year values.
dayofyear
must be greater than 0 or
the result is NULL
.
mysql>SELECT MAKEDATE(2011,31), MAKEDATE(2011,32);
-> '2011-01-31', '2011-02-01' mysql>SELECT MAKEDATE(2011,365), MAKEDATE(2014,365);
-> '2011-12-31', '2014-12-31' mysql>SELECT MAKEDATE(2011,0);
-> NULL
Returns a time value calculated from the
hour
,
minute
, and
second
arguments.
The second
argument can have a
fractional part.
mysql> SELECT MAKETIME(12,15,30);
-> '12:15:30'
Returns the microseconds from the time or datetime expression
expr
as a number in the range from
0
to 999999
.
mysql>SELECT MICROSECOND('12:00:00.123456');
-> 123456 mysql>SELECT MICROSECOND('2009-12-31 23:59:59.000010');
-> 10
Returns the minute for time
, in the
range 0
to 59
.
mysql> SELECT MINUTE('2008-02-03 10:05:03');
-> 5
Returns the month for date
, in the
range 1
to 12
for
January to December, or 0
for dates such as
'0000-00-00'
or
'2008-00-00'
that have a zero month part.
mysql> SELECT MONTH('2008-02-03');
-> 2
Returns the full name of the month for
date
. The language used for the
name is controlled by the value of the
lc_time_names
system variable
(Section 10.7, “MySQL Server Locale Support”).
mysql> SELECT MONTHNAME('2008-02-03');
-> 'February'
Returns the current date and time as a value in
'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context. The value
is expressed in the current time zone.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql>SELECT NOW();
-> '2007-12-15 23:50:26' mysql>SELECT NOW() + 0;
-> 20071215235026.000000
NOW()
returns a constant time
that indicates the time at which the statement began to
execute. (Within a stored function or trigger,
NOW()
returns the time at which
the function or triggering statement began to execute.) This
differs from the behavior for
SYSDATE()
, which returns the
exact time at which it executes.
mysql>SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+ | NOW() | SLEEP(2) | NOW() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 | +---------------------+----------+---------------------+ mysql>SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+ | SYSDATE() | SLEEP(2) | SYSDATE() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 | +---------------------+----------+---------------------+
In addition, the SET TIMESTAMP
statement
affects the value returned by
NOW()
but not by
SYSDATE()
. This means that
timestamp settings in the binary log have no effect on
invocations of SYSDATE()
.
Setting the timestamp to a nonzero value causes each
subsequent invocation of NOW()
to return that value. Setting the timestamp to zero cancels
this effect so that NOW()
once
again returns the current date and time.
See the description for
SYSDATE()
for additional
information about the differences between the two functions.
Adds N
months to period
P
(in the format
YYMM
or YYYYMM
). Returns
a value in the format YYYYMM
. Note that the
period argument P
is
not a date value.
mysql> SELECT PERIOD_ADD(200801,2);
-> 200803
Returns the number of months between periods
P1
and
P2
. P1
and P2
should be in the format
YYMM
or YYYYMM
. Note
that the period arguments P1
and
P2
are not
date values.
mysql> SELECT PERIOD_DIFF(200802,200703);
-> 11
Returns the quarter of the year for
date
, in the range
1
to 4
.
mysql> SELECT QUARTER('2008-04-01');
-> 2
Returns the second for time
, in the
range 0
to 59
.
mysql> SELECT SECOND('10:05:03');
-> 3
Returns the seconds
argument,
converted to hours, minutes, and seconds, as a
TIME
value. The range of the
result is constrained to that of the
TIME
data type. A warning
occurs if the argument corresponds to a value outside that
range.
mysql>SELECT SEC_TO_TIME(2378);
-> '00:39:38' mysql>SELECT SEC_TO_TIME(2378) + 0;
-> 3938
This is the inverse of the
DATE_FORMAT()
function. It
takes a string str
and a format
string format
.
STR_TO_DATE()
returns a
DATETIME
value if the format
string contains both date and time parts, or a
DATE
or
TIME
value if the string
contains only date or time parts. If the date, time, or
datetime value extracted from str
is illegal, STR_TO_DATE()
returns NULL
and produces a warning.
The server scans str
attempting to
match format
to it. The format
string can contain literal characters and format specifiers
beginning with %
. Literal characters in
format
must match literally in
str
. Format specifiers in
format
must match a date or time
part in str
. For the specifiers
that can be used in format
, see the
DATE_FORMAT()
function
description.
mysql>SELECT STR_TO_DATE('01,5,2013','%d,%m,%Y');
-> '2013-05-01' mysql>SELECT STR_TO_DATE('May 1, 2013','%M %d,%Y');
-> '2013-05-01'
Scanning starts at the beginning of
str
and fails if
format
is found not to match. Extra
characters at the end of str
are
ignored.
mysql>SELECT STR_TO_DATE('a09:30:17','a%h:%i:%s');
-> '09:30:17' mysql>SELECT STR_TO_DATE('a09:30:17','%h:%i:%s');
-> NULL mysql>SELECT STR_TO_DATE('09:30:17a','%h:%i:%s');
-> '09:30:17'
Unspecified date or time parts have a value of 0, so
incompletely specified values in
str
produce a result with some or
all parts set to 0:
mysql>SELECT STR_TO_DATE('abc','abc');
-> '0000-00-00' mysql>SELECT STR_TO_DATE('9','%m');
-> '0000-09-00' mysql>SELECT STR_TO_DATE('9','%s');
-> '00:00:09'
Range checking on the parts of date values is as described in
Section 11.3.1, “The DATE
, DATETIME
, and
TIMESTAMP
Types”. This means, for example, that
“zero” dates or dates with part values of 0 are
permitted unless the SQL mode is set to disallow such values.
mysql>SELECT STR_TO_DATE('00/00/0000', '%m/%d/%Y');
-> '0000-00-00' mysql>SELECT STR_TO_DATE('04/31/2004', '%m/%d/%Y');
-> '2004-04-31'
You cannot use format "%X%V"
to convert a
year-week string to a date because the combination of a year
and week does not uniquely identify a year and month if the
week crosses a month boundary. To convert a year-week to a
date, you should also specify the weekday:
mysql> SELECT STR_TO_DATE('200442 Monday', '%X%V %W');
-> '2004-10-18'
SUBDATE(
,
date
,INTERVAL
expr
unit
)SUBDATE(
expr
,days
)
When invoked with the INTERVAL
form of the
second argument, SUBDATE()
is a
synonym for DATE_SUB()
. For
information on the INTERVAL
unit
argument, see the discussion
for DATE_ADD()
.
mysql>SELECT DATE_SUB('2008-01-02', INTERVAL 31 DAY);
-> '2007-12-02' mysql>SELECT SUBDATE('2008-01-02', INTERVAL 31 DAY);
-> '2007-12-02'
The second form enables the use of an integer value for
days
. In such cases, it is
interpreted as the number of days to be subtracted from the
date or datetime expression expr
.
mysql> SELECT SUBDATE('2008-01-02 12:00:00', 31);
-> '2007-12-02 12:00:00'
SUBTIME()
returns
expr1
–
expr2
expressed as a value in the
same format as expr1
.
expr1
is a time or datetime
expression, and expr2
is a time
expression.
mysql>SELECT SUBTIME('2007-12-31 23:59:59.999999','1 1:1:1.000002');
-> '2007-12-30 22:58:58.999997' mysql>SELECT SUBTIME('01:00:00.999999', '02:00:00.999998');
-> '-00:59:59.999999'
Returns the current date and time as a value in
'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context.
SYSDATE()
returns the time at
which it executes. This differs from the behavior for
NOW()
, which returns a constant
time that indicates the time at which the statement began to
execute. (Within a stored function or trigger,
NOW()
returns the time at which
the function or triggering statement began to execute.)
mysql>SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+ | NOW() | SLEEP(2) | NOW() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 | +---------------------+----------+---------------------+ mysql>SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+ | SYSDATE() | SLEEP(2) | SYSDATE() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 | +---------------------+----------+---------------------+
In addition, the SET TIMESTAMP
statement
affects the value returned by
NOW()
but not by
SYSDATE()
. This means that
timestamp settings in the binary log have no effect on
invocations of SYSDATE()
.
Because SYSDATE()
can return
different values even within the same statement, and is not
affected by SET TIMESTAMP
, it is
nondeterministic and therefore unsafe for replication if
statement-based binary logging is used. If that is a problem,
you can use row-based logging.
Alternatively, you can use the
--sysdate-is-now
option to
cause SYSDATE()
to be an alias
for NOW()
. This works if the
option is used on both the master and the slave.
The nondeterministic nature of
SYSDATE()
also means that
indexes cannot be used for evaluating expressions that refer
to it.
Extracts the time part of the time or datetime expression
expr
and returns it as a string.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
mysql>SELECT TIME('2003-12-31 01:02:03');
-> '01:02:03' mysql>SELECT TIME('2003-12-31 01:02:03.000123');
-> '01:02:03.000123'
TIMEDIFF()
returns
expr1
–
expr2
expressed as a time value.
expr1
and
expr2
are time or date-and-time
expressions, but both must be of the same type.
The result returned by TIMEDIFF()
is
limited to the range allowed for
TIME
values. Alternatively, you
can use either of the functions
TIMESTAMPDIFF()
and
UNIX_TIMESTAMP()
, both of which
return integers.
mysql>SELECT TIMEDIFF('2000:01:01 00:00:00',
->'2000:01:01 00:00:00.000001');
-> '-00:00:00.000001' mysql>SELECT TIMEDIFF('2008-12-31 23:59:59.000001',
->'2008-12-30 01:01:01.000002');
-> '46:58:57.999999'
TIMESTAMP(
,
expr
)TIMESTAMP(
expr1
,expr2
)
With a single argument, this function returns the date or
datetime expression expr
as a
datetime value. With two arguments, it adds the time
expression expr2
to the date or
datetime expression expr1
and
returns the result as a datetime value.
mysql>SELECT TIMESTAMP('2003-12-31');
-> '2003-12-31 00:00:00' mysql>SELECT TIMESTAMP('2003-12-31 12:00:00','12:00:00');
-> '2004-01-01 00:00:00'
TIMESTAMPADD(
unit
,interval
,datetime_expr
)
Adds the integer expression
interval
to the date or datetime
expression datetime_expr
. The unit
for interval
is given by the
unit
argument, which should be one
of the following values: MICROSECOND
(microseconds), SECOND
,
MINUTE
, HOUR
,
DAY
, WEEK
,
MONTH
, QUARTER
, or
YEAR
.
The unit
value may be specified
using one of keywords as shown, or with a prefix of
SQL_TSI_
. For example,
DAY
and SQL_TSI_DAY
both
are legal.
mysql>SELECT TIMESTAMPADD(MINUTE,1,'2003-01-02');
-> '2003-01-02 00:01:00' mysql>SELECT TIMESTAMPADD(WEEK,1,'2003-01-02');
-> '2003-01-09'
TIMESTAMPDIFF(
unit
,datetime_expr1
,datetime_expr2
)
Returns datetime_expr2
–
datetime_expr1
, where
datetime_expr1
and
datetime_expr2
are date or datetime
expressions. One expression may be a date and the other a
datetime; a date value is treated as a datetime having the
time part '00:00:00'
where necessary. The
unit for the result (an integer) is given by the
unit
argument. The legal values for
unit
are the same as those listed
in the description of the
TIMESTAMPADD()
function.
mysql>SELECT TIMESTAMPDIFF(MONTH,'2003-02-01','2003-05-01');
-> 3 mysql>SELECT TIMESTAMPDIFF(YEAR,'2002-05-01','2001-01-01');
-> -1 mysql>SELECT TIMESTAMPDIFF(MINUTE,'2003-02-01','2003-05-01 12:05:55');
-> 128885
The order of the date or datetime arguments for this
function is the opposite of that used with the
TIMESTAMP()
function when
invoked with 2 arguments.
This is used like the
DATE_FORMAT()
function, but the
format
string may contain format
specifiers only for hours, minutes, seconds, and microseconds.
Other specifiers produce a NULL
value or
0
.
If the time
value contains an hour
part that is greater than 23
, the
%H
and %k
hour format
specifiers produce a value larger than the usual range of
0..23
. The other hour format specifiers
produce the hour value modulo 12.
mysql> SELECT TIME_FORMAT('100:00:00', '%H %k %h %I %l');
-> '100 100 04 04 4'
Returns the time
argument,
converted to seconds.
mysql>SELECT TIME_TO_SEC('22:23:00');
-> 80580 mysql>SELECT TIME_TO_SEC('00:39:38');
-> 2378
Given a date date
, returns a day
number (the number of days since year 0).
mysql>SELECT TO_DAYS(950501);
-> 728779 mysql>SELECT TO_DAYS('2007-10-07');
-> 733321
TO_DAYS()
is not intended for
use with values that precede the advent of the Gregorian
calendar (1582), because it does not take into account the
days that were lost when the calendar was changed. For dates
before 1582 (and possibly a later year in other locales),
results from this function are not reliable. See
Section 12.8, “What Calendar Is Used By MySQL?”, for details.
Remember that MySQL converts two-digit year values in dates to
four-digit form using the rules in
Section 11.3, “Date and Time Types”. For example,
'2008-10-07'
and
'08-10-07'
are seen as identical dates:
mysql> SELECT TO_DAYS('2008-10-07'), TO_DAYS('08-10-07');
-> 733687, 733687
In MySQL, the zero date is defined as
'0000-00-00'
, even though this date is
itself considered invalid. This means that, for
'0000-00-00'
and
'0000-01-01'
,
TO_DAYS()
returns the values
shown here:
mysql>SELECT TO_DAYS('0000-00-00');
+-----------------------+ | to_days('0000-00-00') | +-----------------------+ | NULL | +-----------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Incorrect datetime value: '0000-00-00' | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT TO_DAYS('0000-01-01');
+-----------------------+ | to_days('0000-01-01') | +-----------------------+ | 1 | +-----------------------+ 1 row in set (0.00 sec)
This is true whether or not the
ALLOW_INVALID_DATES
SQL
server mode is enabled.
Given a date or datetime expr
,
returns a the number of seconds since the year 0. If
expr
is not a valid date or
datetime value, returns NULL
.
mysql>SELECT TO_SECONDS(950501);
-> 62966505600 mysql>SELECT TO_SECONDS('2009-11-29');
-> 63426672000 mysql>SELECT TO_SECONDS('2009-11-29 13:43:32');
-> 63426721412 mysql>SELECT TO_SECONDS( NOW() );
-> 63426721458
Like TO_DAYS()
,
TO_SECONDS()
is not intended for use with
values that precede the advent of the Gregorian calendar
(1582), because it does not take into account the days that
were lost when the calendar was changed. For dates before 1582
(and possibly a later year in other locales), results from
this function are not reliable. See
Section 12.8, “What Calendar Is Used By MySQL?”, for details.
Like TO_DAYS()
,
TO_SECONDS()
, converts two-digit year
values in dates to four-digit form using the rules in
Section 11.3, “Date and Time Types”.
In MySQL, the zero date is defined as
'0000-00-00'
, even though this date is
itself considered invalid. This means that, for
'0000-00-00'
and
'0000-01-01'
,
TO_SECONDS()
returns the values
shown here:
mysql>SELECT TO_SECONDS('0000-00-00');
+--------------------------+ | TO_SECONDS('0000-00-00') | +--------------------------+ | NULL | +--------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Incorrect datetime value: '0000-00-00' | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT TO_SECONDS('0000-01-01');
+--------------------------+ | TO_SECONDS('0000-01-01') | +--------------------------+ | 86400 | +--------------------------+ 1 row in set (0.00 sec)
This is true whether or not the
ALLOW_INVALID_DATES
SQL
server mode is enabled.
UNIX_TIMESTAMP()
,
UNIX_TIMESTAMP(
date
)
If called with no argument, returns a Unix timestamp (seconds
since '1970-01-01 00:00:00'
UTC) as an
unsigned integer. If
UNIX_TIMESTAMP()
is called with
a date
argument, it returns the
value of the argument as seconds since '1970-01-01
00:00:00'
UTC. date
may
be a DATE
string, a
DATETIME
string, a
TIMESTAMP
, or a number in the
format YYMMDD
or
YYYYMMDD
. The server interprets
date
as a value in the current time
zone and converts it to an internal value in UTC. Clients can
set their time zone as described in
Section 10.6, “MySQL Server Time Zone Support”.
mysql>SELECT UNIX_TIMESTAMP();
-> 1196440210 mysql>SELECT UNIX_TIMESTAMP('2007-11-30 10:30:19');
-> 1196440219
When UNIX_TIMESTAMP()
is used
on a TIMESTAMP
column, the
function returns the internal timestamp value directly, with
no implicit “string-to-Unix-timestamp”
conversion. If you pass an out-of-range date to
UNIX_TIMESTAMP()
, it returns
0
.
Note: If you use
UNIX_TIMESTAMP()
and
FROM_UNIXTIME()
to convert
between TIMESTAMP
values and
Unix timestamp values, the conversion is lossy because the
mapping is not one-to-one in both directions. For example, due
to conventions for local time zone changes, it is possible for
two UNIX_TIMESTAMP()
to map two
TIMESTAMP
values to the same
Unix timestamp value.
FROM_UNIXTIME()
will map that
value back to only one of the original
TIMESTAMP
values. Here is an
example, using TIMESTAMP
values
in the CET
time zone:
mysql>SELECT UNIX_TIMESTAMP('2005-03-27 03:00:00');
+---------------------------------------+ | UNIX_TIMESTAMP('2005-03-27 03:00:00') | +---------------------------------------+ | 1111885200 | +---------------------------------------+ mysql>SELECT UNIX_TIMESTAMP('2005-03-27 02:00:00');
+---------------------------------------+ | UNIX_TIMESTAMP('2005-03-27 02:00:00') | +---------------------------------------+ | 1111885200 | +---------------------------------------+ mysql>SELECT FROM_UNIXTIME(1111885200);
+---------------------------+ | FROM_UNIXTIME(1111885200) | +---------------------------+ | 2005-03-27 03:00:00 | +---------------------------+
If you want to subtract
UNIX_TIMESTAMP()
columns, you
might want to cast the result to signed integers. See
Section 12.10, “Cast Functions and Operators”.
Returns the current UTC date as a value in
'YYYY-MM-DD'
or YYYYMMDD
format, depending on whether the function is used in a string
or numeric context.
mysql> SELECT UTC_DATE(), UTC_DATE() + 0;
-> '2003-08-14', 20030814
Returns the current UTC time as a value in
'HH:MM:SS'
or HHMMSS
format, depending on whether the function is used in a string
or numeric context.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql> SELECT UTC_TIME(), UTC_TIME() + 0;
-> '18:07:53', 180753.000000
UTC_TIMESTAMP
,
UTC_TIMESTAMP([
fsp
])
Returns the current UTC date and time as a value in
'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql> SELECT UTC_TIMESTAMP(), UTC_TIMESTAMP() + 0;
-> '2003-08-14 18:08:04', 20030814180804.000000
This function returns the week number for
date
. The two-argument form of
WEEK()
enables you to specify
whether the week starts on Sunday or Monday and whether the
return value should be in the range from 0
to 53
or from 1
to
53
. If the mode
argument is omitted, the value of the
default_week_format
system
variable is used. See
Section 5.1.4, “Server System Variables”.
The following table describes how the
mode
argument works.
Mode | First day of week | Range | Week 1 is the first week … |
---|---|---|---|
0 | Sunday | 0-53 | with a Sunday in this year |
1 | Monday | 0-53 | with more than 3 days this year |
2 | Sunday | 1-53 | with a Sunday in this year |
3 | Monday | 1-53 | with more than 3 days this year |
4 | Sunday | 0-53 | with more than 3 days this year |
5 | Monday | 0-53 | with a Monday in this year |
6 | Sunday | 1-53 | with more than 3 days this year |
7 | Monday | 1-53 | with a Monday in this year |
mysql>SELECT WEEK('2008-02-20');
-> 7 mysql>SELECT WEEK('2008-02-20',0);
-> 7 mysql>SELECT WEEK('2008-02-20',1);
-> 8 mysql>SELECT WEEK('2008-12-31',1);
-> 53
Note that if a date falls in the last week of the previous
year, MySQL returns 0
if you do not use
2
, 3
,
6
, or 7
as the optional
mode
argument:
mysql> SELECT YEAR('2000-01-01'), WEEK('2000-01-01',0);
-> 2000, 0
One might argue that MySQL should return 52
for the WEEK()
function,
because the given date actually occurs in the 52nd week of
1999. We decided to return 0
instead
because we want the function to return “the week number
in the given year.” This makes use of the
WEEK()
function reliable when
combined with other functions that extract a date part from a
date.
If you would prefer the result to be evaluated with respect to
the year that contains the first day of the week for the given
date, use 0
, 2
,
5
, or 7
as the optional
mode
argument.
mysql> SELECT WEEK('2000-01-01',2);
-> 52
Alternatively, use the
YEARWEEK()
function:
mysql>SELECT YEARWEEK('2000-01-01');
-> 199952 mysql>SELECT MID(YEARWEEK('2000-01-01'),5,2);
-> '52'
Returns the weekday index for date
(0
= Monday, 1
=
Tuesday, … 6
= Sunday).
mysql>SELECT WEEKDAY('2008-02-03 22:23:00');
-> 6 mysql>SELECT WEEKDAY('2007-11-06');
-> 1
Returns the calendar week of the date as a number in the range
from 1
to 53
.
WEEKOFYEAR()
is a compatibility
function that is equivalent to
WEEK(
.
date
,3)
mysql> SELECT WEEKOFYEAR('2008-02-20');
-> 8
Returns the year for date
, in the
range 1000
to 9999
, or
0
for the “zero” date.
mysql> SELECT YEAR('1987-01-01');
-> 1987
YEARWEEK(
,
date
)YEARWEEK(
date
,mode
)
Returns year and week for a date. The
mode
argument works exactly like
the mode
argument to
WEEK()
. The year in the result
may be different from the year in the date argument for the
first and the last week of the year.
mysql> SELECT YEARWEEK('1987-01-01');
-> 198653
Note that the week number is different from what the
WEEK()
function would return
(0
) for optional arguments
0
or 1
, as
WEEK()
then returns the week in
the context of the given year.
MySQL uses what is known as a proleptic Gregorian calendar.
Every country that has switched from the Julian to the Gregorian calendar has had to discard at least ten days during the switch. To see how this works, consider the month of October 1582, when the first Julian-to-Gregorian switch occurred.
Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
There are no dates between October 4 and October 15. This discontinuity is called the cutover. Any dates before the cutover are Julian, and any dates following the cutover are Gregorian. Dates during a cutover are nonexistent.
A calendar applied to dates when it was not actually in use is
called proleptic. Thus, if we assume there
was never a cutover and Gregorian rules always rule, we have a
proleptic Gregorian calendar. This is what is used by MySQL, as is
required by standard SQL. For this reason, dates prior to the
cutover stored as MySQL DATE
or
DATETIME
values must be adjusted to
compensate for the difference. It is important to realize that the
cutover did not occur at the same time in all countries, and that
the later it happened, the more days were lost. For example, in
Great Britain, it took place in 1752, when Wednesday September 2
was followed by Thursday September 14. Russia remained on the
Julian calendar until 1918, losing 13 days in the process, and
what is popularly referred to as its “October
Revolution” occurred in November according to the Gregorian
calendar.
MATCH
(
col1
,col2
,...)
AGAINST (expr
[search_modifier
])
search_modifier:
{
IN NATURAL LANGUAGE MODE
| IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
| IN BOOLEAN MODE
| WITH QUERY EXPANSION
}
MySQL has support for full-text indexing and searching:
A full-text index in MySQL is an index of type
FULLTEXT
.
Full-text indexes can be used only with
InnoDB
or
MyISAM
tables, and can be created
only for CHAR
,
VARCHAR
, or
TEXT
columns.
A FULLTEXT
index definition can be given in
the CREATE TABLE
statement when
a table is created, or added later using
ALTER TABLE
or
CREATE INDEX
.
For large data sets, it is much faster to load your data into
a table that has no FULLTEXT
index and then
create the index after that, than to load data into a table
that has an existing FULLTEXT
index.
Full-text searching is performed using
MATCH() ... AGAINST
syntax.
MATCH()
takes a comma-separated
list that names the columns to be searched.
AGAINST
takes a string to search for, and an
optional modifier that indicates what type of search to perform.
The search string must be a string value that is constant during
query evaluation. This rules out, for example, a table column
because that can differ for each row.
There are three types of full-text searches:
A natural language search interprets the search string as a
phrase in natural human language (a phrase in free text).
There are no special operators. The stopword list applies,
controlled by
innodb_ft_enable_stopword
,
innodb_ft_server_stopword_table
,
and
innodb_ft_user_stopword_table
for InnoDB
search indexes, and
ft_stopword_file
for
MyISAM
ones.
Full-text searches are natural language searches if the
IN NATURAL LANGUAGE MODE
modifier is given
or if no modifier is given. For more information, see
Section 12.9.1, “Natural Language Full-Text Searches”.
A boolean search interprets the search string using the rules
of a special query language. The string contains the words to
search for. It can also contain operators that specify
requirements such that a word must be present or absent in
matching rows, or that it should be weighted higher or lower
than usual. Certain common words (stopwords) are omitted from
the search index and do not match if present in the search
string. The IN BOOLEAN MODE
modifier
specifies a boolean search. For more information, see
Section 12.9.2, “Boolean Full-Text Searches”.
A query expansion search is a modification of a natural
language search. The search string is used to perform a
natural language search. Then words from the most relevant
rows returned by the search are added to the search string and
the search is done again. The query returns the rows from the
second search. The IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION
or WITH QUERY
EXPANSION
modifier specifies a query expansion
search. For more information, see
Section 12.9.3, “Full-Text Searches with Query Expansion”.
For information about FULLTEXT
query
performance, see Section 8.3.4, “Column Indexes”.
For more technical details about processing for
InnoDB
FULLTEXT
indexes, see
Section 14.2.3.13.3, “FULLTEXT
Indexes”.
Constraints on full-text searching are listed in Section 12.9.5, “Full-Text Restrictions”.
The myisam_ftdump utility dumps the contents of
a MyISAM
full-text index. This may be helpful
for debugging full-text queries. See
Section 4.6.2, “myisam_ftdump — Display Full-Text Index information”.
By default or with the IN NATURAL LANGUAGE
MODE
modifier, the
MATCH()
function performs a
natural language search for a string against a text
collection. A collection is a set of one or more
columns included in a FULLTEXT
index. The
search string is given as the argument to
AGAINST()
. For each row in the table,
MATCH()
returns a relevance
value; that is, a similarity measure between the search string
and the text in that row in the columns named in the
MATCH()
list.
mysql>CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title,body)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO articles (title,body) VALUES
('MySQL Tutorial','DBMS stands for DataBase ...'),
('How To Use MySQL Well','After you went through a ...'),
('Optimizing MySQL','In this tutorial we will show ...'),
('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
('MySQL vs. YourSQL','In the following database comparison ...'),
('MySQL Security','When configured properly, MySQL ...');
Query OK, 6 rows affected (0.00 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+ | id | title | body | +----+-------------------+------------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 5 | MySQL vs. YourSQL | In the following database comparison ... | +----+-------------------+------------------------------------------+ 2 rows in set (0.00 sec)
By default, the search is performed in case-insensitive fashion.
To perform a case-sensitive full-text search, use a binary
collation for the indexed columns. For example, a column that
uses the latin1
character set of can be
assigned a collation of latin1_bin
to make it
case sensitive for full-text searches.
When MATCH()
is used in a
WHERE
clause, as in the example shown
earlier, the rows returned are automatically sorted with the
highest relevance first. Relevance values are nonnegative
floating-point numbers. Zero relevance means no similarity.
Relevance is computed based on the number of words in the row,
the number of unique words in that row, the total number of
words in the collection, and the number of documents (rows) that
contain a particular word.
To simply count matches, you could use a query like this:
mysql>SELECT COUNT(*) FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----------+ | COUNT(*) | +----------+ | 2 | +----------+ 1 row in set (0.00 sec)
You might find it quicker to rewrite the query as follows:
mysql>SELECT
COUNT(IF(MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE), 1, NULL))
AS count
FROM articles;
+-------+ | count | +-------+ | 2 | +-------+ 1 row in set (0.03 sec)
The first query does some extra work (sorting the results by
relevance) but also can use an index lookup based on the
WHERE
clause. The index lookup might make the
first query faster if the search matches few rows. The second
query performs a full table scan, which might be faster than the
index lookup if the search term was present in most rows.
For natural-language full-text searches, the columns named in
the MATCH()
function must be the
same columns included in some FULLTEXT
index
in your table. For the preceding query, note that the columns
named in the MATCH()
function
(title
and body
) are the
same as those named in the definition of the
article
table's FULLTEXT
index. To search the title
or
body
separately, you would create separate
FULLTEXT
indexes for each column.
You can also perform a boolean search or a search with query expansion. These search types are described in Section 12.9.2, “Boolean Full-Text Searches”, and Section 12.9.3, “Full-Text Searches with Query Expansion”.
A full-text search that uses an index can name columns only from
a single table in the MATCH()
clause because an index cannot span multiple tables. For
MyISAM
tables, a boolean search can be done
in the absence of an index (albeit more slowly), in which case
it is possible to name columns from multiple tables.
The preceding example is a basic illustration that shows how to
use the MATCH()
function where
rows are returned in order of decreasing relevance. The next
example shows how to retrieve the relevance values explicitly.
Returned rows are not ordered because the
SELECT
statement includes neither
WHERE
nor ORDER BY
clauses:
mysql>SELECT id, MATCH (title,body)
AGAINST ('Tutorial' IN NATURAL LANGUAGE MODE) AS score
FROM articles;
+----+---------------------+ | id | score | +----+---------------------+ | 1 | 0.22764469683170319 | | 2 | 0 | | 3 | 0.22764469683170319 | | 4 | 0 | | 5 | 0 | | 6 | 0 | +----+---------------------+ 6 rows in set (0.00 sec)
The following example is more complex. The query returns the
relevance values and it also sorts the rows in order of
decreasing relevance. To achieve this result, specify
MATCH()
twice: once in the
SELECT
list and once in the
WHERE
clause. This causes no additional
overhead, because the MySQL optimizer notices that the two
MATCH()
calls are identical and
invokes the full-text search code only once.
mysql>SELECT id, body, MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE) AS score
FROM articles WHERE MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE);
+----+-----------------------+------------------------------------------+ | id | title | body | +----+-----------------------+------------------------------------------+ | 5 | MySQL vs. YourSQL | In the following database comparison ... | | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 3 | Optimizing MySQL | In this tutorial we will show ... | | 6 | MySQL Security | When configured properly, MySQL ... | | 2 | How To Use MySQL Well | After you went through a ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | +----+-----------------------+------------------------------------------+ 6 rows in set (0.00 sec)
The MySQL FULLTEXT
implementation regards any
sequence of true word characters (letters, digits, and
underscores) as a word. That sequence may also contain
apostrophes (“'
”), but not more
than one in a row. This means that aaa'bbb
is
regarded as one word, but aaa''bbb
is
regarded as two words. Apostrophes at the beginning or the end
of a word are stripped by the FULLTEXT
parser; 'aaa'bbb'
would be parsed as
aaa'bbb
.
The FULLTEXT
parser determines where words
start and end by looking for certain delimiter characters; for
example, “
” (space),
“,
” (comma), and
“.
” (period). If words are not
separated by delimiters (as in, for example, Chinese), the
FULLTEXT
parser cannot determine where a word
begins or ends. To be able to add words or other indexed terms
in such languages to a FULLTEXT
index, you
must preprocess them so that they are separated by some
arbitrary delimiter such as “"
”.
In MySQL 5.7, it is possible to write a plugin that
replaces the built-in full-text parser. For details, see
Section 22.2, “The MySQL Plugin API”. For example parser plugin source
code, see the plugin/fulltext
directory of
a MySQL source distribution.
Some words are ignored in full-text searches:
Any word that is too short is ignored. The default minimum
length of words that are found by full-text searches is
three characters for InnoDB
search
indexes, or four characters for MyISAM
.
You can control the cutoff by setting a configuration option
before creating the index:
innodb_ft_min_token_size
configuration option for InnoDB
search
indexes, or ft_min_word_len
for MyISAM
.
Words in the stopword list are ignored. A stopword is a word
such as “the” or “some” that is so
common that it is considered to have zero semantic value.
There is a built-in stopword list, but it can be overridden
by a user-defined list. The stopword lists and related
configuration options are different for
InnoDB
search indexes and
MyISAM
ones. Stopword processing is
controlled by the configuration options
innodb_ft_enable_stopword
,
innodb_ft_server_stopword_table
,
and
innodb_ft_user_stopword_table
for InnoDB
search indexes, and
ft_stopword_file
for
MyISAM
ones.
The default stopword lists are shown in Section 12.9.4, “Full-Text Stopwords”. The default minimum word length and stopword list can be changed as described in Section 12.9.6, “Fine-Tuning MySQL Full-Text Search”.
Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Thus, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row. This technique works best with large collections.
For very small tables, word distribution does not adequately
reflect their semantic value, and this model may sometimes
produce bizarre results for search indexes on
MyISAM
tables. For example, although the
word “MySQL” is present in every row of the
articles
table shown earlier, a search for
the word in a MyISAM
search index produces
no results:
mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('MySQL' IN NATURAL LANGUAGE MODE);
Empty set (0.00 sec)
The search result is empty because the word “MySQL” is present in at least 50% of the rows, and so is effectively treated as a stopword. This filtering technique is more suitable for large data sets, where you might not want the result set to return every second row from a 1GB table, than for small data sets where it might cause poor results for popular terms.
The 50% threshold can surprise you when you first try
full-text searching to see how it works, and makes
InnoDB
tables more suited to
experimentation with full-text searches. If you create a
MyISAM
table and insert only one or two
rows of text into it, every word in the text occurs in at
least 50% of the rows. As a result, no search returns any
results until the table contains more rows. Users who need to
bypass the 50% limitation can build search indexes on
InnoDB
tables, or the boolean search mode
explained in Section 12.9.2, “Boolean Full-Text Searches”.
MySQL can perform boolean full-text searches using the
IN BOOLEAN MODE
modifier. With this modifier,
certain characters have special meaning at the beginning or end
of words in the search string. In the following query, the
+
and -
operators indicate
that a word must be present or absent, respectively, for a match
to occur. Thus, the query retrieves all the rows that contain
the word “MySQL” but that do
not contain the word
“YourSQL”:
mysql>SELECT * FROM articles WHERE MATCH (title,body)
AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+ | id | title | body | +----+-----------------------+-------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 2 | How To Use MySQL Well | After you went through a ... | | 3 | Optimizing MySQL | In this tutorial we will show ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | | 6 | MySQL Security | When configured properly, MySQL ... | +----+-----------------------+-------------------------------------+
In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which
+
stands for AND
-
stands for NOT
[no operator] implies
OR
Boolean full-text searches have these characteristics:
They do not use the 50% threshold that applies to
MyISAM
search indexes.
They do not automatically sort rows in order of decreasing relevance.
Boolean queries against a MyISAM
search
index can work even without a FULLTEXT
index, although a search executed in this fashion would be
quite slow. InnoDB
tables require a
FULLTEXT
index on all columns of the
MATCH()
expression to perform
boolean queries.
The minimum and maximum word length full-text parameters
apply:
innodb_ft_min_token_size
and
innodb_ft_max_token_size
for InnoDB
search indexes, and
ft_min_word_len
and
ft_max_word_len
for
MyISAM
ones.
The stopword list applies, controlled by
innodb_ft_enable_stopword
,
innodb_ft_server_stopword_table
,
and
innodb_ft_user_stopword_table
for InnoDB
search indexes, and
ft_stopword_file
for
MyISAM
ones.
InnoDB
full-text search does not support
the use of multiple operators on a single search word, as in
this example: '++apple'.
Use of multiple
operators on a single search word returns a syntax error to
standard out. MyISAM full-text search will successfully
process the same search ignoring all operators except for
the operator immediately adjacent to the search word.
The boolean full-text search capability supports the following operators:
+
A leading plus sign indicates that this word must be present in each row that is returned.
-
A leading minus sign indicates that this word must not be present in any of the rows that are returned.
Note: The -
operator acts only to exclude
rows that are otherwise matched by other search terms. Thus,
a boolean-mode search that contains only terms preceded by
-
returns an empty result. It does not
return “all rows except those containing any of the
excluded terms.”
(no operator)
By default (when neither +
nor
-
is specified), the word is optional,
but the rows that contain it are rated higher. This mimics
the behavior of MATCH() ...
AGAINST()
without the IN BOOLEAN
MODE
modifier.
@
distance
This operator works on InnoDB
tables
only. It tests whether two or more words all start within a
specified distance from each other, measured in words.
Specify the search words within a double-quoted string
immediately before the
@
operator, for example, distance
MATCH(col1) AGAINST('"word1
word2 word3" @8' IN BOOLEAN MODE)
> <
These two operators are used to change a word's contribution
to the relevance value that is assigned to a row. The
>
operator increases the contribution
and the <
operator decreases it. See
the example following this list.
( )
Parentheses group words into subexpressions. Parenthesized groups can be nested.
~
A leading tilde acts as a negation operator, causing the
word's contribution to the row's relevance to be negative.
This is useful for marking “noise” words. A row
containing such a word is rated lower than others, but is
not excluded altogether, as it would be with the
-
operator.
*
The asterisk serves as the truncation (or wildcard)
operator. Unlike the other operators, it is
appended to the word to be affected.
Words match if they begin with the word preceding the
*
operator.
If a word is specified with the truncation operator, it is
not stripped from a boolean query, even if it is too short
or a stopword. Whether a word is too short is determined
from the
innodb_ft_min_token_size
setting for InnoDB
tables, or
ft_min_word_len
for
MyISAM
tables. The wildcarded word is
considered as a prefix that must be present at the start of
one or more words. If the minimum word length is 4, a search
for '+
could return fewer rows than a search for
word
+the*''+
,
because the second query ignores the too-short search term
word
+the'the
.
"
A phrase that is enclosed within double quote
(“"
”) characters matches
only rows that contain the phrase literally, as it
was typed. The full-text engine splits the phrase
into words and performs a search in the
FULLTEXT
index for the words. Nonword
characters need not be matched exactly: Phrase searching
requires only that matches contain exactly the same words as
the phrase and in the same order. For example,
"test phrase"
matches "test,
phrase"
.
If the phrase contains no words that are in the index, the result is empty. The words might not be in the index because of a combination of factors: if they do not exist in the text, are stopwords, or are shorter than the minimum length of indexed words.
The following examples demonstrate some search strings that use boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word “apple”, but rank rows higher if they also contain “macintosh”.
'+apple -macintosh'
Find rows that contain the word “apple” but not “macintosh”.
'+apple ~macintosh'
Find rows that contain the word “apple”, but if
the row also contains the word “macintosh”,
rate it lower than if row does not. This is
“softer” than a search for '+apple
-macintosh'
, for which the presence of
“macintosh” causes the row not to be returned
at all.
'+apple +(>turnover <strudel)'
Find rows that contain the words “apple” and “turnover”, or “apple” and “strudel” (in any order), but rank “apple turnover” higher than “apple strudel”.
'apple*'
Find rows that contain words such as “apple”, “apples”, “applesauce”, or “applet”.
'"some words"'
Find rows that contain the exact phrase “some
words” (for example, rows that contain “some
words of wisdom” but not “some noise
words”). Note that the
“"
” characters that enclose
the phrase are operator characters that delimit the phrase.
They are not the quotation marks that enclose the search
string itself.
Full-text search supports query expansion (and in particular, its variant “blind query expansion”). This is generally useful when a search phrase is too short, which often means that the user is relying on implied knowledge that the full-text search engine lacks. For example, a user searching for “database” may really mean that “MySQL”, “Oracle”, “DB2”, and “RDBMS” all are phrases that should match “databases” and should be returned, too. This is implied knowledge.
Blind query expansion (also known as automatic relevance
feedback) is enabled by adding WITH QUERY
EXPANSION
or IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION
following the search phrase. It works
by performing the search twice, where the search phrase for the
second search is the original search phrase concatenated with
the few most highly relevant documents from the first search.
Thus, if one of these documents contains the word
“databases” and the word “MySQL”, the
second search finds the documents that contain the word
“MySQL” even if they do not contain the word
“database”. The following example shows this
difference:
mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+ | id | title | body | +----+-------------------+------------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 5 | MySQL vs. YourSQL | In the following database comparison ... | +----+-------------------+------------------------------------------+ 2 rows in set (0.00 sec) mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' WITH QUERY EXPANSION);
+----+-----------------------+------------------------------------------+ | id | title | body | +----+-----------------------+------------------------------------------+ | 5 | MySQL vs. YourSQL | In the following database comparison ... | | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 3 | Optimizing MySQL | In this tutorial we will show ... | | 6 | MySQL Security | When configured properly, MySQL ... | | 2 | How To Use MySQL Well | After you went through a ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | +----+-----------------------+------------------------------------------+ 6 rows in set (0.00 sec)
Another example could be searching for books by Georges Simenon about Maigret, when a user is not sure how to spell “Maigret”. A search for “Megre and the reluctant witnesses” finds only “Maigret and the Reluctant Witnesses” without query expansion. A search with query expansion finds all books with the word “Maigret” on the second pass.
Because blind query expansion tends to increase noise significantly by returning nonrelevant documents, use it only when a search phrase is short.
The stopword list is loaded and searched for full-text queries
using the server character set and collation (the values of the
character_set_server
and
collation_server
system variables). False
hits or misses might occur for stopword lookups if the stopword
file or columns used for full-text indexing or searches have a
character set or collation different from
character_set_server
or
collation_server
.
Case sensitivity of stopword lookups depends on the server
collation. For example, lookups are case insensitive if the
collation is latin1_swedish_ci
, whereas
lookups are case sensitive if the collation is
latin1_general_cs
or
latin1_bin
.
InnoDB
Search Indexes
InnoDB
has a relatively short list of default
stopwords, because documents from technical, literary, and so on
sources often use short words as keywords or in significant
phrases. For example, you might search for “to be or not
to be” and expect to get a sensible result, rather than
having all those words ignored.
To see the list, query the table
information_schema.innodb_ft_default_stopword
.
To define your own stopword list used for all
InnoDB
tables, define a table with the same
structure as
innodb_ft_default_stopword
, fill it
with the desired stopwords, and set the value of the
innodb_ft_server_stopword_table
option to a value of the form
before creating the search index. To create special stopword
lists on a table-by-table basis, define other tables to hold
these lists and specify the appropriate one in the
db_name
/table_name
innodb_ft_user_stopword_table
option before creating the search index.
MyISAM
Search Indexes
In MySQL 5.7, the stopword file is loaded and
searched using latin1
if
character_set_server
is
ucs2
, utf16
,
utf16le
, or utf32
.
The following table shows the default list of stopwords for
MyISAM
search indexes. In a MySQL source
distribution, you can find this list in the
storage/myisam/ft_static.c
file.
a's | able | about | above | according |
accordingly | across | actually | after | afterwards |
again | against | ain't | all | allow |
allows | almost | alone | along | already |
also | although | always | am | among |
amongst | an | and | another | any |
anybody | anyhow | anyone | anything | anyway |
anyways | anywhere | apart | appear | appreciate |
appropriate | are | aren't | around | as |
aside | ask | asking | associated | at |
available | away | awfully | be | became |
because | become | becomes | becoming | been |
before | beforehand | behind | being | believe |
below | beside | besides | best | better |
between | beyond | both | brief | but |
by | c'mon | c's | came | can |
can't | cannot | cant | cause | causes |
certain | certainly | changes | clearly | co |
com | come | comes | concerning | consequently |
consider | considering | contain | containing | contains |
corresponding | could | couldn't | course | currently |
definitely | described | despite | did | didn't |
different | do | does | doesn't | doing |
don't | done | down | downwards | during |
each | edu | eg | eight | either |
else | elsewhere | enough | entirely | especially |
et | etc | even | ever | every |
everybody | everyone | everything | everywhere | ex |
exactly | example | except | far | few |
fifth | first | five | followed | following |
follows | for | former | formerly | forth |
four | from | further | furthermore | get |
gets | getting | given | gives | go |
goes | going | gone | got | gotten |
greetings | had | hadn't | happens | hardly |
has | hasn't | have | haven't | having |
he | he's | hello | help | hence |
her | here | here's | hereafter | hereby |
herein | hereupon | hers | herself | hi |
him | himself | his | hither | hopefully |
how | howbeit | however | i'd | i'll |
i'm | i've | ie | if | ignored |
immediate | in | inasmuch | inc | indeed |
indicate | indicated | indicates | inner | insofar |
instead | into | inward | is | isn't |
it | it'd | it'll | it's | its |
itself | just | keep | keeps | kept |
know | known | knows | last | lately |
later | latter | latterly | least | less |
lest | let | let's | like | liked |
likely | little | look | looking | looks |
ltd | mainly | many | may | maybe |
me | mean | meanwhile | merely | might |
more | moreover | most | mostly | much |
must | my | myself | name | namely |
nd | near | nearly | necessary | need |
needs | neither | never | nevertheless | new |
next | nine | no | nobody | non |
none | noone | nor | normally | not |
nothing | novel | now | nowhere | obviously |
of | off | often | oh | ok |
okay | old | on | once | one |
ones | only | onto | or | other |
others | otherwise | ought | our | ours |
ourselves | out | outside | over | overall |
own | particular | particularly | per | perhaps |
placed | please | plus | possible | presumably |
probably | provides | que | quite | qv |
rather | rd | re | really | reasonably |
regarding | regardless | regards | relatively | respectively |
right | said | same | saw | say |
saying | says | second | secondly | see |
seeing | seem | seemed | seeming | seems |
seen | self | selves | sensible | sent |
serious | seriously | seven | several | shall |
she | should | shouldn't | since | six |
so | some | somebody | somehow | someone |
something | sometime | sometimes | somewhat | somewhere |
soon | sorry | specified | specify | specifying |
still | sub | such | sup | sure |
t's | take | taken | tell | tends |
th | than | thank | thanks | thanx |
that | that's | thats | the | their |
theirs | them | themselves | then | thence |
there | there's | thereafter | thereby | therefore |
therein | theres | thereupon | these | they |
they'd | they'll | they're | they've | think |
third | this | thorough | thoroughly | those |
though | three | through | throughout | thru |
thus | to | together | too | took |
toward | towards | tried | tries | truly |
try | trying | twice | two | un |
under | unfortunately | unless | unlikely | until |
unto | up | upon | us | use |
used | useful | uses | using | usually |
value | various | very | via | viz |
vs | want | wants | was | wasn't |
way | we | we'd | we'll | we're |
we've | welcome | well | went | were |
weren't | what | what's | whatever | when |
whence | whenever | where | where's | whereafter |
whereas | whereby | wherein | whereupon | wherever |
whether | which | while | whither | who |
who's | whoever | whole | whom | whose |
why | will | willing | wish | with |
within | without | won't | wonder | would |
wouldn't | yes | yet | you | you'd |
you'll | you're | you've | your | yours |
yourself | yourselves | zero |
Full-text searches are supported for
InnoDB
and
MyISAM
tables only.
Full-text searches are not supported for partitioned tables. See Section 17.6, “Restrictions and Limitations on Partitioning”.
Full-text searches can be used with most multi-byte
character sets. The exception is that for Unicode, the
utf8
character set can be used, but not
the ucs2
character set. Although
FULLTEXT
indexes on
ucs2
columns cannot be used, you can
perform IN BOOLEAN MODE
searches on a
ucs2
column that has no such index.
The remarks for utf8
also apply to
utf8mb4
, and the remarks for
ucs2
also apply to
utf16
, utf16le
, and
utf32
.
Ideographic languages such as Chinese and Japanese do not
have word delimiters. Therefore, the
FULLTEXT
parser cannot
determine where words begin and end in these and other such
languages. The implications of this and some
workarounds for the problem are described in
Section 12.9, “Full-Text Search Functions”.
Although the use of multiple character sets within a single
table is supported, all columns in a
FULLTEXT
index must use the same
character set and collation.
The MATCH()
column list must
match exactly the column list in some
FULLTEXT
index definition for the table,
unless this MATCH()
is
IN BOOLEAN MODE
on a
MyISAM
table. For
MyISAM
tables, boolean-mode searches can
be done on nonindexed columns, although they are likely to
be slow.
The argument to AGAINST()
must be a
string value that is constant during query evaluation. This
rules out, for example, a table column because that can
differ for each row.
Index hints are more limited for FULLTEXT
searches than for non-FULLTEXT
searches.
See Section 13.2.9.3, “Index Hint Syntax”.
MySQL's full-text search capability has few user-tunable parameters. You can exert more control over full-text searching behavior if you have a MySQL source distribution because some changes require source code modifications. See Section 2.9, “Installing MySQL from Source”.
Note that full-text search is carefully tuned for effectiveness. Modifying the default behavior in most cases can actually decrease effectiveness. Do not alter the MySQL sources unless you know what you are doing.
Most full-text variables described in this section must be set at server startup time. A server restart is required to change them; they cannot be modified while the server is running.
Some variable changes require that you rebuild the
FULLTEXT
indexes in your tables. Instructions
for doing so are given later in this section.
The minimum and maximum lengths of words to be indexed are
defined by the
innodb_ft_min_token_size
and
innodb_ft_max_token_size
for InnoDB
search indexes, and
ft_min_word_len
and
ft_max_word_len
for
MyISAM
ones. After changing any of these
options, rebuild your FULLTEXT
indexes
for the change to take effect. For example, to make
two-character words searchable, you could put the following
lines in an option file:
[mysqld] innodb_ft_min_token_size=2 ft_min_word_len=2
Then restart the server and rebuild your
FULLTEXT
indexes. For
MyISAM
tables, note particularly the
remarks regarding myisamchk in the
instructions following this list.
To override the default stopword list, set the
ft_stopword_file
system
variable. (See Section 5.1.4, “Server System Variables”.)
The variable value should be the path name of the file
containing the stopword list, or the empty string to disable
stopword filtering. The server looks for the file in the
data directory unless an absolute path name is given to
specify a different directory. After changing the value of
this variable or the contents of the stopword file, restart
the server and rebuild your FULLTEXT
indexes.
The stopword list is free-form, separating stopwords with
any nonalphanumeric character such as newline, space, or
comma. Exceptions are the underscore character
(“_
”) and a single
apostrophe (“'
”) which are
treated as part of a word. The character set of the stopword
list is the server's default character set; see
Section 10.1.3.1, “Server Character Set and Collation”.
The 50% threshold for natural language searches is
determined by the particular weighting scheme chosen. To
disable it, look for the following line in
storage/myisam/ftdefs.h
:
#define GWS_IN_USE GWS_PROB
Change that line to this:
#define GWS_IN_USE GWS_FREQ
Then recompile MySQL. There is no need to rebuild the indexes in this case.
By making this change, you severely
decrease MySQL's ability to provide adequate relevance
values for the MATCH()
function. If you really need to search for such common
words, it would be better to search using IN
BOOLEAN MODE
instead, which does not observe the
50% threshold.
To change the operators used for boolean full-text searches
on MyISAM
tables, set the
ft_boolean_syntax
system
variable. (InnoDB
does not have an
equivalent setting.) This variable can be changed while the
server is running, but you must have the
SUPER
privilege to do so. No
rebuilding of indexes is necessary in this case. See
Section 5.1.4, “Server System Variables”, which describes
the rules governing how to set this variable.
You can change the set of characters that are considered
word characters in several ways, as described in the
following list. After making the modification, rebuild the
indexes for each table that contains any
FULLTEXT
indexes. Suppose that you want
to treat the hyphen character ('-') as a word character. Use
one of these methods:
Modify the MySQL source: In
storage/myisam/ftdefs.h
, see the
true_word_char()
and
misc_word_char()
macros. Add
'-'
to one of those macros and
recompile MySQL.
Modify a character set file: This requires no
recompilation. The true_word_char()
macro uses a “character type” table to
distinguish letters and numbers from other characters. .
You can edit the contents of the
<ctype><map>
array in one
of the character set XML files to specify that
'-'
is a “letter.” Then
use the given character set for your
FULLTEXT
indexes. For information
about the <ctype><map>
array format, see Section 10.3.1, “Character Definition Arrays”.
Add a new collation for the character set used by the indexed columns, and alter the columns to use that collation. For general information about adding collations, see Section 10.4, “Adding a Collation to a Character Set”. For an example specific to full-text indexing, see Section 12.9.7, “Adding a Collation for Full-Text Indexing”.
If you modify full-text variables that affect indexing
(innodb_ft_min_token_size
,
innodb_ft_max_token_size
,
innodb_ft_server_stopword_table
,
innodb_ft_user_stopword_table
,
innodb_ft_enable_stopword
,
ft_min_word_len
,
ft_max_word_len
, or
ft_stopword_file
), or if you
change the stopword file itself, you must rebuild your
FULLTEXT
indexes after making the changes and
restarting the server.
To rebuild the FULLTEXT
indexes for an
InnoDB
table, use ALTER
TABLE
with the DROP INDEX
and
ADD INDEX
options to drop and re-create each
index.
To rebuild the FULLTEXT
indexes for a
MyISAM
table, it is sufficient to do a
QUICK
repair operation:
mysql> REPAIR TABLE tbl_name
QUICK;
Alternatively, use ALTER TABLE
as
just described. In some cases, this may be faster than a repair
operation.
Each table that contains any FULLTEXT
index
must be repaired as just shown. Otherwise, queries for the table
may yield incorrect results, and modifications to the table will
cause the server to see the table as corrupt and in need of
repair.
Note that if you use myisamchk to perform an
operation that modifies MyISAM
table indexes
(such as repair or analyze), the FULLTEXT
indexes are rebuilt using the default
full-text parameter values for minimum word length, maximum word
length, and stopword file unless you specify otherwise. This can
result in queries failing.
The problem occurs because these parameters are known only by
the server. They are not stored in MyISAM
index files. To avoid the problem if you have modified the
minimum or maximum word length or stopword file values used by
the server, specify the same
ft_min_word_len
,
ft_max_word_len
, and
ft_stopword_file
values for
myisamchk that you use for
mysqld. For example, if you have set the
minimum word length to 3, you can repair a table with
myisamchk like this:
shell> myisamchk --recover --ft_min_word_len=3 tbl_name
.MYI
To ensure that myisamchk and the server use
the same values for full-text parameters, place each one in both
the [mysqld]
and
[myisamchk]
sections of an option file:
[mysqld] ft_min_word_len=3 [myisamchk] ft_min_word_len=3
An alternative to using myisamchk for
MyISAM
table index modification is to use the
REPAIR TABLE
,
ANALYZE TABLE
,
OPTIMIZE TABLE
, or
ALTER TABLE
statements. These
statements are performed by the server, which knows the proper
full-text parameter values to use.
This section describes how to add a new collation for full-text
searches. The sample collation is like
latin1_swedish_ci
but treats the
'-'
character as a letter rather than as a
punctuation character so that it can be indexed as a word
character. General information about adding collations is given
in Section 10.4, “Adding a Collation to a Character Set”; it is assumed that you
have read it and are familiar with the files involved.
To add a collation for full-text indexing, use this procedure:
Add a collation to the Index.xml
file.
The collation ID must be unused, so choose a value different
from 1000 if that ID is already taken on your system.
<charset name="latin1"> ... <collation name="latin1_fulltext_ci" id="1000"/> </charset>
Declare the sort order for the collation in the
latin1.xml
file. In this case, the
order can be copied from
latin1_swedish_ci
:
<collation name="latin1_fulltext_ci"> <map> 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 7B 7C 7D 7E 7F 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF 41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49 44 4E 4F 4F 4F 4F 5D D7 D8 55 55 55 59 59 DE DF 41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49 44 4E 4F 4F 4F 4F 5D F7 D8 55 55 55 59 59 DE FF </map> </collation>
Modify the ctype
array in
latin1.xml
. Change the value
corresponding to 0x2D (which is the code for the
'-'
character) from 10 (punctuation) to
01 (small letter). In the following array, this is the
element in the fourth row down, third value from the end.
<ctype>
<map>
00
20 20 20 20 20 20 20 20 20 28 28 28 28 28 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
48 10 10 10 10 10 10 10 10 10 10 10 10 01 10 10
84 84 84 84 84 84 84 84 84 84 10 10 10 10 10 10
10 81 81 81 81 81 81 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 01 01 01 01 01 10 10 10 10 10
10 82 82 82 82 82 82 02 02 02 02 02 02 02 02 02
02 02 02 02 02 02 02 02 02 02 02 10 10 10 10 20
10 00 10 02 10 10 10 10 10 10 01 10 01 00 01 00
00 10 10 10 10 10 10 10 10 10 02 10 02 00 02 01
48 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 01 10 01 01 01 01 01 01 01 02
02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
02 02 02 02 02 02 02 10 02 02 02 02 02 02 02 02
</map>
</ctype>
Restart the server.
To employ the new collation, include it in the definition of columns that are to use it:
mysql>DROP TABLE IF EXISTS t1;
Query OK, 0 rows affected (0.13 sec) mysql>CREATE TABLE t1 (
a TEXT CHARACTER SET latin1 COLLATE latin1_fulltext_ci,
FULLTEXT INDEX(a)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.47 sec)
Test the collation to verify that hyphen is considered as a word character:
mysql>INSERT INTO t1 VALUEs ('----'),('....'),('abcd');
Query OK, 3 rows affected (0.22 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT * FROM t1 WHERE MATCH a AGAINST ('----' IN BOOLEAN MODE);
+------+ | a | +------+ | ---- | +------+ 1 row in set (0.00 sec)
The BINARY
operator casts the
string following it to a binary string. This is an easy way to
force a column comparison to be done byte by byte rather than
character by character. This causes the comparison to be case
sensitive even if the column is not defined as
BINARY
or
BLOB
.
BINARY
also causes trailing
spaces to be significant.
mysql>SELECT 'a' = 'A';
-> 1 mysql>SELECT BINARY 'a' = 'A';
-> 0 mysql>SELECT 'a' = 'a ';
-> 1 mysql>SELECT BINARY 'a' = 'a ';
-> 0
In a comparison, BINARY
affects
the entire operation; it can be given before either operand
with the same result.
BINARY
is shorthand for
str
CAST(
.
str
AS
BINARY)
Note that in some contexts, if you cast an indexed column to
BINARY
, MySQL is not able to use the index
efficiently.
The CAST()
function takes an
expression of any type and produces a result value of a
specified type, similar to
CONVERT()
. See the description
of CONVERT()
for more
information.
CONVERT(
,
expr
,type
)CONVERT(
expr
USING transcoding_name
)
The CONVERT()
and
CAST()
functions take an
expression of any type and produce a result value of a
specified type.
The type
for the result can be one
of the following values:
BINARY
produces a string with
the BINARY
data type. See
Section 11.4.2, “The BINARY
and VARBINARY
Types” for a description of how
this affects comparisons. If the optional length
N
is given,
BINARY(
causes
the cast to use no more than N
)N
bytes of the argument. Values shorter than
N
bytes are padded with
0x00
bytes to a length of
N
.
CHAR(
causes the cast to use no more than
N
)N
characters of the argument.
CAST()
and
CONVERT(... USING ...)
are
standard SQL syntax. The non-USING
form of
CONVERT()
is ODBC syntax.
CONVERT()
with
USING
is used to convert data between
different character sets. In MySQL, transcoding names are the
same as the corresponding character set names. For example,
this statement converts the string 'abc'
in
the default character set to the corresponding string in the
utf8
character set:
SELECT CONVERT('abc' USING utf8);
Normally, you cannot compare a BLOB
value or other binary string in case-insensitive fashion because
binary strings have no character set, and thus no concept of
lettercase. To perform a case-insensitive comparison, use the
CONVERT()
function to convert the
value to a nonbinary string. Comparisons of the result use the
string collation. For example, if the character set of the result
has a case-insensitive collation, a
LIKE
operation is not case sensitive:
SELECT 'A' LIKE CONVERT(blob_col
USING latin1) FROMtbl_name
;
To use a different character set, substitute its name for
latin1
in the preceding statement. To specify a
particular collation for the converted string, use a
COLLATE
clause following the
CONVERT()
call, as described in
Section 10.1.9.2, “CONVERT()
and
CAST()
”. For example, to use
latin1_german1_ci
:
SELECT 'A' LIKE CONVERT(blob_col
USING latin1) COLLATE latin1_german1_ci FROMtbl_name
;
CONVERT()
can be used more
generally for comparing strings that are represented in different
character sets.
LOWER()
(and
UPPER()
) are ineffective when
applied to binary strings (BINARY
,
VARBINARY
,
BLOB
). To perform lettercase
conversion, convert the string to a nonbinary string:
mysql>SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+-------------+-----------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING latin1)) | +-------------+-----------------------------------+ | New York | new york | +-------------+-----------------------------------+
The cast functions are useful when you want to create a column
with a specific type in a
CREATE TABLE ...
SELECT
statement:
CREATE TABLE new_table SELECT CAST('2000-01-01' AS DATE);
The functions also can be useful for sorting
ENUM
columns in lexical order.
Normally, sorting of ENUM
columns
occurs using the internal numeric values. Casting the values to
CHAR
results in a lexical sort:
SELECTenum_col
FROMtbl_name
ORDER BY CAST(enum_col
AS CHAR);
CAST(
is the same thing as
str
AS
BINARY)BINARY
.
str
CAST(
treats the expression as a string with the default
character set.
expr
AS
CHAR)
CAST()
also changes the result if
you use it as part of a more complex expression such as
CONCAT('Date: ',CAST(NOW() AS
DATE))
.
You should not use CAST()
to
extract data in different formats but instead use string functions
like LEFT()
or
EXTRACT()
. See
Section 12.7, “Date and Time Functions”.
To cast a string to a numeric value in numeric context, you normally do not have to do anything other than to use the string value as though it were a number:
mysql> SELECT 1+'1';
-> 2
If you use a string in an arithmetic operation, it is converted to a floating-point number during expression evaluation.
If you use a number in string context, the number automatically is converted to a string:
mysql> SELECT CONCAT('hello you ',2);
-> 'hello you 2'
For information about implicit conversion of numbers to strings, see Section 12.2, “Type Conversion in Expression Evaluation”.
MySQL supports arithmetic with both signed and unsigned 64-bit
values. If you are using numeric operators (such as
+
or
-
) and one of the
operands is an unsigned integer, the result is unsigned by default
(see Section 12.6.1, “Arithmetic Operators”). You can override
this by using the SIGNED
or
UNSIGNED
cast operator to cast a value to a
signed or unsigned 64-bit integer, respectively.
mysql>SELECT CAST(1-2 AS UNSIGNED)
-> 18446744073709551615 mysql>SELECT CAST(CAST(1-2 AS UNSIGNED) AS SIGNED);
-> -1
If either operand is a floating-point value, the result is a
floating-point value and is not affected by the preceding rule.
(In this context, DECIMAL
column
values are regarded as floating-point values.)
mysql> SELECT CAST(1 AS UNSIGNED) - 2.0;
-> -1.0
The SQL mode affects the result of conversion operations. Examples:
If you convert a “zero” date string to a date,
CONVERT()
and
CAST()
return
NULL
and produce a warning when the
NO_ZERO_DATE
SQL mode is
enabled.
For integer subtraction, if the
NO_UNSIGNED_SUBTRACTION
SQL
mode is enabled, the subtraction result is signed even if any
operand is unsigned.
For more information, see Section 5.1.7, “Server SQL Modes”.
Table 12.15. XML Functions
Name | Description |
---|---|
ExtractValue() | Extracts a value from an XML string using XPath notation |
UpdateXML() | Return replaced XML fragment |
This section discusses XML and related functionality in MySQL.
It is possible to obtain XML-formatted output from MySQL in the
mysql and mysqldump
clients by invoking them with the
--xml
option. See
Section 4.5.1, “mysql — The MySQL Command-Line Tool”, and Section 4.5.4, “mysqldump — A Database Backup Program”.
Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this Manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.
These functions remain under development. We continue to improve these and other aspects of XML and XPath functionality in MySQL 5.7 and onwards. You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.
XPath expressions used with these functions support user variables and local stored program variables. User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):
User variables (weak checking).
Variables using the syntax
$@
(that is, user variables) are not checked. No warnings or
errors are issued by the server if a variable has the wrong
type or has previously not been assigned a value. This also
means the user is fully responsible for any typographical
errors, since no warnings will be given if (for example)
variable_name
$@myvariable
is used where
$@myvariable
was intended.
Example:
mysql>SET @xml = '<a><b>X</b><b>Y</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SET @i =1, @j = 2;
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @i, ExtractValue(@xml, '//b[$@i]');
+------+--------------------------------+ | @i | ExtractValue(@xml, '//b[$@i]') | +------+--------------------------------+ | 1 | X | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @j, ExtractValue(@xml, '//b[$@j]');
+------+--------------------------------+ | @j | ExtractValue(@xml, '//b[$@j]') | +------+--------------------------------+ | 2 | Y | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @k, ExtractValue(@xml, '//b[$@k]');
+------+--------------------------------+ | @k | ExtractValue(@xml, '//b[$@k]') | +------+--------------------------------+ | NULL | | +------+--------------------------------+ 1 row in set (0.00 sec)
Variables in stored programs (strong checking).
Variables using the syntax
$
can be declared and used with these functions when they are
called inside stored programs. Such variables are local to
the stored program in which they are defined, and are
strongly checked for type and value.
variable_name
Example:
mysql>DELIMITER |
mysql>CREATE PROCEDURE myproc ()
->BEGIN
->DECLARE i INT DEFAULT 1;
->DECLARE xml VARCHAR(25) DEFAULT '<a>X</a><a>Y</a><a>Z</a>';
-> ->WHILE i < 4 DO
->SELECT xml, i, ExtractValue(xml, '//a[$i]');
->SET i = i+1;
->END WHILE;
->END |
Query OK, 0 rows affected (0.01 sec) mysql>DELIMITER ;
mysql>CALL myproc;
+--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 1 | X | +--------------------------+---+------------------------------+ 1 row in set (0.00 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 2 | Y | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 3 | Z | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec)
Parameters. Variables used in XPath expressions inside stored routines that are passed in as parameters are also subject to strong checking.
Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.
Currently, a user variable used to store an XPath expression is treated as an empty string. Because of this, it is not possible to store an XPath expression as a user variable. (Bug #32911)
ExtractValue(
xml_frag
,
xpath_expr
)
ExtractValue()
takes two string
arguments, a fragment of XML markup
xml_frag
and an XPath expression
xpath_expr
(also known as a
locator); it returns the text
(CDATA
) of the first text node which is a
child of the elements or elements matched by the XPath
expression.
Using this function is the equivalent of performing a match
using the xpath_expr
after
appending /text()
. In other words,
ExtractValue('<a><b>Sakila</b></a>',
'/a/b')
and
ExtractValue('<a><b>Sakila</b></a>',
'/a/b/text()')
produce the same result.
If multiple matches are found, the content of the first child text node of each matching element is returned (in the order matched) as a single, space-delimited string.
If no matching text node is found for the expression
(including the implicit /text()
)—for
whatever reason, as long as
xpath_expr
is valid, and
xml_frag
consists of elements which
are properly nested and closed—an empty string is
returned. No distinction is made between a match on an empty
element and no match at all. This is by design.
If you need to determine whether no matching element was found
in xml_frag
or such an element was
found but contained no child text nodes, you should test the
result of an expression that uses the XPath
count()
function. For example, both of
these statements return an empty string, as shown here:
mysql>SELECT ExtractValue('<a><b/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><b/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><c/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec)
However, you can determine whether there was actually a matching element using the following:
mysql>SELECT ExtractValue('<a><b/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><b/></a>', 'count(/a/b)') | +-------------------------------------+ | 1 | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><c/></a>', 'count(/a/b)') | +-------------------------------------+ | 0 | +-------------------------------------+ 1 row in set (0.01 sec)
ExtractValue()
returns only
CDATA
, and does not return any tags that
might be contained within a matching tag, nor any of their
content (see the result returned as val1
in the following example).
mysql>SELECT
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a') AS val1,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b') AS val2,
->ExtractValue('<a>ccc<b>ddd</b></a>', '//b') AS val3,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/b') AS val4,
->ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+ | val1 | val2 | val3 | val4 | val5 | +------+------+------+------+---------+ | ccc | ddd | ddd | | ddd eee | +------+------+------+------+---------+
This function uses the current SQL collation for making
comparisons with contains()
, performing the
same collation aggregation as other string functions (such as
CONCAT()
), in taking into
account the collation coercibility of their arguments; see
Section 10.1.7.5, “Collation of Expressions”, for an
explanation of the rules governing this behavior.
(Previously, binary—that is, case-sensitive—comparison was always used.)
NULL
is returned if
xml_frag
contains elements which
are not properly nested or closed, and a warning is generated,
as shown in this example:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+-------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+-------------------------------------------------------------------------------------------+ | Warning | 1523 | Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' | +---------+------+-------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
UpdateXML(
xml_target
,
xpath_expr
,
new_xml
)
This function replaces a single portion of a given fragment of
XML markup xml_target
with a new
XML fragment new_xml
, and then
returns the changed XML. The portion of
xml_target
that is replaced matches
an XPath expression xpath_expr
supplied by the user.
If no expression matching
xpath_expr
is found, or if multiple
matches are found, the function returns the original
xml_target
XML fragment. All three
arguments should be strings.
mysql>SELECT
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
->UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
->UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
->\G
*************************** 1. row *************************** val1: <e>fff</e> val2: <a><b>ccc</b><d></d></a> val3: <a><e>fff</e><d></d></a> val4: <a><b>ccc</b><e>fff</e></a> val5: <a><d></d><b>ccc</b><d></d></a>
A discussion in depth of XPath syntax and usage are beyond the scope of this Manual. Please see the XML Path Language (XPath) 1.0 specification for definitive information. A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.
Descriptions and examples of some basic XPath expressions follow:
/
tag
Matches
<
if
and only if
tag
/><
is
the root element.
tag
/>
Example: /a
has a match in
<a><b/></a>
because it
matches the outermost (root) tag. It does not match the inner
a
element in
<b><a/></b>
because in
this instance it is the child of another element.
/
tag1
/tag2
Matches
<
if
and only if it is a child of
tag2
/><
,
and
tag1
/><
is
the root element.
tag1
/>
Example: /a/b
matches the
b
element in the XML fragment
<a><b/></a>
because it is
a child of the root element a
. It
does not have a match in
<b><a/></b>
because in
this case, b
is the root element
(and hence the child of no other element). Nor does the XPath
expression have a match in
<a><c><b/></c></a>
;
here, b
is a descendant of
a
, but not actually a child of
a
.
This construct is extendable to three or more elements. For
example, the XPath expression /a/b/c
matches the c
element in the
fragment
<a><b><c/></b></a>
.
//
tag
Matches any instance of
<
.
tag
>
Example: //a
matches the
a
element in any of the following:
<a><b><c/></b></a>
;
<c><a><b/></a></b>
;
<c><b><a/></b></c>
.
//
can be combined with
/
. For example, //a/b
matches the b
element in either of
the fragments <a><b/></a>
or
<a><b><c/></b></a>
//
is the
equivalent of
tag
/descendant-or-self::*/
.
A common error is to confuse this with
tag
/descendant-or-self::
,
although the latter expression can actually lead to very
different results, as can be seen here:
tag
mysql>SET @xml = '<a><b><c>w</c><b>x</b><d>y</d>z</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @xml;
+-----------------------------------------+ | @xml | +-----------------------------------------+ | <a><b><c>w</c><b>x</b><d>y</d>z</b></a> | +-----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[1]');
+------------------------------+ | ExtractValue(@xml, '//b[1]') | +------------------------------+ | x z | +------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[2]');
+------------------------------+ | ExtractValue(@xml, '//b[2]') | +------------------------------+ | | +------------------------------+ 1 row in set (0.01 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[1]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[1]') | +---------------------------------------------------+ | x z | +---------------------------------------------------+ 1 row in set (0.06 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[2]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[2]') | +---------------------------------------------------+ | | +---------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[1]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[1]') | +-------------------------------------------------+ | z | +-------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[2]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[2]') | +-------------------------------------------------+ | x | +-------------------------------------------------+ 1 row in set (0.00 sec)
The *
operator acts as a
“wildcard” that matches any element. For example,
the expression /*/b
matches the
b
element in either of the XML
fragments <a><b/></a>
or
<c><b/></c>
. However, the
expression does not produce a match in the fragment
<b><a/></b>
because
b
must be a child of some other
element. The wildcard may be used in any position: The
expression /*/b/*
will match any child of a
b
element that is itself not the
root element.
You can match any of several locators using the
|
(UNION
)
operator. For example, the expression
//b|//c
matches all
b
and c
elements in the XML target.
It is also possible to match an element based on the value of
one or more of its attributes. This done using the syntax
.
For example, the expression tag
[@attribute
="value
"]//b[@id="idB"]
matches the second b
element in the
fragment <a><b id="idA"/><c/><b
id="idB"/></a>
. To match against
any element having
,
use the XPath expression
attribute
="value
"//*[
.
attribute
="value
"]
To filter multiple attribute values, simply use multiple
attribute-comparison clauses in succession. For example, the
expression //b[@c="x"][@d="y"]
matches the
element <b c="x" d="y"/>
occurring
anywhere in a given XML fragment.
To find elements for which the same attribute matches any of
several values, you can use multiple locators joined by the
|
operator. For example, to match all
b
elements whose
c
attributes have either of the
values 23 or 17, use the expression
//b[@c="23"]|//b[@c="17"]
. You can also use
the logical or
operator for this purpose:
//b[@c="23" or @c="17"]
.
The difference between or
and
|
is that or
joins
conditions, while |
joins result sets.
XPath Limitations. The XPath syntax supported by these functions is currently subject to the following limitations:
Nodeset-to-nodeset comparison (such as
'/a/b[@c=@d]'
) is not supported.
All of the standard XPath comparison operators are supported. (Bug #22823)
Relative locator expressions are resolved in the context of the root node. For example, consider the following query and result:
mysql>SELECT ExtractValue(
->'<a><b c="1">X</b><b c="2">Y</b></a>',
->'a/b'
->) AS result;
+--------+ | result | +--------+ | X Y | +--------+ 1 row in set (0.03 sec)
In this case, the locator a/b
resolves to
/a/b
.
Relative locators are also supported within predicates. In the
following example, d[../@c="1"]
is resolved
as /a/b[@c="1"]/d
:
mysql>SELECT ExtractValue(
->'<a>
-><b c="1"><d>X</d></b>
-><b c="2"><d>X</d></b>
-></a>',
->'a/b/d[../@c="1"]')
->AS result;
+--------+ | result | +--------+ | X | +--------+ 1 row in set (0.00 sec)
Locators prefixed with expressions that evaluate as scalar values—including variable references, literals, numbers, and scalar function calls—are not permitted, and their use results in an error.
The ::
operator is not supported in
combination with node types such as the following:
axis
::comment()
axis
::text()
axis
::processing-instructions()
axis
::node()
However, name tests (such as
and axis
::name
) are
supported, as shown in these examples:
axis
::*
mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b') | +-------------------------------------------------------+ | x | +-------------------------------------------------------+ 1 row in set (0.02 sec) mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*') | +-------------------------------------------------------+ | x y | +-------------------------------------------------------+ 1 row in set (0.01 sec)
“Up-and-down” navigation is not supported in cases where the path would lead “above” the root element. That is, you cannot use expressions which match on descendants of ancestors of a given element, where one or more of the ancestors of the current element is also an ancestor of the root element (see Bug #16321).
The following XPath functions are not supported, or have known issues as indicated:
id()
lang()
local-name()
name()
namespace-uri()
normalize-space()
starts-with()
string()
substring-after()
substring-before()
translate()
The following axes are not supported:
following-sibling
following
preceding-sibling
preceding
XPath expressions passed as arguments to
ExtractValue()
and
UpdateXML()
may contain the colon
character (“:
”) in element
selectors, which enables their use with markup employing XML
namespaces notation. For example:
mysql>SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+ | ExtractValue(@xml, '//e:f') | +-----------------------------+ | 444 | +-----------------------------+ 1 row in set (0.00 sec) mysql>SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+ | UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') | +--------------------------------------------+ | <a>111<g:h>555</g:h></a> | +--------------------------------------------+ 1 row in set (0.00 sec)
This is similar in some respects to what is permitted by
Apache Xalan and
some other parsers, and is much simpler than requiring namespace
declarations or the use of the namespace-uri()
and local-name()
functions.
Error handling.
For both ExtractValue()
and
UpdateXML()
, the XPath locator
used must be valid and the XML to be searched must consist of
elements which are properly nested and closed. If the locator is
invalid, an error is generated:
mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'
If xml_frag
does not consist of
elements which are properly nested and closed,
NULL
is returned and a warning is generated, as
shown in this example:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+-------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+-------------------------------------------------------------------------------------------+ | Warning | 1523 | Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' | +---------+------+-------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
The replacement XML used as the third argument to
UpdateXML()
is
not checked to determine whether it
consists solely of elements which are properly nested and
closed.
XPath Injection. code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. It is based on exploiting assumptions made by developers about the type and content of data input from users. XPath is no exception in this regard.
A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:
//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id
This is the XPath equivalent of an SQL statement like this one:
SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';
A PHP application employing XPath might handle the login process like this:
<?php $file = "users.xml"; $login = $POST["login"]; $password = $POST["password"]; $xpath = "//user[login/text()=$login and password/text()=$password]/attribute::id"; if( file_exists($file) ) { $xml = simplexml_load_file($file); if($result = $xml->xpath($xpath)) echo "You are now logged in as user $result[0]."; else echo "Invalid login name or password."; } else exit("Failed to open $file."); ?>
No checks are performed on the input. This means that a malevolent
user can “short-circuit” the test by entering
' or 1=1
for both the login name and password,
resulting in $xpath
being evaluated as shown
here:
//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id
Since the expression inside the square brackets always evaluates
as true
, it is effectively the same as this
one, which matches the id
attribute of every
user
element in the XML document:
//user/attribute::id
One way in which this particular attack can be circumvented is
simply by quoting the variable names to be interpolated in the
definition of $xpath
, forcing the values passed
from a Web form to be converted to strings:
$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";
This is the same strategy that is often recommended for preventing SQL injection attacks. In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:
Never accepted untested data from users in your application.
Check all user-submitted data for type; reject or convert data that is of the wrong type
Test numeric data for out of range values; truncate, round, or reject values that are out of range. Test strings for illegal characters and either strip them out or reject input containing them.
Do not output explicit error messages that might provide an unauthorized user with clues that could be used to compromise the system; log these to a file or database table instead.
Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).
It is also important to check the output being sent back to the
client. Consider what can happen when we use the MySQL
ExtractValue()
function:
mysql>SELECT ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->) AS id;
+-------------------------------+ | id | +-------------------------------+ | 00327 13579 02403 42354 28570 | +-------------------------------+ 1 row in set (0.01 sec)
Because ExtractValue()
returns
multiple matches as a single space-delimited string, this
injection attack provides every valid ID contained within
users.xml
to the user as a single row of
output. As an extra safeguard, you should also test output before
returning it to the user. Here is a simple example:
mysql>SELECT @id = ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->);
Query OK, 0 rows affected (0.00 sec) mysql>SELECT IF(
->INSTR(@id, ' ') = 0,
->@id,
->'Unable to retrieve user ID')
->AS singleID;
+----------------------------+ | singleID | +----------------------------+ | Unable to retrieve user ID | +----------------------------+ 1 row in set (0.00 sec)
In general, the guidelines for returning data to users securely are the same as for accepting user input. These can be summed up as:
Always test outgoing data for type and permissible values.
Never permit unauthorized users to view error messages that might provide information about the application that could be used to exploit it.
MySQL uses BIGINT
(64-bit)
arithmetic for bit operations, so these operators have a maximum
range of 64 bits.
Bitwise OR:
mysql> SELECT 29 | 15;
-> 31
The result is an unsigned 64-bit integer.
Bitwise AND:
mysql> SELECT 29 & 15;
-> 13
The result is an unsigned 64-bit integer.
Bitwise XOR:
mysql>SELECT 1 ^ 1;
-> 0 mysql>SELECT 1 ^ 0;
-> 1 mysql>SELECT 11 ^ 3;
-> 8
The result is an unsigned 64-bit integer.
Shifts a longlong (BIGINT
)
number to the left.
mysql> SELECT 1 << 2;
-> 4
The result is an unsigned 64-bit integer. The value is truncated to 64 bits. In particular, if the shift count is greater or equal to the width of an unsigned 64-bit number, the result is zero.
Shifts a longlong (BIGINT
)
number to the right.
mysql> SELECT 4 >> 2;
-> 1
The result is an unsigned 64-bit integer. The value is truncated to 64 bits. In particular, if the shift count is greater or equal to the width of an unsigned 64-bit number, the result is zero.
Invert all bits.
mysql> SELECT 5 & ~1;
-> 4
The result is an unsigned 64-bit integer.
Returns the number of bits that are set in the argument
N
.
mysql> SELECT BIT_COUNT(29), BIT_COUNT(b'101010');
-> 4, 3
Table 12.17. Encryption Functions
Name | Description |
---|---|
AES_DECRYPT() | Decrypt using AES |
AES_ENCRYPT() | Encrypt using AES |
COMPRESS() | Return result as a binary string |
DECODE() | Decodes a string encrypted using ENCODE() |
DES_DECRYPT() | Decrypt a string |
DES_ENCRYPT() | Encrypt a string |
ENCODE() | Encode a string |
ENCRYPT() | Encrypt a string |
MD5() | Calculate MD5 checksum |
OLD_PASSWORD() | Return the value of the pre-4.1 implementation of PASSWORD |
PASSWORD() | Calculate and return a password string |
SHA1() , SHA() | Calculate an SHA-1 160-bit checksum |
SHA2() | Calculate an SHA-2 checksum |
UNCOMPRESS() | Uncompress a string compressed |
UNCOMPRESSED_LENGTH() | Return the length of a string before compression |
VALIDATE_PASSWORD_STRENGTH() | Determine strength of password |
Many encryption and compression functions return strings for which
the result might contain arbitrary byte values. If you want to
store these results, use a column with a
VARBINARY
or
BLOB
binary string data type. This
will avoid potential problems with trailing space removal or
character set conversion that would change data values, such as
may occur if you use a nonbinary string data type
(CHAR
,
VARCHAR
,
TEXT
).
Some encryption functions return strings of ASCII characters:
MD5()
,
OLD_PASSWORD()
,
PASSWORD()
,
SHA()
,
SHA1()
,
SHA2()
. In MySQL 5.7,
their return value is a nonbinary string that has a character set
and collation determined by the
character_set_connection
and
collation_connection
system
variables.
For versions in which functions such as MD5()
or SHA1()
return a string of hex digits as a
binary string, the return value cannot be converted to uppercase
or compared in case-insensitive fashion as is. You must convert
the value to a nonbinary string. See the discussion of binary
string conversion in Section 12.10, “Cast Functions and Operators”.
If an application stores values from a function such as
MD5()
or
SHA1()
that returns a string of hex
digits, more efficient storage and comparisons can be obtained by
converting the hex representation to binary using
UNHEX()
and storing the result in a
BINARY(
column. Each pair of hex digits requires one byte in binary form,
so the value of N
)N
depends on the length
of the hex string. N
is 16 for an
MD5()
value and 20 for a
SHA1()
value. For
SHA2()
,
N
ranges from 28 to 32 depending on the
argument specifying the desired bit length of the result.
The size penalty for storing the hex string in a
CHAR
column is at least two times,
up to eight times if the value is stored in a column that uses the
utf8
character set (where each character uses 4
bytes). Storing the string also results in slower comparisons
because of the larger values and the need to take character set
collation rules into account.
Suppose that an application stores
MD5()
string values in a
CHAR(32)
column:
CREATE TABLE md5_tbl (md5_val CHAR(32), ...); INSERT INTO md5_tbl (md5_val, ...) VALUES(MD5('abcdef'), ...);
To convert hex strings to more compact form, modify the
application to use UNHEX()
and
BINARY(16)
instead as follows:
CREATE TABLE md5_tbl (md5_val BINARY(16), ...); INSERT INTO md5_tbl (md5_val, ...) VALUES(UNHEX(MD5('abcdef')), ...);
Applications should be prepared to handle the very rare case that a hashing function produces the same value for two different input values. One way to make collisions detectable is to make the hash column a primary key.
Exploits for the MD5 and SHA-1 algorithms have become known. You
may wish to consider using one of the other encryption functions
described in this section instead, such as
SHA2()
.
Passwords or other sensitive values supplied as arguments to encryption functions are sent in plaintext to the MySQL server unless an SSL connection is used. Also, such values will appear in any MySQL logs to which they are written. To avoid these types of exposure, applications can encrypt sensitive values on the client side before sending them to the server. The same considerations apply to encryption keys. To avoid exposing these, applications can use stored procedures to encrypt and decrypt values on the server side.
AES_DECRYPT(
crypt_str
,key_str
)
This function decrypts data using the official AES (Advanced
Encryption Standard) algorithm. For more information, see the
description of AES_ENCRYPT()
.
AES_ENCRYPT()
and
AES_DECRYPT()
implement
encryption and decryption of data using the official AES
(Advanced Encryption Standard) algorithm, previously known as
“Rijndael.” The AES standard allows various key
sizes. These functions implement AES with a 128-bit key
length, but you can extend them to 256 bits by modifying the
source. The key length is a trade off between performance and
security.
AES_ENCRYPT()
encrypts the
string str
using a key
key_str
and returns a binary string
containing the encrypted output.
AES_DECRYPT()
decrypts the
encrypted string crypt_str
using a
key key_str
and returns the
original plaintext string. If either argument is
NULL
, the results of these functions are
also NULL
.
The str
and
crypt_str
arguments can be any
length, and padding is automatically added to
str
so it is a multiple of a block
as required by block-based algorithms such as AES. This
padding is automatically removed by the
AES_DECRYPT()
function. The
length of crypt_str
can be
calculated using this formula:
16 * (trunc(string_length
/ 16) + 1)
The most secure way to pass a key to the
key_str
argument is to create a
truly random 128-bit value and pass it as a binary value. For
example:
INSERT INTO t VALUES (1,AES_ENCRYPT('text',UNHEX('F3229A0B371ED2D9441B830D21A390C3')));
A passphrase can be used to generate an AES key by hashing the passphrase. For example:
INSERT INTO t VALUES (1,AES_ENCRYPT('text', SHA2('My secret passphrase',512)));
Do not pass a password or passphrase directly to
crypt_str
, hash it first. Previous
versions of this documentation suggested the former approach,
but it is no longer recommended as the examples shown here are
more secure.
If AES_DECRYPT()
detects
invalid data or incorrect padding, it returns
NULL
. However, it is possible for
AES_DECRYPT()
to return a
non-NULL
value (possibly garbage) if the
input data or the key is invalid.
Compresses a string and returns the result as a binary string.
This function requires MySQL to have been compiled with a
compression library such as zlib
.
Otherwise, the return value is always NULL
.
The compressed string can be uncompressed with
UNCOMPRESS()
.
mysql>SELECT LENGTH(COMPRESS(REPEAT('a',1000)));
-> 21 mysql>SELECT LENGTH(COMPRESS(''));
-> 0 mysql>SELECT LENGTH(COMPRESS('a'));
-> 13 mysql>SELECT LENGTH(COMPRESS(REPEAT('a',16)));
-> 15
The compressed string contents are stored the following way:
Empty strings are stored as empty strings.
Nonempty strings are stored as a 4-byte length of the
uncompressed string (low byte first), followed by the
compressed string. If the string ends with space, an extra
“.
” character is added to
avoid problems with endspace trimming should the result be
stored in a CHAR
or
VARCHAR
column. (However,
use of nonbinary string data types such as
CHAR
or
VARCHAR
to store compressed
strings is not recommended anyway because character set
conversion may occur. Use a
VARBINARY
or
BLOB
binary string column
instead.)
DECODE()
decrypts the encrypted
string crypt_str
using
pass_str
as the password.
crypt_str
should be a string
returned from ENCODE()
.
DES_DECRYPT(
crypt_str
[,key_str
])
Decrypts a string encrypted with
DES_ENCRYPT()
. If an error
occurs, this function returns NULL
.
This function works only if MySQL has been configured with SSL support. See Section 6.3.9, “Using SSL for Secure Connections”.
If no key_str
argument is given,
DES_DECRYPT()
examines the
first byte of the encrypted string to determine the DES key
number that was used to encrypt the original string, and then
reads the key from the DES key file to decrypt the message.
For this to work, the user must have the
SUPER
privilege. The key file
can be specified with the
--des-key-file
server option.
If you pass this function a key_str
argument, that string is used as the key for decrypting the
message.
If the crypt_str
argument does not
appear to be an encrypted string, MySQL returns the given
crypt_str
.
DES_ENCRYPT(
str
[,{key_num
|key_str
}])
Encrypts the string with the given key using the Triple-DES algorithm.
This function works only if MySQL has been configured with SSL support. See Section 6.3.9, “Using SSL for Secure Connections”.
The encryption key to use is chosen based on the second
argument to DES_ENCRYPT()
, if
one was given. With no argument, the first key from the DES
key file is used. With a key_num
argument, the given key number (0 to 9) from the DES key file
is used. With a key_str
argument,
the given key string is used to encrypt
str
.
The key file can be specified with the
--des-key-file
server option.
The return string is a binary string where the first character
is CHAR(128 |
. If an error
occurs, key_num
)DES_ENCRYPT()
returns
NULL
.
The 128 is added to make it easier to recognize an encrypted
key. If you use a string key,
key_num
is 127.
The string length for the result is given by this formula:
new_len
=orig_len
+ (8 - (orig_len
% 8)) + 1
Each line in the DES key file has the following format:
key_num
des_key_str
Each key_num
value must be a number
in the range from 0
to
9
. Lines in the file may be in any order.
des_key_str
is the string that is
used to encrypt the message. There should be at least one
space between the number and the key. The first key is the
default key that is used if you do not specify any key
argument to DES_ENCRYPT()
.
You can tell MySQL to read new key values from the key file
with the FLUSH
DES_KEY_FILE
statement. This requires the
RELOAD
privilege.
One benefit of having a set of default keys is that it gives applications a way to check for the existence of encrypted column values, without giving the end user the right to decrypt those values.
mysql>SELECT customer_address FROM customer_table
>WHERE crypted_credit_card = DES_ENCRYPT('credit_card_number');
ENCODE()
encrypts
str
using
pass_str
as the password. The
result is a binary string of the same length as
str
. To decrypt the result, use
DECODE()
.
The ENCODE()
and
DECODE()
functions are
deprecated in MySQL 5.7, will be removed in a
future MySQL release, and should no longer be used.
If you still need to use
ENCODE()
, a salt value must be
used with it to reduce risk. For example:
ENCODE('plaintext', CONCAT('my_random_salt','my_secret_password'))
A new random salt value must be used whenever a password is updated.
Encrypts str
using the Unix
crypt()
system call and returns a binary
string. The salt
argument must be a
string with at least two characters or the result will be
NULL
. If no salt
argument is given, a random value is used.
mysql> SELECT ENCRYPT('hello');
-> 'VxuFAJXVARROc'
ENCRYPT()
ignores all but the
first eight characters of str
, at
least on some systems. This behavior is determined by the
implementation of the underlying crypt()
system call.
The use of ENCRYPT()
with the
ucs2
, utf16
,
utf16le
, or utf32
multi-byte character sets is not recommended because the
system call expects a string terminated by a zero byte.
If crypt()
is not available on your
system (as is the case with Windows),
ENCRYPT()
always returns
NULL
.
Calculates an MD5 128-bit checksum for the string. The value
is returned as a string of 32 hex digits, or
NULL
if the argument was
NULL
. The return value can, for example, be
used as a hash key. See the notes at the beginning of this
section about storing hash values efficiently.
The return value is a nonbinary string in the connection character set.
mysql> SELECT MD5('testing');
-> 'ae2b1fca515949e5d54fb22b8ed95575'
This is the “RSA Data Security, Inc. MD5 Message-Digest Algorithm.”
See the note regarding the MD5 algorithm at the beginning this section.
OLD_PASSWORD()
was added when
the implementation of
PASSWORD()
was changed in MySQL
4.1 to improve security.
OLD_PASSWORD()
returns the
value of the pre-4.1 implementation of
PASSWORD()
as a string, and is
intended to permit you to reset passwords for any pre-4.1
clients that need to connect to your version 5.7
MySQL server without locking them out. See
Section 6.1.2.4, “Password Hashing in MySQL”.
The return value is a nonbinary string in the connection character set.
Passwords that use the pre-4.1 hashing method are less secure than passwords that use the native password hashing method and should be avoided. Pre-4.1 passwords are deprecated and support for them will be removed in a future MySQL release.
Calculates and returns a hashed password string from the
cleartext password str
and returns
a nonbinary string in the connection character set, or
NULL
if the argument is
NULL
. This function is the SQL interface to
the algorithm used by the server to encrypt MySQL passwords
for storage in the mysql.user
grant table.
The old_passwords
system
variable controls the password hashing method used by
PASSWORD()
(as well as for the
IDENTIFIED BY
clause of the
CREATE USER
and
GRANT
statements).
The following table shows the permitted values of
old_passwords
, the password
hashing method for each value, and which authentication
plugins use passwords hashed with each method.
Value | Password Hashing Format | Intended Use |
---|---|---|
0 | MySQL 4.1 native hashing | Accounts that authenticate with the
mysql_native_password plugin |
1 | Pre-4.1 (“old”) hashing | Accounts that authenticate with the
mysql_old_password plugin |
2 | SHA-256 hashing | Accounts that authenticate with the sha256_password
plugin |
If old_passwords=1
,
PASSWORD('
returns the same value as
str
')OLD_PASSWORD('
.
The latter function is not affected by the value of
str
')old_passwords
.
Examples:
mysql>SET old_passwords = 0;
mysql>SELECT PASSWORD('mypass');
+-------------------------------------------+ | PASSWORD('mypass') | +-------------------------------------------+ | *6C8989366EAF75BB670AD8EA7A7FC1176A95CEF4 | +-------------------------------------------+ mysql>SET old_passwords = 1;
mysql>SELECT PASSWORD('mypass');
+--------------------+ | PASSWORD('mypass') | +--------------------+ | 6f8c114b58f2ce9e | +--------------------+
SHA-256 password hashing
(old_passwords=2
) uses a
random salt value, which makes the result from
PASSWORD()
nondeterministic.
Consequently, statements that use this function are not safe
for statement-based replication and cannot be stored in the
query cache.
Encryption performed by
PASSWORD()
is one-way (not
reversible). It is not the same type of encryption as used for
Unix passwords; for that, use
ENCRYPT()
.
The PASSWORD()
function is
used by the authentication system in MySQL Server; you
should not use it in your own
applications. For that purpose, consider
MD5()
or
SHA2()
instead. Also see
RFC 2195,
section 2 (Challenge-Response Authentication Mechanism
(CRAM)), for more information about handling
passwords and authentication securely in your applications.
Passwords that use the pre-4.1 hashing method are less secure than passwords that use the native password hashing method and should be avoided. Pre-4.1 passwords are deprecated and support for them will be removed in a future MySQL release.
Statements that invoke
PASSWORD()
may be recorded in
server logs or in a history file such as
~/.mysql_history
, which means that
cleartext passwords may be read by anyone having read access
to that information. See
Section 6.1.2, “Keeping Passwords Secure”.
Calculates an SHA-1 160-bit checksum for the string, as
described in RFC 3174 (Secure Hash Algorithm). The value is
returned as a string of 40 hex digits, or
NULL
if the argument was
NULL
. One of the possible uses for this
function is as a hash key. See the notes at the beginning of
this section about storing hash values efficiently. You can
also use SHA1()
as a
cryptographic function for storing passwords.
SHA()
is
synonymous with SHA1()
.
The return value is a nonbinary string in the connection character set.
mysql> SELECT SHA1('abc');
-> 'a9993e364706816aba3e25717850c26c9cd0d89d'
SHA1()
can be considered a
cryptographically more secure equivalent of
MD5()
. However, see the note
regarding the MD5 and SHA-1 algorithms at the beginning this
section.
Calculates the SHA-2 family of hash functions (SHA-224,
SHA-256, SHA-384, and SHA-512). The first argument is the
cleartext string to be hashed. The second argument indicates
the desired bit length of the result, which must have a value
of 224, 256, 384, 512, or 0 (which is equivalent to 256). If
either argument is NULL
or the hash length
is not one of the permitted values, the return value is
NULL
. Otherwise, the function result is a
hash value containing the desired number of bits. See the
notes at the beginning of this section about storing hash
values efficiently.
The return value is a nonbinary string in the connection character set.
mysql> SELECT SHA2('abc', 224);
-> '23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7'
This function works only if MySQL has been configured with SSL support. See Section 6.3.9, “Using SSL for Secure Connections”.
SHA2()
can be considered
cryptographically more secure than
MD5()
or
SHA1()
.
UNCOMPRESS(
string_to_uncompress
)
Uncompresses a string compressed by the
COMPRESS()
function. If the
argument is not a compressed value, the result is
NULL
. This function requires MySQL to have
been compiled with a compression library such as
zlib
. Otherwise, the return value is always
NULL
.
mysql>SELECT UNCOMPRESS(COMPRESS('any string'));
-> 'any string' mysql>SELECT UNCOMPRESS('any string');
-> NULL
UNCOMPRESSED_LENGTH(
compressed_string
)
Returns the length that the compressed string had before being compressed.
mysql> SELECT UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('a',30)));
-> 30
VALIDATE_PASSWORD_STRENGTH(
str
)
Given an argument representing a cleartext password, this function returns an integer to indicate how strong the password is. The return value ranges from 0 (weak) to 100 (strong).
The password is subjected to increasingly strict tests and the return value reflects which tests were satisfied, as shown in the following table.
Password Test | Return Value |
---|---|
Length < 4 | 0 |
Length ≥ 4 and <
validate_password_length | 25 |
Satisfies policy 1 (LOW ) | 50 |
Satisfies policy 2 (MEDIUM ) | 75 |
Satisfies policy 3 (STRONG ) | 100 |
Password assessment by
VALIDATE_PASSWORD_STRENGTH()
is
done by the validate_password
plugin. If
that plugin is not installed, the function always returns 0.
For information about installing the
validate_password
plugin, see
Section 6.1.2.6, “The Password Validation Plugin”. To examine or
configure the parameters that affect password testing, check
or set the system variables implemented by
validate_password
plugin. See
Section 6.1.2.6.2, “Password Validation Plugin Options and Variables”.
Table 12.18. Information Functions
Name | Description |
---|---|
BENCHMARK() | Repeatedly execute an expression |
CHARSET() | Return the character set of the argument |
COERCIBILITY() | Return the collation coercibility value of the string argument |
COLLATION() | Return the collation of the string argument |
CONNECTION_ID() | Return the connection ID (thread ID) for the connection |
CURRENT_USER() , CURRENT_USER | The authenticated user name and host name |
DATABASE() | Return the default (current) database name |
FOUND_ROWS() | For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause |
LAST_INSERT_ID() | Value of the AUTOINCREMENT column for the last INSERT |
ROW_COUNT() | The number of rows updated |
SCHEMA() | A synonym for DATABASE() |
SESSION_USER() | Synonym for USER() |
SYSTEM_USER() | Synonym for USER() |
USER() | The user name and host name provided by the client |
VERSION() | Returns a string that indicates the MySQL server version |
The BENCHMARK()
function
executes the expression expr
repeatedly count
times. It may be
used to time how quickly MySQL processes the expression. The
result value is always 0
. The intended use
is from within the mysql client, which
reports query execution times:
mysql> SELECT BENCHMARK(1000000,ENCODE('hello','goodbye'));
+----------------------------------------------+
| BENCHMARK(1000000,ENCODE('hello','goodbye')) |
+----------------------------------------------+
| 0 |
+----------------------------------------------+
1 row in set (4.74 sec)
The time reported is elapsed time on the client end, not CPU
time on the server end. It is advisable to execute
BENCHMARK()
several times, and
to interpret the result with regard to how heavily loaded the
server machine is.
BENCHMARK()
is intended for
measuring the runtime performance of scalar expressions, which
has some significant implications for the way that you use it
and interpret the results:
Only scalar expressions can be used. Although the
expression can be a subquery, it must return a single
column and at most a single row. For example,
BENCHMARK(10, (SELECT * FROM
t))
will fail if the table t
has more than one column or more than one row.
Executing a SELECT
statement
expr
N
times differs from executing
SELECT BENCHMARK(
in terms of the
amount of overhead involved. The two have very different
execution profiles and you should not expect them to take
the same amount of time. The former involves the parser,
optimizer, table locking, and runtime evaluation
N
,
expr
)N
times each. The latter
involves only runtime evaluation
N
times, and all the other
components just once. Memory structures already allocated
are reused, and runtime optimizations such as local
caching of results already evaluated for aggregate
functions can alter the results. Use of
BENCHMARK()
thus measures
performance of the runtime component by giving more weight
to that component and removing the “noise”
introduced by the network, parser, optimizer, and so
forth.
Returns the character set of the string argument.
mysql>SELECT CHARSET('abc');
-> 'latin1' mysql>SELECT CHARSET(CONVERT('abc' USING utf8));
-> 'utf8' mysql>SELECT CHARSET(USER());
-> 'utf8'
Returns the collation coercibility value of the string argument.
mysql>SELECT COERCIBILITY('abc' COLLATE latin1_swedish_ci);
-> 0 mysql>SELECT COERCIBILITY(USER());
-> 3 mysql>SELECT COERCIBILITY('abc');
-> 4
The return values have the meanings shown in the following table. Lower values have higher precedence.
Coercibility | Meaning | Example |
---|---|---|
0 | Explicit collation | Value with COLLATE clause |
1 | No collation | Concatenation of strings with different collations |
2 | Implicit collation | Column value, stored routine parameter or local variable |
3 | System constant | USER() return value |
4 | Coercible | Literal string |
5 | Ignorable | NULL or an expression derived from
NULL |
Returns the collation of the string argument.
mysql>SELECT COLLATION('abc');
-> 'latin1_swedish_ci' mysql>SELECT COLLATION(_utf8'abc');
-> 'utf8_general_ci'
Returns the connection ID (thread ID) for the connection. Every connection has an ID that is unique among the set of currently connected clients.
mysql> SELECT CONNECTION_ID();
-> 23786
Returns the user name and host name combination for the MySQL
account that the server used to authenticate the current
client. This account determines your access privileges. The
return value is a string in the utf8
character set.
The value of CURRENT_USER()
can
differ from the value of
USER()
.
mysql>SELECT USER();
-> 'davida@localhost' mysql>SELECT * FROM mysql.user;
ERROR 1044: Access denied for user ''@'localhost' to database 'mysql' mysql>SELECT CURRENT_USER();
-> '@localhost'
The example illustrates that although the client specified a
user name of davida
(as indicated by the
value of the USER()
function),
the server authenticated the client using an anonymous user
account (as seen by the empty user name part of the
CURRENT_USER()
value). One way
this might occur is that there is no account listed in the
grant tables for davida
.
Within a stored program or view,
CURRENT_USER()
returns the
account for the user who defined the object (as given by its
DEFINER
value) unless defined with the
SQL SECURITY INVOKER
characteristic. In the
latter case, CURRENT_USER()
returns the object's invoker.
Triggers and events have no option to define the SQL
SECURITY
characteristic, so for these objects,
CURRENT_USER()
returns the
account for the user who defined the object. To return the
invoker, use USER()
or
SESSION_USER()
.
The following statements support use of the
CURRENT_USER()
function to take
the place of the name of (and, possibly, a host for) an
affected user or a definer; in such cases,
CURRENT_USER()
is expanded
where and as needed:
For information about the implications that this expansion of
CURRENT_USER()
has for
replication in different releases of MySQL 5.7,
see Section 16.4.1.7, “Replication of CURRENT_USER()
”.
Returns the default (current) database name as a string in the
utf8
character set. If there is no default
database, DATABASE()
returns
NULL
. Within a stored routine, the default
database is the database that the routine is associated with,
which is not necessarily the same as the database that is the
default in the calling context.
mysql> SELECT DATABASE();
-> 'test'
If there is no default database,
DATABASE()
returns
NULL
.
A SELECT
statement may include
a LIMIT
clause to restrict the number of
rows the server returns to the client. In some cases, it is
desirable to know how many rows the statement would have
returned without the LIMIT
, but without
running the statement again. To obtain this row count, include
a SQL_CALC_FOUND_ROWS
option in the
SELECT
statement, and then
invoke FOUND_ROWS()
afterward:
mysql>SELECT SQL_CALC_FOUND_ROWS * FROM
->tbl_name
WHERE id > 100 LIMIT 10;
mysql>SELECT FOUND_ROWS();
The second SELECT
returns a
number indicating how many rows the first
SELECT
would have returned had
it been written without the LIMIT
clause.
In the absence of the SQL_CALC_FOUND_ROWS
option in the most recent successful
SELECT
statement,
FOUND_ROWS()
returns the number
of rows in the result set returned by that statement. If the
statement includes a LIMIT
clause,
FOUND_ROWS()
returns the number
of rows up to the limit. For example,
FOUND_ROWS()
returns 10 or 60,
respectively, if the statement includes LIMIT
10
or LIMIT 50, 10
.
The row count available through
FOUND_ROWS()
is transient and
not intended to be available past the statement following the
SELECT SQL_CALC_FOUND_ROWS
statement. If
you need to refer to the value later, save it:
mysql>SELECT SQL_CALC_FOUND_ROWS * FROM ... ;
mysql>SET @rows = FOUND_ROWS();
If you are using SELECT
SQL_CALC_FOUND_ROWS
, MySQL must calculate how many
rows are in the full result set. However, this is faster than
running the query again without LIMIT
,
because the result set need not be sent to the client.
SQL_CALC_FOUND_ROWS
and
FOUND_ROWS()
can be useful in
situations when you want to restrict the number of rows that a
query returns, but also determine the number of rows in the
full result set without running the query again. An example is
a Web script that presents a paged display containing links to
the pages that show other sections of a search result. Using
FOUND_ROWS()
enables you to
determine how many other pages are needed for the rest of the
result.
The use of SQL_CALC_FOUND_ROWS
and
FOUND_ROWS()
is more complex
for UNION
statements than for
simple SELECT
statements,
because LIMIT
may occur at multiple places
in a UNION
. It may be applied
to individual SELECT
statements
in the UNION
, or global to the
UNION
result as a whole.
The intent of SQL_CALC_FOUND_ROWS
for
UNION
is that it should return
the row count that would be returned without a global
LIMIT
. The conditions for use of
SQL_CALC_FOUND_ROWS
with
UNION
are:
The SQL_CALC_FOUND_ROWS
keyword must
appear in the first SELECT
of the UNION
.
The value of FOUND_ROWS()
is exact only if
UNION ALL
is used. If UNION
without
ALL
is used, duplicate removal occurs
and the value of
FOUND_ROWS()
is only
approximate.
If no LIMIT
is present in the
UNION
,
SQL_CALC_FOUND_ROWS
is ignored and
returns the number of rows in the temporary table that is
created to process the
UNION
.
Beyond the cases described here, the behavior of
FOUND_ROWS()
is undefined (for
example, its value following a
SELECT
statement that fails
with an error).
FOUND_ROWS()
is not
replicated reliably using statement-based replication. This
function is automatically replicated using row-based
replication.
LAST_INSERT_ID()
,
LAST_INSERT_ID(
expr
)
With no argument,
LAST_INSERT_ID()
returns a
BIGINT UNSIGNED
(64-bit) value representing
the first automatically generated value successfully inserted
for an AUTO_INCREMENT
column as a result of
the most recently executed
INSERT
statement. The value of
LAST_INSERT_ID()
remains
unchanged if no rows are successfully inserted.
With an argument,
LAST_INSERT_ID()
returns an
unsigned integer.
For example, after inserting a row that generates an
AUTO_INCREMENT
value, you can get the value
like this:
mysql> SELECT LAST_INSERT_ID();
-> 195
The currently executing statement does not affect the value of
LAST_INSERT_ID()
. Suppose that
you generate an AUTO_INCREMENT
value with
one statement, and then refer to
LAST_INSERT_ID()
in a
multiple-row INSERT
statement
that inserts rows into a table with its own
AUTO_INCREMENT
column. The value of
LAST_INSERT_ID()
will remain
stable in the second statement; its value for the second and
later rows is not affected by the earlier row insertions.
(However, if you mix references to
LAST_INSERT_ID()
and
LAST_INSERT_ID(
,
the effect is undefined.)
expr
)
If the previous statement returned an error, the value of
LAST_INSERT_ID()
is undefined.
For transactional tables, if the statement is rolled back due
to an error, the value of
LAST_INSERT_ID()
is left
undefined. For manual
ROLLBACK
,
the value of LAST_INSERT_ID()
is not restored to that before the transaction; it remains as
it was at the point of the
ROLLBACK
.
Within the body of a stored routine (procedure or function) or
a trigger, the value of
LAST_INSERT_ID()
changes the
same way as for statements executed outside the body of these
kinds of objects. The effect of a stored routine or trigger
upon the value of
LAST_INSERT_ID()
that is seen
by following statements depends on the kind of routine:
If a stored procedure executes statements that change the
value of LAST_INSERT_ID()
,
the changed value is seen by statements that follow the
procedure call.
For stored functions and triggers that change the value, the value is restored when the function or trigger ends, so following statements will not see a changed value.
The ID that was generated is maintained in the server on a
per-connection basis. This means that the
value returned by the function to a given client is the first
AUTO_INCREMENT
value generated for most
recent statement affecting an
AUTO_INCREMENT
column by that
client. This value cannot be affected by other
clients, even if they generate
AUTO_INCREMENT
values of their own. This
behavior ensures that each client can retrieve its own ID
without concern for the activity of other clients, and without
the need for locks or transactions.
The value of LAST_INSERT_ID()
is not changed if you set the
AUTO_INCREMENT
column of a row to a
non-“magic” value (that is, a value that is not
NULL
and not 0
).
If you insert multiple rows using a single
INSERT
statement,
LAST_INSERT_ID()
returns the
value generated for the first inserted
row only. The reason for this is to
make it possible to reproduce easily the same
INSERT
statement against some
other server.
For example:
mysql>USE test;
Database changed mysql>CREATE TABLE t (
->id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
->name VARCHAR(10) NOT NULL
->);
Query OK, 0 rows affected (0.09 sec) mysql>INSERT INTO t VALUES (NULL, 'Bob');
Query OK, 1 row affected (0.01 sec) mysql>SELECT * FROM t;
+----+------+ | id | name | +----+------+ | 1 | Bob | +----+------+ 1 row in set (0.01 sec) mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+ 1 row in set (0.00 sec) mysql>INSERT INTO t VALUES
->(NULL, 'Mary'), (NULL, 'Jane'), (NULL, 'Lisa');
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM t; +----+------+ | id | name | +----+------+ | 1 | Bob | | 2 | Mary | | 3 | Jane | | 4 | Lisa | +----+------+ 4 rows in set (0.01 sec) mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 2 | +------------------+ 1 row in set (0.00 sec)
Although the second INSERT
statement inserted three new rows into t
,
the ID generated for the first of these rows was
2
, and it is this value that is returned by
LAST_INSERT_ID()
for the
following SELECT
statement.
If you use INSERT
IGNORE
and the row is ignored, the
AUTO_INCREMENT
counter is not incremented
and LAST_INSERT_ID()
returns
0
, which reflects that no row was inserted.
If expr
is given as an argument to
LAST_INSERT_ID()
, the value of
the argument is returned by the function and is remembered as
the next value to be returned by
LAST_INSERT_ID()
. This can be
used to simulate sequences:
Create a table to hold the sequence counter and initialize it:
mysql>CREATE TABLE sequence (id INT NOT NULL);
mysql>INSERT INTO sequence VALUES (0);
Use the table to generate sequence numbers like this:
mysql>UPDATE sequence SET id=LAST_INSERT_ID(id+1);
mysql>SELECT LAST_INSERT_ID();
The UPDATE
statement
increments the sequence counter and causes the next call
to LAST_INSERT_ID()
to
return the updated value. The
SELECT
statement retrieves
that value. The
mysql_insert_id()
C API
function can also be used to get the value. See
Section 21.8.7.37, “mysql_insert_id()
”.
You can generate sequences without calling
LAST_INSERT_ID()
, but the
utility of using the function this way is that the ID value is
maintained in the server as the last automatically generated
value. It is multi-user safe because multiple clients can
issue the UPDATE
statement and
get their own sequence value with the
SELECT
statement (or
mysql_insert_id()
), without
affecting or being affected by other clients that generate
their own sequence values.
Note that mysql_insert_id()
is
only updated after INSERT
and
UPDATE
statements, so you
cannot use the C API function to retrieve the value for
LAST_INSERT_ID(
after executing other SQL statements like
expr
)SELECT
or
SET
.
In MySQL 5.7, ROW_COUNT()
returns a value as follows:
DDL statements: 0. This applies to statements such as
CREATE TABLE
or
DROP TABLE
.
DML statements other than
SELECT
: The number of
affected rows. This applies to statements such as
UPDATE
,
INSERT
, or
DELETE
(as before), but now
also to statements such as ALTER
TABLE
and
LOAD DATA
INFILE
.
SELECT
: -1 if the statement
returns a result set, or the number of rows
“affected” if it does not. For example, for
SELECT * FROM t1
,
ROW_COUNT()
returns -1. For
SELECT * FROM t1 INTO OUTFILE
'
,
file_name
'ROW_COUNT()
returns the
number of rows written to the file.
SIGNAL
statements: 0.
For UPDATE
statements, the
affected-rows value by default is the number of rows actually
changed. If you specify the
CLIENT_FOUND_ROWS
flag to
mysql_real_connect()
when
connecting to mysqld, the affected-rows
value is the number of rows “found”; that is,
matched by the WHERE
clause.
For REPLACE
statements, the
affected-rows value is 2 if the new row replaced an old row,
because in this case, one row was inserted after the duplicate
was deleted.
For
INSERT
... ON DUPLICATE KEY UPDATE
statements, the
affected-rows value per row is 1 if the row is inserted as a
new row, 2 if an existing row is updated, and 0 if an existing
row is set to its current values. If you specify the
CLIENT_FOUND_ROWS
flag, the affected-rows
value is 1 (not 0) if an existing row is set to its current
values.
The ROW_COUNT()
value is
similar to the value from the
mysql_affected_rows()
C API
function and the row count that the mysql
client displays following statement execution.
mysql>INSERT INTO t VALUES(1),(2),(3);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT ROW_COUNT();
+-------------+ | ROW_COUNT() | +-------------+ | 3 | +-------------+ 1 row in set (0.00 sec) mysql>DELETE FROM t WHERE i IN(1,2);
Query OK, 2 rows affected (0.00 sec) mysql>SELECT ROW_COUNT();
+-------------+ | ROW_COUNT() | +-------------+ | 2 | +-------------+ 1 row in set (0.00 sec)
ROW_COUNT()
is not replicated
reliably using statement-based replication. This function is
automatically replicated using row-based replication.
This function is a synonym for
DATABASE()
.
SESSION_USER()
is a synonym for
USER()
.
SYSTEM_USER()
is a synonym for
USER()
.
Returns the current MySQL user name and host name as a string
in the utf8
character set.
mysql> SELECT USER();
-> 'davida@localhost'
The value indicates the user name you specified when
connecting to the server, and the client host from which you
connected. The value can be different from that of
CURRENT_USER()
.
You can extract only the user name part like this:
mysql> SELECT SUBSTRING_INDEX(USER(),'@',1);
-> 'davida'
Returns a string that indicates the MySQL server version. The
string uses the utf8
character set. The
value might have a suffix in addition to the version number.
See the description of the
version
system variable in
Section 5.1.4, “Server System Variables”.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
mysql> SELECT VERSION();
-> '5.7.3-standard'
The functions described in this section are used with GTID-based replication. For more information about GTIDs and how these functions are used, see Section 16.1.3, “Replication with Global Transaction Identifiers”.
The union of two GTID sets is simply their string representations joined together with an interposed comma. In other words, you can define a very simple function for obtaining the union of two GTID sets like this:
CREATE FUNCTION GTID_UNION(g1 TEXT, g2 TEXT) RETURNS TEXT DETERMINISTIC RETURN CONCAT(g1,',',g2);
Table 12.19. GTID Functions
Name | Description |
---|---|
GTID_SUBSET() | Return true if all GTIDs in subset are also in set; otherwise false. |
GTID_SUBTRACT() | Return all GTIDs in set that are not in subset. |
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS() | Wait until the slave SQL thread has executed all the given GTIDs. Returns: the number of events that were executed (or NULL, if GTID mode is not enabled). |
Given two sets of global transaction IDs
subset
and
set
, returns true if all GTIDs in
subset
are also in
set
. Returns false otherwise.
Given two sets of global transaction IDs
subset
and
set
, returns only those GTIDs from
set
that are not in
subset
.
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(
gtid_set
[,
timeout
])
Wait until the slave SQL thread has executed all of the
transactions whose global transaction identifiers are
contained in gtid_set
(see
Section 16.1.3.1, “GTID Concepts”, for a definition
of “GTID sets”), or until
timeout
seconds have elapsed,
whichever occurs first. timeout
is
optional; the default timeout is 0 seconds, in which case the
master simply waits until all of the transactions in the GTID
set have been executed.
For more information, see Section 16.1.3, “Replication with Global Transaction Identifiers”.
mysql> SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('3E11FA47-71CA-11E1-9E33-C80AA9429562:1-5');
-> 5
The return value is the number of transactional events that
were executed. If GTID-based replication is not active (that
is, if the value of the
gtid_mode
variable is
OFF
), then this value is undefined and thus
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
returns
NULL
.
Table 12.20. Miscellaneous Functions
Name | Description |
---|---|
DEFAULT() | Return the default value for a table column |
GET_LOCK() | Get a named lock |
INET_ATON() | Return the numeric value of an IP address |
INET_NTOA() | Return the IP address from a numeric value |
INET6_ATON() | Return the numeric value of an IPv6 address |
INET6_NTOA() | Return the IPv6 address from a numeric value |
IS_FREE_LOCK() | Checks whether the named lock is free |
IS_IPV4_COMPAT() | Return true if argument is an IPv4-compatible address |
IS_IPV4_MAPPED() | Return true if argument is an IPv4-mapped address |
IS_IPV4() | Return true if argument is an IPv4 address |
IS_IPV6() | Return true if argument is an IPv6 address |
IS_USED_LOCK() | Checks whether the named lock is in use. Return connection identifier if true. |
MASTER_POS_WAIT() | Block until the slave has read and applied all updates up to the specified position |
NAME_CONST() | Causes the column to have the given name |
RAND() | Return a random floating-point value |
RELEASE_LOCK() | Releases the named lock |
SLEEP() | Sleep for a number of seconds |
UUID_SHORT() | Return an integer-valued universal identifier |
UUID() | Return a Universal Unique Identifier (UUID) |
VALUES() | Defines the values to be used during an INSERT |
Returns the default value for a table column. An error results if the column has no default value.
mysql> UPDATE t SET i = DEFAULT(i)+1 WHERE id < 100;
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. For details, see
Section 12.5, “String Functions”.
Tries to obtain a lock with a name given by the string
str
, using a timeout of
timeout
seconds. Returns
1
if the lock was obtained successfully,
0
if the attempt timed out (for example,
because another client has previously locked the name), or
NULL
if an error occurred (such as running
out of memory or the thread was killed with
mysqladmin kill). If you have a lock
obtained with GET_LOCK()
, it is
released when you execute
RELEASE_LOCK()
, execute a new
GET_LOCK()
, or your connection
terminates (either normally or abnormally). Locks obtained
with GET_LOCK()
do not interact
with transactions. That is, committing a transaction does not
release any such locks obtained during the transaction.
This function can be used to implement application locks or to
simulate record locks. Names are locked on a server-wide
basis. If a name has been locked by one client,
GET_LOCK()
blocks any request
by another client for a lock with the same name. This enables
clients that agree on a given lock name to use the name to
perform cooperative advisory locking. But be aware that it
also enables a client that is not among the set of cooperating
clients to lock a name, either inadvertently or deliberately,
and thus prevent any of the cooperating clients from locking
that name. One way to reduce the likelihood of this is to use
lock names that are database-specific or application-specific.
For example, use lock names of the form
db_name.str
or
app_name.str
.
mysql>SELECT GET_LOCK('lock1',10);
-> 1 mysql>SELECT IS_FREE_LOCK('lock2');
-> 1 mysql>SELECT GET_LOCK('lock2',10);
-> 1 mysql>SELECT RELEASE_LOCK('lock2');
-> 1 mysql>SELECT RELEASE_LOCK('lock1');
-> NULL
The second RELEASE_LOCK()
call
returns NULL
because the lock
'lock1'
was automatically released by the
second GET_LOCK()
call.
If multiple clients are waiting for a lock, the order in which they will acquire it is undefined and depends on factors such as the thread library in use. In particular, applications should not assume that clients will acquire the lock in the same order that they issued the lock requests.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
Given the dotted-quad representation of an IPv4 network
address as a string, returns an integer that represents the
numeric value of the address in network byte order (big
endian). INET_ATON()
returns
NULL
if it does not understand its
argument.
mysql> SELECT INET_ATON('10.0.5.9');
-> 167773449
For this example, the return value is calculated as 10×2563 + 0×2562 + 5×256 + 9.
INET_ATON()
may or may not
return a non-NULL
result for short-form IP
addresses (such as '127.1'
as a
representation of '127.0.0.1'
). Because of
this, INET_ATON()
a should not
be used for such addresses.
To store values generated by
INET_ATON()
, use an
INT UNSIGNED
column rather than
INT
, which is signed. If you
use a signed column, values corresponding to IP addresses
for which the first octet is greater than 127 cannot be
stored correctly. See
Section 11.2.6, “Out-of-Range and Overflow Handling”.
Given a numeric IPv4 network address in network byte order,
returns the dotted-quad string representation of the address
as a nonbinary string in the connection character set.
INET_NTOA()
returns
NULL
if it does not understand its
argument.
mysql> SELECT INET_NTOA(167773449);
-> '10.0.5.9'
Given an IPv6 or IPv4 network address as a string, returns a
binary string that represents the numeric value of the address
in network byte order (big endian). Because numeric-format
IPv6 addresses require more bytes than the largest integer
type, the representation returned by this function has the
VARBINARY
data type:
VARBINARY(16)
for IPv6
addresses and VARBINARY(4)
for
IPv4 addresses. If the argument is not a valid address,
INET6_ATON()
returns
NULL
.
The following examples use
HEX()
to display the
INET6_ATON()
result in
printable form:
mysql>SELECT HEX(INET6_ATON('fdfe::5a55:caff:fefa:9089'));
-> 'FDFE0000000000005A55CAFFFEFA9089' mysql>SELECT HEX(INET6_ATON('10.0.5.9'));
-> '0A000509'
INET6_ATON()
observes several constraints
on valid arguments. These are given in the following list
along with examples.
A trailing zone ID is not permitted, as in
fe80::3%1
or
fe80::3%eth0
.
A trailing network mask is not permitted, as in
2001:45f:3:ba::/64
or
192.168.1.0/24
.
For values representing IPv4 addresses, only classless
addresses are supported. Classful addresses such as
192.168.1
are rejected. A trailing port
number is not permitted, as in
192.168.1.2:8080
. Hexadecimal numbers
in address components are not permitted, as in
192.0xa0.1.2
. Octal numbers are not
supported: 192.168.010.1
is treated as
192.168.10.1
, not
192.168.8.1
. These IPv4 constraints
also apply to IPv6 addresses that have IPv4 address parts,
such as IPv4-compatible or IPv4-mapped addresses.
To convert an IPv4 address expr
represented in numeric form as an
INT
value to an IPv6 address
represented in numeric form as a
VARBINARY
value, use this
expression:
INET6_ATON(INET_NTOA(expr
))
For example:
mysql> SELECT HEX(INET6_ATON(INET_NTOA(167773449)));
-> '0A000509'
Given an IPv6 or IPv4 network address represented in numeric
form as a binary string, returns the string representation of
the address as a nonbinary string in the connection character
set. If the argument is not a valid address,
INET6_NTOA()
returns
NULL
.
INET6_NTOA()
has these
properties:
It does not use operating system functions to perform conversions, thus the output string is platform independent.
The return string has a maximum length of 39 (4 x 8 + 7). Given this statement:
CREATE TABLE t AS SELECT INET6_NTOA(expr
) AS c1;
The resulting table would have this definition:
CREATE TABLE t (c1 VARCHAR(39) CHARACTER SET utf8 DEFAULT NULL);
The return string uses lowercase letters for IPv6 addresses.
mysql>SELECT INET6_NTOA(INET6_ATON('fdfe::5a55:caff:fefa:9089'));
-> 'fdfe::5a55:caff:fefa:9089' mysql>SELECT INET6_NTOA(INET6_ATON('10.0.5.9'));
-> '10.0.5.9' mysql>SELECT INET6_NTOA(UNHEX('FDFE0000000000005A55CAFFFEFA9089'));
-> 'fdfe::5a55:caff:fefa:9089' mysql>SELECT INET6_NTOA(UNHEX('0A000509'));
-> '10.0.5.9'
Checks whether the lock named str
is free to use (that is, not locked). Returns
1
if the lock is free (no one is using the
lock), 0
if the lock is in use, and
NULL
if an error occurs (such as an
incorrect argument).
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
Returns 1 if the argument is a valid IPv4 address specified as a string, 0 otherwise.
mysql> SELECT IS_IPV4('10.0.5.9'), IS_IPV4('10.0.5.256');
-> 1, 0
For a given argument, if
IS_IPV4()
returns 1,
INET_ATON()
(and
INET6_ATON()
) will return
non-NULL
. The converse statement is not
true: In some cases,
INET_ATON()
returns
non-NULL
when
IS_IPV4()
returns 0.
As implied by the preceding remarks,
IS_IPV4()
is more strict than
INET_ATON()
about what
constitutes a valid IPv4 address, so it may be useful for
applications that need to perform strong checks against
invalid values. Alternatively, use
INET6_ATON()
to convert IPv4
addresses to internal form and check for a
NULL
result (which indicates an invalid
address). INET6_ATON()
is
equally strong as IS_IPV4()
about checking IPv4 addresses.
This function takes an IPv6 address represented in numeric
form as a binary string, as returned by
INET6_ATON()
. It returns 1 if
the argument is a valid IPv4-compatible IPv6 address, 0
otherwise. IPv4-compatible addresses have the form
::
.
ipv4_address
mysql>SELECT IS_IPV4_COMPAT(INET6_ATON('::10.0.5.9'));
-> 1 mysql>SELECT IS_IPV4_COMPAT(INET6_ATON('::ffff:10.0.5.9'));
-> 0
The IPv4 part of an IPv4-compatible address can also be
represented using hexadecimal notation. For example,
192.168.0.1
has this raw hexadecimal value:
mysql> SELECT HEX(INET6_ATON('192.168.0.1'));
-> 'C0A80001'
Expressed in IPv4-compatible form,
::192.168.0.1
is equivalent to
::c0a8:0001
or (without leading zeros)
::c0a8:1
mysql>SELECT
->IS_IPV4_COMPAT(INET6_ATON('::192.168.0.1')),
->IS_IPV4_COMPAT(INET6_ATON('::c0a8:0001')),
->IS_IPV4_COMPAT(INET6_ATON('::c0a8:1'));
-> 1, 1, 1
This function takes an IPv6 address represented in numeric
form as a binary string, as returned by
INET6_ATON()
. It returns 1 if
the argument is a valid IPv4-mapped IPv6 address, 0 otherwise.
IPv4-mapped addresses have the form
::ffff:
.
ipv4_address
mysql>SELECT IS_IPV4_MAPPED(INET6_ATON('::10.0.5.9'));
-> 0 mysql>SELECT IS_IPV4_MAPPED(INET6_ATON('::ffff:10.0.5.9'));
-> 1
As with IS_IPV4_COMPAT()
the IPv4 part of
an IPv4-mapped address can also be represented using
hexadecimal notation:
mysql>SELECT
->IS_IPV4_MAPPED(INET6_ATON('::ffff:192.168.0.1')),
->IS_IPV4_MAPPED(INET6_ATON('::ffff:c0a8:0001')),
->IS_IPV4_MAPPED(INET6_ATON('::ffff:c0a8:1'));
-> 1, 1, 1
Returns 1 if the argument is a valid IPv6 address specified as a string, 0 otherwise. This function does not consider IPv4 addresses to be valid IPv6 addresses.
mysql> SELECT IS_IPV6('10.0.5.9'), IS_IPV6('::1');
-> 0, 1
For a given argument, if
IS_IPV6()
returns 1,
INET6_ATON()
will return
non-NULL
.
Checks whether the lock named str
is in use (that is, locked). If so, it returns the connection
identifier of the client that holds the lock. Otherwise, it
returns NULL
.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
MASTER_POS_WAIT(
log_name
,log_pos
[,timeout
])
This function is useful for control of master/slave
synchronization. It blocks until the slave has read and
applied all updates up to the specified position in the master
log. The return value is the number of log events the slave
had to wait for to advance to the specified position. The
function returns NULL
if the slave SQL
thread is not started, the slave's master information is not
initialized, the arguments are incorrect, or an error occurs.
It returns -1
if the timeout has been
exceeded. If the slave SQL thread stops while
MASTER_POS_WAIT()
is waiting,
the function returns NULL
. If the slave is
past the specified position, the function returns immediately.
If a timeout
value is specified,
MASTER_POS_WAIT()
stops waiting
when timeout
seconds have elapsed.
timeout
must be greater than 0; a
zero or negative timeout
means no
timeout.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
Returns the given value. When used to produce a result set
column, NAME_CONST()
causes the
column to have the given name. The arguments should be
constants.
mysql> SELECT NAME_CONST('myname', 14);
+--------+
| myname |
+--------+
| 14 |
+--------+
This function is for internal use only. The server uses it when writing statements from stored programs that contain references to local program variables, as described in Section 18.7, “Binary Logging of Stored Programs”, You might see this function in the output from mysqlbinlog.
Releases the lock named by the string
str
that was obtained with
GET_LOCK()
. Returns
1
if the lock was released,
0
if the lock was not established by this
thread (in which case the lock is not released), and
NULL
if the named lock did not exist. The
lock does not exist if it was never obtained by a call to
GET_LOCK()
or if it has
previously been released.
The DO
statement is convenient
to use with RELEASE_LOCK()
. See
Section 13.2.3, “DO
Syntax”.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
Sleeps (pauses) for the number of seconds given by the
duration
argument, then returns 0.
If SLEEP()
is interrupted, it
returns 1. The duration may have a fractional part. If the
argument is NULL
or negative,
SLEEP()
produces a warning, or
an error in strict SQL mode.
This function is unsafe for statement-based replication. In
MySQL 5.7, a warning is logged if you use this
function when binlog_format
is set to STATEMENT
. (Bug #47995)
Returns a Universal Unique Identifier (UUID) generated according to “DCE 1.1: Remote Procedure Call” (Appendix A) CAE (Common Applications Environment) Specifications published by The Open Group in October 1997 (Document Number C706, http://www.opengroup.org/public/pubs/catalog/c706.htm).
A UUID is designed as a number that is globally unique in
space and time. Two calls to
UUID()
are expected to generate
two different values, even if these calls are performed on two
separate computers that are not connected to each other.
A UUID is a 128-bit number represented by a
utf8
string of five hexadecimal numbers in
aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
format:
The first three numbers are generated from a timestamp.
The fourth number preserves temporal uniqueness in case the timestamp value loses monotonicity (for example, due to daylight saving time).
The fifth number is an IEEE 802 node number that provides spatial uniqueness. A random number is substituted if the latter is not available (for example, because the host computer has no Ethernet card, or we do not know how to find the hardware address of an interface on your operating system). In this case, spatial uniqueness cannot be guaranteed. Nevertheless, a collision should have very low probability.
Currently, the MAC address of an interface is taken into account only on FreeBSD and Linux. On other operating systems, MySQL uses a randomly generated 48-bit number.
mysql> SELECT UUID();
-> '6ccd780c-baba-1026-9564-0040f4311e29'
Although UUID()
values are
intended to be unique, they are not necessarily unguessable
or unpredictable. If unpredictability is required, UUID
values should be generated some other way.
UUID()
does not work with
statement-based replication.
Returns a “short” universal identifier as a
64-bit unsigned integer (rather than a string-form 128-bit
identifier as returned by the
UUID()
function).
The value of UUID_SHORT()
is
guaranteed to be unique if the following conditions hold:
The server_id
of the
current host is unique among your set of master and slave
servers
server_id
is between 0
and 255
You do not set back your system time for your server between mysqld restarts
You do not invoke
UUID_SHORT()
on average
more than 16 million times per second between
mysqld restarts
The UUID_SHORT()
return value
is constructed this way:
(server_id & 255) << 56 + (server_startup_time_in_seconds << 24) + incremented_variable++;
mysql> SELECT UUID_SHORT();
-> 92395783831158784
Note that UUID_SHORT()
does not
work with statement-based replication.
In an
INSERT
... ON DUPLICATE KEY UPDATE
statement, you can use
the
VALUES(
function in the col_name
)UPDATE
clause
to refer to column values from the
INSERT
portion of the
statement. In other words,
VALUES(
in the col_name
)UPDATE
clause refers to
the value of col_name
that would be
inserted, had no duplicate-key conflict occurred. This
function is especially useful in multiple-row inserts. The
VALUES()
function is meaningful
only in the ON DUPLICATE KEY UPDATE
clause
of INSERT
statements and
returns NULL
otherwise. See
Section 13.2.5.3, “INSERT ... ON
DUPLICATE KEY UPDATE
Syntax”.
mysql>INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
->ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
Table 12.21. Aggregate (GROUP BY
)
Functions
Name | Description |
---|---|
AVG() | Return the average value of the argument |
BIT_AND() | Return bitwise and |
BIT_OR() | Return bitwise or |
BIT_XOR() | Return bitwise xor |
COUNT(DISTINCT) | Return the count of a number of different values |
COUNT() | Return a count of the number of rows returned |
GROUP_CONCAT() | Return a concatenated string |
MAX() | Return the maximum value |
MIN() | Return the minimum value |
STD() | Return the population standard deviation |
STDDEV_POP() | Return the population standard deviation |
STDDEV_SAMP() | Return the sample standard deviation |
STDDEV() | Return the population standard deviation |
SUM() | Return the sum |
VAR_POP() | Return the population standard variance |
VAR_SAMP() | Return the sample variance |
VARIANCE() | Return the population standard variance |
This section describes group (aggregate) functions that operate
on sets of values. Unless otherwise stated, group functions
ignore NULL
values.
If you use a group function in a statement containing no
GROUP BY
clause, it is equivalent to grouping
on all rows. For more information, see
Section 12.17.3, “MySQL Extensions to GROUP BY
”.
For numeric arguments, the variance and standard deviation
functions return a DOUBLE
value.
The SUM()
and
AVG()
functions return a
DECIMAL
value for exact-value
arguments (integer or DECIMAL
),
and a DOUBLE
value for
approximate-value arguments
(FLOAT
or
DOUBLE
).
The SUM()
and
AVG()
aggregate functions do not
work with temporal values. (They convert the values to numbers,
losing everything after the first nonnumeric character.) To work
around this problem, convert to numeric units, perform the
aggregate operation, and convert back to a temporal value.
Examples:
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col
))) FROMtbl_name
; SELECT FROM_DAYS(SUM(TO_DAYS(date_col
))) FROMtbl_name
;
Functions such as SUM()
or
AVG()
that expect a numeric
argument cast the argument to a number if necessary. For
SET
or
ENUM
values, the cast operation
causes the underlying numeric value to be used.
Returns the average value of
. The
expr
DISTINCT
option can be used to return the
average of the distinct values of
expr
.
AVG()
returns
NULL
if there were no matching rows.
mysql>SELECT student_name, AVG(test_score)
->FROM student
->GROUP BY student_name;
Returns the bitwise AND
of all bits in
expr
. The calculation is
performed with 64-bit
(BIGINT
) precision.
This function returns
18446744073709551615
if there were no
matching rows. (This is the value of an unsigned
BIGINT
value with all bits
set to 1.)
Returns the bitwise OR
of all bits in
expr
. The calculation is
performed with 64-bit
(BIGINT
) precision.
This function returns 0
if there were no
matching rows.
Returns the bitwise XOR
of all
bits in expr
. The calculation is
performed with 64-bit
(BIGINT
) precision.
This function returns 0
if there were no
matching rows.
Returns a count of the number of non-NULL
values of expr
in the rows
retrieved by a SELECT
statement. The result is a
BIGINT
value.
COUNT()
returns
0
if there were no matching rows.
mysql>SELECT student.student_name,COUNT(*)
->FROM student,course
->WHERE student.student_id=course.student_id
->GROUP BY student_name;
COUNT(*)
is somewhat
different in that it returns a count of the number of rows
retrieved, whether or not they contain
NULL
values.
COUNT(*)
is optimized to
return very quickly if the
SELECT
retrieves from one
table, no other columns are retrieved, and there is no
WHERE
clause. For example:
mysql> SELECT COUNT(*) FROM student;
This optimization applies only to MyISAM
tables only, because an exact row count is stored for this
storage engine and can be accessed very quickly. For
transactional storage engines such as
InnoDB
, storing an exact row count is
more problematic because multiple transactions may be
occurring, each of which may affect the count.
COUNT(DISTINCT
expr
,[expr
...])
Returns a count of the number of rows with different
non-NULL
expr
values.
COUNT(DISTINCT)
returns
0
if there were no matching rows.
mysql> SELECT COUNT(DISTINCT results) FROM student;
In MySQL, you can obtain the number of distinct expression
combinations that do not contain NULL
by
giving a list of expressions. In standard SQL, you would
have to do a concatenation of all expressions inside
COUNT(DISTINCT ...)
.
This function returns a string result with the concatenated
non-NULL
values from a group. It returns
NULL
if there are no
non-NULL
values. The full syntax is as
follows:
GROUP_CONCAT([DISTINCT]expr
[,expr
...] [ORDER BY {unsigned_integer
|col_name
|expr
} [ASC | DESC] [,col_name
...]] [SEPARATORstr_val
])
mysql>SELECT student_name,
->GROUP_CONCAT(test_score)
->FROM student
->GROUP BY student_name;
Or:
mysql>SELECT student_name,
->GROUP_CONCAT(DISTINCT test_score
->ORDER BY test_score DESC SEPARATOR ' ')
->FROM student
->GROUP BY student_name;
In MySQL, you can get the concatenated values of expression
combinations. To eliminate duplicate values, use the
DISTINCT
clause. To sort values in the
result, use the ORDER BY
clause. To sort
in reverse order, add the DESC
(descending) keyword to the name of the column you are
sorting by in the ORDER BY
clause. The
default is ascending order; this may be specified explicitly
using the ASC
keyword. The default
separator between values in a group is comma
(“,
”). To specify a
separator explicitly, use SEPARATOR
followed by the string literal value that should be inserted
between group values. To eliminate the separator altogether,
specify SEPARATOR ''
.
The result is truncated to the maximum length that is given
by the group_concat_max_len
system variable, which has a default value of 1024. The
value can be set higher, although the effective maximum
length of the return value is constrained by the value of
max_allowed_packet
. The
syntax to change the value of
group_concat_max_len
at
runtime is as follows, where val
is an unsigned integer:
SET [GLOBAL | SESSION] group_concat_max_len = val
;
The return value is a nonbinary or binary string, depending
on whether the arguments are nonbinary or binary strings.
The result type is TEXT
or
BLOB
unless
group_concat_max_len
is
less than or equal to 512, in which case the result type is
VARCHAR
or
VARBINARY
.
See also CONCAT()
and
CONCAT_WS()
:
Section 12.5, “String Functions”.
Returns the maximum value of
expr
.
MAX()
may take a string
argument; in such cases, it returns the maximum string
value. See Section 8.3.1, “How MySQL Uses Indexes”. The
DISTINCT
keyword can be used to find the
maximum of the distinct values of
expr
, however, this produces the
same result as omitting DISTINCT
.
MAX()
returns
NULL
if there were no matching rows.
mysql>SELECT student_name, MIN(test_score), MAX(test_score)
->FROM student
->GROUP BY student_name;
For MAX()
, MySQL currently
compares ENUM
and
SET
columns by their string
value rather than by the string's relative position in the
set. This differs from how ORDER BY
compares them. This is expected to be rectified in a future
MySQL release.
Returns the minimum value of
expr
.
MIN()
may take a string
argument; in such cases, it returns the minimum string
value. See Section 8.3.1, “How MySQL Uses Indexes”. The
DISTINCT
keyword can be used to find the
minimum of the distinct values of
expr
, however, this produces the
same result as omitting DISTINCT
.
MIN()
returns
NULL
if there were no matching rows.
mysql>SELECT student_name, MIN(test_score), MAX(test_score)
->FROM student
->GROUP BY student_name;
For MIN()
, MySQL currently
compares ENUM
and
SET
columns by their string
value rather than by the string's relative position in the
set. This differs from how ORDER BY
compares them. This is expected to be rectified in a future
MySQL release.
Returns the population standard deviation of
expr
. This is an extension to
standard SQL. The standard SQL function
STDDEV_POP()
can be used
instead.
This function returns NULL
if there were
no matching rows.
Returns the population standard deviation of
expr
. This function is provided
for compatibility with Oracle. The standard SQL function
STDDEV_POP()
can be used
instead.
This function returns NULL
if there were
no matching rows.
Returns the population standard deviation of
expr
(the square root of
VAR_POP()
). You can also use
STD()
or
STDDEV()
, which are
equivalent but not standard SQL.
STDDEV_POP()
returns
NULL
if there were no matching rows.
Returns the sample standard deviation of
expr
(the square root of
VAR_SAMP()
.
STDDEV_SAMP()
returns
NULL
if there were no matching rows.
Returns the sum of expr
. If the
return set has no rows, SUM()
returns NULL
. The
DISTINCT
keyword can be used to sum only
the distinct values of expr
.
SUM()
returns
NULL
if there were no matching rows.
Returns the population standard variance of
expr
. It considers rows as the
whole population, not as a sample, so it has the number of
rows as the denominator. You can also use
VARIANCE()
, which is
equivalent but is not standard SQL.
VAR_POP()
returns
NULL
if there were no matching rows.
Returns the sample variance of
expr
. That is, the denominator is
the number of rows minus one.
VAR_SAMP()
returns
NULL
if there were no matching rows.
Returns the population standard variance of
expr
. This is an extension to
standard SQL. The standard SQL function
VAR_POP()
can be used
instead.
VARIANCE()
returns
NULL
if there were no matching rows.
The GROUP BY
clause permits a WITH
ROLLUP
modifier that causes extra rows to be added to
the summary output. These rows represent higher-level (or
super-aggregate) summary operations. ROLLUP
thus enables you to answer questions at multiple levels of
analysis with a single query. It can be used, for example, to
provide support for OLAP (Online Analytical Processing)
operations.
Suppose that a table named sales
has
year
, country
,
product
, and profit
columns for recording sales profitability:
CREATE TABLE sales ( year INT NOT NULL, country VARCHAR(20) NOT NULL, product VARCHAR(32) NOT NULL, profit INT );
The table's contents can be summarized per year with a simple
GROUP BY
like this:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
+------+-------------+
This output shows the total profit for each year, but if you also want to determine the total profit summed over all years, you must add up the individual values yourself or run an additional query.
Or you can use ROLLUP
, which provides both
levels of analysis with a single query. Adding a WITH
ROLLUP
modifier to the GROUP BY
clause causes the query to produce another row that shows the
grand total over all year values:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year WITH ROLLUP;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
| NULL | 7535 |
+------+-------------+
The grand total super-aggregate line is identified by the value
NULL
in the year
column.
ROLLUP
has a more complex effect when there
are multiple GROUP BY
columns. In this case,
each time there is a “break” (change in value) in
any but the last grouping column, the query produces an extra
super-aggregate summary row.
For example, without ROLLUP
, a summary on the
sales
table based on year
,
country
, and product
might
look like this:
mysql>SELECT year, country, product, SUM(profit)
->FROM sales
->GROUP BY year, country, product;
+------+---------+------------+-------------+ | year | country | product | SUM(profit) | +------+---------+------------+-------------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2001 | Finland | Phone | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | +------+---------+------------+-------------+
The output indicates summary values only at the
year/country/product level of analysis. When
ROLLUP
is added, the query produces several
extra rows:
mysql>SELECT year, country, product, SUM(profit)
->FROM sales
->GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+ | year | country | product | SUM(profit) | +------+---------+------------+-------------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | India | NULL | 1350 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2000 | USA | NULL | 1575 | | 2000 | NULL | NULL | 4525 | | 2001 | Finland | Phone | 10 | | 2001 | Finland | NULL | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | | 2001 | USA | NULL | 3000 | | 2001 | NULL | NULL | 3010 | | NULL | NULL | NULL | 7535 | +------+---------+------------+-------------+
For this query, adding ROLLUP
causes the
output to include summary information at four levels of
analysis, not just one. Here is how to interpret the
ROLLUP
output:
Following each set of product rows for a given year and
country, an extra summary row is produced showing the total
for all products. These rows have the
product
column set to
NULL
.
Following each set of rows for a given year, an extra
summary row is produced showing the total for all countries
and products. These rows have the country
and products
columns set to
NULL
.
Finally, following all other rows, an extra summary row is
produced showing the grand total for all years, countries,
and products. This row has the year
,
country
, and products
columns set to NULL
.
ROLLUP
The following items list some behaviors specific to the MySQL
implementation of ROLLUP
.
When you use ROLLUP
, you cannot also use an
ORDER BY
clause to sort the results. In other
words, ROLLUP
and ORDER BY
are mutually exclusive. However, you still have some control
over sort order. GROUP BY
in MySQL sorts
results, and you can use explicit ASC
and
DESC
keywords with columns named in the
GROUP BY
list to specify sort order for
individual columns. (The higher-level summary rows added by
ROLLUP
still appear after the rows from which
they are calculated, regardless of the sort order.)
LIMIT
can be used to restrict the number of
rows returned to the client. LIMIT
is applied
after ROLLUP
, so the limit applies against
the extra rows added by ROLLUP
. For example:
mysql>SELECT year, country, product, SUM(profit)
->FROM sales
->GROUP BY year, country, product WITH ROLLUP
->LIMIT 5;
+------+---------+------------+-------------+ | year | country | product | SUM(profit) | +------+---------+------------+-------------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | +------+---------+------------+-------------+
Using LIMIT
with ROLLUP
may produce results that are more difficult to interpret,
because you have less context for understanding the
super-aggregate rows.
The NULL
indicators in each super-aggregate
row are produced when the row is sent to the client. The server
looks at the columns named in the GROUP BY
clause following the leftmost one that has changed value. For
any column in the result set with a name that is a lexical match
to any of those names, its value is set to
NULL
. (If you specify grouping columns by
column number, the server identifies which columns to set to
NULL
by number.)
Because the NULL
values in the
super-aggregate rows are placed into the result set at such a
late stage in query processing, you cannot test them as
NULL
values within the query itself. For
example, you cannot add HAVING product IS
NULL
to the query to eliminate from the output all but
the super-aggregate rows.
On the other hand, the NULL
values do appear
as NULL
on the client side and can be tested
as such using any MySQL client programming interface.
MySQL permits a column that does not appear in the
GROUP BY
list to be named in the select list.
In this case, the server is free to choose any value from this
nonaggregated column in summary rows, and this includes the
extra rows added by WITH ROLLUP
. For example,
in the following query, country
is a
nonaggregated column that does not appear in the GROUP
BY
list and values chosen for this column are
indeterminate:
mysql>SELECT year, country, SUM(profit)
->FROM sales GROUP BY year WITH ROLLUP;
+------+---------+-------------+ | year | country | SUM(profit) | +------+---------+-------------+ | 2000 | India | 4525 | | 2001 | USA | 3010 | | NULL | USA | 7535 | +------+---------+-------------+
This behavior occurs if the
ONLY_FULL_GROUP_BY
SQL mode is
not enabled. If that mode is enabled, the server rejects the
query as illegal because country
is not
listed in the GROUP BY
clause. For more
information about nonaggregated columns and GROUP
BY
, see Section 12.17.3, “MySQL Extensions to GROUP BY
”.
In standard SQL, a query that includes a GROUP
BY
clause cannot refer to nonaggregated columns in the
select list that are not named in the GROUP
BY
clause. For example, this query is illegal in
standard SQL because the name
column in the
select list does not appear in the GROUP BY
:
SELECT o.custid, c.name, MAX(o.payment) FROM orders AS o, customers AS c WHERE o.custid = c.custid GROUP BY o.custid;
For the query to be legal, the name
column
must be omitted from the select list or named in the
GROUP BY
clause.
MySQL extends the use of GROUP BY
so that the
select list can refer to nonaggregated columns not named in the
GROUP BY
clause. This means that the
preceding query is legal in MySQL. You can use this feature to
get better performance by avoiding unnecessary column sorting
and grouping. However, this is useful primarily when all values
in each nonaggregated column not named in the GROUP
BY
are the same for each group. The server is free to
choose any value from each group, so unless they are the same,
the values chosen are indeterminate. Furthermore, the selection
of values from each group cannot be influenced by adding an
ORDER BY
clause. Sorting of the result set
occurs after values have been chosen, and ORDER
BY
does not affect which values within each group the
server chooses.
A similar MySQL extension applies to the
HAVING
clause. In standard SQL, a query that
includes a GROUP BY
clause cannot refer to
nonaggregated columns in the HAVING
clause
that are not named in the GROUP BY
clause. A
MySQL extension permits references to such columns to simplify
calculations. This extension assumes that the nongrouped columns
will have the same group-wise values. Otherwise, the result is
indeterminate.
To disable the MySQL GROUP BY
extension,
enable the ONLY_FULL_GROUP_BY
SQL mode. This enables standard SQL behavior: Columns not named
in the GROUP BY
clause cannot be used in the
select list or HAVING
clause unless enclosed
in an aggregate function.
ONLY_FULL_GROUP_BY
also
affects use of aliases in the HAVING
clauses.
For example, the following query returns name
values that occur only once in table orders
:
SELECT name, COUNT(name) FROM orders GROUP BY name HAVING COUNT(name) = 1;
MySQL extends this behavior to permit the use of an alias in the
HAVING
clause for the aggregated column:
SELECT name, COUNT(name) AS c FROM orders GROUP BY name HAVING c = 1;
Enabling ONLY_FULL_GROUP_BY
disables this MySQL extension and a non-grouping field
'c' is used in HAVING clause
error occurs because the
column c
in the HAVING
clause is not enclosed in an aggregate function (instead, it
is an aggregate function).
The select list extension also applies to ORDER
BY
. That is, you can refer to nonaggregated columns in
the ORDER BY
clause that do not appear in the
GROUP BY
clause. (However, as mentioned
previously, ORDER BY
does not affect which
values are chosen from nonaggregated columns; it only sorts them
after they have been chosen.) This extension does not apply if
the ONLY_FULL_GROUP_BY
SQL
mode is enabled.
In some cases, you can use MIN()
and MAX()
to obtain a specific
column value even if it is not unique. If the
sort
column contains integers no larger than
6 digits, the following query gives the value of
column
from the row containing the smallest
sort
value:
SUBSTR(MIN(CONCAT(LPAD(sort,6,'0'),column)),7)
See Section 3.6.4, “The Rows Holding the Group-wise Maximum of a Certain Column”.
If you are trying to follow standard SQL, you cannot use
expressions in GROUP BY
clauses. As a
workaround, use an alias for the expression:
SELECT id, FLOOR(value/100) AS val
FROM tbl_name
GROUP BY id, val;
MySQL permits expressions in GROUP BY
clauses, so the alias is unnecessary:
SELECT id, FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
MySQL supports spatial extensions to enable the generation, storage,
and analysis of geographic features. These features are available
for MyISAM
, InnoDB
,
NDB
, and ARCHIVE
tables.
For spatial columns, MyISAM
supports both
SPATIAL
and non-SPATIAL
indexes. Other storage engines support
non-SPATIAL
indexes, as described in
Section 13.1.11, “CREATE INDEX
Syntax”.
This chapter covers the following topics:
The basis of these spatial extensions in the OpenGIS geometry model
Data formats for representing spatial data
How to use spatial data in MySQL
Use of indexing for spatial data
MySQL differences from the OpenGIS specification
The Open Geospatial Consortium publishes the OpenGIS® Simple Features Specifications For SQL, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. This specification is available from the OGC Web site at http://www.opengis.org/docs/99-049.pdf.
If you have questions or concerns about the use of the spatial extensions to MySQL, you can discuss them in the GIS forum: http://forums.mysql.com/list.php?23.
MySQL implements spatial extensions following the specification of the Open Geospatial Consortium (OGC). This is an international consortium of more than 250 companies, agencies, and universities participating in the development of publicly available conceptual solutions that can be useful with all kinds of applications that manage spatial data. The OGC maintains a Web site at http://www.opengis.org/.
In 1997, the Open Geospatial Consortium published the OpenGIS® Simple Features Specifications For SQL, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. This specification is available from the OGC Web site at http://www.opengis.org/docs/99-049.pdf. It contains additional information relevant to this chapter.
MySQL implements a subset of the SQL with Geometry Types environment proposed by OGC. This term refers to an SQL environment that has been extended with a set of geometry types. A geometry-valued SQL column is implemented as a column that has a geometry type. The specification describe a set of SQL geometry types, as well as functions on those types to create and analyze geometry values.
A geographic feature is anything in the world that has a location. A feature can be:
An entity. For example, a mountain, a pond, a city.
A space. For example, town district, the tropics.
A definable location. For example, a crossroad, as a particular place where two streets intersect.
Some documents use the term geospatial feature to refer to geographic features.
Geometry is another word that denotes a geographic feature. Originally the word geometry meant measurement of the earth. Another meaning comes from cartography, referring to the geometric features that cartographers use to map the world.
This chapter uses all of these terms synonymously: geographic feature, geospatial feature, feature, or geometry. Here, the term most commonly used is geometry, defined as a point or an aggregate of points representing anything in the world that has a location.
Geometry
Point
Curve
LineString
Surface
Polygon
GeometryCollection
MultiPoint
MultiCurve
MultiLineString
MultiSurface
MultiPolygon
The set of geometry types proposed by OGC's SQL with Geometry Types environment is based on the OpenGIS Geometry Model. In this model, each geometric object has the following general properties:
It is associated with a Spatial Reference System, which describes the coordinate space in which the object is defined.
It belongs to some geometry class.
The geometry classes define a hierarchy as follows:
Geometry
(noninstantiable)
Point
(instantiable)
Curve
(noninstantiable)
LineString
(instantiable)
Line
LinearRing
Surface
(noninstantiable)
Polygon
(instantiable)
GeometryCollection
(instantiable)
MultiPoint
(instantiable)
MultiCurve
(noninstantiable)
MultiLineString
(instantiable)
MultiSurface
(noninstantiable)
MultiPolygon
(instantiable)
It is not possible to create objects in noninstantiable classes. It is possible to create objects in instantiable classes. All classes have properties, and instantiable classes may also have assertions (rules that define valid class instances).
Geometry
is the base class. It is an abstract
class. The instantiable subclasses of
Geometry
are restricted to zero-, one-, and
two-dimensional geometric objects that exist in two-dimensional
coordinate space. All instantiable geometry classes are defined
so that valid instances of a geometry class are topologically
closed (that is, all defined geometries include their boundary).
The base Geometry
class has subclasses for
Point
, Curve
,
Surface
, and
GeometryCollection
:
Point
represents zero-dimensional
objects.
Curve
represents one-dimensional objects,
and has subclass LineString
, with
sub-subclasses Line
and
LinearRing
.
Surface
is designed for two-dimensional
objects and has subclass Polygon
.
GeometryCollection
has specialized zero-,
one-, and two-dimensional collection classes named
MultiPoint
,
MultiLineString
, and
MultiPolygon
for modeling geometries
corresponding to collections of Points
,
LineStrings
, and
Polygons
, respectively.
MultiCurve
and
MultiSurface
are introduced as abstract
superclasses that generalize the collection interfaces to
handle Curves
and
Surfaces
.
Geometry
, Curve
,
Surface
, MultiCurve
, and
MultiSurface
are defined as noninstantiable
classes. They define a common set of methods for their
subclasses and are included for extensibility.
Point
, LineString
,
Polygon
,
GeometryCollection
,
MultiPoint
,
MultiLineString
, and
MultiPolygon
are instantiable classes.
Geometry
is the root class of the hierarchy.
It is a noninstantiable class but has a number of properties
that are common to all geometry values created from any of the
Geometry
subclasses. These properties are
described in the following list. Particular subclasses have
their own specific properties, described later.
Geometry Properties
A geometry value has the following properties:
Its type. Each geometry belongs to one of the instantiable classes in the hierarchy.
Its SRID, or Spatial Reference Identifier. This value identifies the geometry's associated Spatial Reference System that describes the coordinate space in which the geometry object is defined.
In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry.
Its coordinates in its Spatial Reference System, represented as double-precision (8-byte) numbers. All nonempty geometries include at least one pair of (X,Y) coordinates. Empty geometries contain no coordinates.
Coordinates are related to the SRID. For example, in different coordinate systems, the distance between two objects may differ even when objects have the same coordinates, because the distance on the planar coordinate system and the distance on the geocentric system (coordinates on the Earth's surface) are different things.
Its interior, boundary, and exterior.
Every geometry occupies some position in space. The exterior of a geometry is all space not occupied by the geometry. The interior is the space occupied by the geometry. The boundary is the interface between the geometry's interior and exterior.
Its MBR (Minimum Bounding Rectangle), or Envelope. This is the bounding geometry, formed by the minimum and maximum (X,Y) coordinates:
((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
Whether the value is simple
or nonsimple. Geometry
values of types (LineString
,
MultiPoint
,
MultiLineString
) are either simple or
nonsimple. Each type determines its own assertions for being
simple or nonsimple.
Whether the value is closed
or not closed. Geometry
values of types (LineString
,
MultiString
) are either closed or not
closed. Each type determines its own assertions for being
closed or not closed.
Whether the value is empty
or nonempty A geometry is
empty if it does not have any points. Exterior, interior,
and boundary of an empty geometry are not defined (that is,
they are represented by a NULL
value). An
empty geometry is defined to be always simple and has an
area of 0.
Its dimension. A geometry can have a dimension of –1, 0, 1, or 2:
–1 for an empty geometry.
0 for a geometry with no length and no area.
1 for a geometry with nonzero length and zero area.
2 for a geometry with nonzero area.
Point
objects have a dimension of zero.
LineString
objects have a dimension of 1.
Polygon
objects have a dimension of 2.
The dimensions of MultiPoint
,
MultiLineString
, and
MultiPolygon
objects are the same as the
dimensions of the elements they consist of.
A Point
is a geometry that represents a
single location in coordinate space.
Point
Examples
Imagine a large-scale map of the world with many cities. A
Point
object could represent each city.
On a city map, a Point
object could
represent a bus stop.
Point
Properties
X-coordinate value.
Y-coordinate value.
Point
is defined as a zero-dimensional
geometry.
The boundary of a Point
is the empty set.
A Curve
is a one-dimensional geometry,
usually represented by a sequence of points. Particular
subclasses of Curve
define the type of
interpolation between points. Curve
is a
noninstantiable class.
Curve
Properties
A Curve
has the coordinates of its
points.
A Curve
is defined as a one-dimensional
geometry.
A Curve
is simple if it does not pass
through the same point twice.
A Curve
is closed if its start point is
equal to its endpoint.
The boundary of a closed Curve
is empty.
The boundary of a nonclosed Curve
consists of its two endpoints.
A Curve
that is simple and closed is a
LinearRing
.
A LineString
is a Curve
with linear interpolation between points.
LineString
Examples
On a world map, LineString
objects could
represent rivers.
In a city map, LineString
objects could
represent streets.
LineString
Properties
A LineString
has coordinates of segments,
defined by each consecutive pair of points.
A LineString
is a Line
if it consists of exactly two points.
A LineString
is a
LinearRing
if it is both closed and
simple.
A Surface
is a two-dimensional geometry. It
is a noninstantiable class. Its only instantiable subclass is
Polygon
.
Surface
Properties
A Surface
is defined as a two-dimensional
geometry.
The OpenGIS specification defines a simple
Surface
as a geometry that consists of a
single “patch” that is associated with a single
exterior boundary and zero or more interior boundaries.
The boundary of a simple Surface
is the
set of closed curves corresponding to its exterior and
interior boundaries.
A Polygon
is a planar
Surface
representing a multisided geometry.
It is defined by a single exterior boundary and zero or more
interior boundaries, where each interior boundary defines a hole
in the Polygon
.
Polygon
Examples
On a region map, Polygon
objects could
represent forests, districts, and so on.
Polygon
Assertions
The boundary of a Polygon
consists of a
set of LinearRing
objects (that is,
LineString
objects that are both simple
and closed) that make up its exterior and interior
boundaries.
A Polygon
has no rings that cross. The
rings in the boundary of a Polygon
may
intersect at a Point
, but only as a
tangent.
A Polygon
has no lines, spikes, or
punctures.
A Polygon
has an interior that is a
connected point set.
A Polygon
may have holes. The exterior of
a Polygon
with holes is not connected.
Each hole defines a connected component of the exterior.
The preceding assertions make a Polygon
a
simple geometry.
A GeometryCollection
is a geometry that is a
collection of one or more geometries of any class.
All the elements in a GeometryCollection
must
be in the same Spatial Reference System (that is, in the same
coordinate system). There are no other constraints on the
elements of a GeometryCollection
, although
the subclasses of GeometryCollection
described in the following sections may restrict membership.
Restrictions may be based on:
Element type (for example, a MultiPoint
may contain only Point
elements)
Dimension
Constraints on the degree of spatial overlap between elements
A MultiPoint
is a geometry collection
composed of Point
elements. The points are
not connected or ordered in any way.
MultiPoint
Examples
On a world map, a MultiPoint
could
represent a chain of small islands.
On a city map, a MultiPoint
could
represent the outlets for a ticket office.
MultiPoint
Properties
A MultiPoint
is a zero-dimensional
geometry.
A MultiPoint
is simple if no two of its
Point
values are equal (have identical
coordinate values).
The boundary of a MultiPoint
is the empty
set.
A MultiCurve
is a geometry collection
composed of Curve
elements.
MultiCurve
is a noninstantiable class.
MultiCurve
Properties
A MultiCurve
is a one-dimensional
geometry.
A MultiCurve
is simple if and only if all
of its elements are simple; the only intersections between
any two elements occur at points that are on the boundaries
of both elements.
A MultiCurve
boundary is obtained by
applying the “mod 2 union rule” (also known as
the “odd-even rule”): A point is in the
boundary of a MultiCurve
if it is in the
boundaries of an odd number of MultiCurve
elements.
A MultiCurve
is closed if all of its
elements are closed.
The boundary of a closed MultiCurve
is
always empty.
A MultiLineString
is a
MultiCurve
geometry collection composed of
LineString
elements.
MultiLineString
Examples
On a region map, a MultiLineString
could
represent a river system or a highway system.
A MultiSurface
is a geometry collection
composed of surface elements. MultiSurface
is
a noninstantiable class. Its only instantiable subclass is
MultiPolygon
.
MultiSurface
Assertions
Two MultiSurface
surfaces have no
interiors that intersect.
Two MultiSurface
elements have boundaries
that intersect at most at a finite number of points.
A MultiPolygon
is a
MultiSurface
object composed of
Polygon
elements.
MultiPolygon
Examples
On a region map, a MultiPolygon
could
represent a system of lakes.
MultiPolygon
Assertions
A MultiPolygon
has no two
Polygon
elements with interiors that
intersect.
A MultiPolygon
has no two
Polygon
elements that cross (crossing is
also forbidden by the previous assertion), or that touch at
an infinite number of points.
A MultiPolygon
may not have cut lines,
spikes, or punctures. A MultiPolygon
is a
regular, closed point set.
A MultiPolygon
that has more than one
Polygon
has an interior that is not
connected. The number of connected components of the
interior of a MultiPolygon
is equal to
the number of Polygon
values in the
MultiPolygon
.
MultiPolygon
Properties
A MultiPolygon
is a two-dimensional
geometry.
A MultiPolygon
boundary is a set of
closed curves (LineString
values)
corresponding to the boundaries of its
Polygon
elements.
Each Curve
in the boundary of the
MultiPolygon
is in the boundary of
exactly one Polygon
element.
Every Curve
in the boundary of an
Polygon
element is in the boundary of the
MultiPolygon
.
This section describes the standard spatial data formats that are used to represent geometry objects in queries. They are:
Well-Known Text (WKT) format
Well-Known Binary (WKB) format
Internally, MySQL stores geometry values in a format that is not identical to either WKT or WKB format.
The Well-Known Text (WKT) representation of Geometry is designed to exchange geometry data in ASCII form. For a Backus-Naur grammar that specifies the formal production rules for writing WKT values, see the OpenGIS specification document referenced in Section 12.18, “Spatial Extensions”.
Examples of WKT representations of geometry objects:
A Point
:
POINT(15 20)
Note that point coordinates are specified with no separating
comma. This differs from the syntax for the SQL
POINT()
function, which
requires a comma between the coordinates. Take care to use
the syntax appropriate to the context of a given spatial
operation. For example, the following statements both
extract the X-coordinate from a Point
object. The first produces the object directly using the
POINT()
function. The second
uses a WKT representation converted to a
Point
with
GeomFromText()
.
mysql>SELECT X(POINT(15, 20));
+------------------+ | X(POINT(15, 20)) | +------------------+ | 15 | +------------------+ mysql>SELECT X(GeomFromText('POINT(15 20)'));
+---------------------------------+ | X(GeomFromText('POINT(15 20)')) | +---------------------------------+ | 15 | +---------------------------------+
A LineString
with four points:
LINESTRING(0 0, 10 10, 20 25, 50 60)
Note that point coordinate pairs are separated by commas.
A Polygon
with one exterior ring and one
interior ring:
POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))
A MultiPoint
with three
Point
values:
MULTIPOINT(0 0, 20 20, 60 60)
A MultiLineString
with two
LineString
values:
MULTILINESTRING((10 10, 20 20), (15 15, 30 15))
A MultiPolygon
with two
Polygon
values:
MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((5 5,7 5,7 7,5 7, 5 5)))
A GeometryCollection
consisting of two
Point
values and one
LineString
:
GEOMETRYCOLLECTION(POINT(10 10), POINT(30 30), LINESTRING(15 15, 20 20))
The Well-Known Binary (WKB) representation for geometric values is defined by the OpenGIS specification. It is also defined in the ISO SQL/MM Part 3: Spatial standard.
WKB is used to exchange geometry data as binary streams
represented by BLOB
values
containing geometric WKB information.
WKB uses 1-byte unsigned integers, 4-byte unsigned integers, and 8-byte double-precision numbers (IEEE 754 format). A byte is eight bits.
For example, a WKB value that corresponds to POINT(1
1)
consists of this sequence of 21 bytes (each
represented here by two hex digits):
0101000000000000000000F03F000000000000F03F
The sequence may be broken down into these components:
Byte order : 01 WKB type : 01000000 X : 000000000000F03F Y : 000000000000F03F
Component representation is as follows:
The byte order may be either 1 or 0 to indicate little-endian or big-endian storage. The little-endian and big-endian byte orders are also known as Network Data Representation (NDR) and External Data Representation (XDR), respectively.
The WKB type is a code that indicates the geometry type.
Values from 1 through 7 indicate Point
,
LineString
, Polygon
,
MultiPoint
,
MultiLineString
,
MultiPolygon
, and
GeometryCollection
.
A Point
value has X and Y coordinates,
each represented as a double-precision value.
WKB values for more complex geometry values are represented by more complex data structures, as detailed in the OpenGIS specification.
This section describes the data types you can use for representing spatial data in MySQL, and the functions available for creating and retrieving spatial values.
MySQL has data types that correspond to OpenGIS classes. Some of these types hold single geometry values:
GEOMETRY
POINT
LINESTRING
POLYGON
GEOMETRY
can store geometry values of any
type. The other single-value types (POINT
,
LINESTRING
, and POLYGON
)
restrict their values to a particular geometry type.
The other data types hold collections of values:
MULTIPOINT
MULTILINESTRING
MULTIPOLYGON
GEOMETRYCOLLECTION
GEOMETRYCOLLECTION
can store a collection of
objects of any type. The other collection types
(MULTIPOINT
,
MULTILINESTRING
,
MULTIPOLYGON
, and
GEOMETRYCOLLECTION
) restrict collection
members to those having a particular geometry type.
This section describes how to create spatial values using Well-Known Text and Well-Known Binary functions that are defined in the OpenGIS standard, and using MySQL-specific functions.
MySQL provides a number of functions that take as arguments a Well-Known Text representation and, optionally, a spatial reference system identifier (SRID). They return the corresponding geometry.
GeomFromText()
accepts a WKT of
any geometry type as its first argument. An implementation
also provides type-specific construction functions for
construction of geometry values of each geometry type.
GeomCollFromText(
,
wkt
[,srid
])GeometryCollectionFromText(
wkt
[,srid
])
Constructs a GEOMETRYCOLLECTION
value
using its WKT representation and SRID.
GeomFromText(
,
wkt
[,srid
])GeometryFromText(
wkt
[,srid
])
Constructs a geometry value of any type using its WKT representation and SRID.
LineFromText(
,
wkt
[,srid
])LineStringFromText(
wkt
[,srid
])
Constructs a LINESTRING
value using its
WKT representation and SRID.
MLineFromText(
,
wkt
[,srid
])MultiLineStringFromText(
wkt
[,srid
])
Constructs a MULTILINESTRING
value
using its WKT representation and SRID.
MPointFromText(
,
wkt
[,srid
])MultiPointFromText(
wkt
[,srid
])
Constructs a MULTIPOINT
value using its
WKT representation and SRID.
MPolyFromText(
,
wkt
[,srid
])MultiPolygonFromText(
wkt
[,srid
])
Constructs a MULTIPOLYGON
value using
its WKT representation and SRID.
Constructs a POINT
value using its WKT
representation and SRID.
PolyFromText(
,
wkt
[,srid
])PolygonFromText(
wkt
[,srid
])
Constructs a POLYGON
value using its
WKT representation and SRID.
The OpenGIS specification also defines the following optional
functions, which MySQL does not implement. These functions
construct Polygon
or
MultiPolygon
values based on the WKT
representation of a collection of rings or closed
LineString
values. These values may
intersect.
Constructs a MultiPolygon
value from a
MultiLineString
value in WKT format
containing an arbitrary collection of closed
LineString
values.
Constructs a Polygon
value from a
MultiLineString
value in WKT format
containing an arbitrary collection of closed
LineString
values.
MySQL provides a number of functions that take as arguments a
BLOB
containing a Well-Known
Binary representation and, optionally, a spatial reference
system identifier (SRID). They return the corresponding
geometry.
These functions also accept geometry objects for compatibility with the return value of the functions in Section 12.18.4.2.3, “Creating Geometry Values Using MySQL-Specific Functions”. Thus, those functions may be used to provide the first argument to the functions in this section.
GeomCollFromWKB(
,
wkb
[,srid
])GeometryCollectionFromWKB(
wkb
[,srid
])
Constructs a GEOMETRYCOLLECTION
value
using its WKB representation and SRID.
GeomFromWKB(
,
wkb
[,srid
])GeometryFromWKB(
wkb
[,srid
])
Constructs a geometry value of any type using its WKB representation and SRID.
LineFromWKB(
,
wkb
[,srid
])LineStringFromWKB(
wkb
[,srid
])
Constructs a LINESTRING
value using its
WKB representation and SRID.
MLineFromWKB(
,
wkb
[,srid
])MultiLineStringFromWKB(
wkb
[,srid
])
Constructs a MULTILINESTRING
value
using its WKB representation and SRID.
MPointFromWKB(
,
wkb
[,srid
])MultiPointFromWKB(
wkb
[,srid
])
Constructs a MULTIPOINT
value using its
WKB representation and SRID.
MPolyFromWKB(
,
wkb
[,srid
])MultiPolygonFromWKB(
wkb
[,srid
])
Constructs a MULTIPOLYGON
value using
its WKB representation and SRID.
Constructs a POINT
value using its WKB
representation and SRID.
PolyFromWKB(
,
wkb
[,srid
])PolygonFromWKB(
wkb
[,srid
])
Constructs a POLYGON
value using its
WKB representation and SRID.
The OpenGIS specification also describes optional functions
for constructing Polygon
or
MultiPolygon
values based on the WKB
representation of a collection of rings or closed
LineString
values. These values may
intersect. MySQL does not implement these functions:
Constructs a MultiPolygon
value from a
MultiLineString
value in WKB format
containing an arbitrary collection of closed
LineString
values.
Constructs a Polygon
value from a
MultiLineString
value in WKB format
containing an arbitrary collection of closed
LineString
values.
MySQL provides a set of useful nonstandard functions for creating geometry values. The functions described in this section are MySQL extensions to the OpenGIS specification.
These functions produce geometry objects from either WKB
values or geometry objects as arguments. If any argument is
not a proper WKB or geometry representation of the proper
object type, the return value is NULL
.
For example, you can insert the geometry return value from
Point()
directly into a
Point
column:
INSERT INTO t1 (pt_col) VALUES(Point(1,2));
Constructs a GeometryCollection
.
Constructs a LineString
value from a
number of Point
or WKB
Point
arguments. If the number of
arguments is less than two, the return value is
NULL
.
Constructs a MultiLineString
value
using LineString
or WKB
LineString
arguments.
Constructs a MultiPoint
value using
Point
or WKB Point
arguments.
Constructs a MultiPolygon
value from a
set of Polygon
or WKB
Polygon
arguments.
Constructs a Point
using its
coordinates.
Constructs a Polygon
value from a
number of LineString
or WKB
LineString
arguments. If any argument
does not represent a LinearRing
(that
is, not a closed and simple
LineString
), the return value is
NULL
.
MySQL provides a standard way of creating spatial columns for
geometry types, for example, with CREATE
TABLE
or ALTER TABLE
.
Currently, spatial columns are supported for
MyISAM
, InnoDB
,
NDB
, and ARCHIVE
tables. See also the annotations about spatial indexes under
Section 12.18.6.1, “Creating Spatial Indexes”.
Use the CREATE TABLE
statement to create a table with a spatial column:
CREATE TABLE geom (g GEOMETRY);
Use the ALTER TABLE
statement
to add or drop a spatial column to or from an existing
table:
ALTER TABLE geom ADD pt POINT; ALTER TABLE geom DROP pt;
After you have created spatial columns, you can populate them with spatial data.
Values should be stored in internal geometry format, but you can convert them to that format from either Well-Known Text (WKT) or Well-Known Binary (WKB) format. The following examples demonstrate how to insert geometry values into a table by converting WKT values into internal geometry format:
The following examples insert more complex geometries into the table:
SET @g = 'LINESTRING(0 0,1 1,2 2)'; INSERT INTO geom VALUES (GeomFromText(@g)); SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))'; INSERT INTO geom VALUES (GeomFromText(@g)); SET @g = 'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))'; INSERT INTO geom VALUES (GeomFromText(@g));
The preceding examples all use
GeomFromText()
to create geometry
values. You can also use type-specific functions:
SET @g = 'POINT(1 1)'; INSERT INTO geom VALUES (PointFromText(@g)); SET @g = 'LINESTRING(0 0,1 1,2 2)'; INSERT INTO geom VALUES (LineStringFromText(@g)); SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))'; INSERT INTO geom VALUES (PolygonFromText(@g)); SET @g = 'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))'; INSERT INTO geom VALUES (GeomCollFromText(@g));
Note that if a client application program wants to use WKB representations of geometry values, it is responsible for sending correctly formed WKB in queries to the server. However, there are several ways of satisfying this requirement. For example:
Inserting a POINT(1 1)
value with hex
literal syntax:
mysql>INSERT INTO geom VALUES
->(GeomFromWKB(0x0101000000000000000000F03F000000000000F03F));
An ODBC application can send a WKB representation, binding
it to a placeholder using an argument of
BLOB
type:
INSERT INTO geom VALUES (GeomFromWKB(?))
Other programming interfaces may support a similar placeholder mechanism.
In a C program, you can escape a binary value using
mysql_real_escape_string()
and include the result in a query string that is sent to the
server. See Section 21.8.7.54, “mysql_real_escape_string()
”.
Geometry values stored in a table can be fetched in internal format. You can also convert them into WKT or WKB format.
Fetching spatial data in internal format:
Fetching geometry values using internal format can be useful in table-to-table transfers:
CREATE TABLE geom2 (g GEOMETRY) SELECT g FROM geom;
Fetching spatial data in WKT format:
The AsText()
function
converts a geometry from internal format into a WKT string.
SELECT AsText(g) FROM geom;
Fetching spatial data in WKB format:
The AsBinary()
function
converts a geometry from internal format into a
BLOB
containing the WKB
value.
SELECT AsBinary(g) FROM geom;
After populating spatial columns with values, you are ready to query and analyze them. MySQL provides a set of functions to perform various operations on spatial data. These functions can be grouped into four major categories according to the type of operation they perform:
Functions that convert geometries between various formats
Functions that provide access to qualitative or quantitative properties of a geometry
Functions that describe relations between two geometries
Functions that create new geometries from existing ones
Spatial analysis functions can be used in many contexts, such as:
Any interactive SQL program, such as mysql.
Application programs written in any language that supports a MySQL client API
MySQL supports the following functions for converting geometry values between internal format and either WKT or WKB format:
Converts a value in internal geometry format to its WKB representation and returns the binary result.
SELECT AsBinary(g) FROM geom;
Converts a value in internal geometry format to its WKT representation and returns the string result.
mysql>SET @g = 'LineString(1 1,2 2,3 3)';
mysql>SELECT AsText(GeomFromText(@g));
+--------------------------+ | AsText(GeomFromText(@g)) | +--------------------------+ | LINESTRING(1 1,2 2,3 3) | +--------------------------+
Converts a string value from its WKT representation into
internal geometry format and returns the result. A number of
type-specific functions are also supported, such as
PointFromText()
and
LineFromText()
. See
Section 12.18.4.2.1, “Creating Geometry Values Using WKT Functions”.
Converts a binary value from its WKB representation into
internal geometry format and returns the result. A number of
type-specific functions are also supported, such as
PointFromWKB()
and
LineFromWKB()
. See
Section 12.18.4.2.2, “Creating Geometry Values Using WKB Functions”.
Each function that belongs to this group takes a geometry value
as its argument and returns some quantitative or qualitative
property of the geometry. Some functions restrict their argument
type. Such functions return NULL
if the
argument is of an incorrect geometry type. For example,
Area()
returns
NULL
if the object type is neither
Polygon
nor MultiPolygon
.
The functions listed in this section do not restrict their argument and accept a geometry value of any type.
Returns the inherent dimension of the geometry value
g
. The result can be –1,
0, 1, or 2. The meaning of these values is given in
Section 12.18.2.2, “Class Geometry
”.
mysql> SELECT Dimension(GeomFromText('LineString(1 1,2 2)'));
+------------------------------------------------+
| Dimension(GeomFromText('LineString(1 1,2 2)')) |
+------------------------------------------------+
| 1 |
+------------------------------------------------+
Returns the Minimum Bounding Rectangle (MBR) for the
geometry value g
. The result is
returned as a Polygon
value.
The polygon is defined by the corner points of the bounding box:
POLYGON((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
mysql> SELECT AsText(Envelope(GeomFromText('LineString(1 1,2 2)')));
+-------------------------------------------------------+
| AsText(Envelope(GeomFromText('LineString(1 1,2 2)'))) |
+-------------------------------------------------------+
| POLYGON((1 1,2 1,2 2,1 2,1 1)) |
+-------------------------------------------------------+
Returns as a binary string the name of the geometry type
of which the geometry instance
g
is a member. The name
corresponds to one of the instantiable
Geometry
subclasses.
mysql> SELECT GeometryType(GeomFromText('POINT(1 1)'));
+------------------------------------------+
| GeometryType(GeomFromText('POINT(1 1)')) |
+------------------------------------------+
| POINT |
+------------------------------------------+
Returns 1 if the geometry value
g
has no anomalous geometric
points, such as self-intersection or self-tangency.
IsSimple()
returns 0 if the
argument is not simple, and NULL
if it
is NULL
.
The description of each instantiable geometric class given earlier in the chapter includes the specific conditions that cause an instance of that class to be classified as not simple. (See Section 12.18.2.1, “The Geometry Class Hierarchy”.)
Returns an integer indicating the Spatial Reference System
ID for the geometry value g
.
In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry.
mysql> SELECT SRID(GeomFromText('LineString(1 1,2 2)',101));
+-----------------------------------------------+
| SRID(GeomFromText('LineString(1 1,2 2)',101)) |
+-----------------------------------------------+
| 101 |
+-----------------------------------------------+
The OpenGIS specification also defines the following functions, which MySQL does not implement:
Returns a geometry that is the closure of the
combinatorial boundary of the geometry value
g
.
This function is a placeholder that returns 0 for any
valid geometry value, 1 for any invalid geometry value or
NULL
.
MySQL does not support GIS EMPTY
values
such as POINT EMPTY
.
A Point
consists of X and Y coordinates,
which may be obtained using the following functions:
Returns the X-coordinate value for the
Point
object
p
as a double-precision number.
mysql> SELECT X(POINT(56.7, 53.34));
+-----------------------+
| X(POINT(56.7, 53.34)) |
+-----------------------+
| 56.7 |
+-----------------------+
Returns the Y-coordinate value for the
Point
object
p
as a double-precision number.
mysql> SELECT Y(POINT(56.7, 53.34));
+-----------------------+
| Y(POINT(56.7, 53.34)) |
+-----------------------+
| 53.34 |
+-----------------------+
A LineString
consists of
Point
values. You can extract particular
points of a LineString
, count the number of
points that it contains, or obtain its length.
Returns the Point
that is the endpoint
of the LineString
value
ls
.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT AsText(EndPoint(GeomFromText(@ls)));
+-------------------------------------+ | AsText(EndPoint(GeomFromText(@ls))) | +-------------------------------------+ | POINT(3 3) | +-------------------------------------+
Returns as a double-precision number the length of the
LineString
value
ls
in its associated spatial
reference.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT GLength(GeomFromText(@ls));
+----------------------------+ | GLength(GeomFromText(@ls)) | +----------------------------+ | 2.8284271247462 | +----------------------------+
GLength()
is a nonstandard
name. It corresponds to the OpenGIS
Length()
function.
Returns 1 if the LineString
value
ls
is closed (that is, its
StartPoint()
and
EndPoint()
values are the
same) and is simple (does not pass through the same point
more than once). Returns 0 if
ls
is not closed, and –1
if it is NULL
.
mysql>SET @ls1 = '
Query OK, 0 rows affected (0.00 sec) mysql>LineString
(1 1,2 2,3 3,2 2)';SET @ls2 = 'LineString(1 1,2 2,3 3,1 1)';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT IsClosed(
+------------------------------+ | IsClosed(GeomFromText(@ls1)) | +------------------------------+ | 0 | +------------------------------+ 1 row in set (0.00 sec) mysql>GeomFromText
(@ls1));SELECT IsClosed(GeomFromText(@ls2));
+------------------------------+ | IsClosed(GeomFromText(@ls2)) | +------------------------------+ | 1 | +------------------------------+ 1 row in set (0.00 sec)
Returns the number of Point
objects in
the LineString
value
ls
.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT NumPoints(GeomFromText(@ls));
+------------------------------+ | NumPoints(GeomFromText(@ls)) | +------------------------------+ | 3 | +------------------------------+
Returns the N
-th
Point
in the
Linestring
value
ls
. Points are numbered
beginning with 1.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT AsText(PointN(GeomFromText(@ls),2));
+-------------------------------------+ | AsText(PointN(GeomFromText(@ls),2)) | +-------------------------------------+ | POINT(2 2) | +-------------------------------------+
Returns the Point
that is the start
point of the LineString
value
ls
.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT AsText(StartPoint(GeomFromText(@ls)));
+---------------------------------------+ | AsText(StartPoint(GeomFromText(@ls))) | +---------------------------------------+ | POINT(1 1) | +---------------------------------------+
These functions return properties of
MultiLineString
values.
Returns as a double-precision number the length of the
MultiLineString
value
mls
. The length of
mls
is equal to the sum of the
lengths of its elements.
mysql>SET @mls = 'MultiLineString((1 1,2 2,3 3),(4 4,5 5))';
mysql>SELECT GLength(GeomFromText(@mls));
+-----------------------------+ | GLength(GeomFromText(@mls)) | +-----------------------------+ | 4.2426406871193 | +-----------------------------+
GLength()
is a nonstandard
name. It corresponds to the OpenGIS
Length()
function.
Returns 1 if the MultiLineString
value
mls
is closed (that is, the
StartPoint()
and
EndPoint()
values are the
same for each LineString
in
mls
). Returns 0 if
mls
is not closed, and –1
if it is NULL
.
mysql>SET @mls = 'MultiLineString((1 1,2 2,3 3),(4 4,5 5))';
mysql>SELECT IsClosed(GeomFromText(@mls));
+------------------------------+ | IsClosed(GeomFromText(@mls)) | +------------------------------+ | 0 | +------------------------------+
These functions return properties of
Polygon
values.
Returns as a double-precision number the area of the
Polygon
value
poly
, as measured in its
spatial reference system.
mysql>SET @poly = 'Polygon((0 0,0 3,3 0,0 0),(1 1,1 2,2 1,1 1))';
mysql>SELECT Area(GeomFromText(@poly));
+---------------------------+ | Area(GeomFromText(@poly)) | +---------------------------+ | 4 | +---------------------------+
Returns the exterior ring of the
Polygon
value
poly
as a
LineString
.
mysql>SET @poly =
->'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
mysql>SELECT AsText(ExteriorRing(GeomFromText(@poly)));
+-------------------------------------------+ | AsText(ExteriorRing(GeomFromText(@poly))) | +-------------------------------------------+ | LINESTRING(0 0,0 3,3 3,3 0,0 0) | +-------------------------------------------+
Returns the N
-th interior ring
for the Polygon
value
poly
as a
LineString
. Rings are numbered
beginning with 1.
mysql>SET @poly =
->'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
mysql>SELECT AsText(InteriorRingN(GeomFromText(@poly),1));
+----------------------------------------------+ | AsText(InteriorRingN(GeomFromText(@poly),1)) | +----------------------------------------------+ | LINESTRING(1 1,1 2,2 2,2 1,1 1) | +----------------------------------------------+
Returns the number of interior rings in the
Polygon
value
poly
.
mysql>SET @poly =
->'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
mysql>SELECT NumInteriorRings(GeomFromText(@poly));
+---------------------------------------+ | NumInteriorRings(GeomFromText(@poly)) | +---------------------------------------+ | 1 | +---------------------------------------+
These functions return properties of
MultiPolygon
values.
Returns as a double-precision number the area of the
MultiPolygon
value
mpoly
, as measured in its
spatial reference system.
mysql>SET @mpoly =
->'MultiPolygon(((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1)))';
mysql>SELECT Area(GeomFromText(@mpoly));
+----------------------------+ | Area(GeomFromText(@mpoly)) | +----------------------------+ | 8 | +----------------------------+
Returns the mathematical centroid for the
MultiPolygon
value
mpoly
as a
Point
. The result is not guaranteed to
be on the MultiPolygon
.
mysql>SET @poly =
->GeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7,5 5))');
mysql>SELECT GeometryType(@poly),AsText(Centroid(@poly));
+---------------------+--------------------------------------------+ | GeometryType(@poly) | AsText(Centroid(@poly)) | +---------------------+--------------------------------------------+ | POLYGON | POINT(4.958333333333333 4.958333333333333) | +---------------------+--------------------------------------------+
The OpenGIS specification also defines the following function, which MySQL does not implement:
These functions return properties of
GeometryCollection
values.
Returns the N
-th geometry in
the GeometryCollection
value
gc
. Geometries are numbered
beginning with 1.
mysql>SET @gc = 'GeometryCollection(Point(1 1),LineString(2 2, 3 3))';
mysql>SELECT AsText(GeometryN(GeomFromText(@gc),1));
+----------------------------------------+ | AsText(GeometryN(GeomFromText(@gc),1)) | +----------------------------------------+ | POINT(1 1) | +----------------------------------------+
Returns the number of geometries in the
GeometryCollection
value
gc
.
mysql>SET @gc = 'GeometryCollection(Point(1 1),LineString(2 2, 3 3))';
mysql>SELECT NumGeometries(GeomFromText(@gc));
+----------------------------------+ | NumGeometries(GeomFromText(@gc)) | +----------------------------------+ | 2 | +----------------------------------+
The following sections describe functions that take geometry values as arguments and return new geometry values.
Section 12.18.5.2, “Geometry
Property Functions”, discusses
several functions that construct new geometries from existing
ones. See that section for descriptions of these functions:
OpenGIS proposes a number of other functions that can produce geometries. They are designed to implement spatial operators.
Returns a geometry that represents all points whose
distance from the geometry value
g
is less than or equal to a
distance of d
.
Buffer()
supports negative
distances for polygons, multipolygons, and geometry
collections containing polygons or multipolygons. For
point, multipoint, linestring, multilinestring, and
geometry collections not containing any polygons or
multipolygons, Buffer()
with a negative distance returns NULL
.
The OpenGIS specification also defines the following functions, which MySQL does not implement:
Returns a geometry that represents the convex hull of the
geometry value g
.
Returns a geometry that represents the point set
difference of the geometry value
g1
with
g2
.
Returns a geometry that represents the point set
intersection of the geometry values
g1
with
g2
.
Returns a geometry that represents the point set symmetric
difference of the geometry value
g1
with
g2
.
Returns a geometry that represents the point set union of
the geometry values g1
and
g2
.
The functions described in these sections take two geometries as input parameters and return a qualitative or quantitative relation between them.
MySQL provides several functions that test relations between
minimal bounding rectangles of two geometries
g1
and g2
. The return
values 1 and 0 indicate true and false, respectively.
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangle of g1
contains the
Minimum Bounding Rectangle of
g2
. This tests the opposite
relationship as
MBRWithin()
.
mysql>SET @g1 = GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
mysql>SET @g2 = GeomFromText('Point(1 1)');
mysql>SELECT MBRContains(@g1,@g2), MBRContains(@g2,@g1);
----------------------+----------------------+ | MBRContains(@g1,@g2) | MBRContains(@g2,@g1) | +----------------------+----------------------+ | 1 | 0 | +----------------------+----------------------+
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangles of the two geometries
g1
and
g2
are disjoint (do not
intersect).
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangles of the two geometries
g1
and
g2
are the same.
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangles of the two geometries
g1
and
g2
intersect.
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangles of the two geometries
g1
and
g2
overlap. The term
spatially overlaps is used if two
geometries intersect and their intersection results in a
geometry of the same dimension but not equal to either of
the given geometries.
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangles of the two geometries
g1
and
g2
touch. Two geometries
spatially touch if the interiors of
the geometries do not intersect, but the boundary of one
of the geometries intersects either the boundary or the
interior of the other.
Returns 1 or 0 to indicate whether the Minimum Bounding
Rectangle of g1
is within the
Minimum Bounding Rectangle of
g2
. This tests the opposite
relationship as
MBRContains()
.
mysql>SET @g1 = GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
mysql>SET @g2 = GeomFromText('Polygon((0 0,0 5,5 5,5 0,0 0))');
mysql>SELECT MBRWithin(@g1,@g2), MBRWithin(@g2,@g1);
+--------------------+--------------------+ | MBRWithin(@g1,@g2) | MBRWithin(@g2,@g1) | +--------------------+--------------------+ | 1 | 0 | +--------------------+--------------------+
The OpenGIS specification defines the following functions.
They test the relationship between two geometry values
g1
and g2
.
The return values 1 and 0 indicate true and false, respectively.
MySQL originally implemented these functions such that they
used object bounding rectangles and returned the same result
as the corresponding MBR-based functions. Corresponding
versions are available that use precise object shapes. These
versions are named with an ST_
prefix.
For example, Contains()
uses
object bounding rectangles, whereas
ST_Contains()
uses object
shapes.
There are also ST_
aliases for existing
spatial functions that were already exact. For example,
ST_IsEmpty()
is an alias for
IsEmpty()
Returns 1 or 0 to indicate whether
g1
completely contains
g2
. This tests the opposite
relationship as
ST_Within()
.
Returns 1 if g1
spatially
crosses g2
. Returns
NULL
if g1
is a
Polygon
or a
MultiPolygon
, or if
g2
is a
Point
or a
MultiPoint
. Otherwise, returns 0.
The term spatially crosses denotes a spatial relation between two given geometries that has the following properties:
The two geometries intersect
Their intersection results in a geometry that has a dimension that is one less than the maximum dimension of the two given geometries
Their intersection is not equal to either of the two given geometries
Returns 1 or 0 to indicate whether
g1
is spatially disjoint from
(does not intersect) g2
.
Returns 1 or 0 to indicate whether
g1
is spatially equal to
g2
.
Returns 1 or 0 to indicate whether
g1
spatially intersects
g2
.
Returns 1 or 0 to indicate whether
g1
spatially overlaps
g2
. The term
spatially overlaps is used if two
geometries intersect and their intersection results in a
geometry of the same dimension but not equal to either of
the given geometries.
Returns 1 or 0 to indicate whether
g1
spatially touches
g2
. Two geometries
spatially touch if the interiors of
the geometries do not intersect, but the boundary of one
of the geometries intersects either the boundary or the
interior of the other.
Returns 1 or 0 to indicate whether
g1
is spatially within
g2
. This tests the opposite
relationship as
ST_Contains()
.
Returns 1 or 0 to indicate whether
g1
completely contains
g2
. This tests the opposite
relationship as Within()
.
Returns 1 if g1
spatially
crosses g2
. Returns
NULL
if g1
is a
Polygon
or a
MultiPolygon
, or if
g2
is a
Point
or a
MultiPoint
. Otherwise, returns 0.
The term spatially crosses denotes a spatial relation between two given geometries that has the following properties:
The two geometries intersect
Their intersection results in a geometry that has a dimension that is one less than the maximum dimension of the two given geometries
Their intersection is not equal to either of the two given geometries
Returns 1 or 0 to indicate whether
g1
is spatially disjoint from
(does not intersect) g2
.
Returns 1 or 0 to indicate whether
g1
is spatially equal to
g2
.
Returns 1 or 0 to indicate whether
g1
spatially intersects
g2
.
Returns 1 or 0 to indicate whether
g1
spatially overlaps
g2
. The term
spatially overlaps is used if two
geometries intersect and their intersection results in a
geometry of the same dimension but not equal to either of
the given geometries.
Returns 1 or 0 to indicate whether
g1
spatially touches
g2
. Two geometries
spatially touch if the interiors of
the geometries do not intersect, but the boundary of one
of the geometries intersects either the boundary or the
interior of the other.
Returns 1 or 0 to indicate whether
g1
is spatially within
g2
. This tests the opposite
relationship as Contains()
.
For MyISAM
tables, Search operations
in nonspatial databases can be optimized using
SPATIAL
indexes. This is true for spatial
databases as well. With the help of a great variety of
multi-dimensional indexing methods that have previously been
designed, it is possible to optimize spatial searches. The most
typical of these are:
Point queries that search for all objects that contain a given point
Region queries that search for all objects that overlap a given region
MySQL uses R-Trees with quadratic
splitting for SPATIAL
indexes on
spatial columns. A SPATIAL
index is built using
the MBR of a geometry. For most geometries, the MBR is a minimum
rectangle that surrounds the geometries. For a horizontal or a
vertical linestring, the MBR is a rectangle degenerated into the
linestring. For a point, the MBR is a rectangle degenerated into
the point.
It is also possible to create normal indexes on spatial columns.
In a non-SPATIAL
index, you must declare a
prefix for any spatial column except for POINT
columns.
MyISAM
supports both SPATIAL
and non-SPATIAL
indexes. Other storage engines
support non-SPATIAL
indexes, as described in
Section 13.1.11, “CREATE INDEX
Syntax”.
For MyISAM
tables, MySQL can create
spatial indexes using syntax similar to that for creating
regular indexes, but extended with the
SPATIAL
keyword. Currently, columns in
spatial indexes must be declared NOT NULL
.
The following examples demonstrate how to create spatial
indexes:
With CREATE TABLE
:
CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g)) ENGINE=MyISAM;
With ALTER TABLE
:
ALTER TABLE geom ADD SPATIAL INDEX(g);
With CREATE INDEX
:
CREATE SPATIAL INDEX sp_index ON geom (g);
For MyISAM
tables, SPATIAL
INDEX
creates an R-tree index. For storage engines
that support nonspatial indexing of spatial columns, the engine
creates a B-tree index. A B-tree index on spatial values will be
useful for exact-value lookups, but not for range scans.
For more information on indexing spatial columns, see
Section 13.1.11, “CREATE INDEX
Syntax”.
To drop spatial indexes, use ALTER
TABLE
or DROP INDEX
:
With ALTER TABLE
:
ALTER TABLE geom DROP INDEX g;
With DROP INDEX
:
DROP INDEX sp_index ON geom;
Example: Suppose that a table geom
contains
more than 32,000 geometries, which are stored in the column
g
of type GEOMETRY
. The
table also has an AUTO_INCREMENT
column
fid
for storing object ID values.
mysql>DESCRIBE geom;
+-------+----------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+----------+------+-----+---------+----------------+ | fid | int(11) | | PRI | NULL | auto_increment | | g | geometry | | | | | +-------+----------+------+-----+---------+----------------+ 2 rows in set (0.00 sec) mysql>SELECT COUNT(*) FROM geom;
+----------+ | count(*) | +----------+ | 32376 | +----------+ 1 row in set (0.00 sec)
To add a spatial index on the column g
, use
this statement:
mysql> ALTER TABLE geom ADD SPATIAL INDEX(g);
Query OK, 32376 rows affected (4.05 sec)
Records: 32376 Duplicates: 0 Warnings: 0
The optimizer investigates whether available spatial indexes can
be involved in the search for queries that use a function such
as MBRContains()
or
MBRWithin()
in the
WHERE
clause. The following query finds all
objects that are in the given rectangle:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>SELECT fid,AsText(g) FROM geom WHERE
->MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+ | fid | AsText(g) | +-----+---------------------------------------------------------------+ | 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... | | 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... | | 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... | | 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... | | 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... | | 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... | | 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... | | 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... | | 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... | | 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... | | 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... | | 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... | | 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... | | 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... | | 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... | | 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... | | 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... | | 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... | | 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... | | 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... | +-----+---------------------------------------------------------------+ 20 rows in set (0.00 sec)
Use EXPLAIN
to check the way this
query is executed:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>EXPLAIN SELECT fid,AsText(g) FROM geom WHERE
->MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: geom type: range possible_keys: g key: g key_len: 32 ref: NULL rows: 50 Extra: Using where 1 row in set (0.00 sec)
Check what would happen without a spatial index:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>EXPLAIN SELECT fid,AsText(g) FROM g IGNORE INDEX (g) WHERE
->MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: geom type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 32376 Extra: Using where 1 row in set (0.00 sec)
Executing the SELECT
statement
without the spatial index yields the same result but causes the
execution time to rise from 0.00 seconds to 0.46 seconds:
mysql>SET @poly =
->'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
mysql>SELECT fid,AsText(g) FROM geom IGNORE INDEX (g) WHERE
->MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+ | fid | AsText(g) | +-----+---------------------------------------------------------------+ | 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... | | 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... | | 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... | | 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... | | 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... | | 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... | | 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... | | 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... | | 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... | | 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... | | 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... | | 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... | | 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... | | 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... | | 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... | | 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... | | 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... | | 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... | | 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... | | 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... | +-----+---------------------------------------------------------------+ 20 rows in set (0.46 sec)
MySQL does not yet implement the following GIS features:
Additional Metadata Views
OpenGIS specifications propose several additional metadata
views. For example, a system view named
GEOMETRY_COLUMNS
contains a description of
geometry columns, one row for each geometry column in the
database.
The OpenGIS function Length()
on LineString
and
MultiLineString
currently should be called
in MySQL as GLength()
The problem is that there is an existing SQL function
Length()
that calculates the
length of string values, and sometimes it is not possible to
distinguish whether the function is called in a textual or
spatial context. We need either to solve this somehow, or
decide on another function name.
MySQL 5.7 provides support for precision math: numeric value handling that results in extremely accurate results and a high degree control over invalid values. Precision math is based on these two features:
SQL modes that control how strict the server is about accepting or rejecting invalid data.
The MySQL library for fixed-point arithmetic.
These features have several implications for numeric operations and provide a high degree of compliance with standard SQL:
Precise calculations: For
exact-value numbers, calculations do not introduce
floating-point errors. Instead, exact precision is used. For
example, MySQL treats a number such as .0001
as an exact value rather than as an approximation, and summing
it 10,000 times produces a result of exactly
1
, not a value that is merely
“close” to 1.
Well-defined rounding behavior:
For exact-value numbers, the result of
ROUND()
depends on its argument,
not on environmental factors such as how the underlying C
library works.
Platform independence: Operations on exact numeric values are the same across different platforms such as Windows and Unix.
Control over handling of invalid
values: Overflow and division by zero are detectable
and can be treated as errors. For example, you can treat a value
that is too large for a column as an error rather than having
the value truncated to lie within the range of the column's data
type. Similarly, you can treat division by zero as an error
rather than as an operation that produces a result of
NULL
. The choice of which approach to take is
determined by the setting of the server SQL mode.
The following discussion covers several aspects of how precision math works, including possible incompatibilities with older applications. At the end, some examples are given that demonstrate how MySQL 5.7 handles numeric operations precisely. For information about controlling the SQL mode, see Section 5.1.7, “Server SQL Modes”.
The scope of precision math for exact-value operations includes
the exact-value data types (DECIMAL
and integer types) and exact-value numeric literals.
Approximate-value data types and numeric literals are handled as
floating-point numbers.
Exact-value numeric literals have an integer part or fractional
part, or both. They may be signed. Examples: 1
,
.2
, 3.4
,
-5
, -6.78
,
+9.10
.
Approximate-value numeric literals are represented in scientific
notation with a mantissa and exponent. Either or both parts may be
signed. Examples: 1.2E3
,
1.2E-3
, -1.2E3
,
-1.2E-3
.
Two numbers that look similar may be treated differently. For
example, 2.34
is an exact-value (fixed-point)
number, whereas 2.34E0
is an approximate-value
(floating-point) number.
The DECIMAL
data type is a
fixed-point type and calculations are exact. In MySQL, the
DECIMAL
type has several synonyms:
NUMERIC
,
DEC
,
FIXED
. The integer types also are
exact-value types.
The FLOAT
and
DOUBLE
data types are
floating-point types and calculations are approximate. In MySQL,
types that are synonymous with
FLOAT
or
DOUBLE
are
DOUBLE PRECISION
and
REAL
.
This section discusses the characteristics of the
DECIMAL
data type (and its
synonyms) in MySQL 5.7, with particular regard to the
following topics:
Maximum number of digits
Storage format
Storage requirements
The nonstandard MySQL extension to the upper range of
DECIMAL
columns
Possible incompatibilities with applications that are written for older versions of MySQL (prior to 5.0.3) are noted throughout this section.
The declaration syntax for a
DECIMAL
column is
DECIMAL(
.
The ranges of values for the arguments in MySQL 5.7
are as follows:
M
,D
)
M
is the maximum number of digits
(the precision). It has a range of 1 to 65. (Older versions of
MySQL permitted a range of 1 to 254.)
D
is the number of digits to the
right of the decimal point (the scale). It has a range of 0 to
30 and must be no larger than M
.
The maximum value of 65 for M
means
that calculations on DECIMAL
values
are accurate up to 65 digits. This limit of 65 digits of precision
also applies to exact-value numeric literals, so the maximum range
of such literals differs from before. (In older versions of MySQL,
decimal values could have up to 254 digits. However, calculations
were done using floating-point and thus were approximate, not
exact.)
Values for DECIMAL
columns in MySQL
5.7 are stored using a binary format that packs nine
decimal digits into 4 bytes. The storage requirements for the
integer and fractional parts of each value are determined
separately. Each multiple of nine digits requires 4 bytes, and any
remaining digits left over require some fraction of 4 bytes. The
storage required for remaining digits is given by the following
table.
Leftover Digits | Number of Bytes |
---|---|
0 | 0 |
1–2 | 1 |
3–4 | 2 |
5–6 | 3 |
7–9 | 4 |
For example, a DECIMAL(18,9)
column has nine
digits on either side of the decimal point, so the integer part
and the fractional part each require 4 bytes. A
DECIMAL(20,6)
column has fourteen integer
digits and six fractional digits. The integer digits require four
bytes for nine of the digits and 3 bytes for the remaining five
digits. The six fractional digits require 3 bytes.
Unlike some older versions of MySQL,
DECIMAL
columns in MySQL
5.7 do not store a leading +
character or -
character or leading
0
digits. If you insert
+0003.1
into a DECIMAL(5,1)
column, it is stored as 3.1
. For negative
numbers, a literal -
character is not stored.
Applications that rely on the older behavior must be modified to
account for this change.
DECIMAL
columns in MySQL
5.7 do not permit values larger than the range
implied by the column definition. For example, a
DECIMAL(3,0)
column supports a range of
-999
to 999
. A
DECIMAL(
column permits at most M
,D
)M
-
D
digits to the left of the decimal
point. This is not compatible with applications relying on older
versions of MySQL that permitted storing an extra digit in lieu of
a +
sign.
The SQL standard requires that the precision of
NUMERIC(
be exactly M
,D
)M
digits. For
DECIMAL(
,
the standard requires a precision of at least
M
,D
)M
digits but permits more. In MySQL,
DECIMAL(
and
M
,D
)NUMERIC(
are the same, and both have a precision of exactly
M
,D
)M
digits.
For more detailed information about porting applications that rely
on the old treatment of the DECIMAL
data type, see the MySQL 5.0 Reference
Manual.
With precision math, exact-value numbers are used as given
whenever possible. For example, numbers in comparisons are used
exactly as given without a change in value. In strict SQL mode,
for INSERT
into a column with an
exact data type (DECIMAL
or
integer), a number is inserted with its exact value if it is
within the column range. When retrieved, the value should be the
same as what was inserted. (Without strict mode, truncation for
INSERT
is permissible.)
Handling of a numeric expression depends on what kind of values the expression contains:
If any approximate values are present, the expression is approximate and is evaluated using floating-point arithmetic.
If no approximate values are present, the expression contains
only exact values. If any exact value contains a fractional
part (a value following the decimal point), the expression is
evaluated using DECIMAL
exact
arithmetic and has a precision of 65 digits. The term
“exact” is subject to the limits of what can be
represented in binary. For example, 1.0/3.0
can be approximated in decimal notation as
.333...
, but not written as an exact
number, so (1.0/3.0)*3.0
does not evaluate
to exactly 1.0
.
Otherwise, the expression contains only integer values. The
expression is exact and is evaluated using integer arithmetic
and has a precision the same as
BIGINT
(64 bits).
If a numeric expression contains any strings, they are converted to double-precision floating-point values and the expression is approximate.
Inserts into numeric columns are affected by the SQL mode, which
is controlled by the sql_mode
system variable. (See Section 5.1.7, “Server SQL Modes”.) The
following discussion mentions strict mode (selected by the
STRICT_ALL_TABLES
or
STRICT_TRANS_TABLES
mode values)
and ERROR_FOR_DIVISION_BY_ZERO
.
To turn on all restrictions, you can simply use
TRADITIONAL
mode, which includes
both strict mode values and
ERROR_FOR_DIVISION_BY_ZERO
:
mysql> SET sql_mode='TRADITIONAL';
If a number is inserted into an exact type column
(DECIMAL
or integer), it is
inserted with its exact value if it is within the column range.
If the value has too many digits in the fractional part, rounding occurs and a warning is generated. Rounding is done as described in Section 12.19.4, “Rounding Behavior”.
If the value has too many digits in the integer part, it is too large and is handled as follows:
If strict mode is not enabled, the value is truncated to the nearest legal value and a warning is generated.
If strict mode is enabled, an overflow error occurs.
Underflow is not detected, so underflow handling is undefined.
By default, division by zero produces a result of
NULL
and no warning. With the
ERROR_FOR_DIVISION_BY_ZERO
SQL
mode enabled, MySQL handles division by zero differently:
If strict mode is not enabled, a warning occurs.
If strict mode is enabled, inserts and updates involving division by zero are prohibited, and an error occurs.
In other words, inserts and updates involving expressions that
perform division by zero can be treated as errors, but this
requires
ERROR_FOR_DIVISION_BY_ZERO
in
addition to strict mode.
Suppose that we have this statement:
INSERT INTO t SET i = 1/0;
This is what happens for combinations of strict and
ERROR_FOR_DIVISION_BY_ZERO
modes.
sql_mode Value | Result |
---|---|
'' (Default) | No warning, no error; i is set to
NULL . |
strict | No warning, no error; i is set to
NULL . |
ERROR_FOR_DIVISION_BY_ZERO | Warning, no error; i is set to
NULL . |
strict,ERROR_FOR_DIVISION_BY_ZERO | Error condition; no row is inserted. |
For inserts of strings into numeric columns, conversion from string to number is handled as follows if the string has nonnumeric contents:
A string that does not begin with a number cannot be used as a number and produces an error in strict mode, or a warning otherwise. This includes the empty string.
A string that begins with a number can be converted, but the trailing nonnumeric portion is truncated. If the truncated portion contains anything other than spaces, this produces an error in strict mode, or a warning otherwise.
This section discusses precision math rounding for the
ROUND()
function and for inserts
into columns with exact-value types
(DECIMAL
and integer).
The ROUND()
function rounds
differently depending on whether its argument is exact or
approximate:
For exact-value numbers,
ROUND()
uses the “round
half up” rule: A value with a fractional part of .5 or
greater is rounded up to the next integer if positive or down
to the next integer if negative. (In other words, it is
rounded away from zero.) A value with a fractional part less
than .5 is rounded down to the next integer if positive or up
to the next integer if negative.
For approximate-value numbers, the result depends on the C
library. On many systems, this means that
ROUND()
uses the “round
to nearest even” rule: A value with any fractional part
is rounded to the nearest even integer.
The following example shows how rounding differs for exact and approximate values:
mysql> SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+
| ROUND(2.5) | ROUND(25E-1) |
+------------+--------------+
| 3 | 2 |
+------------+--------------+
For inserts into a DECIMAL
or
integer column, the target is an exact data type, so rounding uses
“round half away from zero,” regardless of whether
the value to be inserted is exact or approximate:
mysql>CREATE TABLE t (d DECIMAL(10,0));
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t VALUES(2.5),(2.5E0);
Query OK, 2 rows affected, 2 warnings (0.00 sec) Records: 2 Duplicates: 0 Warnings: 2 mysql>SELECT d FROM t;
+------+ | d | +------+ | 3 | | 3 | +------+
This section provides some examples that show precision math query results in MySQL 5.7. These examples demonstrate the principles described in Section 12.19.3, “Expression Handling”, and Section 12.19.4, “Rounding Behavior”.
Example 1. Numbers are used with their exact value as given when possible:
mysql> SELECT (.1 + .2) = .3;
+----------------+
| (.1 + .2) = .3 |
+----------------+
| 1 |
+----------------+
For floating-point values, results are inexact:
mysql> SELECT (.1E0 + .2E0) = .3E0;
+----------------------+
| (.1E0 + .2E0) = .3E0 |
+----------------------+
| 0 |
+----------------------+
Another way to see the difference in exact and approximate value
handling is to add a small number to a sum many times. Consider
the following stored procedure, which adds
.0001
to a variable 1,000 times.
CREATE PROCEDURE p () BEGIN DECLARE i INT DEFAULT 0; DECLARE d DECIMAL(10,4) DEFAULT 0; DECLARE f FLOAT DEFAULT 0; WHILE i < 10000 DO SET d = d + .0001; SET f = f + .0001E0; SET i = i + 1; END WHILE; SELECT d, f; END;
The sum for both d
and f
logically should be 1, but that is true only for the decimal
calculation. The floating-point calculation introduces small
errors:
+--------+------------------+ | d | f | +--------+------------------+ | 1.0000 | 0.99999999999991 | +--------+------------------+
Example 2. Multiplication is
performed with the scale required by standard SQL. That is, for
two numbers X1
and
X2
that have scale
S1
and S2
,
the scale of the result is
:
S1
+ S2
mysql> SELECT .01 * .01;
+-----------+
| .01 * .01 |
+-----------+
| 0.0001 |
+-----------+
Example 3. Rounding behavior for exact-value numbers is well-defined:
Rounding behavior (for example, with the
ROUND()
function) is independent of
the implementation of the underlying C library, which means that
results are consistent from platform to platform.
Rounding for exact-value columns
(DECIMAL
and integer) and
exact-valued numbers uses the “round half away from
zero” rule. Values with a fractional part of .5 or
greater are rounded away from zero to the nearest integer, as
shown here:
mysql> SELECT ROUND(2.5), ROUND(-2.5);
+------------+-------------+
| ROUND(2.5) | ROUND(-2.5) |
+------------+-------------+
| 3 | -3 |
+------------+-------------+
Rounding for floating-point values uses the C library, which on many systems uses the “round to nearest even” rule. Values with any fractional part on such systems are rounded to the nearest even integer:
mysql> SELECT ROUND(2.5E0), ROUND(-2.5E0);
+--------------+---------------+
| ROUND(2.5E0) | ROUND(-2.5E0) |
+--------------+---------------+
| 2 | -2 |
+--------------+---------------+
Example 4. In strict mode, inserting a value that is out of range for a column causes an error, rather than truncation to a legal value.
When MySQL is not running in strict mode, truncation to a legal value occurs:
mysql>SET sql_mode='';
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO t SET i = 128;
Query OK, 1 row affected, 1 warning (0.00 sec) mysql>SELECT i FROM t;
+------+ | i | +------+ | 127 | +------+ 1 row in set (0.00 sec)
However, an error occurs if strict mode is in effect:
mysql>SET sql_mode='STRICT_ALL_TABLES';
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t SET i = 128;
ERROR 1264 (22003): Out of range value adjusted for column 'i' at row 1 mysql>SELECT i FROM t;
Empty set (0.00 sec)
Example 5: In strict mode and
with ERROR_FOR_DIVISION_BY_ZERO
set, division by zero causes an error, not a result of
NULL
.
In nonstrict mode, division by zero has a result of
NULL
:
mysql>SET sql_mode='';
Query OK, 0 rows affected (0.01 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t SET i = 1 / 0;
Query OK, 1 row affected (0.00 sec) mysql>SELECT i FROM t;
+------+ | i | +------+ | NULL | +------+ 1 row in set (0.03 sec)
However, division by zero is an error if the proper SQL modes are in effect:
mysql>SET sql_mode='STRICT_ALL_TABLES,ERROR_FOR_DIVISION_BY_ZERO';
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t SET i = 1 / 0;
ERROR 1365 (22012): Division by 0 mysql>SELECT i FROM t;
Empty set (0.01 sec)
Example 6. Exact-value literals are evaluated as exact values.
Prior to MySQL 5.0.3, exact-value and approximate-value literals both are evaluated as double-precision floating-point values:
mysql>SELECT VERSION();
+------------+ | VERSION() | +------------+ | 4.1.18-log | +------------+ 1 row in set (0.01 sec) mysql>CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.07 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql>DESCRIBE t;
+-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | a | double(3,1) | | | 0.0 | | | b | double | | | 0 | | +-------+-------------+------+-----+---------+-------+ 2 rows in set (0.04 sec)
As of MySQL 5.0.3, the approximate-value literal is evaluated
using floating point, but the exact-value literal is handled as
DECIMAL
:
mysql>SELECT VERSION();
+-----------------+ | VERSION() | +-----------------+ | 5.1.6-alpha-log | +-----------------+ 1 row in set (0.11 sec) mysql>CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql>DESCRIBE t;
+-------+-----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------------------+------+-----+---------+-------+ | a | decimal(2,1) unsigned | NO | | 0.0 | | | b | double | NO | | 0 | | +-------+-----------------------+------+-----+---------+-------+ 2 rows in set (0.01 sec)
Example 7. If the argument to an aggregate function is an exact numeric type, the result is also an exact numeric type, with a scale at least that of the argument.
Consider these statements:
mysql>CREATE TABLE t (i INT, d DECIMAL, f FLOAT);
mysql>INSERT INTO t VALUES(1,1,1);
mysql>CREATE TABLE y SELECT AVG(i), AVG(d), AVG(f) FROM t;
Before MySQL 5.0.3, the result is a double no matter the argument type:
mysql> DESCRIBE y;
+--------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+-------+
| AVG(i) | double(17,4) | YES | | NULL | |
| AVG(d) | double(17,4) | YES | | NULL | |
| AVG(f) | double | YES | | NULL | |
+--------+--------------+------+-----+---------+-------+
As of MySQL 5.0.3, the result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type:
mysql> DESCRIBE y;
+--------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------+------+-----+---------+-------+
| AVG(i) | decimal(14,4) | YES | | NULL | |
| AVG(d) | decimal(14,4) | YES | | NULL | |
| AVG(f) | double | YES | | NULL | |
+--------+---------------+------+-----+---------+-------+
The result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type.