MySQL supports the following JOIN syntax for the table_references part of SELECT statements and multiple-table DELETE and UPDATE statements: table_references: escaped_table_reference [, escaped_table_reference] ... escaped_table_reference: { table_reference | { OJ table_reference } } table_reference: { table_factor | joined_table } table_factor: { tbl_name [PARTITION (partition_names)] [[AS] alias] [index_hint_list] | table_subquery [AS] alias | ( table_references ) } joined_table: { table_reference [INNER | CROSS] JOIN table_factor [join_specification] | table_reference STRAIGHT_JOIN table_factor | table_reference STRAIGHT_JOIN table_factor ON search_condition | table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_specification | table_reference NATURAL [{LEFT|RIGHT} [OUTER]] JOIN table_factor } join_specification: { ON search_condition | USING (join_column_list) } join_column_list: column_name [, column_name] ... index_hint_list: index_hint [, index_hint] ... index_hint: { USE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list]) | {IGNORE|FORCE} {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] (index_list) } index_list: index_name [, index_name] ...A table reference is also known as a join expression. A table reference (when it refers to a partitioned table) may contain a PARTITION clause, including a list of comma-separated partitions, subpartitions, or both. This option follows the name of the table and precedes any alias declaration. The effect of this option is that rows are selected only from the listed partitions or subpartitions. Any partitions or subpartitions not named in the list are ignored. For more information and examples, see Section 19.5, “Partition Selection”. The syntax of table_factor is extended in MySQL in comparison with standard SQL. The standard accepts only table_reference, not a list of them inside a pair of parentheses. This is a conservative extension if each comma in a list of table_reference items is considered as equivalent to an inner join. For example: SELECT * FROM t1 LEFT JOIN (t2, t3, t4) ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)is equivalent to: SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4) ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)In MySQL, JOIN, CROSS JOIN, and INNER JOIN are syntactic equivalents (they can replace each other). In standard SQL, they are not equivalent. INNER JOIN is used with an ON clause, CROSS JOIN is used otherwise. In general, parentheses can be ignored in join expressions containing only inner join operations. MySQL also supports nested joins. See Section 8.2.1.7, “Nested Join Optimization”. Index hints can be specified to affect how the MySQL optimizer makes use of indexes. For more information, see Section 8.9.3, “Index Hints”. The optimizer_switch system variable is another way to influence optimizer use of indexes. See Section 8.9.2, “Switchable Optimizations”. The following list describes general factors to take into account when writing joins:
Some join examples: SELECT * FROM table1, table2; SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id; SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id; SELECT * FROM table1 LEFT JOIN table2 USING (id); SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id LEFT JOIN table3 ON table2.id = table3.id;Natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard:
Page 2SELECT ... UNION [ALL | DISTINCT] SELECT ... [UNION [ALL | DISTINCT] SELECT ...] UNION combines the result from multiple SELECT statements into a single result set. The result set column names are taken from the column names of the first SELECT statement. Selected columns listed in corresponding positions of each SELECT statement should have the same data type. For example, the first column selected by the first statement should have the same type as the first column selected by the other statements. If the data types of corresponding SELECT columns do not match, the types and lengths of the columns in the UNION result take into account the values retrieved by all the SELECT statements. For example, consider the following, where the column length is not constrained to the length of the value from the first SELECT: mysql> SELECT REPEAT('a',1) UNION SELECT REPEAT('b',20); +----------------------+ | REPEAT('a',1) | +----------------------+ | a | | bbbbbbbbbbbbbbbbbbbb | +----------------------+UNION DISTINCT and UNION ALLBy default, duplicate rows are removed from UNION results. The optional DISTINCT keyword has the same effect but makes it explicit. With the optional ALL keyword, duplicate-row removal does not occur and the result includes all matching rows from all the SELECT statements. You can mix UNION ALL and UNION DISTINCT in the same query. Mixed UNION types are treated such that a DISTINCT union overrides any ALL union to its left. A DISTINCT union can be produced explicitly by using UNION DISTINCT or implicitly by using UNION with no following DISTINCT or ALL keyword. ORDER BY and LIMIT in UnionsTo apply an ORDER BY or LIMIT clause to an individual SELECT, parenthesize the SELECT and place the clause inside the parentheses: Use of ORDER BY for individual SELECT statements implies nothing about the order in which the rows appear in the final result because UNION by default produces an unordered set of rows. Therefore, ORDER BY in this context typically is used in conjunction with LIMIT, to determine the subset of the selected rows to retrieve for the SELECT, even though it does not necessarily affect the order of those rows in the final UNION result. If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it has no effect. To use an ORDER BY or LIMIT clause to sort or limit the entire UNION result, parenthesize the individual SELECT statements and place the ORDER BY or LIMIT after the last one: (SELECT a FROM t1 WHERE a=10 AND B=1) UNION (SELECT a FROM t2 WHERE a=11 AND B=2) ORDER BY a LIMIT 10;A statement without parentheses is equivalent to one parenthesized as just shown. This kind of ORDER BY cannot use column references that include a table name (that is, names in tbl_name.col_name format). Instead, provide a column alias in the first SELECT statement and refer to the alias in the ORDER BY. (Alternatively, refer to the column in the ORDER BY using its column position. However, use of column positions is deprecated.) Also, if a column to be sorted is aliased, the ORDER BY clause must refer to the alias, not the column name. The first of the following statements is permitted, but the second fails with an Unknown column 'a' in 'order clause' error: (SELECT a AS b FROM t) UNION (SELECT ...) ORDER BY b; (SELECT a AS b FROM t) UNION (SELECT ...) ORDER BY a;To cause rows in a UNION result to consist of the sets of rows retrieved by each SELECT one after the other, select an additional column in each SELECT to use as a sort column and add an ORDER BY that sorts on that column following the last SELECT: (SELECT 1 AS sort_col, col1a, col1b, ... FROM t1) UNION (SELECT 2, col2a, col2b, ... FROM t2) ORDER BY sort_col;To additionally maintain sort order within individual SELECT results, add a secondary column to the ORDER BY clause: (SELECT 1 AS sort_col, col1a, col1b, ... FROM t1) UNION (SELECT 2, col2a, col2b, ... FROM t2) ORDER BY sort_col, col1a;Use of an additional column also enables you to determine which SELECT each row comes from. Extra columns can provide other identifying information as well, such as a string that indicates a table name. UNION RestrictionsIn a UNION, the SELECT statements are normal select statements, but with the following restrictions:
Page 3
A subquery is a SELECT statement within another statement. All subquery forms and operations that the SQL standard requires are supported, as well as a few features that are MySQL-specific. Here is an example of a subquery: SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);In this example, SELECT * FROM t1 ... is the outer query (or outer statement), and (SELECT column1 FROM t2) is the subquery. We say that the subquery is nested within the outer query, and in fact it is possible to nest subqueries within other subqueries, to a considerable depth. A subquery must always appear within parentheses. The main advantages of subqueries are:
Here is an example statement that shows the major points about subquery syntax as specified by the SQL standard and supported in MySQL: DELETE FROM t1 WHERE s11 > ANY (SELECT COUNT(*) /* no hint */ FROM t2 WHERE NOT EXISTS (SELECT * FROM t3 WHERE ROW(5*t2.s1,77)= (SELECT 50,11*s1 FROM t4 UNION SELECT 50,77 FROM (SELECT * FROM t5) AS t5)));A subquery can return a scalar (a single value), a single row, a single column, or a table (one or more rows of one or more columns). These are called scalar, column, row, and table subqueries. Subqueries that return a particular kind of result often can be used only in certain contexts, as described in the following sections. There are few restrictions on the type of statements in which subqueries can be used. A subquery can contain many of the keywords or clauses that an ordinary SELECT can contain: DISTINCT, GROUP BY, ORDER BY, LIMIT, joins, index hints, UNION constructs, comments, functions, and so on. A subquery's outer statement can be any one of: SELECT, INSERT, UPDATE, DELETE, SET, or DO. In MySQL, you cannot modify a table and select from the same table in a subquery. This applies to statements such as DELETE, INSERT, REPLACE, UPDATE, and (because subqueries can be used in the SET clause) LOAD DATA. For information about how the optimizer handles subqueries, see Section 8.2.2, “Optimizing Subqueries and Derived Tables”. For a discussion of restrictions on subquery use, including performance issues for certain forms of subquery syntax, see Section 13.2.10.12, “Restrictions on Subqueries”. Page 4
13.2.10.1 The Subquery as Scalar OperandIn its simplest form, a subquery is a scalar subquery that returns a single value. A scalar subquery is a simple operand, and you can use it almost anywhere a single column value or literal is legal, and you can expect it to have those characteristics that all operands have: a data type, a length, an indication that it can be NULL, and so on. For example: CREATE TABLE t1 (s1 INT, s2 CHAR(5) NOT NULL); INSERT INTO t1 VALUES(100, 'abcde'); SELECT (SELECT s2 FROM t1);The subquery in this SELECT returns a single value ('abcde') that has a data type of CHAR, a length of 5, a character set and collation equal to the defaults in effect at CREATE TABLE time, and an indication that the value in the column can be NULL. Nullability of the value selected by a scalar subquery is not copied because if the subquery result is empty, the result is NULL. For the subquery just shown, if t1 were empty, the result would be NULL even though s2 is NOT NULL. There are a few contexts in which a scalar subquery cannot be used. If a statement permits only a literal value, you cannot use a subquery. For example, LIMIT requires literal integer arguments, and LOAD DATA requires a literal string file name. You cannot use subqueries to supply these values. When you see examples in the following sections that contain the rather spartan construct (SELECT column1 FROM t1), imagine that your own code contains much more diverse and complex constructions. Suppose that we make two tables: CREATE TABLE t1 (s1 INT); INSERT INTO t1 VALUES (1); CREATE TABLE t2 (s1 INT); INSERT INTO t2 VALUES (2);Then perform a SELECT: SELECT (SELECT s1 FROM t2) FROM t1;The result is 2 because there is a row in t2 containing a column s1 that has a value of 2. A scalar subquery can be part of an expression, but remember the parentheses, even if the subquery is an operand that provides an argument for a function. For example: SELECT UPPER((SELECT s1 FROM t1)) FROM t2;Page 5
13.2.10.2 Comparisons Using SubqueriesThe most common use of a subquery is in the form: non_subquery_operand comparison_operator (subquery)Where comparison_operator is one of these operators: = > < >= <= <> != <=>For example: ... WHERE 'a' = (SELECT column1 FROM t1)MySQL also permits this construct: non_subquery_operand LIKE (subquery)At one time the only legal place for a subquery was on the right side of a comparison, and you might still find some old DBMSs that insist on this. Here is an example of a common-form subquery comparison that you cannot do with a join. It finds all the rows in table t1 for which the column1 value is equal to a maximum value in table t2: SELECT * FROM t1 WHERE column1 = (SELECT MAX(column2) FROM t2);Here is another example, which again is impossible with a join because it involves aggregating for one of the tables. It finds all rows in table t1 containing a value that occurs twice in a given column: SELECT * FROM t1 AS t WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);For a comparison of the subquery to a scalar, the subquery must return a scalar. For a comparison of the subquery to a row constructor, the subquery must be a row subquery that returns a row with the same number of values as the row constructor. See Section 13.2.10.5, “Row Subqueries”. Page 6
13.2.10.3 Subqueries with ANY, IN, or SOMESyntax: operand comparison_operator ANY (subquery) operand IN (subquery) operand comparison_operator SOME (subquery)Where comparison_operator is one of these operators: = > < >= <= <> !=The ANY keyword, which must follow a comparison operator, means “return TRUE if the comparison is TRUE for ANY of the values in the column that the subquery returns.” For example: SELECT s1 FROM t1 WHERE s1 > ANY (SELECT s1 FROM t2);Suppose that there is a row in table t1 containing (10). The expression is TRUE if table t2 contains (21,14,7) because there is a value 7 in t2 that is less than 10. The expression is FALSE if table t2 contains (20,10), or if table t2 is empty. The expression is unknown (that is, NULL) if table t2 contains (NULL,NULL,NULL). When used with a subquery, the word IN is an alias for = ANY. Thus, these two statements are the same: SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 IN (SELECT s1 FROM t2);IN and = ANY are not synonyms when used with an expression list. IN can take an expression list, but = ANY cannot. See Section 12.4.2, “Comparison Functions and Operators”. NOT IN is not an alias for <> ANY, but for <> ALL. See Section 13.2.10.4, “Subqueries with ALL”. The word SOME is an alias for ANY. Thus, these two statements are the same: SELECT s1 FROM t1 WHERE s1 <> ANY (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 <> SOME (SELECT s1 FROM t2);Use of the word SOME is rare, but this example shows why it might be useful. To most people, the English phrase “a is not equal to any b” means “there is no b which is equal to a,” but that is not what is meant by the SQL syntax. The syntax means “there is some b to which a is not equal.” Using <> SOME instead helps ensure that everyone understands the true meaning of the query. Page 7
13.2.10.4 Subqueries with ALLSyntax: operand comparison_operator ALL (subquery)The word ALL, which must follow a comparison operator, means “return TRUE if the comparison is TRUE for ALL of the values in the column that the subquery returns.” For example: SELECT s1 FROM t1 WHERE s1 > ALL (SELECT s1 FROM t2);Suppose that there is a row in table t1 containing (10). The expression is TRUE if table t2 contains (-5,0,+5) because 10 is greater than all three values in t2. The expression is FALSE if table t2 contains (12,6,NULL,-100) because there is a single value 12 in table t2 that is greater than 10. The expression is unknown (that is, NULL) if table t2 contains (0,NULL,1). Finally, the expression is TRUE if table t2 is empty. So, the following expression is TRUE when table t2 is empty: SELECT * FROM t1 WHERE 1 > ALL (SELECT s1 FROM t2);But this expression is NULL when table t2 is empty: SELECT * FROM t1 WHERE 1 > (SELECT s1 FROM t2);In addition, the following expression is NULL when table t2 is empty: SELECT * FROM t1 WHERE 1 > ALL (SELECT MAX(s1) FROM t2);In general, tables containing NULL values and empty tables are “edge cases.” When writing subqueries, always consider whether you have taken those two possibilities into account. NOT IN is an alias for <> ALL. Thus, these two statements are the same: SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);Page 8
Scalar or column subqueries return a single value or a column of values. A row subquery is a subquery variant that returns a single row and can thus return more than one column value. Legal operators for row subquery comparisons are: = > < >= <= <> != <=>Here are two examples: SELECT * FROM t1 WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10); SELECT * FROM t1 WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);For both queries, if the table t2 contains a single row with id = 10, the subquery returns a single row. If this row has col3 and col4 values equal to the col1 and col2 values of any rows in t1, the WHERE expression is TRUE and each query returns those t1 rows. If the t2 row col3 and col4 values are not equal the col1 and col2 values of any t1 row, the expression is FALSE and the query returns an empty result set. The expression is unknown (that is, NULL) if the subquery produces no rows. An error occurs if the subquery produces multiple rows because a row subquery can return at most one row. For information about how each operator works for row comparisons, see Section 12.4.2, “Comparison Functions and Operators”. The expressions (1,2) and ROW(1,2) are sometimes called row constructors. The two are equivalent. The row constructor and the row returned by the subquery must contain the same number of values. A row constructor is used for comparisons with subqueries that return two or more columns. When a subquery returns a single column, this is regarded as a scalar value and not as a row, so a row constructor cannot be used with a subquery that does not return at least two columns. Thus, the following query fails with a syntax error: SELECT * FROM t1 WHERE ROW(1) = (SELECT column1 FROM t2)Row constructors are legal in other contexts. For example, the following two statements are semantically equivalent (and are handled in the same way by the optimizer): SELECT * FROM t1 WHERE (column1,column2) = (1,1); SELECT * FROM t1 WHERE column1 = 1 AND column2 = 1;The following query answers the request, “find all rows in table t1 that also exist in table t2”: SELECT column1,column2,column3 FROM t1 WHERE (column1,column2,column3) IN (SELECT column1,column2,column3 FROM t2);For more information about the optimizer and row constructors, see Section 8.2.1.18, “Row Constructor Expression Optimization” Page 9
13.2.10.6 Subqueries with EXISTS or NOT EXISTSIf a subquery returns any rows at all, EXISTS subquery is TRUE, and NOT EXISTS subquery is FALSE. For example: SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);Traditionally, an EXISTS subquery starts with SELECT *, but it could begin with SELECT 5 or SELECT column1 or anything at all. MySQL ignores the SELECT list in such a subquery, so it makes no difference. For the preceding example, if t2 contains any rows, even rows with nothing but NULL values, the EXISTS condition is TRUE. This is actually an unlikely example because a [NOT] EXISTS subquery almost always contains correlations. Here are some more realistic examples:
The last example is a double-nested NOT EXISTS query. That is, it has a NOT EXISTS clause within a NOT EXISTS clause. Formally, it answers the question “does a city exist with a store that is not in Stores”? But it is easier to say that a nested NOT EXISTS answers the question “is x TRUE for all y?” Page 10
13.2.10.7 Correlated SubqueriesA correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. For example: SELECT * FROM t1 WHERE column1 = ANY (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);Notice that the subquery contains a reference to a column of t1, even though the subquery's FROM clause does not mention a table t1. So, MySQL looks outside the subquery, and finds t1 in the outer query. Suppose that table t1 contains a row where column1 = 5 and column2 = 6; meanwhile, table t2 contains a row where column1 = 5 and column2 = 7. The simple expression ... WHERE column1 = ANY (SELECT column1 FROM t2) would be TRUE, but in this example, the WHERE clause within the subquery is FALSE (because (5,6) is not equal to (5,7)), so the expression as a whole is FALSE. Scoping rule: MySQL evaluates from inside to outside. For example: SELECT column1 FROM t1 AS x WHERE x.column1 = (SELECT column1 FROM t2 AS x WHERE x.column1 = (SELECT column1 FROM t3 WHERE x.column2 = t3.column1));In this statement, x.column2 must be a column in table t2 because SELECT column1 FROM t2 AS x ... renames t2. It is not a column in table t1 because SELECT column1 FROM t1 ... is an outer query that is farther out. For subqueries in HAVING or ORDER BY clauses, MySQL also looks for column names in the outer select list. For certain cases, a correlated subquery is optimized. For example: val IN (SELECT key_val FROM tbl_name WHERE correlated_condition)Otherwise, they are inefficient and likely to be slow. Rewriting the query as a join might improve performance. Aggregate functions in correlated subqueries may contain outer references, provided the function contains nothing but outer references, and provided the function is not contained in another function or expression. Page 11
A derived table is an expression that generates a table within the scope of a query FROM clause. For example, a subquery in a SELECT statement FROM clause is a derived table: SELECT ... FROM (subquery) [AS] tbl_name ...The [AS] tbl_name clause is mandatory because every table in a FROM clause must have a name. Any columns in the derived table must have unique names. For the sake of illustration, assume that you have this table: CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);Here is how to use a subquery in the FROM clause, using the example table: INSERT INTO t1 VALUES (1,'1',1.0); INSERT INTO t1 VALUES (2,'2',2.0); SELECT sb1,sb2,sb3 FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb WHERE sb1 > 1;Result: +------+------+------+ | sb1 | sb2 | sb3 | +------+------+------+ | 2 | 2 | 4 | +------+------+------+Here is another example: Suppose that you want to know the average of a set of sums for a grouped table. This does not work: SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;However, this query provides the desired information: SELECT AVG(sum_column1) FROM (SELECT SUM(column1) AS sum_column1 FROM t1 GROUP BY column1) AS t1;Notice that the column name used within the subquery (sum_column1) is recognized in the outer query. A derived table can return a scalar, column, row, or table. Derived tables are subject to these restrictions:
The optimizer determines information about derived tables in such a way that EXPLAIN does not need to materialize them. See Section 8.2.2.4, “Optimizing Derived Tables”. It is possible under certain circumstances that using EXPLAIN SELECT modifies table data. This can occur if the outer query accesses any tables and an inner query invokes a stored function that changes one or more rows of a table. Suppose that there are two tables t1 and t2 in database d1, and a stored function f1 that modifies t2, created as shown here: CREATE DATABASE d1; USE d1; CREATE TABLE t1 (c1 INT); CREATE TABLE t2 (c1 INT); CREATE FUNCTION f1(p1 INT) RETURNS INT BEGIN INSERT INTO t2 VALUES (p1); RETURN p1; END;Referencing the function directly in an EXPLAIN SELECT has no effect on t2, as shown here: mysql> SELECT * FROM t2; Empty set (0.02 sec) mysql> EXPLAIN SELECT f1(5)\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: No tables used 1 row in set (0.01 sec) mysql> SELECT * FROM t2; Empty set (0.01 sec)This is because the SELECT statement did not reference any tables, as can be seen in the table and Extra columns of the output. This is also true of the following nested SELECT: However, if the outer SELECT references any tables, the optimizer executes the statement in the subquery as well, with the result that t2 is modified: mysql> EXPLAIN SELECT * FROM t1 AS a1, (SELECT f1(5)) AS a2\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: system possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 Extra: NULL *************************** 2. row *************************** id: 1 select_type: PRIMARY table: a1 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 Extra: NULL *************************** 3. row *************************** id: 2 select_type: DERIVED table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: No tables used 3 rows in set (0.00 sec) mysql> SELECT * FROM t2; +------+ | c1 | +------+ | 5 | +------+ 1 row in set (0.00 sec)Page 12
13.2.10.9 Subquery ErrorsThere are some errors that apply only to subqueries. This section describes them.
For transactional storage engines, the failure of a subquery causes the entire statement to fail. For nontransactional storage engines, data modifications made before the error was encountered are preserved. Page 13
13.2.10.10 Optimizing SubqueriesDevelopment is ongoing, so no optimization tip is reliable for the long term. The following list provides some interesting tricks that you might want to play with. See also Section 8.2.2, “Optimizing Subqueries and Derived Tables”.
These tricks might cause programs to go faster or slower. Using MySQL facilities like the BENCHMARK() function, you can get an idea about what helps in your own situation. See Section 12.16, “Information Functions”. Some optimizations that MySQL itself makes are:
See also MySQL Internals: How MySQL Transforms Subqueries. Page 14
13.2.10.11 Rewriting Subqueries as JoinsSometimes there are other ways to test membership in a set of values than by using a subquery. Also, on some occasions, it is not only possible to rewrite a query without a subquery, but it can be more efficient to make use of some of these techniques rather than to use subqueries. One of these is the IN() construct: For example, this query: SELECT * FROM t1 WHERE id IN (SELECT id FROM t2);Can be rewritten as: SELECT DISTINCT t1.* FROM t1, t2 WHERE t1.id=t2.id;The queries: SELECT * FROM t1 WHERE id NOT IN (SELECT id FROM t2); SELECT * FROM t1 WHERE NOT EXISTS (SELECT id FROM t2 WHERE t1.id=t2.id);Can be rewritten as: SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id WHERE table2.id IS NULL;A LEFT [OUTER] JOIN can be faster than an equivalent subquery because the server might be able to optimize it better—a fact that is not specific to MySQL Server alone. Prior to SQL-92, outer joins did not exist, so subqueries were the only way to do certain things. Today, MySQL Server and many other modern database systems offer a wide range of outer join types. MySQL Server supports multiple-table DELETE statements that can be used to efficiently delete rows based on information from one table or even from many tables at the same time. Multiple-table UPDATE statements are also supported. See Section 13.2.2, “DELETE Statement”, and Section 13.2.11, “UPDATE Statement”. Page 15
13.2.10.12 Restrictions on Subqueries
Page 16
UPDATE is a DML statement that modifies rows in a table. Single-table syntax: UPDATE [LOW_PRIORITY] [IGNORE] table_reference SET assignment_list [WHERE where_condition] [ORDER BY ...] [LIMIT row_count] value: {expr | DEFAULT} assignment: col_name = value assignment_list: assignment [, assignment] ...Multiple-table syntax: UPDATE [LOW_PRIORITY] [IGNORE] table_references SET assignment_list [WHERE where_condition]For the single-table syntax, the UPDATE statement updates columns of existing rows in the named table with new values. The SET clause indicates which columns to modify and the values they should be given. Each value can be given as an expression, or the keyword DEFAULT to set a column explicitly to its default value. The WHERE clause, if given, specifies the conditions that identify which rows to update. With no WHERE clause, all rows are updated. If the ORDER BY clause is specified, the rows are updated in the order that is specified. The LIMIT clause places a limit on the number of rows that can be updated. For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. Each matching row is updated once, even if it matches the conditions multiple times. For multiple-table syntax, ORDER BY and LIMIT cannot be used. For partitioned tables, both the single-single and multiple-table forms of this statement support the use of a PARTITION clause as part of a table reference. This option takes a list of one or more partitions or subpartitions (or both). Only the partitions (or subpartitions) listed are checked for matches, and a row that is not in any of these partitions or subpartitions is not updated, whether it satisfies the where_condition or not.
Note Unlike the case when using PARTITION with an INSERT or REPLACE statement, an otherwise valid UPDATE ... PARTITION statement is considered successful even if no rows in the listed partitions (or subpartitions) match the where_condition. For more information and examples, see Section 19.5, “Partition Selection”. where_condition is an expression that evaluates to true for each row to be updated. For expression syntax, see Section 9.5, “Expressions”. table_references and where_condition are specified as described in Section 13.2.9, “SELECT Statement”. You need the UPDATE privilege only for columns referenced in an UPDATE that are actually updated. You need only the SELECT privilege for any columns that are read but not modified. The UPDATE statement supports the following modifiers:
UPDATE IGNORE statements, including those having an ORDER BY clause, are flagged as unsafe for statement-based replication. (This is because the order in which the rows are updated determines which rows are ignored.) Such statements produce a warning in the error log when using statement-based mode and are written to the binary log using the row-based format when using MIXED mode. (Bug #11758262, Bug #50439) See Section 17.1.2.3, “Determination of Safe and Unsafe Statements in Binary Logging”, for more information. If you access a column from the table to be updated in an expression, UPDATE uses the current value of the column. For example, the following statement sets col1 to one more than its current value: UPDATE t1 SET col1 = col1 + 1;The second assignment in the following statement sets col2 to the current (updated) col1 value, not the original col1 value. The result is that col1 and col2 have the same value. This behavior differs from standard SQL. UPDATE t1 SET col1 = col1 + 1, col2 = col1;Single-table UPDATE assignments are generally evaluated from left to right. For multiple-table updates, there is no guarantee that assignments are carried out in any particular order. If you set a column to the value it currently has, MySQL notices this and does not update it. If you update a column that has been declared NOT NULL by setting to NULL, an error occurs if strict SQL mode is enabled; otherwise, the column is set to the implicit default value for the column data type and the warning count is incremented. The implicit default value is 0 for numeric types, the empty string ('') for string types, and the “zero” value for date and time types. See Section 11.5, “Data Type Default Values”. UPDATE returns the number of rows that were actually changed. The mysql_info() C API function returns the number of rows that were matched and updated and the number of warnings that occurred during the UPDATE. You can use LIMIT row_count to restrict the scope of the UPDATE. A LIMIT clause is a rows-matched restriction. The statement stops as soon as it has found row_count rows that satisfy the WHERE clause, whether or not they actually were changed. If an UPDATE statement includes an ORDER BY clause, the rows are updated in the order specified by the clause. This can be useful in certain situations that might otherwise result in an error. Suppose that a table t contains a column id that has a unique index. The following statement could fail with a duplicate-key error, depending on the order in which rows are updated: UPDATE t SET id = id + 1;For example, if the table contains 1 and 2 in the id column and 1 is updated to 2 before 2 is updated to 3, an error occurs. To avoid this problem, add an ORDER BY clause to cause the rows with larger id values to be updated before those with smaller values: UPDATE t SET id = id + 1 ORDER BY id DESC;You can also perform UPDATE operations covering multiple tables. However, you cannot use ORDER BY or LIMIT with a multiple-table UPDATE. The table_references clause lists the tables involved in the join. Its syntax is described in Section 13.2.9.2, “JOIN Clause”. Here is an example: UPDATE items,month SET items.price=month.price WHERE items.id=month.id;The preceding example shows an inner join that uses the comma operator, but multiple-table UPDATE statements can use any type of join permitted in SELECT statements, such as LEFT JOIN. If you use a multiple-table UPDATE statement involving InnoDB tables for which there are foreign key constraints, the MySQL optimizer might process tables in an order that differs from that of their parent/child relationship. In this case, the statement fails and rolls back. Instead, update a single table and rely on the ON UPDATE capabilities that InnoDB provides to cause the other tables to be modified accordingly. See Section 13.1.17.5, “FOREIGN KEY Constraints”. You cannot update a table and select directly from the same table in a subquery. You can work around this by using a multi-table update in which one of the tables is derived from the table that you actually wish to update, and referring to the derived table using an alias. Suppose you wish to update a table named items which is defined using the statement shown here: CREATE TABLE items ( id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY, wholesale DECIMAL(6,2) NOT NULL DEFAULT 0.00, retail DECIMAL(6,2) NOT NULL DEFAULT 0.00, quantity BIGINT NOT NULL DEFAULT 0 );To reduce the retail price of any items for which the markup is 30% or greater and of which you have fewer than one hundred in stock, you might try to use an UPDATE statement such as the one following, which uses a subquery in the WHERE clause. As shown here, this statement does not work: mysql> UPDATE items > SET retail = retail * 0.9 > WHERE id IN > (SELECT id FROM items > WHERE retail / wholesale >= 1.3 AND quantity > 100); ERROR 1093 (HY000): You can't specify target table 'items' for update in FROM clauseInstead, you can employ a multi-table update in which the subquery is moved into the list of tables to be updated, using an alias to reference it in the outermost WHERE clause, like this: UPDATE items, (SELECT id FROM items WHERE id IN (SELECT id FROM items WHERE retail / wholesale >= 1.3 AND quantity < 100)) AS discounted SET items.retail = items.retail * 0.9 WHERE items.id = discounted.id;Because the optimizer tries by default to merge the derived table discounted into the outermost query block, this works only if you force materialization of the derived table. You can do this by setting the derived_merge flag of the optimizer_switch system variable to off before running the update, or by using the NO_MERGE optimizer hint, as shown here: UPDATE /*+ NO_MERGE(discounted) */ items, (SELECT id FROM items WHERE retail / wholesale >= 1.3 AND quantity < 100) AS discounted SET items.retail = items.retail * 0.9 WHERE items.id = discounted.id;The advantage of using the optimizer hint in such a case is that it applies only within the query block where it is used, so that it is not necessary to change the value of optimizer_switch again after executing the UPDATE. Another possibility is to rewrite the subquery so that it does not use IN or EXISTS, like this: UPDATE items, (SELECT id, retail / wholesale AS markup, quantity FROM items) AS discounted SET items.retail = items.retail * 0.9 WHERE discounted.markup >= 1.3 AND discounted.quantity < 100 AND items.id = discounted.id;In this case, the subquery is materialized by default rather than merged, so it is not necessary to disable merging of the derived table. An UPDATE on a partitioned table using a storage engine such as MyISAM that employs table-level locks locks only those partitions containing rows that match the UPDATE statement WHERE clause, as long as none of the table partitioning columns are updated. (For storage engines such as InnoDB that employ row-level locking, no locking of partitions takes place.) For more information, see Section 19.6.4, “Partitioning and Locking”. Page 17
13.3 Transactional and Locking StatementsMySQL supports local transactions (within a given client session) through statements such as SET autocommit, START TRANSACTION, COMMIT, and ROLLBACK. See Section 13.3.1, “START TRANSACTION, COMMIT, and ROLLBACK Statements”. XA transaction support enables MySQL to participate in distributed transactions as well. See Section 13.3.7, “XA Transactions”. |