10k

Use The Index Luke Note1

  1. The ebook discuss things of the index , more specific , the b-tree index for developers

Anatomy of an Index

  1. An index is a distance structure in the database that is build using the CREATE INDEX statement ; it requires its own disk space, hold a copy of the indexed table data and is pure redundancy -> it doesn't change the original table data and creates a new data structure that refers to the table.
  2. Searching in a database index is like searing a telephone -> the entries are ordered -> so it's fast.
  3. We cannot update the index each time the table there is a write (update/ insert) -> too slow for the query

The Leaf Nodes - A doubly linked list

  1. The purpose of the index is to provide an ordered representation of the indexed data
  2. But we cannot store it physically ordered because insert will cause moving to all the data to make new space
  3. The solution is to establish a logical order that independent of physical order in memory.
  4. The logical order is established via a doubly linked list. Each node has links to its neighbors. -> adding nodes will cause small data change -> they only change the pointers.
  5. The node in database is called leaf-nodes -> each node is stored in a database block or page. (Database smallest storage unit) -> normally a few KB
  6. Database stores as much as index entries in each page.
  7. Order is maintained not only among index leaf nodes but the index entries within each node.

image-20240513141321352

The B-Tree - A balanced Tree

  1. The database use a balanced tree - B-tree to search the leaf nodes.

  2. image-20240513141823308

  3. The branch node and root.

    1. Searching starts from root and
    2. Branch nodes are the node that's not leaf nodes. Their node entry are some node of their sons. In the above graph, it's the max node entry of each leaf.
    3. A Branch layer is built until all the leaf are covered.
  4. The next layer of branch are build similarly , until there is only one node -> the root.

  5. Its not a binary tree, it's balanced

  6. The database maintains the indexes automatically (the tree)

  7. The traversal is efficient ,

    1. it reach every data in the same set -> tree are balanced so the depth are the same.
    2. The tree depth grows very slow compared to the leaf node -> logarithmic

    image-20240513142428833

Slow Indexes lookup, Part1 - Two Ingredients make the index slow

  1. The index can be slow in some condition
  2. The first one for a slow index look up is the leaf node chain.
    1. In the traversal figure 1.3 the tree traversal not only in the tree , but also the leaf node chain. There are two entry with key 57, so traversal needs to go through the page to see if there are further matching entries.
  3. The second is the accessing of table
    1. Usually the table data is scattered across many table blocks -> many table access.
    2. An index look up requires three steps:
      1. Tree traversal (the only step has a upper bound of the number of accessed bocks -> the tree depth);
      2. Following leaf node chain;
      3. Fetching the table data.

The Where Clause

  1. The where clause defines the search condition of an SQL statement, and it thus falls into the core functional domain of an index: finding data quickly.
  2. This chapter explains how different operators affect index usage and how to make sure that an index is usable for as many quires as possible.

The Equals Operator - Exact key lookup

Primary Keys - Verifying index usage

  1. We analysis an use case , use this table

    sql CREATE TABLE employees ( employee_id NUMBER NOT NULL, first_name VARCHAR2(1000) NOT NULL, last_name VARCHAR2(1000) NOT NULL, date_of_birth DATE NOT NULL, phone_number VARCHAR2(1000) NOT NULL, CONSTRAINT employees_pk PRIMARY KEY (employee_id) )

    The database automatically creates an index for the primary key. That means there is an index on the EMPLOYEE_ID column, even though there is no create index statement.

  2. The following query uses the primary key to retrieve an employee's name:

    sql SELECT first_name, last_name FROM employees WHERE employee_id = 123

    The where clause cannot match multiple rows because the primary ket constraint ensures uniqueness of the EMPLOYEE_ID values. The database doesn't need to follow the index leaf nodes -> it's enough to traverses the index tree.

    Use execution plan for analysis :

    ba +----+-----------+-------+---------+---------+------+-------+ | id | table | type | key | key_len | rows | Extra | +----+-----------+-------+---------+---------+------+-------+ | 1 | employees | const | PRIMARY | 5 | 1 | | +----+-----------+-------+---------+---------+------+-------+

  3. After accessing the index, the database must do one more disk access but it will not be a bottleneck since there is only one disk access due to the uniqueness constraint.

Concatenated Keys - Multi-column indexes

  1. Concatenated key, also called multi-column, composite or combined index.

  2. The column order of a concatenated index has great impact on its usability so it must be chosen carefully.

  3. Say we have a company merger , so the employee_id across company are no longer unique so we introduce a new column SUBSIDIARY_ID to re-establish uniqueness.

  4. The index for new primary key defines as follows :

    sql CREATE UNIQUE INDEX employees_pk ON employees (employee_id, subsidiary_id)

  5. A query to a particular employee must take subsidiary_id column into account:

    sql SELECT first_name, last_name FROM employees WHERE employee_id = 123 AND subsidiary_id = 30

  6. But what if the query forget to use employee_id -> it becomes a full table scan

  7. The entries for subsidiary are not sorted next to each other , and they are not in the tree so they can not leverage the tree :

    image-20240519113038058

  8. We can add another index on SUBSIDIRAY_ID to improve the speed but search EMPLOYEE_ID alone doesn't make sense.

  9. We can take advantage that the first index column is always usable for searching. The trick is to reverse the index column order so that SUBSIDIARY_ID is in the first position

    sql CREATE UNIQUE INDEX EMPLOYEES_PK ON EMPLOYEES (SUBSIDIARY_ID, EMPLOYEE_ID)

  10. SUBSIDIARY_ID becomes the first column in the index -> the primary sort criterion -> they will be used to build the B-tree

  11. The most important consideration when defining a concatenated index is how to choose the column order so it can be used as often as possible.

  12. Now it's using SUBSIDIARY_ID , while it's not unique so the database must follow the leaf nodes in order to find all maturing entries

    bash +----+-----------+------+---------+---------+------+-------+ | id | table | type | key | key_len | rows | Extra | +----+-----------+------+---------+---------+------+-------+ | 1 | employees | ref | PRIMARY | 5 | 123 | | +----+-----------+------+---------+---------+------+-------+

  13. Two indexes deliver better select performance but one index saves storage, and has less maintenance overhead -> the fewer indexes a table has. The better the insert, delete and update performance.

  14. To define optimal index, you not only need to know how index works but also how the data are queried (the condition in the where clause) -> developer knows best.

Slow Indexes, Part II - The first ingredient, revisited

This chapter explains the way database pick up an index and the side effects when changing existing indexes.

  1. Indexes become usable no matter there are other condition if they are met to be used;

  2. But query optimizer will pick the best one

  3. sql SELECT first_name, last_name, subsidiary_id, phone_number FROM employees WHERE last_name = 'WINAND' AND subsidiary_id = 30

    the execution plan is:

    ```bash

    |Id |Operation | Name | Rows | Cost |

    | 0 |SELECT STATEMENT | | 1 | 30 | |1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 30 | |2 | INDEX RANGE SCAN | EMPLOYEES_PK | 40 | 2 |


    Predicate Information (identified by operation id):

    1 - filter("LAST_NAME"='WINAND') 2 - access("SUBSIDIARY_ID"=30)

    ```

  4. When we deliberately don't use index, the execution plan:

    ```bash

    | Id | Operation | Name | Rows | Cost |

    | 0 | SELECT STATEMENT | | 1 | 477 | |* 1 | TABLE ACCESS FULL| EMPLOYEES | 1 | 477 |


    Predicate Information (identified by operation id):

    1 - filter("LAST_NAME"='WINAND' AND "SUBSIDIARY_ID"=30) ```

    Only 1 row! Which is faster then using index -> unusual !

  5. So we debug step by step:

    1. The using index case, ID 2 it fetch the rows use the subsidiary id , find the match, and continue scan all the entries -> 40 rows
    2. Then read them individually ! -> that's the reason
    3. Since the full table scan in the case 2 read the data in one shot, while case 1 has 40 read!
  6. Choosing the best execution plan depends on the tables data distributions as well so the optimizer users statistics about the content of the database .

  7. If there is no statistics (the data were deleted), the database use default values.

  8. If we provide previous query statistics like :

    ```bash

    |Id |Operation | Name | Rows | Cost |

    | 0 |SELECT STATEMENT | | 1 | 680 | |1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 680 | |2 | INDEX RANGE SCAN | EMPLOYEES_PK | 1000 | 4 |


    Predicate Information (identified by operation id):

    1 - filter("LAST_NAME"='WINAND') 2 - access("SUBSIDIARY_ID"=30) ```

    Which the cost 680 is higher than the full table scan, the query optimizer will choose the latter.

  9. We add a new index on the last name column

    sql CREATE INDEX emp_name ON employees (last_name)

  10. The exception plan become :

    ```bash

    | Id | Operation | Name | Rows | Cost |

    | 0 | SELECT STATEMENT | | 1 | 3 | | 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 3 | | 2 | INDEX RANGE SCAN | EMP_NAME | 1 | 1 |


    Predicate Information (identified by operation id):

    1 - filter("SUBSIDIARY_ID"=30) 2 - access("LAST_NAME"='WINAND') ```

    This is faster then the full table scan -> only one row fetched

  11. Conclusion is: database don't always use the best way automatically.

Functions - Using functions in the where clause

You may need to search will same case (upper /lower).

This chapter explains how you use it without harming the performance. (MySQL is case sensitive.)

Case-insensitive Search - UPPER and LOWER

  1. You can use UPPER or LOWER to ignore the case.

    sql SELECT first_name, last_name, phone_number FROM employees WHERE UPPER(last_name) = UPPER('winand')

  2. The execution plan is not that simple :

    ```bash

    | Id | Operation | Name | Rows | Cost |

    | 0 | SELECT STATEMENT | | 10 | 477 | |* 1 | TABLE ACCESS FULL| EMPLOYEES | 10 | 477 |


    Predicate Information (identified by operation id):

    1 - filter(UPPER("LAST_NAME")='WINAND') ```

    A full table scan! Although there is an index on LAST_NAME but the search is on UPPER(LAST_NAME) not LAST_NAME. They are different to database.

  3. To cover such thing, we need a new index created based on the function- > FBI function based index.

    1. Instead of copying the column data directly to the index, the index applies the function first and puts the result into the index. -> the index stores all capitalized(derived data).

    2. The execution plan now becomes:

      ```bash

      |Id |Operation | Name | Rows | Cost |

      | 0 |SELECT STATEMENT | | 100 | 41 | | 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 100 | 41 | |*2 | INDEX RANGE SCAN | EMP_UP_NAME | 40 | 1 |


      Predicate Information (identified by operation id):

      2 - access(UPPER("LAST_NAME")='WINAND') ```

    3. The line 5 looks strange , 100 rows are going to fetch ?? -> this is the statistic issue -> updating statistic -> then check the execution plan :

      ```bash

      |Id |Operation | Name | Rows | Cost |

      | 0 |SELECT STATEMENT | | 1 | 3 | | 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 3 | |*2 | INDEX RANGE SCAN | EMP_UP_NAME | 1 | 1 |


      Predicate Information (identified by operation id):

      2 - access(UPPER("LAST_NAME")='WINAND') ```

  4. MySQL and SQL server doesn't support such FBI. But there is workaround : (create a redundancy column in the table and create index on the column)

    ```sql

    Since MySQL 5.7 you can index a generated columns as follows:

    ALTER TABLE employees ADD COLUMN last_name_up VARCHAR(255) AS (UPPER(last_name)); CREATE INDEX emp_up_name ON employees (last_name_up); ```

User-Defined Functions - Limitations of function-based indexes

CREATE FUNCTION get_age(date_of_birth DATE) 
RETURN NUMBER
AS
BEGIN
  RETURN 
    TRUNC(MONTHS_BETWEEN(SYSDATE, date_of_birth)/12);
END
SELECT first_name, last_name, get_age(date_of_birth)
  FROM employees
 WHERE get_age(date_of_birth) = 42
  1. For such query above we cannot use a FBI index like previous section because the get_age is not deterministic . -> the stored index will not be periodically updated.
  2. Date, random, or functions depend on environment variables.

Over-Indexing - Avoid redundancy

  1. Don't try to index everything -> causes maintenance overhead.
  2. Always try to index original data -> most useful.
  3. Unify the access path so that one index can be used by several queries
  4. Some ORM tools use UPPER and LOWER without developers' knowledge. -> Hibernate injects and implicit LOWER for case-insensitive searches. -> good to know

Parameterized Queries / Bind Variables - For Security and performance

  1. Security: Bind variables are the best way to prevent SQL injections

  2. Performance: Database with execution plan cache use the sql in the cache if the sql is exactly the same -> parametrized sql is regarded as unchanged query -> sql database don't have to rebuild the execution plan

  3. The exception is : for example, the affected data volume depends on the actual values :

    sql SELECT first_name, last_name FROM employees WHERE subsidiary_id = 20

    ```bash 99 rows selected.


    |Id | Operation | Name | Rows | Cost |

    | 0 | SELECT STATEMENT | | 99 | 70 | | 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 99 | 70 | |*2 | INDEX RANGE SCAN | EMPLOYEES_PK | 99 | 2 |


    Predicate Information (identified by operation id):

    2 - access("SUBSIDIARY_ID"=20) ```

    This is good

    But below is not :

    sql SELECT first_name, last_name FROM employees WHERE subsidiary_id = 30

    ```bash 1000 rows selected.


    | Id | Operation | Name | Rows | Cost |

    | 0 | SELECT STATEMENT | | 1000 | 478 | |* 1 | TABLE ACCESS FULL| EMPLOYEES | 1000 | 478 |


    Predicate Information (identified by operation id):

    1 - filter("SUBSIDIARY_ID"=30) ```

  4. The histogram on SUBSIDIARY_ID fulfills its purpose. The optimizer uses it to determine the frequency of the subsidiary ID mentioned in the SQL query. Consequently it gets two different row count estimates for both queries.

  5. The cost of TABLE ACCESS BY INDEX ROWID is sensitive to the row count estimate -> so selecting multiple times will elevate the estimate. -> using index might has higher cost than the full table scan -> so the optimizer will select a different plan.

  6. Using a bind parameters -> regards as same query with equal distribution -> so database will always select the same plan

  7. But not using the bind parameters always opt for the best execution plan.

  8. This is a dilemma. But as developers : That is, you should always use bind parameters except for values that shall influence the execution plan.

  9. A java example :

    ```java // Without bind parameters:

    int subsidiary_id; Statement command = connection.createStatement( "select first_name, last_name" + " from employees" + " where subsidiary_id = " + subsidiary_id ); // Using a bind parameter:

    int subsidiary_id; PreparedStatement command = connection.prepareStatement( "select first_name, last_name" + " from employees" + " where subsidiary_id = ?" ); command.setInt(1, subsidiary_id); // See also: PreparedStatement class documentation. ```

  10. ? Question mark is the only placeholder char for sql standard. -> positional parameters , the same number as the argument. (Many database offers using @name or :name)

  11. Note: the bind parameter cannot change the structure of an sql : -> You cannot use bind parameters for table or column names.

Searching for Ranges - Beyond equality

Range queries can leverage index, but you need to care about the order of columns in multi-column indexes.

Greater, Less and BETWEEN - The column order revisited

  1. The biggest performance risk of an index range scan is the leaf node traversal

  2. Golden rule of indexingL keep the scanned index range as small as possible.

    sql SELECT first_name, last_name, date_of_birth FROM employees WHERE date_of_birth >= TO_DATE(?, 'YYYY-MM-DD') AND date_of_birth <= TO_DATE(?, 'YYYY-MM-DD') AND subsidiary_id = ?

  3. Ideal situation is that we have an index covers both columns. -> but in which order?

  4. image-20240530134902934image-20240530134929130

  5. The difference is that the equal operator limited the first index column to single value. Within the range for this value (SUBSIDIARY_ID 27), the index is sorted according to the second column(the date of birth), so there is no need to visit the first leaf node because the branch node already indicates that there is no employee for subsidiary 27 born after June 25th 1969 in the first leaf node. -> compare to the first index, the search will be limit to the date range and the SUBSIDIRAY_ID become useless (it doesn't limit the range)

  6. ```base -------------------------------------------------------------- |Id | Operation | Name | Rows | Cost | -------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4 | |1 | FILTER | | | | | 2 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 4 | |3 | INDEX RANGE SCAN | EMP_TEST | 2 | 2 | --------------------------------------------------------------

    Predicate Information (identified by operation id):

    1 - filter(:END_DT >= :START_DT) 3 - access(DATE_OF_BIRTH >= :START_DT AND DATE_OF_BIRTH <= :END_DT) filter(SUBSIDIARY_ID = :SUBS_ID) ```

  7. We can see that in the EMP_TEST, the DATE_OF_BIRTH is used for INDEX RANGE SCAN while the SUBSIDIARY_ID is used for only filter.

  8. If we change the order of index columns :

    ```bash

    | Id | Operation | Name | Rows | Cost |

    | 0 | SELECT STATEMENT | | 1 | 3 | | 1 | FILTER | | | | | 2 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 3 | | 3 | INDEX RANGE SCAN | EMP_TEST2 | 1 | 2 |


    Predicate Information (identified by operation id):

    1 - filter(:END_DT >= :START_DT) 3 - access(SUBSIDIARY_ID = :SUBS_ID AND DATE_OF_BIRTH >= :START_DT AND DATE_OF_BIRTH <= :END_T) ```

  9. The between operator is like using a >= and <= to specify a range.

  1. sql SELECT first_name, last_name, date_of_birth FROM employees WHERE UPPER(last_name) LIKE 'WIN%D'

    bash +----+-----------+-------+----------+---------+------+-------------+ | id | table | type | key | key_len | rows | Extra | +----+-----------+-------+----------+---------+------+-------------+ | 1 | employees | range | emp_name | 767 | 2 | Using where | +----+-----------+-------+----------+---------+------+-------------+

  2. LIKE filters can only use the characters before the first wild card during tree traversal.

  3. The more selective the prefix before the first wildcard is, the smaller the scanned index range becomes.

    image-20240604161721081

Index Combine - Why not using one index for every columns

  1. One index for multiple columns is better than each column has an index in most cases.

Partial Indexes - Indexing Selected Rows

  1. PostgreSQL has partial index; while only add indexes for some rows

  2. e.g. we have a query that wants to get all the unprocessed message of a recipient

    sql SELECT message FROM messages WHERE processed = 'N' AND receiver = ?

  3. We can create an index like :

    sql CREATE INDEX messages_todo ON messages (receiver, processed)

    But many unwanted rows are indexed -> waste disk space.-> so we can create index like below:

    sql CREATE INDEX messages_todo ON messages (receiver) WHERE processed = 'N'

    Limitation is: you can only use deterministic functions as is the case everywhere in an index definition. (Due to the potential complexity)

NULL in the Oracle Database - An Important curiosity

Skipped

Obfuscated Conditions - Common anti-patterns

Dates - Pay special attention to DATE types

  1. Truncate the data and time to get date(using function); or comparing date string (using function) -> index might not be used
  2. Always use explicit range condition when comparing dates.

Numeric Strings - Don't mix types

  1. Number stores in text column and not automatically render an index unless you turn it into string

    1. Databse don't do this because doing this may lead to unambiguous result.

      42 042 0042 ....

  2. Using numeric strings may cause performance issues due to conversion ; also error due to invalid converted numbers.

  3. This works : (use index)

    sql SELECT ... FROM ... WHERE numeric_number = '42'

Combining Columns - use redundant where clauses

  1. sql SELECT ... FROM ... WHERE ADDTIME(date_column, time_column) > DATE_ADD(now(), INTERVAL -1 DAY)

    This might not use index due to the search is based on the derived strings.

  2. You can avoid this without using function:

    sql SELECT ... FROM ... WHERE datetime_column > DATE_ADD(now(), INTERVAL -1 DAY)

    But this changed the table (source)

  3. Or you can use function based index (and there is no such thing in MySQL)

  4. It is still possible to write the query so that the database can use a concatenated index on DATE_COLUMN, TIME_COLUMN with an access predicate—at least partially. For that, we add an extra condition on the DATE_COLUMN.

    sql WHERE ADDTIME(date_column, time_column) > DATE_ADD(now(), INTERVAL -1 DAY) AND date_column >= DATE(DATE_ADD(now(), INTERVAL -1 DAY))

    The new condition is absolutely redundant but it is a straight filter on DATE_COLUMN that can be used as access predicate. Even though this technique is not perfect, it is usually a good enough approximation.

Smart Logic - The smartest way to make SQL slow

Skip(MySQL doesn't have a execution plan cache)

Math - Database don't solve equations

  1. The database will not use index for such :

    sql SELECT a, b FROM table_name WHERE 3*a + 5 = b

  2. But ew can leverage function based index if we need it.

Appendix

How to use execution plan in MySQL?

  1. Just put explain before the query SQL.
  2. The most important information is in the TYPE column, it specified how the data is accessed.

Operations

  1. Index and Table access Types :
    1. eq_ref, const: performs a B-tree traversal to find one row, and fetch additional columns from the table if needed. -> the database use this if a primary key or unique constrain ensures that the search criteria will match no more than one entry.
    2. ref, range: Performs a B-tree traversal, walks through the leaf nodes to find all matching index entries and fetches additional columns from the primary table store if needed
    3. index: Reads the entire index - all rows -in the index order.
    4. ALL: reads the entire table - all rows and columns - as stored on the disk.
    5. Using Index (in the EXTRA column ): when the extra column shows "Using Index", it means that the table is not accessed because the index has all the required data. (However, if clustered index is used, (e.g. the PRIMARY index when using InnoDB), "Using Index doesn't appear in the Extra column, although it is technically an Index-Only Scan")
  2. Sorting and Grouping
    1. Using filesort(in the "Extra" column): indicate an explicit sort operation - no matter where the sort takes place (memory or disk) -> it uses large amounts of memory to materialized the intermediate result.
  3. Top_N Queries
    1. Implicit: no "using filesort" in the "Extra" column: A MySQL execution plan does not show a top-N query explicitly. If you are using the limit syntax and don’t see “using filesort” in the extra column, it is executed in a pipelined manner

Distinguishing Access and Filter-predicates

Three different ways to evaluate where clauses :

  1. Access predicate ("key_len", "ref" columns): the access predicated express the start and stop condition of the leaf node traversal

  2. Index filter predicate("Using index condition", since MySQL 5.6): Index filter predicates are applied during the leaf node traversal only. They do not contribute to the start and stop conditions and do not narrow the scanned range.

  3. Table level filter predicate (“Using where” in the “Extra” column): Predicates on columns which are not part of the index are evaluated on the table level. For that to happen, the database must load the row from the table first.

  4. MySQL execution plans do not show which predicate types are used for each condition—they just list the predicate types in use.In the following example, the entire where clause is used as access predicate:

    ```sql CREATE TABLE demo ( id1 NUMERIC , id2 NUMERIC , id3 NUMERIC , val NUMERIC) INSERT INTO demo VALUES (1,1,1,1) INSERT INTO demo VALUES (2,2,2,2) CREATE INDEX demo_idx ON demo (id1, id2, id3) EXPLAIN SELECT * FROM demo WHERE id1=1 AND id2=1

    ```

    bash +------+----------+---------+-------------+------+-------+ | type | key | key_len | ref | rows | Extra | +------+----------+---------+-------------+------+-------+ | ref | demo_idx | 12 | const,const | 1 | | +------+----------+---------+-------------+------+-------+

    There is no “Using where” or “Using index condition” shown in the “Extra” column. The index is, however, used (type=ref, key=demo_idx) so you can assume that the entire where clause qualifies as access predicate.

  5. Please also note that the ref column indicates that two columns are used from the index (both are query constants in this example). Another way to confirm which part of the index is used is the key_len value: It shows that the query uses the first 12 bytes of the index definition. To map this to column names, you “just” need to know how much storage space each column needs (see “Data Type Storage Requirements” in the MySQL documentation). In absence of a NOT NULL constraint, MySQL needs an extra byte for each column. After all, each NUMERIC column needs 6 bytes in the example. Therefore, the key length of 12 confirms that the first two index columns are used as access predicates.

    When filtering with the ID3 column (instead of the ID2) MySQL 5.6 and later use an index filter predicate (“Using index condition”):

    sql EXPLAIN SELECT * FROM demo WHERE id1=1 AND id3=1

    ```bash +------+----------+---------+-------+------+-----------------------+ | type | key | key_len | ref | rows | Extra | +------+----------+---------+-------+------+-----------------------+ | ref | demo_idx | 6 | const | 1 | Using index condition | +------+----------+---------+-------+------+-----------------------+

    ```

    In this case, the ken_len=6 and only one const in the ref column means only one index column is used as access predicate.

    Previous versions of MySQL used a table level filter predicate for this query—identified by “Using where” in the “Extra” column:

    bash +------+----------+---------+-------+------+-------------+ | type | key | key_len | ref | rows | Extra | +------+----------+---------+-------+------+-------------+ | ref | demo_idx | 6 | const | 1 | Using where | +------+----------+---------+-------+------+-------------+

Thoughts? Leave a comment