Problem: We are experiencing a problem with large tables (+40M rows) having frequently updated their indexed columns with low selectivity. It seems that when the column is often updated/deleted after some time optimizer starts to choose the wrong execution plan. In my opinion the issue could be in outdated statistics (or not) provided to the optimizer (e.g. rec_per_key). Here is an example:
MariaDB [test]> select version(); +----------------------------------+ | version() | +----------------------------------+ | 10.0.33-MariaDB-0ubuntu0.16.04.1 | +----------------------------------+ MariaDB [test]> show create table t1 \G; *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `status` enum('a','b','c','d','e') DEFAULT 'a', PRIMARY KEY (`id`), KEY `status` (`status`) ) ENGINE=InnoDB AUTO_INCREMENT=12001 DEFAULT CHARSET=utf8mb4 MariaDB [test]> show index from t1 \G; *************************** 1. row *************************** Table: t1 Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: id Collation: A Cardinality: 11980 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: Index_comment: *************************** 2. row *************************** Table: t1 Non_unique: 1 Key_name: status Seq_in_index: 1 Column_name: status Collation: A Cardinality: 15 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: Index_comment: MariaDB [test]> select index_name,last_update,stat_name,stat_value,sample_size,stat_description from mysql.innodb_index_stats where database_name='test' AND table_name='t1'; +------------+---------------------+--------------+------------+-------------+-----------------------------------+ | index_name | last_update | stat_name | stat_value | sample_size | stat_description | +------------+---------------------+--------------+------------+-------------+-----------------------------------+ | PRIMARY | 2018-01-30 18:31:07 | n_diff_pfx01 | 11555 | 20 | id | | PRIMARY | 2018-01-30 18:31:07 | n_leaf_pages | 20 | NULL | Number of leaf pages in the index | | PRIMARY | 2018-01-30 18:31:07 | size | 21 | NULL | Number of pages in the index | | status | 2018-01-30 18:31:07 | n_diff_pfx01 | 4 | 10 | status | | status | 2018-01-30 18:31:07 | n_diff_pfx02 | 11575 | 10 | status,id | | status | 2018-01-30 18:31:07 | n_leaf_pages | 10 | NULL | Number of leaf pages in the index | | status | 2018-01-30 18:31:07 | size | 11 | NULL | Number of pages in the index | +------------+---------------------+--------------+------------+-------------+-----------------------------------+ MariaDB [test]> select count(distinct(status)) from t1; +-------------------------+ | count(distinct(status)) | +-------------------------+ | 4 | +-------------------------+ MariaDB [test]> explain select * from t1 where status='a'; +------+-------------+-------+------+---------------+--------+---------+-------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+--------+---------+-------+------+--------------------------+ | 1 | SIMPLE | t1 | ref | status | status | 2 | const | 5038 | Using where; Using index | +------+-------------+-------+------+---------------+--------+---------+-------+------+--------------------------+ MariaDB [test]> show status like '%handler%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Handler_commit | 1 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_external_lock | 0 | | Handler_icp_attempts | 0 | | Handler_icp_match | 0 | | Handler_mrr_init | 0 | | Handler_mrr_key_refills | 0 | | Handler_mrr_rowid_refills | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 5039 | | Handler_read_prev | 0 | | Handler_read_retry | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_deleted | 0 | | Handler_read_rnd_next | 0 | | Handler_rollback | 0 | | Handler_savepoint | 0 | | Handler_savepoint_rollback | 0 | | Handler_tmp_update | 0 | | Handler_tmp_write | 0 | | Handler_update | 0 | | Handler_write | 0 | +----------------------------+-------+ MariaDB [test]> select count(*) from t1 where status='a'; +----------+ | count(*) | +----------+ | 5039 | +----------+ MariaDB [test]> update t1 set status='b' where status='a' limit 1000; Query OK, 1000 rows affected (0.22 sec) Rows matched: 1000 Changed: 1000 Warnings: 0 MariaDB [test]> select count(*) from t1 where status='a'; +----------+ | count(*) | +----------+ | 4039 | +----------+ 1 row in set (0.01 sec) MariaDB [test]> explain select * from t1 where status='a'; +------+-------------+-------+------+---------------+--------+---------+-------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+--------+---------+-------+------+--------------------------+ | 1 | SIMPLE | t1 | ref | status | status | 2 | const | 5038 | Using where; Using index | +------+-------------+-------+------+---------------+--------+---------+-------+------+--------------------------+ 1 row in set (0.00 sec) MariaDB [test]> show status like '%handler%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Handler_commit | 1 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_external_lock | 0 | | Handler_icp_attempts | 0 | | Handler_icp_match | 0 | | Handler_mrr_init | 0 | | Handler_mrr_key_refills | 0 | | Handler_mrr_rowid_refills | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 4039 | | Handler_read_prev | 0 | | Handler_read_retry | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_deleted | 0 | | Handler_read_rnd_next | 0 | | Handler_rollback | 0 | | Handler_savepoint | 0 | | Handler_savepoint_rollback | 0 | | Handler_tmp_update | 0 | | Handler_tmp_write | 0 | | Handler_update | 0 | | Handler_write | 0 | +----------------------------+-------+ 26 rows in set (0.01 sec) MariaDB [test]> select index_name,last_update,stat_name,stat_value,sample_size,stat_description from mysql.innodb_index_stats where database_name='test' AND table_name='t1'; +------------+---------------------+--------------+------------+-------------+-----------------------------------+ | index_name | last_update | stat_name | stat_value | sample_size | stat_description | +------------+---------------------+--------------+------------+-------------+-----------------------------------+ | PRIMARY | 2018-01-30 18:52:25 | n_diff_pfx01 | 11980 | 20 | id | | PRIMARY | 2018-01-30 18:52:25 | n_leaf_pages | 20 | NULL | Number of leaf pages in the index | | PRIMARY | 2018-01-30 18:52:25 | size | 21 | NULL | Number of pages in the index | | status | 2018-01-30 18:52:25 | n_diff_pfx01 | 4 | 12 | status | | status | 2018-01-30 18:52:25 | n_diff_pfx02 | 12058 | 12 | status,id | | status | 2018-01-30 18:52:25 | n_leaf_pages | 12 | NULL | Number of leaf pages in the index | | status | 2018-01-30 18:52:25 | size | 13 | NULL | Number of pages in the index | +------------+---------------------+--------------+------------+-------------+-----------------------------------+ 7 rows in set (0.00 sec) MariaDB [test]> show index from t1 \G; *************************** 1. row *************************** Table: t1 Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: id Collation: A Cardinality: 11980 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: Index_comment: *************************** 2. row *************************** Table: t1 Non_unique: 1 Key_name: status Seq_in_index: 1 Column_name: status Collation: A Cardinality: 15 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: Index_comment: MariaDB [test]> show variables like 'innodb_stat%'; +--------------------------------------+-------------+ | Variable_name | Value | +--------------------------------------+-------------+ | innodb_stats_auto_recalc | ON | | innodb_stats_include_delete_marked | OFF | | innodb_stats_method | nulls_equal | | innodb_stats_modified_counter | 0 | | innodb_stats_on_metadata | OFF | | innodb_stats_persistent | ON | | innodb_stats_persistent_sample_pages | 20 | | innodb_stats_sample_pages | 8 | | innodb_stats_traditional | ON | | innodb_stats_transient_sample_pages | 8 | | innodb_status_output | OFF | | innodb_status_output_locks | OFF | +--------------------------------------+-------------+
Analyze table doesn’t help. Not even changing innodb_stats_persistent_sample_pages to higher values. Only solution so far are rebuilding the table (alter, mysqldump/recover) or use/force index.
In our real world scenario optimizer (at least show by explained) assumes it has to examine 1 600 000 rows, when in reality the handler statistic shows that only 400 rows have been checked.
Any help would be really appreciated.