You are here: Foswiki>Main Web>EinsteinAtHomeDatabaseOptimizations (13 Jul 2018, OliverBock)Edit Attach

Maintenance

There are a few housekeeping tasks that should be done on a regular base:
  • To update index key distributions: run ANALYZE TABLE once a day (e.g. right after daily backup-dump) on all InnoDB and MyISAM tables
    • Tables will be locked during optimization (MyISAM: read-lock, InnoDB: write-lock)!
    • Optimization will be replicated to slaves (good) unless told otherwise
  • To defrag data, recover space and update index statistics: run OPTIMIZE TABLE once a month (e.g. right after backup-dump) on all InnoDB and MyISAM tables
    • Tables will be locked during optimization!
    • Optimization will be replicated to slaves (good) unless told otherwise

Index / Query structure

Looking at the index structure of the most important tables, three different classes of problems could be determined:

Missing indexes:

Analyzing the long running queries (i.e. using innotop) we found a number of them that could benefit from extra (or even dedicated) indexes. Of course one has to take into account the overhead to maintain these indexs during DML-statements like INSERT, UPDATE, DELETE. However, a BOINC project, in particular when using locality scheduling, is more often reading (SELECT) than writing, so the extra overhead is usually compensated by the improved read performance. One also has to consider that many columns without indexes only contain state information, hence can be indexed rather cheaply (index with small cardinality).

Index redundancy:

There were a couple of redundant indexes which means they could be dropped (or changed, see below) without affecting the optimizer negatively. Redundant indexes simply add index maintenance overhead without contributing any advantage. Consider the following example: index1(a) and index2(a,b). In this case the index1 can be dropped because index2 already covers column a. Important: for this to work, the respective column(s) has(have) to be the leftmost column(s)! An additional advantage could be that index2 could serve as a so called covering index when used in statement like this: SELECT a, b FROM x WHERE a = m AND b = n . In this case no data would have to be fetched from table x because index2 already containes it - it covers the queried columns.

Index column order:

There were a number of multi-column indexes that shared the very same first column. Consider the following scenario of existing indexes: index1(a,b), index2(a,c), index3(a,d). These three indexes most likely have been created because column a will be used in conjunction with either column b, c or d in different queries. However, one needs to know that the optimizer doesn't care about the column order in this case. That means that the optimizer will use index1 for a query like SELECT x FROM y WHERE a = m AND b = n, regardless of the column order (index1(a,b) or index1(b,a)). Therefore it's better to reorder the remaining indexes index2 and index3, hence avoiding index redundancy on column a and providing indexes of columns c and d without additional overhead. Please note that it can sometimes even be better to swap all three indexes and create a dedicated index for column a, but that obviously depends on the actual queries being issued (which columns are used in which contexts) and the actual cardinalities of the columns concerned.

Table: result

Original index structure

+--------+------------+-------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name                | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+-------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| result |          0 | PRIMARY                 |            1 | id                | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          0 | name                    |            1 | name              | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_wuid                |            1 | workunitid        | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_app_state           |            1 | appid             | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_app_state           |            2 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_filedel             |            1 | file_delete_state | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_userid_id           |            1 | userid            | A         |       69264 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_userid_id           |            2 | id                | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_userid_val          |            1 | userid            | A         |       63451 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_userid_val          |            2 | validate_state    | A         |      132232 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_hostid_id           |            1 | hostid            | A         |      161617 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_hostid_id           |            2 | id                | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_wu_user             |            1 | workunitid        | A         |      727279 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_wu_user             |            2 | userid            | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | ind_res_st              |            1 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | ind_res_st              |            2 | priority          | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_st_create           |            1 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_st_create           |            2 | create_time       | A         |      727279 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_st_random           |            1 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_st_random           |            2 | random            | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_host_state          |            1 | hostid            | A         |       96970 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_host_state          |            2 | server_state      | A         |      161617 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+-------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+

Optimized index structure

Added three indexes:
  • outcome (small cardinality)
  • client_state (small cardinality)
  • received_time (overhead compensated by drop of res_wuid, see below)
Dropped redundant indexes:
  • res_wuid (covered by res_wu_user)
Changed index column order:
  • res_userid_val(userid,validate_state) right res_val_userid(validate_state,userid)
    (userid redundant and covered by res_userid_id, important "free" additional index for validate_state)
  • res_st_create(server_state,create_time) right res_create_server_state(create_time, server_state)
    (server_state redundant and covered by ind_res_st, important "free" additional index for create_time)
Possible changes (being conservative, not yet implemented):
  • add exit_status
  • swap column order of
    • res_st_random
      (server_state redundant and covered by ind_res_st)
      Note: why index a column with random content, just adds overhead without benefit?! maybe it could even be dropped!?
    • ind_res_st
      (this would require to also swap res_host_state to have one index still covering server_state! That'd be ok as hostid is redundant anyway and covered by res_hostid_id)
      Important: ind_res_st is used in an optimizer hint (FORCE INDEX) in boinc_db.cpp, is this really necessary (bad design, error-prone)?

+--------+------------+-------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name                | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+-------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| result |          0 | PRIMARY                 |            1 | id                | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          0 | name                    |            1 | name              | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_app_state           |            1 | appid             | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_app_state           |            2 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_filedel             |            1 | file_delete_state | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_userid_id           |            1 | userid            | A         |       69264 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_userid_id           |            2 | id                | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_val_userid          |            1 | validate_state    | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_val_userid          |            2 | userid            | A         |      132232 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_hostid_id           |            1 | hostid            | A         |      161617 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_hostid_id           |            2 | id                | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_wu_user             |            1 | workunitid        | A         |      727279 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_wu_user             |            2 | userid            | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | ind_res_st              |            1 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | ind_res_st              |            2 | priority          | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_create_server_state |            1 | create_time       | A         |      727279 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_create_server_state |            2 | server_state      | A         |      727279 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_st_random           |            1 | server_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_st_random           |            2 | random            | A         |     1454559 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_host_state          |            1 | hostid            | A         |       96970 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_host_state          |            2 | server_state      | A         |      161617 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_outcome             |            1 | outcome           | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_client_state        |            1 | client_state      | A         |           9 |     NULL | NULL   |      | BTREE      |         |
| result |          1 | res_recv_time           |            1 | received_time     | A         |      727279 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+-------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+

Table: workunit

Original index structure

+----------+------------+---------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table    | Non_unique | Key_name      | Seq_in_index | Column_name        | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+---------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
| workunit |          0 | PRIMARY       |            1 | id                 | A         |      706229 |     NULL | NULL   |      | BTREE      |         |
| workunit |          0 | name          |            1 | name               | A         |      706229 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_val        |            1 | appid              | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_val        |            2 | need_validate      | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_timeout    |            1 | transition_time    | A         |      353114 |     NULL | NULL   |      | BTREE      |        |
| workunit |          1 | wu_assim      |            1 | appid              | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_assim      |            2 | assimilate_state   | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_filedel    |            1 | file_delete_state  | A         |          19 |     NULL | NULL   |      | BTREE      |         |
+----------+------------+---------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+

Optimized index structure

Added three indexes:
  • canonical_resultid
  • mod_time
  • app_id (required for following changes of index orders, small cardinality)
Changed index column order (app_id redundant, now covered by wu_appid):
  • wu_val(appid,need_validate) right wu_val(need_validate,appid)
    ("free" additional index for need_validate)
  • wu_assim(appid,assimilate_state) right wu_assim(assimilate_state, appid)
    ("free" additional index for assimilate_state)
+----------+------------+---------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table    | Non_unique | Key_name      | Seq_in_index | Column_name        | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+---------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
| workunit |          0 | PRIMARY       |            1 | id                 | A         |      706229 |     NULL | NULL   |      | BTREE      |         |
| workunit |          0 | name          |            1 | name               | A         |      706229 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_val        |            1 | need_validate      | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_val        |            2 | appid              | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_timeout    |            1 | transition_time    | A         |      353114 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_assim      |            1 | assimilate_state   | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_assim      |            2 | appid              | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_filedel    |            1 | file_delete_state  | A         |          19 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_cano_resid |            1 | canonical_resultid | A         |      706229 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_modtime    |            1 | mod_time           | A         |      706229 |     NULL | NULL   |      | BTREE      |         |
| workunit |          1 | wu_appid      |            1 | appid              | A         |          16 |     NULL | NULL   |      | BTREE      |         |
+----------+------------+---------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+

Table: host

Original index structure

+-------+------------+------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name         | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| host  |          0 | PRIMARY          |            1 | id            | A         |     1739946 |     NULL | NULL   |      | BTREE      |         |
| host  |          1 | host_user        |            1 | userid        | A         |      869973 |     NULL | NULL   |      | BTREE      |         |
| host  |          1 | host_avg         |            1 | expavg_credit | A         |     1739946 |     NULL | NULL   |      | BTREE      |         |
| host  |          1 | host_tot         |            1 | total_credit  | A         |     1739946 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+

Optimized index structure

Added three indexes:
  • rpc_time
  • create_time
  • p_fpops
Modified index (added second column, helps scheduler):
  • host_user(userid) right host_user_cpid(userid, host_cpid)
+-------+------------+------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name         | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| host  |          0 | PRIMARY          |            1 | id            | A         |     2263839 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_avg         |            1 | expavg_credit | A         |     1131919 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_tot         |            1 | total_credit  | A         |     2263839 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_rpc_time    |            1 | rpc_time      | A         |     2263839 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_create_time |            1 | create_time   | A         |     2263839 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_fpops       |            1 | p_fpops       | A         |     2263839 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_user_cpid   |            1 | userid        | A         |     2263839 |     NULL | NULL   |      | BTREE      |         | 
| host  |          1 | host_user_cpid   |            2 | host_cpid     | A         |     2263839 |     NULL | NULL   | YES  | BTREE      |         | 
+-------+------------+------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+

Table: user

Original index structure

+-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name      | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| user  |          0 | PRIMARY       |            1 | id            | A         |      423498 |     NULL | NULL   |      | BTREE      |         |
| user  |          0 | email_addr    |            1 | email_addr    | A         |      423498 |     NULL | NULL   |      | BTREE      |         |
| user  |          0 | authenticator |            1 | authenticator | A         |      423498 |     NULL | NULL   | YES  | BTREE      |         |
| user  |          1 | ind_tid       |            1 | teamid        | A         |         138 |     NULL | NULL   |      | BTREE      |         |
| user  |          1 | user_name     |            1 | name          | A         |      423498 |     NULL | NULL   | YES  | BTREE      |         |
| user  |          1 | user_tot      |            1 | total_credit  | A         |      423498 |     NULL | NULL   |      | BTREE      |         |
| user  |          1 | user_avg      |            1 | expavg_credit | A         |      423498 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+

Optimized index structure

No changes so far...

Topic revision: r25 - 13 Jul 2018, OliverBock
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback