Optimize Discuz with MySQL! Top posts page turning tips

  • 2020-09-28 09:12:40
  • OfStack

Write first: discuz! As a first-descendent 1 refers to the community system, for the majority of webmasters to provide a site type solution, and is open source (although part of the code is encrypted), it has made a great contribution to the development of this vertical industry. Still, discuz! System source code, or more or less some pit. One of the most famous is the default use of MyISAM engine, and based on the MyISAM engine building grab function, session table using memory engine, etc., you can refer to the following several historical articles. This time we're going to talk about discuz! Another problem in dealing with the logic of turning pages in hot posts.

In our environment, version ES10en-5.6.6 is used.

When you view a post and turn the page, SQL looks something like this:


mysql> desc SELECT * FROM pre_forum_post WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 15\G
 *************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: pre_forum_post
 type: ref
 possible_keys: tid,displayorder,first
 key: displayorder
 key_len: 3
 ref: const
 rows: 593371
 Extra: Using index condition; Using where; Using filesort

The cost of this SQL implementation is:

-- According to the number of times the row is accessed by the index, it is generally in a good state


| Handler_read_key   | 16  |

-- The number of times the next row of records are accessed in index order, usually because the range scan, or full index scan, is generally in a good state


| Handler_read_next   | 329881 |

-- The total number of times a row record is read in order of 1. If you need to sort the results, this value is usually large. This value is also large when a full table scan occurs or when indexes are not available for multi-table join


| Handler_read_rnd   | 15  |

And when it comes to top posts, you need to turn a lot of pages later, for example:


mysql> desc SELECT * FROM pre_forum_post WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860, 15\G
 *************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: pre_forum_post
 type: ref
 possible_keys: displayorder
 key: displayorder
 key_len: 3
 ref: const
 rows: 593371
 Extra: Using where; Using filesort

The cost of this SQL execution becomes (as you can see, Handler_read_key, Handler_read_rnd are much larger) :

| Handler_read_key | 129876 | -- because there are a lot of rows to skip
| Handler_read_next | 329881 | -- ibid
| Handler_read_rnd | 129875 | -- because you need to sort a very large result set first

As you can see, the cost of SQL is very high when top posts is encountered. It is easy to overwhelm the database server directly if the top posts is flooded with access history replies, or if it is repeatedly requested by search engine 1 and has a history reply page.

Summary: The reason this SQL cannot sort with 'displayorder' is that the second column of the index 'invisible' USES a range query (RANGE), so it is not possible to continue to sort with the 'dateline' field using the joint index (if it is WHERE tid =? AND invisible IN(? , & # 63;) AND dateline =? In this case, the entire joint index can be used. Note the difference between the two.

Knowing this reason, the corresponding optimization solution is also clear:
Create a new index, idx_tid_dateline, which includes only two columns, tid and dateline (item_type and item_id, according to statistics from other indexes, have too low a cardinality to be included in the joint index. Of course, you can also consider 1 and add).

Let's take a look at the execution plan after adopting the new index:


mysql> desc SELECT * FROM pre_forum_post WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 15\G
 *************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: pre_forum_post
 type: ref
 possible_keys: tid,displayorder,first,idx_tid_dateline
 key: idx_tid_dateline
 key_len: 3
 ref: const
 rows: 703892
 Extra: Using where

As you can see, the existing Using filesort disappears and the sorting can be done directly by index.

However, if the top posts flipped to an older historical reply, the corresponding SQL would still not be able to use the new index:


mysql> desc SELECT * FROM pre_forum_post WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15\G
 *************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: pre_forum_post
 type: ref
 possible_keys: tid,displayorder,first,idx_tid_dateline
 key: displayorder
 key_len: 3
 ref: const
 rows: 593371
 Extra: Using where; Using filesort

Compare what the execution plan would look like if the optimizer were recommended to use the new index:


mysql> desc SELECT * FROM pre_forum_post use index(idx_tid_dateline) WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15\G
 *************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: pre_forum_post
 type: ref
 possible_keys: idx_tid_dateline
 key: idx_tid_dateline
 key_len: 3
 ref: const
 rows: 703892
 Extra: Using where

As you can see, the query optimizer considers the latter to be more efficient because it considers the number of rows to be scanned far more than the former by more than 110,000.

In fact, in this example, sorting is more expensive, so we want to eliminate sorting first, so we should force the new index, which is to specify the index in the corresponding program using the following execution plan.

Finally, let's look at the comparison of profiling statistics for the two execution plans when top posts turns to the very old historical reply:

1. Use the old index (displayorder) :


mysql> SELECT * FROM pre_forum_post WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15;

# To view profiling The results of 
 | starting    | 0.020203 |
 | checking permissions | 0.000026 |
 | Opening tables  | 0.000036 |
 | init     | 0.000099 |
 | System lock   | 0.000092 |
 | optimizing   | 0.000038 |
 | statistics   | 0.000123 |
 | preparing   | 0.000043 |
 | Sorting result  | 0.000025 |
 | executing   | 0.000023 |
 | Sending data   | 0.000045 |
 | Creating sort index | 0.941434 |
 | end     | 0.000077 |
 | query end   | 0.000044 |
 | closing tables  | 0.000038 |
 | freeing items  | 0.000056 |
 | cleaning up   | 0.000040 |

2. If the new index is adopted (idx_tid_dateline) :


mysql> SELECT * FROM pre_forum_post use index(idx_tid_dateline) WHERE
 tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15;

# Contrast to view profiling The results of 
 | starting    | 0.000151 |
 | checking permissions | 0.000033 |
 | Opening tables  | 0.000040 |
 | init     | 0.000105 |
 | System lock   | 0.000044 |
 | optimizing   | 0.000038 |
 | statistics   | 0.000188 |
 | preparing   | 0.000044 |
 | Sorting result  | 0.000024 |
 | executing   | 0.000023 |
 | Sending data   | 0.917035 |
 | end     | 0.000074 |
 | query end   | 0.000030 |
 | closing tables  | 0.000036 |
 | freeing items  | 0.000049 |
 | cleaning up   | 0.000032 |

As you can see, there is a definite improvement in efficiency, but it is not obvious, because the amount of data that does need to be scanned is larger, so the Sending data phase takes more time.

At this point, we can refer back to the previous optimization: the [MySQL optimization Case] series - paging optimization

You can then rewrite SQL as follows:


| Handler_read_key   | 16  |
0

Take a look at the profiling statistics for SQL:


| Handler_read_key   | 16  |
1

As you can see, the efficiency has more than doubled, which is pretty good.

As a final note, this problem only occurs when top posts turns the page, and it's fine if the original execution plan is still in place for posts with only 1 or 2 pages of replies.

Therefore, discuz! Officially modify or add a new index, and determine in the code whether top posts page turns or not, if so, force the new index to avoid performance issues.


Related articles: