Brief Analysis of Database Table Split Technology of Split Horizontally and Split Vertically and Library Table Hash

  • 2021-06-28 09:51:29
  • OfStack

1. Horizontal Division
What is horizontal division?To make a more image comparison, when eating in the dining hall, there is only one window, the queue for dinner is too long, they all line up in S type, which easily causes anxiety and confusion among the queued people. At this time, a manager stands up, adds more windows for cooking, and breaks the long queue into several queues.A more vivid point of understanding, you take a "scalpel" and cut a large table violently several times. As a result, the large table becomes several small tables.

Horizontal splitting places data in two or more separate tables based on certain conditions.That is, they are divided into records, and different records can be kept separately, with the same number of columns in each subtable.Horizontal cutting divides a table into multiple tables.Each table contains the same number of columns, but fewer rows of data.For example, a table with 1 billion rows can be horizontally divided into 12 tables, each of which represents data for a month in a specific year.Any query that requires data for a specific month only needs to reference the table for that month.

Commonly used to split tables horizontally are the date-time dimension, the region dimension, and, of course, more business dimensions.Here are a few examples to illustrate 1
Case 1: The sales record data of a company is so large that we can divide it horizontally by month, with monthly sales records separated into one table.
Case 2: A group has branches in each region. The order data table of the group is too large. We can cut it horizontally by the region where the branch is located.
Case 3: After a telecommunication company cut its phone list horizontally by date and city, it found that the amount of data was too large, and then they cut it horizontally by brand and number segment
Horizontal splitting is often used in the following situations:
(1) Tables have a large amount of data, which can reduce the number of pages of data and indexes that need to be read in the query, reduce the number of layers of indexes, and speed up the query.
(2) Data in tables are inherently independent, such as those in which data for each region or for different periods are recorded separately, in particular some data are commonly used, while others are not.
(3) Data needs to be stored on multiple media.
(4) It is necessary to separate historical data from current data.
Advantage:
1: Reduce the number of pages of data and indexes to read while querying, and also reduce the number of layers of indexes to speed up the query.
Disadvantages:
1: Horizontal splitting adds complexity to the application. It usually requires multiple table names to query, and union to query all data.In many database applications, this complexity outweighs its benefits because, as long as the index keyword is small, the amount of data in the table increases by two to three times when the index is used for queries, and the number of disks read by an index layer increases when the index is used for queries.

2. Vertical Split
What is vertical division?To make an image of a small company, it has become a multinational enterprise in a few years. The previous department structure obviously can not meet the current business development. CEO clicks to divide the company into finance department, personnel department, production department and sales department.... 1. A number of departments have been set up under each department.This is a fairly image, is there any wood?Ha-ha
You split the table vertically (without breaking the third paradigm), put the primary code (primary key) and some columns in one table, and then put the primary code (primary key) and other columns in another table.Split the original table into multiple tables with fewer columns.Vertical splitting can be used if some columns in one table are common and others are not.
Advantage:
1: Vertical splitting can make row data smaller, one data block (Block) can store more data, and it will reduce the number of I/O queries (less Block is read per query).
2: A vertically split table maximizes the use of Cache.
Disadvantages:
1: After the table is vertically split, the primary code (primary key) is redundant and redundant columns need to be managed
2: Causes table join JOIN operation (increases CPU overhead) which needs to be circumvented from business

3. Library table hash
Tables hash is similar to horizontal splitting, but there is no distinct splitting boundary like horizontal splitting. Hash algorithm is used to spread the data among the tables so that IO is more balanced.1 Generally speaking, we separate databases according to business or function modules, different modules correspond to different databases or tables, and then use a certain strategy to hash a smaller database for a page or function, such as user table, table hash according to user ID, hash 128 tables.You should be able to improve the performance of your system at a low cost and with good scalability

Related articles: