Some pits in MongoDB (preferably not used)

2020-05-13 03:44:49
OfStack

MongoDB is the current hot NoSQL document database with some great features: automatic failover, automatic sharding, schemaless schemaless, and, for the most part, great performance. However, mint encountered a lot of problems in the deep use of MongoDB. The following is a summary of a few pits we encountered. Special notice: we currently use version 2.4.10 of MongoDB, which was upgraded to version 2.6.0 of MongoDB, but the problem still exists, and then back to version 2.4.10.

MongoDB database level locks

Kengdi index: 5 stars (Max. 5 stars)

The locking mechanism of MongoDB is quite different from that of a general relational database such as MySQL (InnoDB) and Oracle. InnoDB and Oracle can provide row-level granular locks, while MongoDB can only provide library-level granular locks. This means that when one of MongoDB's write locks is in the occupied state, other read-write operations have to wait.

At first glance, it seems that library-level locks have serious problems in the context of large concurrency, but MongoDB can still maintain large concurrency and high performance. This is because although the lock granularity of MongoDB is extensive, it is quite different from that of relational database locks in the lock processing mechanism, mainly manifested in:

MongoDB does not have full transaction support, and the atomicity of operations is only up to a single document level, so the operation granularity is usually small.
The actual time taken by the MongoDB lock is memory data calculation and change time, which is usually very fast;
There is a temporary abandon mechanism for MongoDB lock. When there is a need to wait for slow IO to read and write data, the lock can be temporarily abandoned, and the lock can be acquired again after IO is finished.
Usually no problem does not mean no problem, if the data operation is not appropriate, it will still lead to a long time to occupy the write lock, such as the front desk index operation mentioned below, when this happens, the entire database will be in a completely blocked state, unable to carry out any read and write operations, the situation is 10 points serious.

The way to solve the problem is to avoid using write locks for a long time as much as possible. If there are some collection operations that are really difficult to avoid, you can consider putting this collection into a separate MongoDB library, because the different library locks of MongoDB are isolated from each other, and separating the collection can avoid the global blocking problem caused by one collection operation.

Building indexes causes the database to block

Kengdi index: 3 stars

The problem of MongoDB library level locks was mentioned above. Building an index is a problem that is likely to cause a long write lock. When building an index, MongoDB needs to occupy a write lock (and will not give it up temporarily).

The solution is simple. MongoDB provides two types of indexed access, one is background, which does not require a write lock for a long time, and the other is non-background, which requires a lock for a long time. The problem can be solved by using background. For example, index the oversize table posts, never use it



db.posts.ensureIndex({user_id: 1})

You should use



db.posts.ensureIndex({user_id: 1}, {background: 1})

Improper use of embedded embed document

Kengdi index: 5 stars

embed document is a place where MongoDB is significantly different from the relational database. It is possible to embed other children document in one document, so that the parent and child document can remain in the single 1 collection, which is convenient for retrieval and modification.

For example, in the application scenario of bo he, there is an Group document, and the user applies to join Group and models it as GroupRequest document. At the beginning, we used embed to put GroupRequest into Group. The Ruby code is shown below (Mongoid ORM is used) :



class Group

  include Mongoid::Document

  ...

  embeds_many :group_requests

  ...

end
class GroupRequest

  include Mongoid::Document

  ...

  embedded_in :group

  ...

end

The use of this method let us fall into the pit, almost unable to climb out, it caused nearly two weeks of time system problems, peak hours often a few minutes of the system stuck, the most serious one even caused the MongoDB outage.

After careful analysis, it was found that some active Group's group_requests increased (when there was a new application) and changed (when a user application was approved or rejected) with unusual frequency, and these operations often held the write lock for a long time, causing the entire database to block. The reason is that when the operation group_request is added, the space pre-allocated by Group is not enough, so the space needs to be redistributed (both memory and hard disk), which takes a long time. In addition, there are many indexes built on Group, and moving the location of Group leads to a large number of index update operations, which also takes a long time.

The way to solve the problem is to change the embed association into a normal foreign key association, which is similar to the practice of relational database. In this way, the addition or modification of group_request only occurs on GroupRequest, which is simple and fast, and avoids the problem of long time consuming write lock. When the data of the associated object is not fixed or changes frequently, 1 must avoid using embed association, otherwise it will die miserably.

Improper use of the Array field

Kengdi index: 4 stars

The Array field of MongoDB is a unique feature that allows you to store simple 1-to-many relationships in a single document.

Peppermint had a serious performance problem with one of its application scenarios, as shown in the following code:



class User

  include Mongoid::Document

  ...

  field :follower_user_ids, type: Array, default: []

  ...

end

In User, follower_user_ids is used to save id of people followed by users through one Array type field, follower_user_ids. The number of people followed by users varies from 10 to 3000. Changes are frequent, similar to the problems caused by embed above.

How to solve the problem: we completely solved the problem by moving follower_user_ids to the in-memory database redis, avoiding frequent changes to User in MongoDB. If you do not use redis, you can also create a collection of UserFollower, using a foreign key form of association.

Let's list a few of the above traps. They are not easy traps. You should pay more attention to avoid falling into the pits when using MongoDB process 1.