MongoDB is the only index of of Unique

  • 2020-11-30 08:38:04
  • OfStack

Writing in the front

There are many types of indexes supported by MongoDB, such as single-key indexes, composite indexes, multi-key indexes, TTL indexes, text indexes, spatial and geographic indexes, etc. An attribute of an index can have only one sex, i.e. only one index. The only index is used to ensure that the index field does not store duplicate values, that is, to enforce the uniqueness of the index field. By default, the _id field of MongoDB automatically creates a unique index when a collection is created. This article focuses on the use of only one index.

I'm not going to explain what an index is and what the only index is, but if you don't know, you can go to Google or baidu. What caused me to write this article came from a question in a previous project.

We use MongoDB data to store user information. The user table used to register users through their mobile phone number, so it is reasonable to add the only index (Unique) to the mobile phone number. There is nothing wrong with this. Later on, we changed our requirements. You can also imagine that you can register with both mobile phone number and email. At this time, due to the addition of the Unique index to the mobile phone number, in fact, there will be problems.


func init() {
 phoneIndex := mgo.Index{
 Key: []string{"phone"},
 Unique: true,
 }

 col := db.Collection(&User{})
 col.EnsureIndex(phoneIndex)
}

, of course, this question is easy to think of, when the user through the email address registered mobile phone number to fill in the blanks, at this time the first will be a problem, in this way next user registration will prompt built on phone index values repeat, very normal, because insert two null values, note that there is an empty string, rather than null.

Since MongoDB is a document-type flexible database, it is not affected to insert more than one field, so we try to modify the entry of User entity Phone field. When phone is an empty string, we do not insert this field. So we added the omitempty tag to the phone field (our microservice is written in Go). The following is part of User1:


type User struct {
 Email  string `bson:"email"`
 Salt  string `bson:"salt"`
 Phone  string `bson:"phone,omitempty"`
 IDCard string `bson:"idcard"`
 RealName string `bson:"realname"`
 AuthStatus int `bson:"auth_status"`
}

You can see that the phone field is followed by the omitempty tag, which means that the field is not inserted when it is empty. It's still going to be a problem, so if it's still going to be a problem why would you want to do that? This stems from experience with Mysql, where it is customary to assume that MongoDB and Mysql do not index the value of null. In other words, in Mysql, Phone with a value of Null in multiple records is allowed.

The above method will still report an error, indicating that a duplicate value has been inserted, but instead of an empty string, it will be null. So sometimes don't bring the Mysql set. Mysql works, but Mongo doesn't. mongo will still index the record even if the field is inserted.

I like to read the official documents. Here is the official document of MongoDB:

[

If a document does not have a value for the indexed field in a unique
index, the index will store a null value for this document. Because of
the unique constraint, MongoDB will only permit one document that
lacks the indexed field. If there is more than one document without a
value for the indexed field or is missing the indexed field, the index
build will fail with a duplicate key error.

]

In fact, I have made it very clear that I can understand even a little Bit of English. The following is the translation:

[

If the document does not have the value of the index field in the unique 1 index, the index stores the null value for the document. Because of the only 1 constraint, MongoDB allows only one document with a missing index field. If more than one document does not have the value of an index field or is missing an index field, the index build will fail with a duplicate key error.

]

That is to say, the field even if not in the document, so this field will be save null value, on the field also cannot appear two null value at the same time, this is why the above that it still doesn't work, but also broke the data structure, above) that way though phone number missing, but the database should not be a lack of this field, although is a relational database, after all, still have to consider the business design.

The solution

Is there no solution? Of course, Mongo provides Sparse Index, which is translated as sparse index. Here is an example of creating a sparse index:


db.getCollection("test").createIndex( { "phone": 1 }, { sparse: true })

After executing the above statement, documents that do not have the phone field are not indexed. That is to say, it is indexed only if it exists, so combining it with the Unique index is useful. Unqiue is only 1, Sparse is an existence-only index. So, when phone or email is empty we can not insert it and that's possible.


db.getCollection("test").createIndex( { "phone": 1 }, { sparse: true,unique: true } )

Above is the mongo shell syntax. We usually create the index in the code and modify it as follows (of course, the Phone field omitempty tag is still needed in the User structure) :


func init() {
 phoneIndex := mgo.Index{
 Key: []string{"phone"},
 Unique: true,
 Sparse: true,
 }

 col := db.Collection(&User{})
 col.EnsureIndex(phoneIndex)
}

But again, as we said earlier, it breaks the data structure. Well, you get what you want. Of course, we can also from the business level to solve, such as the registration of its query and other operations, of course, will consume 1 qualitative energy, no matter you are the space for time, or take time for space must pay 1, do not be a too greedy person.

conclusion


Related articles: