MongoDB query performance optimization validation and validation

  • 2020-06-01 11:14:56
  • OfStack

Conclusion:

1, 200w data, reasonable use of the index in the case of a single stationId under 4w data. mongodb query and sorting performance is ideal, no regular when client can be completed in 600ms+, qps300+. When there is a regular, client can complete the query at 1300ms+, qps140+.

2. The performance of count of Mongodb is poor. client can complete the query at 330ms in the case of non-concurrency, and 1-3s is required in the case of concurrency. Can consider to estimate method of the total, http: / / blog sina. com. cn/s/blog_56545fd30101442b html

Test environment: mongodb USES replica set, 1 master 2 slave, 96G memory, version 2.6.5

Mem consumption (collection of 4 200w data) :


Space consumption (collection finally selected for test data) :


Jvm: -Xms2G -Xmx2G

33 ms Ping delay

ReadPreference.secondaryPreferred ()

There is no regular

1. Create stationId, firmId composite query scenario (200w set, 12 fields)

Number of queries: 20000

Query criteria: multi-criteria query 10 records, and get records one by one


String key = " Clear spring " + r.nextInt(1000);
Pattern pattern = Pattern.compile(key);
BasicDBObject queryObject = new BasicDBObject("stationId",
new BasicDBObject("$in", new Integer[]{20}))
.append("firmId", new BasicDBObject("$gt", 5000))
.append("dealCount", new BasicDBObject("$gt", r.nextInt(1000000))); DBCursor cursor = collection.find(queryObject).limit(10).skip(2);

Concurrent: 200

Time: 61566

Single time (server) : 124ms

Qps:324.85

2. Create stationId, firmId composite query scenario (200w set, 12 fields)

Number of queries: 20000

Query criteria: multi-criteria queries the order of 10 records, and gets the records one by one


String key = " Clear spring " + r.nextInt(100);
Pattern pattern = Pattern.compile(key);
BasicDBObject queryObject = new BasicDBObject("stationId",
new BasicDBObject("$in", new Integer[]{4, 20}))
.append("firmId", new BasicDBObject("$gt", 5000))
.append("dealCount", new BasicDBObject("$gt", r.nextInt(1000000))); DBCursor cursor = collection.find(queryObject)
.sort(new BasicDBObject("firmId", 1)).limit(10).skip(2);

Concurrent: 200

Time: 63187

Single time (server) : 119ms

Qps:316.52

3. Create stationId, firmId composite query scenario (200w set, 12 fields)

Number of queries: 2000

Query criteria: number of records queried by multiple conditions


String key = " Clear spring " + r.nextInt(100);
Pattern pattern = Pattern.compile(key);
BasicDBObject queryObject = new BasicDBObject("stationId",
new BasicDBObject("$in", new Integer[]{4, 20}))
.append("firmId", new BasicDBObject("$gt", 5000))
.append("dealCount", new BasicDBObject("$gt", r.nextInt(1000000))); 
long count = collection.count(queryObject);

Concurrent: 200

Time: 21887

Single time (client) : 280ms

Qps:91.38

There are regular

4. Create stationId, firmId composite query scenario (200w set, 12 fields)

Number of queries: 20000

Query criteria: multi-criteria query 10 records, and get records one by one


String key = " Clear spring " + r.nextInt(1000);
Pattern pattern = Pattern.compile(key);
BasicDBObject queryObject = new BasicDBObject("stationId",
new BasicDBObject("$in", new Integer[]{20}))
.append("firmId", new BasicDBObject("$gt", 5000))
.append ("dealCount", new BasicDBObject("$gt", r.nextInt(1000000)))
.append("firmName", pattern);
DBCursor cursor = collection.find(queryObject).limit(10).skip(2);

Concurrent: 200

Time: 137673

Single time (server) : 225ms

Qps:145.27

5. Create stationId, firmId composite query scenario (200w set, 12 fields)

Number of queries: 20000

Query criteria: multi-criteria queries the order of 10 records, and gets the records one by one


String key = " Clear spring " + r.nextInt(1000);
Pattern pattern = Pattern.compile(key);
BasicDBObject queryObject = new BasicDBObject("stationId",
new BasicDBObject("$in", new Integer[]{4, 20}))
.append("firmId", new BasicDBObject("$gt", 5000))
.append ("dealCount", new BasicDBObject("$gt", r.nextInt(1000000)))
.append("firmName", pattern);
DBCursor cursor = collection.find(queryObject)
.sort(new BasicDBObject("firmId", 1)).limit(10).skip(2);

Concurrent: 200

Time: 138673

Single time (server) : 230ms

Qps:144.22

6. Create stationId, firmId composite query scenario (200w set, 12 fields)

Number of queries: 2000

Query criteria: number of records queried by multiple conditions


String key = " Clear spring " + r.nextInt(1000);
Pattern pattern = Pattern.compile(key);
BasicDBObject queryObject = new BasicDBObject("stationId",
new BasicDBObject("$in", new Integer[]{4, 20}))
.append("firmId", new BasicDBObject("$gt", 5000))
.append ("dealCount", new BasicDBObject("$gt", r.nextInt(1000000)))
.append("firmName", pattern);
long count = collection.count(queryObject);

Concurrent: 200

Time: 23155

Single time (client) : 330ms

Qps:86.37

MongoDB index features

1, the composite index must hit the first field, otherwise it will not take effect. Later fields can be hit out of order.

2. The more compound index fields, the more space they take up, but the less impact on query performance (except array index).

3. The index will be selected according to the sort field, and the priority will exceed the non-first field in the composite index.


4, hit the composite index, data volume < In the case of 10w, filtering non-indexed fields is also more efficient.


5. The performance of full-text retrieval is poor. When the 200w data hits 50w, the full-text retrieval needs 10+s, and the regular needs 1s.

MongoDB client configuration, can be pulled out to make an spring injection, set the maximum number of connections and so on.


MongoClientOptions options =
MongoClientOptions.builder().maxWaitTime(1000 * 60 * 2)
.connectionsPerHost(500).build();
mongoClient = new MongoClient(Arrays.asList(new ServerAddress("10.205.68.57", 8700),
new ServerAddress("10.205.68.15", 8700),
new ServerAddress("10.205.69.13", 8700)), options);
mongoClient.setReadPreference(ReadPreference.secondaryPreferred());

docx is the test data in the final scene, which can be divided into regular and non-regular.

mongoDB survey _remote.docx is the data in the process of test and verification, which may be cached.

What do you know about the query optimization principles of MongoDB? The details are as follows:

1. Choosing to create an index on the fields of query conditions, sorting conditions and statistical conditions can significantly improve the query efficiency.

2. With $or, put the condition that matches the most results in the first place; with $and, put the condition that matches the least results in the first place.

3. Use limit() to limit the size of the returned result set to reduce the resource consumption of the database server and the amount of data transferred over the network.

4. Use $in as little as possible and instead break it down into single 1 queries, one by one. Especially on sharding, $in will require your query to look up each sharding once and, if necessary, index each sharding.

5. Try not to use fuzzy matching queries, but to replace them with other exact matching queries, such as $in and $nin.

6. If the query volume is large and the concurrency is large, the front-end plus cache can be used to solve the problem.

7. Operations that do not use safe mode do not use safe mode, so that the client does not have to wait for the database to return the results of the query and handle exceptions, which is an order of magnitude faster.

8. The intelligent query optimization of MongoDB determines that the granularity is query condition, while skip and limit are not in its judgment. When paging the last few pages of the query, order is used to reverse sort.

9. Reduce cross-sharding queries as much as possible, balance equalization times are less.

10. Query only the fields to use, not all fields.

11. When updating the value of a field, using $inc is more efficient than update.

12.apped collections is more efficient in reading and writing than regular collections.

13. server-side processing is similar to the SQL query stored procedure, which reduces network communication overhead.

14. Use hint() when necessary to force an index query.

15. If you have your own primary key column, use your own primary key column as id to save space and not create an extra so.

16. Use explain and optimize according to exlpain plan.

17. Try to use $in, $nin instead of $in for range queries.

18. View the database query log, specifically analyze the inefficient operation.

19.mongodb has a database optimization tool, database profiler, which can detect the performance of database operations. You can find inefficiencies in query or write operations and optimize for those operations.

20. Try to put as many operations as possible on the client side, which is of course the design concept of mongodb.


Related articles: