OpenStack Ceilometer USES MongoDB to solve the problem of taking up too much disk space

  • 2020-05-30 21:17:50
  • OfStack

OpenStack Ceilometer USES MongoDB to solve the problem of taking up too much disk space

Background :Ceilometer USES MongoDB as a database, constantly sampling, resulting in data bloating and taking up too much disk space.

Knowledge background

1. Database file type

1.1. journal log file

Unlike some traditional databases, MongoDB's log files are only used to recover memory data that has not yet been synchronized to the hard disk in the event of a system outage. The log files will be stored in a separate directory. At startup, MongoDB will automatically pre-create three log files (initially empty) of 1G each.

1.2. namespace table name file dbname.ns

This file is used to store the collection of the entire database and the name of the index. This file is small, and by default 16M can store 24,000 collection or index names and the specific locations of those collections and indexes in the data file. This file, MongoDB, lets you know where to start looking for or inserting collection data or index data.

1.3. Data files dbname.0, dbname.1... dbname. n

The data and indexes of MongoDB are stored in one or more MongoDB data files. The first data file will be named "database name.0", such as my-db.0. The default size of this file is 64M. Before running out of 64M, MongoDB will generate the next data file in advance, such as my-db.1. The size of the data file is doubled. The size of the second data file is 128M and the third is 256M. 1 stops when it reaches 2G, and 1 simply adds a new file to the size of 2G.

2. Database size parameter

2.1. dataSize

dataSize is the one parameter closest to the real data size. You can use it to check how much data you have. This size includes the sum of each record in the database (or collection). Note that each record has additional costs of header and padding in addition to the BSON documentation. So the actual size is going to take up a little bit more space than the actual data.

2.2. storageSize

This parameter is equal to the sum of all Data Extents used by the database or a collection. Note that this number will be larger than dataSize because Extent will have 1 fragment left over from the deleted document. If a newly inserted document is less than or equal to the size of the fragment, MongoDB will reuse this fragment to store the new document. But until then the pieces will stay there and take up space. For this reason, this parameter does not get smaller when you delete a document.

2.3. fileSize

This parameter is only valid on the database and refers to the size of the file used in the actual file system. It includes the sum of all the data Extents, the sum of the index Extent, and some unallocated space. As mentioned earlier, MongoDB pre-allocates database files when they are created. For example, the minimum is 64M, even if you only have a few hundred KB data. So this parameter might be much larger than the actual data size. This extra unused space is used to ensure that MongoDB can quickly allocate new Extent when new data is written, avoiding the delay caused by disk space allocation.

The solution

1. Reduce the size of pre-allocation (or disable pre-allocation)

Considering the pre-allocation mechanism of MongoDB, you can reduce the size of the pre-allocation or disable the pre-allocation. However, this approach can affect database behavior. If the database does not have frequent big data write actions, you can take this approach.

2. Data compression

The conpact command compresses collection, reducing the size of the data.


db.runCommand({compact:'CollectionName'})

Here's the thing to note:

1) when the operation is in progress, collection of the current operation will be locked;
2) the compact command will not free the disk space, but the new disk request will use the cleaned space;

3. Export and import data

The function of mongodump is to export the database, and mongorestore is to import the exported database, during which the index will be rebuilt. Therefore, if the database has been deleted before and the space has not been released, the deleted space will be released after the import.

4. Regularly delete unused data

Considering the actual situation, sample adopted by ceilometer can be eliminated after the expiry date.

Steps:

1) by changing/etc ceilometer/ceilometer conf time_to_live parameters specify the validity of the sample;
2) run ceilometer-expirer to delete expired sample;
3) restart openstack-ceilometer-collector service;
4) after the deletion, use repairDatabase to repair the database.

Note that the repairDatabase operation requires disk free space for the current total amount of data plus 2G. If you are currently running out of disk partition space, you can try to specify a sufficiently sized partition path with the wx158en parameter.

Thank you for reading, I hope to help you, thank you for your support of this site!


Related articles: