Summarize 7 reasons why Docker is not suitable for deploying databases

2021-10-11 20:03:20
OfStack

Docker has been very hot in recent 2 years. Developers can't wait to deploy all applications and software in Docker containers, but are you sure you want to deploy databases in containers as well? This problem is not false, because many kinds of operation manuals and video tutorials can be found on the Internet. Here, some reasons why databases are not suitable for containerization are sorted out for your reference. At the same time, I hope you can be cautious when using them.

So far, it is very unreasonable to container the database, but I believe all developers have tasted the sweetness of container, and hope that with the development of technology, a more perfect solution will appear.

7 Reasons Docker Is Not Suitable for Deploying Databases

1. Data security issues

Don't store data in containers, which is also one of the official container usage tips of Docker. The container can be stopped or deleted at any time. When the container is dropped by rm, the data in the container will be lost. To avoid data loss, users can use data volume mount to store data. However, the Volumes design of the container provides persistent storage around the Union FS mirror layer, and the data security is not guaranteed. If the container suddenly crashes and the database does not shut down properly, the data may be damaged. In addition, the sharing of data volume groups in containers will cause great damage to the hardware of physical machines. Even if you want to store Docker data on the host, it still can't guarantee not to lose data. With the current storage driver, Docker still has the risk of unreliability. If the container crashes and the database does not shut down correctly, the data may be corrupted.

2. Performance issues

As we all know, MySQL is a relational database, which requires higher IO. When one physical machine runs multiple times, IO will accumulate, which leads to the bottleneck of IO and greatly reduces the reading and writing performance of MySQL. In a special session on the top 10 difficulties in the application of Docker, an architect of a state-owned bank once put forward: "The performance bottleneck of the database appears on IO, and if the idea of Docker is followed, then many docker final IO requests will appear on the storage again. Nowadays, most of the databases on the Internet are the architecture of share nothing, which may be a factor that does not consider migrating to Docker.

Some students may have corresponding solutions to solve performance problems:

(1) Separation of database program and data

If you use Docker to run MySQL, the database program and data need to be separated, the data is stored in shared storage, and the program is put into a container. If the container has an exception or MySQL service exception, a brand-new container will be started automatically. In addition, it is recommended not to store data in the host machine, and the host machine and container share the volume group, which has a great impact on the damage of the host machine.

(2) Run lightweight or distributed databases

When Docker deploys lightweight or distributed databases, Docker itself recommends that the service hang up and automatically start the new container instead of continuing to restart the container service.

(3) Reasonable layout and application

For applications or services with high requirements of IO, it is more appropriate to deploy the database in physical machines or KVM. At present, TDSQL of TX Cloud and Oceanbase of Ali are directly deployed on physical machines instead of Docker.

3. Network problems

To understand Docker networks, you must have an in-depth understanding of network virtualization. The database requires dedicated and persistent throughput to achieve higher load. The unresolved Docker network problem is still unresolved in version 1.9. Put these problems in one case. Containerization makes database containers difficult to manage. How long do you need to solve the Docker network problem? Wouldn't it be better to put the database in a dedicated environment? Save time to focus on really important business goals.

4. Status

Packaging stateless services in Docker is cool, enabling orchestration containers and solving single point of failure. But what about the database? Placing the database in the same environment will make it stateful and widen the range of system failures. The next time your application instance or application crashes, the database may be affected. Knowledge: Horizontal scaling in Docker can only be used for stateless computing services, not databases. An important feature of Docker's rapid expansion is stateless, and those with data status are not suitable for being directly placed in Docker. If the database is installed in Docker, the storage service needs to be provided separately. At present, TDSQL (Financial Distributed Database) of TX Cloud and Oceanbase (Distributed Database System) of Alibaba Cloud both run directly on physical machines, not on Docker which is easy to manage.

5. Resource isolation

In terms of resource isolation, Docker is really inferior to virtual machine KVM, and Docker uses Cgroup to realize resource restriction, which can only limit the maximum value of resource consumption, but cannot isolate other programs from occupying their own resources. If other applications excessively occupy physical machine resources, it will affect the reading and writing efficiency of MySQL in the container. The more isolation levels you need, the more resource overhead you get. Compared with dedicated environment, easy horizontal scaling is one of the major advantages of Docker. However, in Docker, horizontal scaling can only be used for stateless computing services, and databases are not suitable. We don't see any isolation for the database, so why should we put it in a container?

6. Inapplicability of cloud platform

Most people start projects through a shared cloud. Cloud simplifies the complexity of virtual machine operation and replacement, so there is no need to test the new hardware environment at night or on weekends when no one is working. Why do we need to worry about the environment in which this instance runs when we can start an instance quickly? That's why we pay a lot of fees to cloud providers. When we place the database container for the instance, these conveniences do not exist. Because the data is not 1, the new instance will not be compatible with the old instance. If we want to restrict the instance from using stand-alone services, we should let DB use a non-containerized environment. We only need to reserve the ability to scale flexibly for the computing service layer.

7. Environmental requirements for running the database

It is common to see that the DBMS container and other services run on the same host. However, the hardware requirements of these services are very different. Databases (especially relational databases) have higher requirements for IO. 1. The database engine uses a dedicated environment to avoid concurrent resource competition. If you put your database in a container, it will waste the resources of your project. Because you need to configure a lot of extra resources for this instance. In the public cloud, when you need 34G memory, the instance you start must have 64G memory. In practice, these resources are not fully used. How to solve it? You can design hierarchically and use fixed resources to start multiple instances at different levels. Horizontal scaling is always better than vertical scaling.

Summarize

In response to the above question, does it mean that Database 1 should not be deployed in a container? The answer is: No

We can be loss of data insensitive business (search, embedding point) can be container, using database fragmentation to increase the number of instances, thereby increasing throughput.

docker is suitable for running lightweight or distributed databases. When docker service hangs up, it will automatically start a new container instead of continuing to restart the container service. Using middleware and containerized system, the database can automatically scale, recover disasters and switch, and has multiple nodes, which can also be containerized.

Docker-Deployment Database Knowledge Points Supplement

Method of Deploying Database in Docker

cd /usr/local/docker/tomcat/

docker pull mysql:5.7.22


docker run -p 3306:3306 --name mysql \
-v /usr/local/docker/mysql/conf:/etc/mysql \
-v /usr/local/docker/mysql/logs:/var/log/mysql \
-v /usr/local/docker/mysql/data:/var/lib/mysql \
-e MSYQL_ROOT_PASSWORD=123456 \
-d mysql:5.7.22

cd /usr/local/docker/mysql/

docker run -it --rm mysql:5.7.22 bash
ls -al