C++ copy implementation principle and instance analysis at write time

2020-10-07 18:46:56
OfStack

What is a writing-time copy

Writing-time replication is an optimization strategy in the field of computer programming. Its core idea is that if there are multiple caller requests at the same time the same resources (such as the data on the memory or disk storage), they will get the same pointer to the same common resources, until a caller tried to change the contents of the resources, the system will truly duplicate copies of the special 1 (private copy) to the caller, and the other caller can see the original resources still remain the same.
This procedure is transparent to other callers (transparently).

The main advantage of this approach is that no copy will be created if the caller does not modify the resource, so multiple callers can share the same resource just for read operations.

Copy-on-write technique is a very important optimization method. Its core is lazy processing of entity resource request. It only shares resources among multiple entity resources.

Private resources are actually allocated to an entity only when the entity needs to modify the resource. But copy at write technology also has its advantages and disadvantages:

1. Write-time copying technology can reduce the instantaneous delay caused by allocating and duplicating a large amount of resources, but it actually attaches such delay to subsequent operations.

2. Copy at write reduces unnecessary resource allocation. For example, with the fork process, not all pages need to be copied, and the parent process's code segments and read-only data segments are not allowed to be modified, so there is no need to copy.

2. Application of copy-on-write technology

1, write - time replication in virtual memory management

Typically, such shared-access pages are marked as read-only. When an task attempts to write data into memory, the memory management unit (MMU) throws an exception. The kernel processes the exception and allocates a portion of physical memory for the task and copies the data into this memory.

Issue a reissue to MMU to perform the write operation for that task.

Such as Linux fork () to use when writing copy sheet to implement new process is created, it is a kind of can be delayed or even avoid the data copy technology, at the beginning of the kernel does not replicate the entire address space, but let the father and son to share the address space, only when writing copy address space, makes the process of father and son have separate address space, namely resources replication is occurs when only need to write. Before this are the way of reading and the parent process to share resources, so that the pages are written to the scenario, no fork immediate execution exec () (), don't need to make copy address space, fork () actual overhead is one page table and copy the parent for the child to create a process descriptor, which means only when the process paragraphs in the space of memory content changes, the parent process only 1 copies its contents to the child process, greatly improving the efficiency.

Write - time replication in the data store

File management systems such as Linux use a write-time replication strategy.

For example, if we have a program that writes files constantly based on data coming from the network, it would be a huge performance loss if every fwrite or fprintf had to perform I/O operation on one disk.

So it's common practice to write each file in a specific size block of memory (disk cache) and only write to disk when we close the file (which is why if the file is not closed, the write will be lost)

3. Write - time replication in software applications

The string class in the STL standard template library, which we use a lot, is also a class with writing-time copy technology. To improve performance, many of the classes in STL use write-time copying. However, in the C++11 standard, this 1 policy was removed to improve parallelism


class String
{
public:
  // Constructor (split memory) 
  String(char* tmp)
  {
    _Len = strlen(tmp);
    _Ptr = new char[_Len + 1 + 1];
    strcpy(_Ptr, tmp);
    //  Set the reference count at the end of the array 
    _Ptr[_Len + 1] = 0; 
  }
  // The destructor 
  ~String()
  {
    // Reference count minus 1
    _Ptr[_Len + 1]--; 
    //  Reference count is 0 , free memory 
    if (_Ptr[_Len + 1] == 0)
    {
      delete[] _Ptr;
    }
  }

  // Copy construction (Shared memory) 
  String(string& str)
  {
    if (this->_Ptr != str)
    {
      // The Shared memory ,.data() The return is will string Is converted to char Pointer to type 
      const char *p = str.c_str();
      char* pp;
      strcmp(pp, p);
      this->_Ptr = pp; 
      this->_Len = str.size();
      this->_Ptr[_Len + 1] ++; // Reference counting plus 1

    }
  }

  // right [] When you operate on a string, you copy it when you start writing 
  char& operator[](unsigned int idx)
  {
    if (idx > _Len || _Ptr == 0)
    {
      static char nullchar = 0;
      return nullchar;
    }
    // Reference count minus 1
    _Ptr[_Len + 1]--; 

    char* tmp = new char[_Len + 1 + 1];

    strncpy(tmp, _Ptr, _Len + 1);

    _Ptr = tmp;
    //  Sets the reference count for the new Shared memory 
    _Ptr[_Len + 1] = 0; 
    return _Ptr[idx];
  }

private:
  int _Len;
  char* _Ptr;
};