Details on byte alignment in C++ memory

  • 2020-04-01 23:36:28
  • OfStack

What is byte alignment
Memory space in the computer are divided according to the byte, theoretically seems to access to any type of variables can be from any address, but the reality is when they visit a specific type variables are often in a particular memory address access, which requires all kinds of data according to certain rules in space arrangement, rather than the one by one order of emissions, this is the alignment.

Ii. Functions and reasons of alignment:
1. Platform reason (migration reason) : not all hardware platforms can access any data on any address; Some hardware platforms can only fetch certain types of data at certain addresses, or throw a hardware exception. The processing of storage space varies greatly from hardware platform to hardware platform. Some platforms can only access certain types of data from certain addresses. For example, if the CPU of some architectures makes an error when accessing a variable that is not aligned, then programming in this architecture must ensure byte alignment.

2. Performance reasons: the most common one is that if the data storage is not aligned according to the requirements of the appropriate platform, it will lead to loss of access efficiency. Data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that the processor needs to make two memory accesses to access the unaligned memory, while the aligned memory access needs only one. Some platforms such as every time I read from my address, if an int type (assuming 32-bit system) as if in my address, then a read cycle can read out the 32 bit, and if in a strange address, you need 2 read cycle, and the results of two readout of high and low together to get the 32 bit data bytes. Obviously there's a big drop in read efficiency.

Three, alignment rules
Each compiler on a particular platform has its own default "alignment factor" (also called alignment modulus). Programmers can change this coefficient by precompiling the command #pragma pack(n), n=1,2,4,8,16, where n is the "alignment factor" you specify.
Rules:
1.
Rule of data member alignment: for the data member of a struct (or union), the first data member is placed where the offset is 0, and the alignment of each data member is carried out according to the value specified by #pragma pack and the smaller one in the length of the data member itself.
2, Overall alignment rule for structures (or unions) : after the data members are aligned separately, the structure (or unions) itself is aligned, and the alignment will be in accordance with #pragma pack's numerical value and structure (or unions) maximum data member length, the smaller one.
3, When the n value of #pragma pack is equal to or exceeds the length of all data members, the size of this n value will not have any effect.

Four, please look at the following structure:
  Struct MyStruct
{
  Double dda1;
  Char dda;
The int type
};
What happens if I take sizeof of the structure MyStruct? What is sizeof of MyStruct? You might say sizeof (MyStruct) =sizeof (double) +sizeof (char) +sizeof (int) =13
But when you test the sizeof the above structure in VC, you will find that sizeof (MyStruct) is 16. Do you know why you get this result in VC?
In fact, this is a special treatment of variable storage by VC. In order to improve the CPU storage speed, VC "allocates" the starting addresses of some variables. By default, VC specifies that the offset between the starting address of each member variable and the starting address of the structure must be a multiple of the number of bytes occupied by the variable's type. Below is a list of common types of alignment (vc6.0, 32-bit system).

Type alignment (the offset of the starting address of the variable to the starting address of the structure)
  The Char offset must be sizeof (Char), which is a multiple of 1
  The int offset must be sizeof (int), which is a multiple of 4
  The float offset must be sizeof (float), which is a multiple of 4
  The double offset must be sizeof (double), which is a multiple of 8
  The Short offset must be sizeof (Short), which is a multiple of 2

When the member variables are stored, they apply for space in turn according to the order they appear in the structure. At the same time, they adjust the position according to the above alignment, and the empty byte VC will be filled automatically. At the same time, VC ensures that the size of the structure is a multiple of the byte boundary number of the structure (that is, the number of bytes occupied by the type occupying the largest space in the structure), so after the space is applied for the last member variable, the empty bytes are automatically filled as needed.

Let's use the previous example to show how VC actually stores structure.
Struct MyStruct
{
  Double dda1;
  Char dda;
The int type
};
When allocating space for the above structure, VC first allocates space for the first member dda1 according to the order and alignment of the member variables. The starting address is the same as the starting address of the structure (the offset 0 is just a multiple of sizeof (double)). The member variable occupies sizeof (double) =8 bytes. Next, allocate space for the second member dda. At this time, the offset of the next address that can be allocated to the starting address of the structure is 8, which is a multiple of sizeof (char). Therefore, the location of dda stored at the offset of 8 satisfies the alignment method, and the member variable occupies sizeof (char) =1 byte; Space next to the third member of the type distribution, then the address of the next can be allocated to the starting address of the structure of the offset of 9, not sizeof (int) = 4 times, in order to meet the alignment to offset the constraints of the problem, VC automatically fill 3 bytes (the three bytes there anything), then the address of the next can be allocated to structure the starting address of the offset is 12, just sizeof (int) = 4 multiples, so keep the type in the offset of 12, The member variable occupies sizeof (int) =4 bytes; At this point, the member variables of the entire structure have been allocated space, and the total space occupied is: 8+1+3+4=16, which is just a multiple of the byte boundary number of the structure (that is, the sizeof sizeof (double) =8) occupied by the type occupying the largest space in the structure, so there are no empty bytes to fill. So the sizeof the whole structure is: sizeof (MyStruct) =8+1+ 3+4=16, in which 3 bytes are automatically filled by VC and nothing meaningful is put.

For another example, swap the position of the member variable of the above MyStruct to make it the following:
  Struct MyStruct
{
  Char dda;
  Double dda1;
  The int type
};

How much space does this structure take up? In the VC6.0 environment, sizeof (MyStruc) can be obtained to be 24. Combined with the above mentioned allocation of space some principles, analyze how VC allocation of space for the above structure. (brief description)
Struct MyStruct
{
Char dda; // offset is 0, the alignment is satisfied, dda occupies 1 byte;
Double dda1; // the offset of the next available address is 1, not a multiple of sizeof (double) =8, you need to make up 7 bytes to change the offset to 8 (to satisfy the alignment), so VC automatically fills 7 bytes, dda1 is stored at the address of 8, it occupies 8 bytes.
  The int type. // the offset of the next available address is 16, which is a multiple of sizeof (int) =4. It satisfies the alignment of int, so VC is not needed to fill automatically. Type is stored on the address with an offset of 16, which occupies 4 bytes.
};
All the member variables are allocated space, the total sizeof the space is 1+7+8+4=20, not a multiple of the number of section boundaries of the structure (that is, the number of bytes occupied by sizeof (double) =8 for the type occupying the largest space in the structure), so you need to fill 4 bytes to satisfy the sizeof the structure is a multiple of sizeof (double) =8. So the total sizeof the structure is: sizeof (MyStruc) is 1+7+8+4+4=24. There is a total of 7+4=11 bytes that VC automatically fills, and nothing meaningful is put in it.

VC's special treatment of structured storage does increase the speed of CPU's storage of variables, but sometimes it also brings some trouble. We also mask the default alignment of variables, and we can set the alignment of variables. VC provides #pragma pack (n) to align variables in n bytes. N byte alignment is the starting address of the variable for the offset of two things: first, if n is greater than or equal to the variable of the number of bytes occupied, then the offset must meet the alignment of the default, the second, if n is less than the type of the variable of the number of bytes occupied, then the offset for n times, don't have to meet the default alignment. The total size of the structure also has a constraint, which is divided into two cases: if n is greater than the number of bytes occupied by all the member variable types, then the total size of the structure must be a multiple of the number of space occupied by the variable with the largest space; Otherwise it has to be a multiple of n.

The following examples illustrate their usage.
#pragma pack (push) // save alignment
#pragma pack (4) // set to 4-byte alignment
Struct test
{
Char m1;
Double m4.
  Int m3.
};
#pragma pack (pop) // restore alignment
The size of the above structure is 16, and the following is to analyze its storage. First, the space is allocated for m1, whose offset is 0, satisfying the alignment method (4-byte alignment) we set by ourselves, and m1 occupies 1 byte. Then start allocating space for m4, which has an offset of 1 and needs to be complemented by 3 bytes, so that the offset is a multiple of n=4 (because sizeof (double) is greater than n) and m4 takes up 8 bytes. Then, the space is allocated to m3. At this time, the offset is 12, which is a multiple of 4, and m3 occupies 4 bytes. At this point, space has been allocated for all member variables, a total of 16 bytes, which satisfy multiples of n. If we change the above #pragma pack (4) to #pragma pack (16), then we can get the size of the structure to be 24. (please analyze for yourself)

Look at the following example


#pragma pack ( 8 ) 
struct S1
{
char a;
long b;
};
struct S2
{ 
char c;
struct S1 d;
long long e;
};
#pragma pack ( pop ) 

Sizeof (S2) results in 24. One important condition for member alignment is that each member is aligned separately. That is, each member is aligned in its own way. That is, although the above specifies 8-byte alignment, not all members are 8-byte aligned. The rule of alignment is that each member is aligned by the smaller of the alignment parameters of its type (usually the size of this type) and the one that specifies the alignment parameter (in this case, 8 bytes). And the length of the structure must be an integer multiple of all the alignment parameters used.

In S1, member a is 1 byte by default, and the alignment parameter is 8. Member b is 4 bytes, default is 4 bytes aligned, so sizeof (S1) should be 8;

So in S2, c is aligned in 1 byte like a in S1, and d is a structure, it's 8 bytes, what is it aligned in? For a structure, its default alignment is the largest of the alignment parameters used by all of its members, with S1 being 4. Therefore, member d is 4 byte aligned. Members e is 8 bytes, it is the default at 8 byte alignment, and specified, so it to 8 bytes of the boundary, then, has been in use for 12 bytes, so added four bytes of empty, starting from the 16 bytes placed members e. at this time, the length is 24, already can be divided exactly by 8 (member e aligned at 8 bytes). Thus, a total of 24 bytes are used.
A, b
Memory layout of S1:11**,1111,
D c S1. A S1. B
Memory layout of S2:1***,11**,1111,****11111111

Here are three important points:
1. Each member is aligned in its own way and can minimize length.
2. The default alignment for a complex type (such as a structure) is the alignment of its longest member, so that the length can be minimized if the member is a complex type.
3. The aligned length must be an integer multiple of the largest aligned parameter in the member, so that when working with an array, each item is bound.

6. Sizeof example (note: the following examples have been tested and verified)


std::cout <<"void* size"<<sizeof(void*)<<std::endl; //4
std::cout <<"char size"<<sizeof(char)<<std::endl; //1
std::cout <<"unsigned char size"<<sizeof(unsigned char)<<std::endl; //1
std::cout <<"short size"<<sizeof(short)<<std::endl; //2
std::cout <<"int size"<<sizeof(int)<<std::endl;//4
std::cout <<"unsigned int size"<<sizeof(unsigned int)<<std::endl; //4
std::cout <<"long size"<<sizeof(long)<<std::endl; //4
std::cout <<"long int size"<<sizeof(long int)<<std::endl; //4
std::cout <<"long long size"<<sizeof(long long)<<std::endl; //8
std::cout <<"float size"<<sizeof(float)<<std::endl; //4
std::cout <<"double size"<<sizeof(double)<<std::endl; //8
std::cout <<"time_t size"<<sizeof(time_t)<<std::endl;//8
char bufc[32];
std::cout <<"bufc size"<<sizeof(bufc)<<std::endl;//32

struct teststruct{};
class testclass{};
std::cout <<"struct size"<<sizeof(teststruct)<<std::endl;//1
std::cout <<"class size"<<sizeof(testclass)<<std::endl;//1

class A
{
char c;
int val;
short sh;
};

class B
{
public:
char c;
int val;
short sh;
void func1(void){};
virtual void func2(void){};
};

std::cout <<"class size A"<<sizeof(A)<<std::endl;//12
std::cout <<"class size B"<<sizeof(B)<<std::endl;//16

char*p =NULL;
p=new char[100];
std::cout <<"size p"<<sizeof(p)<<std::endl;//4

Seven, other
1. When writing code, you can use #pragma pack(n),n=1,2,4,8,16 to flexibly control the coefficient of memory alignment. When you need to close the memory alignment, you can use #pragma pack().
2. Matters needing attention
Memory alignment can greatly improve the compiler's processing speed, but it is not always necessary, and sometimes unexpected errors can occur if you are not careful! The most typical case is that in the coding of a network communication program,
Be sure to use #pragma pack() to close the memory alignment before defining the struct or union, this is because the remote host usually does not know which alignment the other party is using, the byte stream received through the socket, and then parsed in bytes
If memory alignment is used, the remote host is likely to get the wrong result! This situation has been encountered when the instruction on the machine, and belongs to a relatively hidden error, debug for a long time to find the problem here.
3. Optimize the structure
Although memory alignment can improve performance, it wastes memory. On modern PCS, this small amount of space is not usually concerned, but on some embedded devices with small memory, it may be necessary to be careful. In fact, we found that we could adjust the order of the members to reduce the waste caused by "empty memory" without affecting the function.


Related articles: