Deep understanding of c and c++ memory alignment

  • 2020-04-02 02:09:50
  • OfStack

To improve program performance, data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that in order to access the unaligned memory, the processor needs to make two memory accesses. However, aligned memory access requires only one access.
Memory alignment is general CPU access memory efficiency, improve the running speed) and accuracy (in some conditions, if there is no alignment leads to data synchronization). Rely on the CPU, platform and compiler. Some CPU demanding (this other words is not accurate, but it does depend on the different CPU), and some of the platform has been optimized memory alignment problem, different compiler alignment module. Overall memory alignment belongs to the compiler.

In general, you don't need to worry about memory alignment, which is the compiler's business. However, you still need to understand this concept when encountering some problems. After all, c/c++ values directly operate on the memory language. You need to understand how programs are distributed and run in memory.

The bottom line is: don't let your code depend on memory alignment.


1. Reason: why memory alignment is required.
1. Platform reason (migration reason) : not all hardware platforms can access any data on any address; Some hardware platforms can only fetch certain types of data at certain addresses, or throw a hardware exception.

2. Performance reasons: data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that in order to access the unaligned memory, the processor needs to make two memory accesses. Aligned memory access requires only one access.


2. Memory alignment rules and examples
Before we talk about memory alignment, let's look at the sizes of the various types, and the compiler and the length of the word.
Specific post: http://blog.csdn.net/lyl0625/article/details/7350045
Memory allocation rules for members: starting from the first address of the structure, the first address x that satisfies the condition that x % N = 0 is found for each member, and the length of the entire structure must be the smallest integer multiple of the largest value in the alignment parameter used by each member.
The maximum value of the alignment parameter N for all members of the structure is called the alignment parameter of the structure.

1. Data member alignment rule: for the data member of struct (or union), the first data member is placed at the place with offset equal to 0, and the alignment of each data member is carried out according to the value (or default value) specified by #pragma pack and the smaller one in the length of the data member type. Start looking for an address that is divisible by the current alignment value after the previous alignment.
2, structure (or joint) overall alignment rules: after data members to complete their alignment, structure (or joint) itself has to be aligned. Mainly reflects in, after the last element alignment, behind whether fill the null bytes, if fill, fill. Align the value specified in # pragma pack (or default) and structure (or joint) maximum data member type, the length that is small.
3. Combined with 1, 2 inferences: when the n value of #pragma pack equals or exceeds the length of all data member types, the size of this n value will not have any effect.
Two notes: arrays, nested structures.
Array:
The alignment value is :min(array element type, specify alignment length). However, the elements in the array are stored continuously, and are stored according to the actual length of the array.
For example, char t[9], whose alignment length is 1, actually occupies 9 bytes in a row. Then, depending on the alignment length of the next element, determine how many bytes are filled before the next element.
Nested structures:
Assuming that
Struct a.
{
.
Struct B B;
.
};
For A B structure, the alignment length in A is :min(the alignment length of B structure, the specified alignment length).
The alignment length of structure B is: the alignment length in the above 2 rules of overall alignment of structure.

Example:
VC++6.0 n default is 8 bytes, you can modify the alignment parameters set
You can also use the instruction #pragma     Pack (xx) control.

1. Basic examples


#pragma   pack(n)
struct A
{
char   c; //1byte
double d; //8byte
short s; //2byte
int i; //4byte
};
int main(int argc, char* argv[])
{
A strua;
printf("%len:dn",sizeof(A));
printf("%d,%d,%d,%d",&strua.c,&strua.d,&strua.s,&strua.i);
return 0;
}

1) when n is set to 8byte
Results: the len: 24,
1245032124040124048124052
Memory member distribution is as follows:
Strua. C allocation in a starts at 8 1245032 integer times address (why is this reader thinking himself first, after reading will understand), then the strua. C after allocation strua. D, due to the double is 8 bytes, N = min (8, 8), 8 byte alignment, so from strua. C to find back the first can be divided exactly by 8 address, so take 1245032 + 8, 1245040, strua. S parameters for byte is less than 2 N, N = min (2, 8) so, N = 2, Take 2 bytes aligned, so from strua. Looking for the first can be divided exactly by 2 d behind the address to store strua. S, because strua. D behind the address is 1245048 can be divided exactly by 2, so strua. S followed by distribution, to allocate strua. Now I, int is 4 byte, less than a specified 8 byte alignment parameters, so the N = min (N = 4, 8) 4 byte alignment, strua. S behind the first can be divided exactly by 4 address is 1245048 + 4, Therefore, at the position of 1245048+4, strua.i was allocated and the middle was filled with space. Meanwhile, since the maximum value of N of all members was 8, the length of the whole structure was the minimum integer multiple of 8byte, namely, the rest of 24byte were filled with 0.
So the alignment parameter for this structure is 8byte.
2) when the alignment parameter n is set to 16byte, the result is the same as above and will not be analyzed
3) when the alignment parameter is set to 4byte
The result of the above example is: Len:20
1245036124040124048124052
Memory member distribution is as follows:
Strua. C began with a 4 integer times of address, next to the Strua. C after allocation Strua. D, because the Strua. D length is 8 byte, greater than 4 byte alignment parameters, so that N = min (8, 4) take a minimum of 4 bytes, so back to find the address of the first can be divided exactly by 4 as Strua. D first address, therefore, take 1245036 + 4, then to Strua. D after allocation Strua. S, Strua. S length is 2 byte is less than 4 byte, N = min (2, 4) 2byte alignment, because the address after strua.d is 1245048 can be 2
The length of strua.i is 4byte, so N=min (4, 4) 4byte is aligned. Therefore, the first position divisible by 4, namely 1245048+4, is found backward from strua.s to allocate and strua.i. Meanwhile, the maximum value of N is 4byte, so the length of the whole structure is 20byte of the minimum integer multiple of 4byte
4) when the alignment parameter is set to 2byte
The result of the above example is: Len:16
1245040124042124050124052
After the allocation of strua.c, find a position divisible by 2 to store the strua.d, and so on
5) 1byte alignment:
The result of the above example is: Len:15
1245040124041124049124051
At this point, N=min (sizeof (member), 1), take N=1, since 1 can be divisible into any integer, so the members are allocated in turn, there is no space.
6) when the structure member is an array, instead of treating the entire array as a member, each element of the array is allocated as a member. Other allocation rules remain unchanged, such as changing the structure of the above example to:
Struct a.
{
Char c; / / 1 byte
Double d; / / 8 byte
Short s; / / 2 byte
char   SzBuf [5].
};
If the alignment parameter is set to 8byte, the result is as follows:
Len: 24
1245032124040124048124050
After the s allocation of Strua, the array szBuf[5] of Strua is then allocated. Here, each element of the array szBuf[5] is allocated separately. Since it is of char type, N=min(1,8), and N=1, so the elements of the array szBuf[5] are successively allocated without gaps.

After looking at the above examples, the basic allocation rules and methods should have been known.
The most important thing is to write your own program.

2. Arrays, nested.
Test environment: 64-bit ubuntu; G + + (Ubuntu/Linaro ubuntu5 4.6.3-1) 4.6.3


#include <iostream>
#include <cstdio>
using namespace std;
#pragma pack(8)
struct Args
{
        char ch; 
        double d;
        short st; 
        char rs[9];
        int i;
} args;
struct Argsa
{
        char ch; 
        Args test;
        char jd[10];
        int i;
}arga;
int main()
{
// cout <<sizeof(char)<<" "<<sizeof(double)<<" "<<sizeof(short)<<" "<<sizeof(int)<<endl;
//cout<<sizeof(long)<<" "<<sizeof(long long)<<" "<<sizeof(float)<<endl;
cout<<"Args:"<<sizeof(args)<<endl;
cout<<""<<(unsigned long)&args.i-(unsigned long)&args.rs<<endl;
cout<<"Argsa:"<<sizeof(arga)<<endl;
cout<<"Argsa:"<<(unsigned long)&arga.i -(unsigned long)&arga.jd<<endl;
cout<<"Argsa:"<<(unsigned long)&arga.jd-(unsigned long)&arga.test<<endl;
return 0;
}

Output results:
The Args: 32
10
Argsa: 56
Argsa: 12
Argsa: 32

Struct Args length 32 struct Argsa length :56.
If I change it to #pragma pack (16), I get the same result.
This example proves three things:
When the alignment length is longer than the value of the longest type length in the struct, the alignment length set is equal to null.
The length of the array alignment is compared by the length of the array member type.
The alignment length of the structure contained in a nested structure is the alignment length of the structure.

3. Pointer. Mainly because of 32-bit and 64-bit machine addressing
Test environment same as 2.(64-bit system)


#include <iostream>
#include <cstdio>
#pragma pack(4)
using namespace std;
struct Args
{
        int i;
        double d;
        char *p; 
        char ch; 
        int *pi;
}args;
int main()
{    
        cout<<"args length:"<<sizeof(args)<<endl;
        cout<<(unsigned long)&args.ch-(unsigned long)&args.p<<endl;
        cout<<(unsigned long)&args.pi-(unsigned long)&args.ch<<endl;
        return 0;
}

Set pack to 4:
The args length: 32
8
4

When setting pack to 8:
The args length: 40
8
8
After reading the above, you should be able to analyze why this is the result.

3. Memory alignment in different compilers
VC 6.0 is 8 bytes

GCC is 8byte by default. The test version of GCC (Ubuntu/Linaro 4.6.3-1ubuntu5) is 4.6.3
G++ defaults to 8byte. Test version g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
However, according to the reference, GCC is 4 by default and does not support pragma parameter setting. In the test, GCC is aligned to 8byte by default and pragma parameter is supported.
Two different examples have been tested and the results are the same.

4. When memory alignment is required.
In general, there is no need to change the memory alignment rules of the compiler, because this can degrade the performance of the program, except in the following two cases:

(1) this structure needs to be written directly to the file;

(2) this structure needs to be transmitted to other programs through the network;


Related articles: