C memory alignment example details

  • 2020-04-02 02:42:25
  • OfStack

This article describes the concept and usage of memory in C programming language in detail. Share with you for your reference. The details are as follows:

Basic concepts of byte alignment

Memory space in the modern computers are divided according to the byte, theoretically seems to access to any type of variables can be from any address, but the reality is when they visit a specific type variables are often in a particular memory address access, which requires all kinds of data according to certain rules in space arrangement, rather than the one by one order of emissions, this is the alignment. What alignment does and why: the way storage is handled varies greatly from hardware platform to hardware platform. Some platforms can only access certain types of data from certain addresses. Schemes such as some CPU during a visit to a variable without alignment error happens, then under this architecture programming must ensure that the byte alignment. Other platforms may not have this kind of situation, but the most common is if they are not in accordance with the requirements for suitable for its platform to alignment of data storage, will damage on the access efficiency. Some platforms such as every time I read from my address, if an int type (assuming 32-bit system) as if in my address, then a 32 bit read cycle can read this, and if in a strange address, you need 2 read cycle, and the results of two readout of high and low together to get the 32 bit data bytes. Obviously there's a big drop in read efficiency.

Look at the structure below:


struct struct1 
{ 
  double dda; 
  char cda; 
  int ida; 
}; 
sizeof(struct1) = ?

Wrong method:


sizeof(struct1)=sizeof(double)+sizeof(char)+sizeof(int)=13

But when you run the following test code:


#include<stdio.h>
struct mystruct
{
  double dda;
  char cda;
  int ida;
};
int main()
{
  struct mystruct ss;
  printf("%dn",sizeof(ss));
  return 0;
}

The result is: 16

In fact, this is a special handling of the variable store by the compiler. In order to increase the CPU's storage speed, the compiler "allocates" the starting addresses of some variables. By default, the compiler specifies that the offset of the starting address of each member variable relative to the starting address of the structure must be a multiple of the number of bytes occupied by the variable's type. The following lists common types of alignment:

type                       Alignment (the offset of the starting address of the variable to the starting address of the structure)

char                           The offset must be sizeof(char), which is a multiple of 1

int                               The offset must be sizeof(int), which is a multiple of 4

float                         The offset must be sizeof(float), which is a multiple of 4

A double                   The offset must be sizeof(double), which is a multiple of 8

Short                       The offset must be sizeof(short), which is a multiple of 2

When the member variables are stored, they apply for space according to the order in which they appear in the structure, and adjust the position according to the above alignment, and the empty byte compiler will fill in automatically. At the same time, the compiler ensures that the size of the structure is a multiple of the byte boundary number of the structure (that is, the number of bytes occupied by the type occupying the largest space in the structure), so after the space is applied for the last member variable, the empty bytes are automatically filled as needed

Now let's analyze how the compiler stores the structure:


struct struct1 
{ 
  double dda; 
  char cda; 
  int ida; 
}; 

The first member dda allocates space with the same starting address as the starting address of the structure (offset 0 is just a multiple of sizeof(double)), and the member variable occupies sizeof(double)=8 bytes; Next, allocate space for the second member cda. At this time, the offset of the next address that can be allocated to the starting address of the structure is 8, which is a multiple of sizeof(char). Therefore, the cda stored at the offset of 8 satisfies the alignment method, and the member variable occupies sizeof(char)=1 byte; Next IDA allocate space for the third members, then the address of the next can be allocated to the starting address of the structure of the offset of 9, not sizeof (int) = 4 times, in order to meet the alignment to offset the constraints of the problem, VC automatically fill 3 bytes (the three bytes there anything), then the address of the next can be allocated to structure the starting address of the offset is 12, just sizeof (int) = 4 multiples, so keep IDA in the offset to 12, the member variables occupy sizeof (int) = 4 bytes; At this point, all the member variables of the entire structure have been allocated space, and the total space occupied is: 8+1+3+4=16, which is just a multiple of the byte boundary number of the structure (that is, the sizeof sizeof(double)=8) occupied by the type occupying the largest space in the structure. There are no empty bytes to fill. So the sizeof the whole structure is: sizeof(struct1)=8+1+ 3+4=16, in which 3 bytes are automatically filled by VC and nothing meaningful is put.

For another example, swap the position of the member variable of the struct1 above to make it the following:


struct mystruct2
{
  char cda;
  double dda;
  int ida;
};

  The result is: 24


struct mystruct2
{
  char cda;  //The offset is 0, the alignment is satisfied, cda occupies 1 byte;
  double dda; //The next available address has an offset of 1, not sizeof(double)=8
         //To make the offset to 8 (which satisfies alignment)
         //So VC is automatically filled with 7 bytes, and dda is stored at an offset of 8
         //It takes 8 bytes on the address of.

  int ida;   //The next available address has an offset of 16, which is times sizeof(int)=4
         //Number, which satisfies the alignment of int, so VC is not needed to fill automatically, type saved
         //On an address with an offset of 16, it takes four bytes.
  
  //All member variables are allocated space, the total size of the space is 1+7+8+4=20, not the structure
  //Is the number of bytes sizeof occupied by the type that takes up the most space in the structure
  //(double)=8), so you need to fill in 4 bytes to satisfy the size of the structure
  //Sizeof (double)= a multiple of 8.
};

So the total sizeof the structure is: sizeof(struct2) is 1+7+8+4+4=24. There is a total of 7+4=11 bytes that VC automatically fills, and nothing meaningful is put in it.

Second, #pragma pack(n) to set the variable to n byte alignment

VC's special treatment of structured storage does increase the speed of CPU's storage of variables, but sometimes it also brings some trouble. We also mask the default alignment of variables, and we can set the alignment of variables. VC provides #pragma pack(n) to align variables in n bytes. N byte alignment means that the offset of the starting address of the variable is stored in two ways:

First, if n is greater than or equal to the number of bytes occupied by the variable, the offset must satisfy the default alignment.

Second, if n is less than the number of bytes occupied by the type of the variable, the offset is a multiple of n, not satisfying the default alignment.

The total size of the structure also has a constraint, which is divided into two cases: if n is greater than the number of bytes occupied by all the member variable types, then the total size of the structure must be a multiple of the number of space occupied by the variable with the largest space; Otherwise it has to be a multiple of n. The following examples illustrate their usage:


#pragma pack(push) //Save alignment
#pragma pack(4)//Set to 4-byte alignment
struct test 
{ 
  char m1; 
  double m4; 
  int m3; 
}; 
#pragma pack(pop)//Restore alignment

The size of the above structure is 16, and the following is to analyze its storage. First, the space is allocated for m1, whose offset is 0, satisfying the alignment method (4-byte alignment) we set by ourselves, and m1 occupies 1 byte. Then start allocating space for m4, which has an offset of 1 and needs to be complemented by 3 bytes, so that the offset is a multiple of n=4 (because sizeof(double) is greater than n) and m4 takes up 8 bytes. Then, the space is allocated to m3. At this time, the offset is 12, which is a multiple of 4, and m3 occupies 4 bytes. At this point, space has been allocated for all member variables, 4+8+4=16 bytes, which is a multiple of n. If we change the above #pragma pack(4) to #pragma pack(16), then we can get the size of the structure to be 24.

Here's another example:


#pragma pack(8)
struct S1{
  char a;
  long b;
};
struct S2 {
  char c;
  struct S1 d;
  long long e;
};
#pragma pack()

One important condition for member alignment is that each member is aligned separately. That is, each member is aligned in its own way.

That is to say, although the above specified by 8 byte alignment, but not all of the members are 8 byte alignment. The alignment of the rule is that each member according to the type of alignment parameter (usually the size of the type) and specify the alignment parameters (here is eight bytes) the smaller of a alignment. And the length of the structure must be for all alignment parameters of the integer times, not enough is null bytes.

In S1, member a is 1 byte by default, and the alignment parameter is 8. Member b is 4 bytes, default is 4 bytes aligned, so sizeof(S1) should be 8;

In S2,c is 1 byte aligned like a in S1, and d is a structure, it's 8 bytes, what is it aligned with ? For structure, its default alignment is the alignment parameters in all of its members to use one of the biggest, S1 is 4. So, members are aligned at 4 byte. D e 8 bytes, it is the default at 8 byte alignment, as well as specified, so it to 8 bytes on the border, at this moment, has been in use for 12 bytes, so added four bytes of empty, e. since 16 bytes placed members at this time, the length is 24, already can be divided exactly by 8 (members of e by 8 byte alignment). In this way, sizeof (S2) for 24 bytes.

Here are three important points:

1. Each member is aligned in its own way and can minimize length.

2. The default alignment for a complex type (such as a structure) is the alignment of its longest member, so that the length can be minimized if the member is a complex type.

3. The aligned length must be an integer multiple of the largest aligned parameter in the member, so that when working with an array, each item is bound.

Iii. Alignment in stdarg.h file of minix

In the stdarg.h file of minix, the following macro is defined:




#define __va_rounded_size(TYPE) 
 (((sizeof (TYPE) + sizeof (int) - 1) / sizeof (int)) * sizeof (int))

As you can see from the names of the comments and macros, this is about memory alignment. According to the previous theory about the C language memory alignment

N byte alignment means that the offset of the starting address of the variable is stored in two ways:

First, if n is greater than or equal to the number of bytes occupied by the variable, the offset must satisfy the default alignment ( The offset of the starting address of each member variable relative to the starting address of the structure must be a multiple of the number of bytes occupied by the type of the variable );

Second, if n is less than the number of bytes occupied by the type of the variable, the offset is a multiple of n, not satisfying the default alignment.

Where n = 4, for sizeof(TYPE) must be a natural number, sizeof(int) -1 = 3

Sizeof (TYPE) can only occur in two cases:

(1) when the sizeof (TYPE) > = 4, offset = (sizeof(TYPE)/4)*4

(2) when the sizeof (TYPE) < 4, the offset is equal to 4

Sizeof (TYPE) = 1 or 2 or 3, and sizeof(TYPE) + 3/4   = 1

To unify the two cases, the offset = ((sizeof(TYPE) + 3) / 4) * 4

  In some of the source code, memory alignment macros with s/s are implemented by bit operation. The code is as follows:


#define __va_rounded_size(TYPE) 
  ((sizeof(TYPE)+sizeof(int)-1)&~(sizeof(int)-1))

Since ~(sizeof(int) w1)) =~ (4-1) =~ (00000011B) =11111100B

(sizeof(TYPE) + sizeof(int) wok 1) is to raise the number greater than 4m but less than or equal to 4(m+ 1) to be greater than or equal to 4(m+ 1) but less than 4(m+2), so that the original length is exactly filled up to a multiple of 4.

I believe that this paper is of certain reference value to the study of C programming.


Related articles: