Introduction to consortium union in C language programming

  • 2020-05-07 20:05:18
  • OfStack

Consortium (union) is a special data type in C language that can store different types of data in the same memory location. A consortium can be defined with many members, but only one part can contain a given value at any time. Consortia provide an efficient way to use the same memory location for multiple purposes.

defines the union
To define a union,
must use an union statement that is similar to defining the structure. A new data type is defined in the union declaration, and the program has more than one member. The format of the consortium declaration is as follows:


union [union tag]
{
  member definition;
  member definition;
  ...
  member definition;
} [one or more union variables]; 

The union tag is optional, and each member is defined as one normal variable definition, such as int i; And float f; Or any other valid variable definition. At the end of the union definition, before the final semicolon, you can specify a union of one or more variables, but it is optional. Here we define 1 type named data union with 3 members i, f, and str:


union Data
{
  int i;
  float f;
  char str[20];
} data; 

Now, a data type variable can store an integer, a floating point number, or a string of characters. This means that a single variable structure, the same storage cell, can be used to store multiple types of data. Any built-in or user-defined data type can be used in the union as needed.

The memory footprint through union will be large enough to hold the largest member of the consortium. For example, the data type in the above example will take up 20 bytes of storage, because that is the maximum amount of space taken up by a text string. An example of the total memory footprint of the above union is shown below:


#include <stdio.h>
#include <string.h>
 
union Data
{
  int i;
  float f;
  char str[20];
};
 
int main( )
{
  union Data data;    

  printf( "Memory size occupied by data : %d
", sizeof(data));

  return 0;
}

Let's compile and run the above program, which will produce the following results:


Memory size occupied by data : 20

visits members of the consortium
To access any member of the consortium,
USES the member access operator (.). The member access operator is encoded as the consortium variable name and member, and the union keyword is used to define the consortium type variable. The following is an example to explain the use of the combination:


#include <stdio.h>
#include <string.h>
 
union Data
{
  int i;
  float f;
  char str[20];
};
 
int main( )
{
  union Data data;    

  data.i = 10;
  data.f = 220.5;
  strcpy( data.str, "C Programming");

  printf( "data.i : %d
", data.i);
  printf( "data.f : %f
", data.f);
  printf( "data.str : %s
", data.str);

  return 0;
}

Let's compile and run the above program, which will produce the following results:


data.i : 1917853763
data.f : 4122360580327794860452759994368.000000
data.str : C Programming

Here we can see that the i and f values of the consortium members are corrupted because of the reason why the values of the str members are printed well if the values of the i and f values are allocated to the memory location occupied by the final value of the variable. Now, let's look at the same example again, we will use 1 variable at the same time, which is the main purpose of the union:


#include <stdio.h>
#include <string.h>
 
union Data
{
  int i;
  float f;
  char str[20];
};
 
int main( )
{
  union Data data;    

  data.i = 10;
  printf( "data.i : %d
", data.i);
  
  data.f = 220.5;
  printf( "data.f : %f
", data.f);
  
  strcpy( data.str, "C Programming");
  printf( "data.str : %s
", data.str);

  return 0;
}

Let's compile and run the above program, which will produce the following results:


data.i : 10
data.f : 220.500000
data.str : C Programming

Here, all the members are printed very well, because one part is used once.

applications
can be used with unions (union) when multiple data require Shared memory or when multiple data are only 1 at a time. In C Programming Language 1, the description of the consortium is as follows:

        1) the union is a structure;

        2) all its members have an offset of 0 relative to the base address;

        3) the structural space shall be large enough to accommodate the "widest" member;

        4) its alignment shall be suitable for all its members;

The four descriptions are explained below:

Since all members of the consortium share 1 segment of memory, the offset of the first address of each member relative to the base address of the consortium variable is 0, that is, the first address of all members is 1. In order for all members to share 1 segment of memory, the space must be large enough to hold the widest member of these members. For this "alignment to be appropriate for all members" means that it must conform to the alignment of all members themselves.

Here are some examples:

As a consortium


union U
{
  char s[9];
  int n;
  double d;
};

s takes 9 bytes, n 4 bytes, and d 8 bytes, so it takes at least 9 bytes of space. However, its actual size is not 9, and it was tested with the operator sizeof to be 16. This is because of the byte alignment problem, and 9 is neither divisible by 4 nor divisible by 8. So add bytes to 16 so that all the members align themselves. It can be seen from here that the space occupied by the consortium not only depends on the widest member, but also has relations with all members, that is, its size must meet two conditions: 1) the size is enough to accommodate the widest member; 2) the size is divisible by the size of all the basic data types it contains.

Test procedures:


/* Test combination  2011.10.3*/

#include <iostream>
using namespace std;

union U1
{
  char s[9];
  int n;
  double d;
};

union U2
{
  char s[5];
  int n;
  double d;
};

int main(int argc, char *argv[])
{
  U1 u1;
  U2 u2;
  printf("%d\n",sizeof(u1));
  printf("%d\n",sizeof(u2));
  printf("0x%x\n",&u1);
  printf("0x%x\n",&u1.s);
  printf("0x%x\n",&u1.n);
  printf("0x%x\n",&u1.d);
  u1.n=1;
  printf("%d\n",u1.s[0]);
  printf("%lf\n",u1.d);
  unsigned char *p=(unsigned char *)&u1;
  printf("%d\n",*p);
  printf("%d\n",*(p+1));
  printf("%d\n",*(p+2));
  printf("%d\n",*(p+3));
  printf("%d\n",*(p+4));
  printf("%d\n",*(p+5));
  printf("%d\n",*(p+6));
  printf("%d\n",*(p+7));
  return 0;
}

The output result is:


union Data
{
  int i;
  float f;
  char str[20];
} data; 
0

For sizeof (u1) = 16. Because s takes 9 bytes, n takes 4 bytes, and d takes 8 bytes in u1, at least 9 bytes are required. The basic data types it contains are char, int and double, which account for 1, 4 and 8 bytes respectively. In order to make the space occupied by u1 divisible by 1, 4 and 8, bytes need to be filled to 16, so sizeof(u1)=16.

For sizeof (u2) = 8. Since s accounts for 5 bytes, n for 4 bytes, and d for 8 bytes in u2, at least 8 bytes are required. The basic data types it contains are char, int and double, which account for 1, 4 and 8 bytes respectively. In order to make the space occupied by u2 divisible by 1, 4 and 8, there is no need to fill in bytes, because 8 itself can meet the requirements. Thus sizeof (u2) = 8.

As can be seen from the printed base address of each member, the base address of each member in the commonwealth is the same, equal to the first address of the commonwealth variable.

For u1.n =1, after assigning n of u1 to 1, the data stored in the first 4 bytes of this segment memory is 00000001 00000000 00000000 00000000 00000000

Therefore, taking the data of s[0] means taking the data of the first cell, whose integer value is 1, so the printed result is 1.

As for the printed d for 0.000000 would like to be as follows. Since it is known that the data stored in the first 4 bytes of the segment memory is 00000001 00000000 00000000 00000000 00000000 00000000, the data in the next 4 bytes of the segment memory is 00110000 11001100 01000000 00000000 as printed from the above results of 48,204 644,0

00000000 01000000 11001100 00110000 00000000 00000000 00000000 00000001

For double data, the 63rd 0 is the symbol bit, the 62-52 00000000100 is the order code, and the 0000 11001100 00000000 00000000 00000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001 is the mantissa, and the mantissa value is about 0 according to its value, and the order code is 4-1023=-1019, so the floating point number it represents is 1.0*2^(-1019)=0.00000000000...... , so the output is 0.000000.


Related articles: