C++ union union usage examples

  • 2020-04-02 03:07:15
  • OfStack

This article illustrates the C++ consortium union usage as an example. Share with you for your reference. The details are as follows:

We should use the union according to the convention in C, which is the point of my article. While C++ allows us to extend something new, I suggest you don't do that, and after reading this article, I think you probably do too.

C because there is no concept of class, all types can be regarded as the combination of basic types, so it is quite natural to include structs in the union. After C++, since structs in C++ are generally considered to be basically equivalent to classes, can there be class members in the union? Let's take a look at the following code:


struct TestUnion
{
    TestUnion() {}
};

typedef union
{
    TestUnion obj;
} UT;

int main (void)
{
    return 0;
}

To compile the program, we will be told:
The error C2620: union '__unnamed' : member "obj" from the user - defined constructor or non - trivial default constructor

If you remove the constructor that doesn't do anything, everything is OK.

Why doesn't the compiler allow our union members to have constructors? I can't find a more authoritative explanation for this question, but here is my explanation:

If the C++ standard allows us to have a constructor for our union, should we execute this constructor when allocating space? If the answer is yes, then if the constructor for TestUnion contains some memory allocation operations or other changes to the entire application state, then it might make sense if I use obj in the future, but what if I don't use obj at all ? The modification of system state caused by the introduction of obj is obviously unreasonable. On the other hand, if the answer is no, then once we select obj to operate in the future, all the information is not initialized (if it is a normal struct, that's fine, but what if there is a virtual function ?). . Further, suppose now our union is not only a TestUnion obj, there is also a TestUnion2 obj2, both have a constructor, and are carried out in the constructor some memory allocation of work (or even did a lot of other things), so, if the structure first obj, after obj2 structure, is the result of the execution will almost certainly cause memory leaks.

Given all of the above (and possibly more), the compiler is only responsible for allocating space when constructing the union, not for performing additional initialization, and to simplify the work, we receive the above error whenever we provide a constructor.

Similarly, the destructor/copy constructor/assignment operator cannot be added, except that constructors cannot be added.

In addition, if our class contains any virtual functions, we will receive the following error message at compile time:
The error C2621: union '__unnamed' : member 'obj has the copy constructor

So get rid of the idea of a class member variable that contains a constructor/destructor/copy constructor/assignment operator/virtual function in the union, and use your c-style struct.
However, it is OK to define ordinary member functions, because it does not make class any fundamentally different from a c-style struct, which you can easily think of as a c-style struct + n global functions.

Now, look at the difference when you include an internal union in a class. Take a look at the following program and pay attention to read the program prompt:


class TestUnion
{
    union DataUnion
    {
      DataUnion(const char*);
      DataUnion(long);
      const char* ch_;
      long l_;
    } data_;

  public:
    TestUnion(const char* ch);
    TestUnion(long l);
};

TestUnion::TestUnion(const char* ch) : data_(ch) // if you want to use initialzing list to initiate a                                nested-union member .  the union must not be anonymous and                              must have a constructor . 
{}

TestUnion::TestUnion(long l) : data_(l)
{}

TestUnion::DataUnion::DataUnion(const char* ch) : ch_(ch)
{}

TestUnion::DataUnion::DataUnion(long l) : l_(l)
{}

int main (void)
{
    return 0;
}

As the above program shows, unions in C++ can also contain constructors, but this, while supported by the language, is a poor programming practice, so,               I'm not going to go into too much detail on the above procedure. I recommend the following programming styles:


class TestUnion
{
    union DataUnion
    {
      const char* ch_;
      long l_;
    } data_;
  
  public:
    TestUnion(const char* ch);
    TestUnion(long l);
};

TestUnion::TestUnion(const char* ch)
{
    data_.ch_ = ch;
}

TestUnion::TestUnion(long l)
{
    data_.l_ = l;
}

int main (void)
{
    return 0;
}

It's completely c-style.

So, accept the conclusion:

Follow the convention in C to use a union and try not to use any additional C++ features.

A union is a good thing. A union is a struct, in which all members share a block of memory. The size is determined by the member with the largest size. In gamedev, the union can make a difference in these areas:

1. Change the name:


struct Rename
{
  public:
    union
    {
      struct 
      {
        float x,y,z,w;
      };
      struct
      {
        float vec[4];
      };
    };
};

  In this way, we can access the variables according to the specific meaning, or loop like an array.

  2. Compression:


struct Compression
{
 public:
   bool operator==(const Compression& arg) const { return value == arg.value; }
   union
   {
     struct 
     {
       char a,b,c,d,e,f,g;
     };
     struct
     {
       long long value;
     };
   };
};

In this way, for centralized processing, such as ==, the efficiency will be greatly improved, such as in 64-bit machines, only once, or the transmission of data, compression and decompression is very convenient;

(3) risk:

Anonymous union usage, not standard, so confirm == on compiler > Poor compiler portability;
The size of data on different machine operating systems is different, indicating different, so when using union, especially when porting, it is dangerous.
However, if the system and compiler are the same, it is possible to use union in the right place.

Union is not common in C/C++, but it occurs frequently in places where memory requirements are particularly strict. So what is union, how to use it, and what should be paid attention to? I try to give some simple answers to these questions. There must be something wrong with them.

1. What is union?

"Federation" is a special class and a construction-type data structure. Many different data types can be defined in a "union", and a variable described as the "union" type allows any type of data defined by the "union" to be loaded, which shares the same memory and achieves the goal of saving space (there is also a space-saving type: bit field). It's a very special place, and it's a joint feature. Also, like structs, the union default access is public, and there are member functions.

2. The difference between union and structure?

There are some similarities between "union" and "structure". But there are fundamental differences. Each member has its own memory space in the structure, and the total length of a structure variable is the sum of the member lengths (except for empty structures, without considering boundary adjustments). In a union, where members share a segment of memory, the length of a union variable is equal to the longest length of each member. It should be noted that sharing here does not mean loading multiple members into a union variable at the same time. Instead, it means that the union variable can be assigned any member value, but only one value at a time.

Here is an example to add to the understanding of deep association.

Example 4:


#include <stdio.h>
void main()
{
  union number
  { 
   int i;
   struct
   { 
    char first;
    char second;
   }half;
  }num;
  num.i=0x4241; 
  printf("%c%cn", num.half.first, num.half.second);
  num.half.first='a'; 
  num.half.second='b';
  printf("%xn", num.i);
  getchar();
}

The output result is:

AB
6261

As can be seen from the above example, when I is assigned a value, its lower eight bits are the values of first and second. When the first and second characters are assigned, the ASCII code of the two characters is also used as the low and high octet of I.

3. How to define it?

Such as:


union test
{
  test() { }
  int office;
  char teacher[5];
};

A union type named test is defined. It contains two members, one is an integer, and the member name is office. The other is an array of characters named teacher. After the union is defined, the union variable can be described as a variable of type test, which can hold the integer office or the teacher array of characters.

4. How to explain?

The description of joint variables can take three forms: definition first and then, definition at the same time, and direct description.

Taking the test type as an example, it is illustrated as follows:
1)


union test
{
  int office;
  char teacher[5];
}; 
union test a,b; 

2)


union test
{
  int office;
  char teacher[5];
} a,b;

3)


union 
{
  int office;
  char teacher[5];
} a,b;

After the explanation, a and b variables are of type test. The length of the variables a and b should be equal to the longest length of the member of test, which is equal to the length of the teacher array, 5 bytes in total. Variables a and b, such as integer values, use only 4 bytes, while character arrays can use 5 bytes.

5. How to use it?

Any assignment to a union variable can only be made to a member of the variable. The member of the union variable is expressed as:
Union variable name. Member name
For example, after a is specified as a variable of type test, use a.class, a.ffice
No assignment or other operation is allowed with only the union variable name, nor is it allowed to initialize the union variable, which can only be done in the program.
Again, a joint variable can only be assigned one member value at a time. In other words, the value of a union variable is the value of a member of the union variable.

6. Anonymous association

The anonymous union simply tells the compiler that its member variables share an address, and that the variables themselves are directly referenced, without the usual dot operator syntax.
Such as:


#include <iostream>
void main()
{
  union{ 
  int test;
  char c; 
  }; 
  test=5;
  c='a';
  std::cout<<i<<" "<<c;
}

As seen, the joint element a normal local variables that are referenced in a statement, in fact for the program, which is the use of these variables. In addition, although is defined in a joint statement, they with the same program fast that any other local variables with the same scope level. This means that the name of the member in the anonymous joint can't identifier with the same scope of other conflicts.
There are also the following restrictions on anonymous associations:
Because anonymous unions do not use point operators, the elements contained in the anonymous union must be data, no member functions are allowed, and no private or protected members can be included. Also, the global anonymous union must be static, or it must be in the anonymous namespace.

7. Several points to discuss:

1) what can't be stored in the union?

As we know, the things in the union share memory, so static and references can't be used, because they can't share memory.

2) can a class be put into a union?

Let's start with an example:


class Test
{
  public:
  Test():data(0) { }
  private:
  int data;
};
typedef union _test
{
  Test test; 
}UI; 

It won't compile. Why?
Because unions are not allowed to hold classes with constructors, disjunction functions, copy-copy operators, and so on, because they share memory, the compiler cannot guarantee that these objects will not be destroyed, nor that the disjunction function will be called when they leave.

3) anonymity again?
Let's look at the next piece of code:


class test
{
  public:
  test(const char* p);
  test(int in);
  const operator char*() const {return
  data.ch;}
  operator long() const {return data.l;}
  private:
  enum type {Int, String };
  union 
  {
   const char* ch;
   int i;
  }datatype;
  type stype;
  test(test&);
  test& operator=(const test&);
};
test::test(const char *p):stype
(String),datatype.ch(p) { }
test::test(int in):stype(Int),datatype.l(i) {
}

See the problem? Hehe, compiled but not. Why? Is there a problem with datatype.ch(p) and datatype.l(I)?
Haha, what's the problem? So let's see what happens when we construct the test object, when we create the test object, naturally we call its constructor, and in the constructor of course we call the constructor of its member, so it's going to call the constructor of the member of datatype, but it doesn't have a constructor to call, so what happens
Fault.
Note that this is not an anonymous union! Because it's followed by a data!

4) how to effectively prevent access errors?

Using federation can save memory space, but there is a risk of getting the current object's value through an inappropriate data member! For example, the above ch, I interleave access.

To prevent such an error, we must define an additional object to keep track of the value type currently stored in the union, which we call the discriminant of the union.

A good rule of thumb is to provide a set of access functions for all union data types when working with union objects that are members of a class.

Hope that the article described in the C++ programming to help you.


Related articles: