Explain in depth the role of the volatile modifier in programming the C language

  • 2020-05-14 05:00:20
  • OfStack

volatile warns the compiler that the variables defined behind it can change at any time, so every time the compiled program needs to store or read the variable, it reads the data directly from the variable's address. If the volatile keyword is not present, the compiler may optimize the read and store, may temporarily use the value in the register, and if the variable is updated by another program, the result will be non-1. Here's an example. In DSP development, you often need to wait for an event to trigger, so you often write programs like this:


short flag;
void test()
{
do1();
while(flag==0);
do2();
}

This program waits for the value of the memory variable flag to change to 1(suspect here is 0, a little doubt,) before running do2(). The value of the variable flag is changed by another program, which may be a hardware interrupt service program. For example, if a button is pressed, it will interrupt DSP. Change flag to 1 in the button interrupt program so that the above program can continue to run. However, the compiler does not know that the value of flag will be modified by another program, so when it optimizes, it may read the value of flag into a register and wait for that register to change to 1. If this optimization is unfortunate, the while loop becomes a dead loop because the contents of the register cannot be modified by the interrupt service program. In order for the program to read the value of the real flag variable each time, it needs to be defined as follows:


volatile short flag;

It is important to note that it may work without volatile, but it may not work again after the compiler optimization level has been changed. As a result, there is often a problem where the debug version is normal, but the release version is not. So to be on the safe side, add the volatile keyword as long as you're waiting for a variable to be modified by another program.

volatile means "changeable"
Since registers are accessed faster than RAM, compiler 1 is generally optimized to reduce access to external RAM. Such as:


static int i=0;
int main(void)
{
...
while (1)
{
if (i) do_something();
}
}
/* Interrupt service routine. */
void ISR_2(void)
{
i=1;
}

Program is meant to hope ISR_2 interrupt occurs, call in the middle of the main do_something function, however, because the compiler determines not modified i in main function, therefore can only perform one from i to a register read operation, then the inside of the each if judgment will use this register only "i copy", cause do_something will never be invoked. If the variable is decorated with volatile, the compiler guarantees that no read or write operations for this variable will be optimized (and must be performed). The same should be said for i in this example.
1 generally speaking, volatile is used in the following places:
1. Variables modified in the interrupt service program for detection by other programs need to be added with volatile;
2. volatile should be added to the logo Shared among tasks in the multi-task environment;
3. The hardware register of memory mapping is also usually specified by volatile, because each read or write to it may have different meanings;
In addition, these cases often have to consider the data integrity at the same time (several related signs were interrupted and overwritten after reading 1 and half). In 1, it can be achieved by turning off interrupts, in 2, task scheduling can be disabled, and in 3, it can only rely on good hardware design.
The deeper meaning of volatile
volatile is always about optimization. The compiler has a technique called data flow analysis, which analyzes where variables in the program are assigned, used and failed. The analysis results can be used for constant merging, constant propagation and other optimization. However, sometimes these optimizations are not required by the program. In this case, the keyword volatile can be used to prohibit such optimizations. The literal meaning of volatile is variable, and it has the following functions:

The volatile variable is not cached in registers between operations. In multi-tasking, interrupts, and even setjmp environments, variables can be changed by other programs without the compiler knowing, as volatile tells the compiler. Do not do constant merge, constant propagation and other optimization, so like the following code:

volatile int i = 1;
if (i > 0) ...

The condition of if is not true as if it were unconditional.

Reads and writes to the volatile variable are not optimized. If you assign a value to a variable and don't use it later, the compiler can often omit that assignment, whereas the handling of Memory Mapped IO can't be optimized in this way.

It is not accurate to say that volatile can guarantee atomicity of memory operation. The prefix LOCK is needed to guarantee atomicity of x86, while the prefix LOCK is needed to guarantee atomicity of SMP. The prefix x86 cannot be directly operated on memory at all, and other methods, such as atomic_inc, are needed to guarantee atomicity.

For jiffies, which is already declared as the volatile variable, I think you can just use jiffies++, there's no need to use that complicated form, because that doesn't guarantee atomicity either.
You may not know that in Pentium and CPU, the following two sets of instructions


inc jiffies 
;;
mov jiffies, %eax
inc %eax
mov %eax, jiffies

It works the same, but one instruction is not as fast as three.
Compiler optimization → C keyword volatile → memory destroy descriptor zz

"memory" is special and probably the most difficult part of the inline assembly. To explain it, first introduce 1 compiler optimization knowledge, and then look at the C keyword volatile. Finally, look at the descriptor.
Introduction to compiler optimization
Memory access speed is far less than the processing speed of CPU. In order to improve the overall performance of the machine, hardware cache Cache is introduced into the hardware to speed up the access to memory. In addition, in modern CPU, the execution of instructions is not strictly in order, and instructions without correlation can be executed out of order, so as to make full use of the instruction pipeline of CPU and improve the execution speed. These are hardware level optimizations. Let's look at software level 1 optimization: 1 is optimized by the programmer when writing code, and 1 is optimized by the compiler. Common compiler optimization methods include: cache memory variables into registers; Adjust the instruction order to make full use of the CPU instruction pipeline, common is to reorder read and write instructions. When tuning regular memory, these optimizations are transparent and efficient. The solution to the problem caused by compiler optimization or hardware reordering is to put memory barriers between operations that must be performed in a particular order from a hardware (or other processor) point of view (memory barrier). linux provides a macro to solve the compiler's order of execution problem.
void Barrier(void)
This function tells the compiler to insert a memory barrier, but it does not work on the hardware. The compiled code will store all the modified values in the current CPU register in memory and read them out again when needed.
Memory
With the above knowledge, it is easy to understand that Memory modifies the descriptor, and Memory tells GCC:
1) do not reorder the inline assembly instructions from the previous ones; That is, the instructions in front of the embedded assembly code are executed before it is executed
2) do not cache variables into the register, because this code may use memory variables, and these memory variables will change in unpredictable ways. Therefore, GCC inserts the necessary code to write the values of variables cached in the register back into the memory. If these variables are accessed later, the memory needs to be accessed again.
If the revised assembly instruction memory, but imperceptibly GCC itself, because there is no description in the output section, this is need to modify description part increase "memory", told GCC memory has been modified, GCC learned this information, will be before this period of instruction, insert the necessary instructions to the front because optimization Cache to register a variable's value in the first written back to memory, read again later, if you want to use these variables.
You can do this with "volatile" as well, but adding the keyword to each variable is not as convenient as using "memory".

The importance of volatile is self-evident to embedded programmers. The understanding of volatile is often used by many companies to evaluate whether a candidate is qualified or not in the interview of embedded programmers. Why is volatile so important? This is because embedded programmers often have to deal with interrupts, underlying hardware, etc., which all use volatile, so embedded programmers must master the use of volatile.

Like the familiar const1, volatile is a type modifier. Before moving on to volatile, let's take a look at one of the functions that will be used next. Readers who know how to use volatile can skip the explanation of this function.

Prototype:


int gettimeofday ( struct timeval * tv , struct timezone * tz ) ; 

The header file


#include <sys/time.h>

Function: get the current time

Return value: 0 on success, -1 on failure, error code in errno.

gettimeofday() returns the current time in the structure referred to by tv, and the local time zone information is placed in the structure referred to by tz.
timeval structure is defined as:


struct timeval{
 long tv_sec; 
 long tv_usec; 
};

timezone structure is defined as:


struct timezone{
 int tz_minuteswest; 
 int tz_dsttime; 
};

Let's start with the timeval structure, where tv_sec stores seconds, and tv_usec stores microseconds. The timezone member variable, which we rarely use, is used in the gettimeofday() function to put the local time zone information into the structure referred to by tz, where the tz_minuteswest variable stores the number of minutes of time difference from Greenwich, and the state of tz_dsttime daylight saving time. Our main concern here is to focus on the first member variable timeval, and the last one we don't use here, so when we use the gettimeofday() function we set one parameter to NULL. Let's take a look at a simple piece of code.


#include <stdio.h>
#include <sys/time.h>

int main(int argc, char * argv[])
{
 struct timeval start,end;
 gettimeofday( &start, NULL ); /* Test start time */
 double timeuse;
 int j;
 for(j=0;j<1000000;j++)
 ;
 gettimeofday( &end, NULL ); /* Test termination time */
 timeuse = 1000000 * ( end.tv_sec - start.tv_sec ) + end.tv_sec - start.tv_sec ;
 timeuse /= 1000000;
printf(" The running time is: %f\n",timeuse);

 return 0;

}
root@ubuntu:/home# ./p

The running time is:


volatile short flag;
0

Now for a simple analysis of the code, end.tv_sec_start.tv_sec we get the interval between the stop time and the start time in seconds, and then end.tv_sec.tv_sec we get the interval between the stop time and the start time in subtle units. Because of the time unit, we calculate the result (end.tv_sec-start.tv_sec) here by multiplying it by 1000000 and converting it to microseconds, and then we use timeuse /= 1000000; Convert it to seconds. Now that you know how to test the elapsed time between start and end code using the gettimeofday() function, let's look at the volatile modifier.

Usually in our code, to prevent a variable from changing unexpectedly, we define the variable as volatile, which prevents the compiler from arbitrarily "moving" the value of the variable. To be precise, the value of the variable must be reread directly from memory each time the variable is used, rather than using a backup stored in a register.

Before giving an example, let's outline the differences between the compilation methods in Debug and Release modes. Debug is usually referred to as the debug version. It contains debugging information and is not optimized to facilitate programmers to debug programs. Release is called a release, and it is often optimized so that the program is optimized for both code size and speed so that users can use it well. Now that you know the difference between Debug and Release, let's look at a piece of code.


#include <stdio.h>

void main()
{
int a=12;
printf("a The value of :%d\n",a);
__asm {mov dword ptr [ebp-4], 0h}
int b = a;
printf("b The value of :%d\n",b);
}

Analyzing the above code, we use 1 sentence to modify the value of variable a in memory using s 237en {s 238en dword ptr [s 241en-4], s 242en}. Having explained the difference between Debug and Release, let's compare the results. Note: compile and run with vc6, unless otherwise specified, in linux environment. Don't forget to select the mode to run when you compile.

The results of using Debug mode are:


a The value of :12 
b The value of :0 
Press any key to continue 

The results of using Release mode are:


volatile short flag;
3

Looking at the results above, we found that after the Release mode was optimized, b had a value of 12, but b had a value of 0 when Debug mode was used. Why does this happen? Let's skip the answer and take a look at the following piece of code. Note: compile and run using vc6


volatile short flag;
4

The results of using Debug mode are:


volatile short flag;
5

The results of using Release mode are:


volatile short flag;
6

We found that both Debug mode and Release mode had the same result in this case. Now let's look at the difference between Debug and Release.

Analysis on 1 piece of code first, because we are not in Debug mode optimize the code, so for a each use is worth all the time in your code to read directly from its memory address, so we used the __asm {mov dword ptr [ebp - 4], 0 h} statement after changed a value, then use a values directly read from the memory, so get the updated a value; However, when we were running in Release mode, we found that the value of b was the value before a, rather than the value of a after we updated it. This is because the compiler did some optimization during the optimization process. The compiler found no change again after and assign a a value, so the value of the compiler a backup in a register, in after the operation, we again use a values when operating this register directly, instead of reading a memory address, because of the register read faster than the speed of memory read directly. This causes the read a value to be 12. Not the updated 0.

In the second code, we use an volatile modifier. In this case, we get the updated a value no matter what mode it is in, because the purpose of the volatile modifier is to tell the compiler not to optimize the variable it modifies, but to get the value directly from the memory address each time. As you can see here, for the variables in our code, it is best to use the volatile modifier to get the value every time it is updated. To give you an idea, let's look at the following piece of code.


#include <stdio.h> 
#include <sys/time.h> 
 
int main(int argc, char * argv[]) 
{ 
 struct timeval start,end; 
 gettimeofday( &start, NULL ); /* Test start time */ 
 double timeuse; 
 int j; 
 for(j=0;j<10000000;j++) 
  ; 
 gettimeofday( &end, NULL ); /* Test termination time */ 
 timeuse = 1000000 * ( end.tv_sec - start.tv_sec ) + end.tv_usec -start.tv_usec; 
 timeuse /= 1000000; 
printf(" The running time is: %f\n",timeuse); 
 
 return 0; 
 
} 

We just increased the number of for() cycles as compared to code 1, which we tested earlier.

Let's take a look at the results we didn't use:


root@ubuntu:/home# gcc time.c -o p 
root@ubuntu:/home# ./p 
 The running time is: 0.028260 

Optimized running results are used:


volatile short flag;
9

It is obvious from the results that the difference is so large, but if we change int j to int volatile j by changing int j to int volatile j by changing int j to int volatile j by changing int j


#include <stdio.h> 
#include <sys/time.h> 
 
int main(int argc, char * argv[]) 
{ 
 struct timeval start,end; 
 gettimeofday( &start, NULL ); /* Test start time */ 
 double timeuse; 
 int volatile j; 
 for(j=0;j<10000000;j++) 
  ; 
 gettimeofday( &end, NULL ); /* Test termination time */ 
 timeuse = 1000000 * ( end.tv_sec - start.tv_sec ) + end.tv_usec -start.tv_usec; 
 timeuse /= 1000000; 
printf(" The running time is: %f\n",timeuse); 
 
 return 0; 
 
} 

Let's take a look at our non-optimized run results as follows:


root@ubuntu:/home# gcc time.c -o p 
root@ubuntu:/home# ./p 
 The running time is: 0.027647 

The optimized operation results are:


root@ubuntu:/home# gcc -o p time.c -O2 
root@ubuntu:/home# ./p 
 The running time is: 0.027390 

We found that at this point in time, whether or not we were running with optimized statements, there was almost no change in time, only a slight difference, which was caused by the computer itself. So we pass for the above 1 didn't use volatile and below 1 used volatile the comparison result shows that using volatile variables in the use of optimization statement is for () loop and didn't get optimization, because for () loop execution is a no-op, so usually use the optimized statement makes the for () loop are optimized away, does not perform at all. Just as the compiler sets the value of i to 1 number greater than or equal to 10 million during compilation, the for() loop will not execute. But because we used volatile, the compiler didn't want to mess with our i value on its own, so the loop body was executed. The reason for this example is to keep in mind that if we define the volatile variable, it will not be optimized by the compiler.

Of course, what else should be noted about volatile? Since access to registers is faster than direct access to memory, compiler 1 generally reduces access to memory, but if the variable is decorated with volatile, the compiler guarantees that no read or write operations are optimized for this variable. This might seem a bit abstract, but take a look at the code below and write a few brief steps here.


main()

{

  int i=o;

  while(i==0)

  {

      ... 

  }

}

To analyze the above code, if we did not change the value of i in the while loop body structure, the compiler will backup the value of i to a register during the compilation process, and value it from the register every time the judgment statement is executed, then this will be a dead loop, but if we make the following changes:


main()

{

  int volatile i=o;

  while(i==0)

  {

      ... 

  }

}

We had one in front of the i plus volatile, assuming while () inside the loop body is keep up with the 1 completely 1 sample, but this time can't say it's a dead cycle, because the compiler will not "back up" on our i value for operation, each time you execute judgment will be read directly from the memory address of i, 1 denier change its value will exit the loop body.

The last point is that volatile is generally used in the following situations:

1. Variables modified in the interrupt service program for detection by other programs shall be added with volatile;

2. In the multi-task environment, volatile should be added to the logo Shared among tasks;

3. Memory-mapped hardware registers are usually also specified by volatile, because each read or write to them may have a different meaning.


Related articles: