The compilation of C language program for ARM platform is briefly analyzed

  • 2020-05-07 20:04:20
  • OfStack

  we know that when the C language is compiled, there are several common optimization options: -O0, -O1, -O2, -O3, and -Os. Before 1 straight think since it is an optimization option, the most is to optimize 1 logic, improve some efficiency or reduce the size of the next program. There is very little sense that they affect the end result of the program. Until recently, when I found an bug in a program on the ARM platform, I thought these optimizations were sometimes not so smart. Or for the ARM platform, it's not that smart.
         


#include<stdio.h>
#include<string.h>

int main()
{
 char buffer[1024] = {0,1,2,3,4,5,6,7};
 int iTest = 0x12345678;
 int *p = (int *)(buffer + 7);
 memcpy(p, &iTest, sizeof(iTest));
 printf("%x\n", buffer[6]); 
 printf("%x\n", buffer[9]); 
 return 0;
}

At first glance, there's nothing wrong with this program. Then we will call this program file point.c. Then, the cross-compilation chain is used for the following compilation:


 arm-xxx-linux-gcc point.c -o point0 -O0
 arm-xxx-linux-gcc point.c -o point1 -O1
 arm-xxx-linux-gcc point.c -o point2 -O2

      finally executed three more programs, and the results were a bit surprising:


 ./point0
 6
 34
 ./point1
 34
 0
 ./point2
 6
 0

      results correspond to the hypothetical 1 only if -O 0 is not optimized. But the same problem doesn't apply to x86.
      then I determined what was wrong with compiling on ARM by using the following command to generate assembly code under different optimizations, respectively.


 arm-xxx-linux-gcc point.c -o point0.s -O0 -S
 arm-xxx-linux-gcc point.c -o point1.s -O1 -S
 arm-xxx-linux-gcc point.c -o point2.s -O2 -S

      then compared the three assembles of code and found the problem with memcpy.
      in point0.s, the program is honestly called memcpy, and then 0x12345678 is honestly placed in buffer+7 by byte 1.
      and in point1.s the program does not call memcpy, but USES the following statement:
      str               r3, [sp, #7]
      and what's stored in r3 is 0x12345678; Since the ARM platform I used is 32-bit, the address line should not change when this statement is executed, so the final result is that the data from buffer+4 to buffer+7 is overwritten, instead of the data from buffer+7 to buffer+10 is modified.
     , however, seems to be optimized for the pipeline in point2.s, and the sequence of program execution will be changed. The order of initial value assigned to buffer part is after str   r3, [sp, #7], so the data at buffer+6 is correct 6 instead.
After the analysis of      , maybe some people will say that writing a simple program will result in different results due to different optimization options for compilation, so this memcpy is not dare to use?
      as long as you have good programming habits, you will not encounter such problems, such as the following program:


#include<stdio.h>
#include<string.h>

int main()
{
 char buffer[1024] = {0,1,2,3,4,5,6,7};
 int iTest = 0x12345678;
 char *p = buffer + 7;
 memcpy(p, &iTest, sizeof(iTest));
 printf("%x\n", buffer[6]); 
 printf("%x\n", buffer[9]); 
 return 0;
}

The       program simply changes the type of p to ensure that the results are the same under various optimizations. You can see how important good programming habits are.


Related articles: