Detail how many uninitialized local variables are in C

  • 2020-10-23 21:10:47
  • OfStack

In C, what are the uninitialized local variables?

The answer is often:

It has to do with the compiler. Initialization to 0 is possible but not guaranteed. Not sure.

In short, it's all one serious metaphysical answer, and it's annoying.

[

If someone gives you a run-through of compilers, C libraries, and processor architectures without giving you a realistic scenario to reproduce the problem, that's probably bullshit.

]

It was Friday again when I went home. I made a short essay on the bus.

As a matter of fact, this question itself is the wrong way to ask. If we can speak 100,000 words in full, we only need to determine the specific behavior of OK in a specific scenario. Of course, this requires designing an experiment to compare OK.

In demonstration a behavior before the actual code, gives a knowledge, don't know CPU variables, more can't identify the name of the variable, CPU will only from a specific memory location value or the value to a specific memory location, so when asked 1 how much is the value of a variable, must want to know the variable where the corresponding value is saved.

Consider the following code:


#include <stdio.h>

void func1()
{
 int a;
 printf("func1:%d\n", a);
 a = 12345;
}

void func2()
{
 int b;
 printf("func2:%d\n", b);
}

void func4()
{
 int d;
 printf("func3:%d\n", d);
}

void func3()
{
 int c;
 printf("func3:%d\n", c);
 c = 54321;
 func4();
}

void test_call()
{
 func3();
}

int main(int argc, char **argv)
{
 func1();
 func2();

 test_call();
}

We have four functions of func1 ~ func41, each of which has an uninitialized local variable inside. What are their values?

For such local variables, their values depend on:

The position of the variable on the stack. Has the stack location of the variable been previously accessed by store?

As you can see, the first point above marks a memory location, and the second point is the behavior of the code. In other words, as long as the code goes to the location corresponding to store, and the subsequent code does not have the value corresponding to reset, the location will retain the original value after store.

The verification is very simple. Try 1 and you will know:


[root@localhost test]# ./a.out
func1:0
func2:12345
func3:0
func3:0

According to the change of the function call stack frames and local variables with the func1 a and func2 with a position of the local variable b apparently, in func1 is invoked, this is a piece of new memory (which may have a stack frame before entering main reached the position), a value depends on the page to memory the position corresponding to the initial value of the offset it depends on the operating system:

The operating system may assign page clear to a program page with zero pages.

[

Stack allocation does not involve the C library, which obviously does not involve the behavior of the C library, but memory allocation like malloc does.

]

When the result is printed, a has a value of 0, and we assume that the operating system returned zero pages to the application. Next, the function is returned after assigning 12345 to func1. Next, when calling func2, the stack frame is reconstructed at the previous stack frame position that func1 has exited, and the corresponding position is still 12345.

[

I did not see the ret operation for func1 followed by the stack clear 0 code instruction. Efficiency considerations, there should not be such a command.

]

Looking at the test_call function, it is clear that the calls to func3 and func4 do not use the same stack frame, so even assigning 54321 to c in func3 does not affect the value of d at the corresponding stack frame position of func4 above its stack frame. Therefore, the initial value of BOTH c and d remains 0.

So, what's the difference at the instruction level between initializing a local variable and not initializing a local variable?

It's easy to see it with your own eyes. First, look at func1 without initializing a local variable:


// int a;
00000000004005ad <func1>:
 4005ad: 55      push %rbp
 4005ae: 48 89 e5    mov %rsp,%rbp
 4005b1: 48 83 ec 10    sub $0x10,%rsp
 4005b5: 8b 45 fc    mov -0x4(%rbp),%eax
 4005b8: 89 c6     mov %eax,%esi
 4005ba: bf 90 07 40 00   mov $0x400790,%edi
 4005bf: b8 00 00 00 00   mov $0x0,%eax
 4005c4: e8 b7 fe ff ff   callq 400480 <printf@plt>
 4005c9: c7 45 fc 39 30 00 00 movl $0x3039,-0x4(%rbp)
 4005d0: c9      leaveq
 4005d1: c3      retq

Consider the version that initializes the local variable a to 2222:


// int a = 2222;
00000000004005ad <func1>:
 4005ad: 55      push %rbp
 4005ae: 48 89 e5    mov %rsp,%rbp
 4005b1: 48 83 ec 10    sub $0x10,%rsp
 4005b5: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
 4005bc: 8b 45 fc    mov -0x4(%rbp),%eax
 4005bf: 89 c6     mov %eax,%esi
 4005c1: bf 90 07 40 00   mov $0x400790,%edi
 4005c6: b8 00 00 00 00   mov $0x0,%eax
 4005cb: e8 b0 fe ff ff   callq 400480 <printf@plt>
 4005d0: c7 45 fc 39 30 00 00 movl $0x3039,-0x4(%rbp)
 4005d7: c9      leaveq
 4005d8: c3      retq

Just one instruction away:


 4005b5: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)

The operation of initialization depends on actual instructions.

To sum up, when the function returns pop out of the current stack frame, it does not clean up the data left in the stack frame. When the next function call reuses the memory of the stack frame, the uninitialized local variables will be affected by the legacy data, thus becoming uncertain!

So, remember to initialize your local variables. If you don't, God will manage you in the end.

conclusion


Related articles: