Introduction to the C language

  • 2020-05-09 18:56:04
  • OfStack

This seems like a serious topic, but it's really interesting.

1. Pointer refers to something of type 1. Any one whole, as long as it can be called a whole, can have its own unique 1 without 2 pointer type

2. Function names are always presented as function Pointers in expressions, except for the address operator and sizeof

3. The most obscure aspect of C is its complex declaration: void (* int sig, void (*func (int)) (int). Try to rewrite it in a form that is easy to understand

4. For Pointers, use const to protect them as much as possible, either by passing them to functions or by yourself

Let's look at a special pointer, let's call it a pointer, because it depends on the environment: NULL, which is a magic thing. With the definition attached, there are two types of NULL in the compiler (each environment has a 1-specific NULL):


#define NULL 0
#define NULL ((void*)0)

What's the difference? It doesn't make any difference that they're all zeros, except that one is a constant and one is a pointer to address zero.

There is no error or warning when they are both pointer values, which the compiler or the C standard considers legal:


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 

Why is that? Why is 0 assigned to a pointer, but not 10? They're all constant.

Because the C language states that the constant 0 is used as a pointer when the compiler handling the context finds it in a pointer assignment statement, it seems silly, but it is.

Going back to the beginning, what's the difference between the two cases of NULL? In the case of strings, I actually think of character arrays as C style strings.

In C, character arrays are used to store a string of meaningful characters. By default, '\0' is added at the end of these characters.

For some people, the difference between NULL and '\0' is not always clear when using a character array. Using NULL at the end of a character array is definitely wrong! Although they are essentially constant 0, they have different meanings depending on where they are.

The appetizers are over

For a function, we pass the parameter. The parameter has two forms: the form participates in the argument


int function(int value)
{
    /*...*/
}
//...
function(11);

Among them, value is the formal parameter and 11 is the argument. We know that on the field surface, C language has two transfer methods: value transfer and address transfer, but have you studied carefully? Here is a fact. In fact, the C language is only passed by value. The so-called address passing is just an illusion of passing by value. As for the reason, it can be understood after a little thought.

For parameter and argument, the two are closely related. It can be understood that the parameter always passes a copy of itself to the parameter, so that the parameter can safely use the value of the argument, but it also brings us some trouble


void swap_v1(int* val_1, int* val_2)
{
  int temp = *val_1;
  *val_1 = *val_2;
  *val_2 = *val_1;
}

This is called address passing. In fact, it just makes a copy of the value of the external pointer (argument) and passes it to the parameters val_1 and val_2. In fact, we use:


#define SWAP_V2(a, b) (a += b, b = a - b, a -= b)
#define SWAP_V3(x, y) {x ^= y; y ^= x; x ^= y}

Try 1. Isn't that amazing and saves time and space on function calls? The principle of the above two ways of writing is essentially the same.

But think about it for a second. Is there really no flaw in this way of writing? What happens if two input parameters point to the same block of memory?


...
int test_1 = 10, test_2 = 100;
SWAP_V2(test_1, test_2);          
printf("Now the test_1 is %d, test_2 is %d\n", test_1, test_2);
.../* Restore original value */
SWAP_V2(test_1, test_1);
printf("Now the test_1 is %d\n", test_1);  

What will it output? :


$: Now the test_1 is 100, test_2 is 10
$: Now the test_1 is 0

Yes, it's 0. Why? A little bit of thought goes a long way, and so does SWAP_V3, so the solution should be as short and concise as possible:


static inline void swap_final(int* val_1, int* val_2)
{
  if(val_1 == val_2)
    return;
  *val_1 ^= *val_2;
  *val_2 ^= *val_1;
  *val_1 ^= *val_2;
}
#define SWAP(x, y) \
do{         \
  if(&x == &y)  \
    break;   \
  x ^= y;   \
  y ^= x;   \
  x ^= y;   \
}while(0)

So that's the best swap function we can find so far, and we can think a little bit further from that. So how do we make this swap function more general? Is it more applicable? Floating point types are not considered. Hint: void* is available

As in the above case, the occasional casual act can have serious consequences:


int combine_1(int* dest, int* add)
{
  *dest += *add;
  *dest += *add;
  return *dest;
}
int combine_2(int* dest, int* add)
{
  *dest = 2* (*add);// Use parentheses when you're not sure what the priority is 1 A wise choice 
  return *dest;
}

Do the above two functions do the same thing? Well, it looks like 1


int test_3 = 10, test_4 = 100;

combine_1(&test_3, &test_4);
printf("After combine_1, test_3 = %d\n",test_3);
.../* Restore original value */
combine_2(&test_3, &test_4);
printf("After combine_2, test_3 = %d\n",test_3);

The output


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 
0

What if I pass in two of the same objects?


... /* restore test_3 The original cost */
combine_1(&test_3, &test_3);
printf("After second times combine_1, test_3 = %d\n",test_3);
...
combine_2(&test_3, &test_3);
printf("After second times combine_2, test_3 = %d\n",test_3);

The output


$: After second times combine_1, test_3 = 30

$: After second times combine_2, test_3 = 20

It's always surprising to know the truth, and the pointer is so much to love and hate.

C99 standard has a new keyword, restrict, used to modify the pointer, it does not have much explicit effect, even add or not add, in your own opinion, the effect is no different. However, in the code of the standard library, this keyword is used in many places. Why

First of all, this keyword is written for the compiler Second, the keyword is used to help the compiler optimize the program Finally, never misuse this keyword if you are unfamiliar with it.

All that stuff about arrays

Array and pointer 1?

Don't 1 sample

Keep in mind that arrays and Pointers are different things. But why is the following code correct?


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 
3

Again, in context, the compiler derives that arr is to the right of the assignment operator, silently converting it to a pointer of its type, and we always use arr as a pointer to the top of the memory block in the array.


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 
4

Output:


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 
5

This is why the cause of the array and pointer is different, in the external that define an array of block of code, the compiler found here arr from context is an array, and arr represent a pointer to the 10 int types of arrays, only the so-called the first code is right, just because the usage is more, to become one part of the standard. Just as there is no way in the world, if you walk much, you will become the way." How do you write "correct"


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 
6

At this point, the type of p is a pointer to an array of 10 elements. At this point, (*p)[0] produces arr[0], which is parr[0], but what about (*p)? It's not recorded here, it's going to overflow, why?

That's the difference between an array and a pointer, but why write int (*p)[10] when we can use Pointers like parr? Here's why:

Said back to the beginning of pass way, pass by value when the relay arr just pure its value has passed, but lost the context it is just a normal pointer, just we programmers know it refers to the starting position of 1 piece of meaningful memory, I want to transfer the information 1 of the array, in addition to an additional 1 parameter is used to record the length of the array, you can also use this method, pass a pointer to the array so that we can only pass a parameter and keep all information. But there are limits to doing this: arrays of different sizes, or storage types, will have different types


 int arr_2[5];
 int (*p_2)[5] = &arr_2;
 float arr_3[5];
 float (*p_3)[5] = &arr_3;

As shown above, Pointers to arrays must specify the size of the array and the type of storage in the array, which puts a big limit on Pointers to arrays.

This usage used in multidimensional arrays is more, but overall usual is not much, as far as I am concerned, more inclined to use 1 d array to represent the multidimensional arrays, as described earlier, actually C language is a very concise language, it is not a lot of crap, C language not multidimensional arrays in terms of nature, because memory is a kind of linear exist, even if is a multidimensional array is implemented as a 1 d array form.

Explain 1 here for the multidimensional array. The so-called multi-dimensional array is to combine several arrays with a reduction of 1 dimension from 1, and the array with a reduction of 1 dimension from 1 to the lowest array of 1 dimension. For example:


int dou_arr[5][3];

For this 2-dimensional array, combine 5 arrays of type int each at 1. What do I do to point to this array?


int* temp_int_1 = 0; // There is no warning 
int* temp_int_2 = (void*)0; // There is no warning 
int* temp_int_3 = 10; // A warning 
9

In fact, the multi-dimensional array is just a combination of multiple arrays reduced by 1 dimension from 1, making indexing more intuitive. When really understand the memory usage, can think of multidimensional arrays instead to bring more restrictions to the explanation of the sentence 3 when the array name appeared on the assignment, on the right side, it will be a pointer, type is pointing to the type of the array elements, and for a multidimensional array, the element type is its drop 1 d array, the pointer to the drop 1 d array type. This explanation is a little convoluted, but it's much better to write the 1 yourself.

For some form of operation, we naturally combine similar behaviors with 1 consideration. Consider the following code:


int* arr_3[5] = {1, 2, 3, 4, 5};
int* p_4   = arr_3;

printf("%d == %d == %d ?\n", arr_3[2], *(p_4 + 2), *(arr_3 + 2));

Output: 3 == 3 == 3 ? Actually for array and pointer, [] operation can have the same results in most cases, for the pointer * (p_4 + 2) is equivalent to p_4 [2], that is to say, [] is the syntactic sugar of pointer arithmetic, interestingly [p_4] is equivalent to 2 p_4 [2], "Iamastring" [2] = = 'm', but this is just for fun, practice, please don't do it, unless it's code chaos series or some special purposes. At this point, it should be stated that the execution efficiency of these methods is completely 1, there is no one pointer operation faster than [] operation, these statements are the last century, with the development of The Times, we should pay more attention to the code clean

Here is another strange and useful technique. The operation of pointer operation in char array to extract different types of data, or the extraction of contents by char* pointer in different types of array, is the purpose of pointer operation. However, when using different types of Pointers to manipulate memory blocks, you need to be careful not to manipulate meaningless areas or out-of-bounds operations.

In fact, one of the simplest security studies is using an overflow to attack.

Advance: for the growth direction of an array in a function, it is always towards the return address, and there may be many other automatic variables in between. We only need 1 to carry out overflow test until a certain time, the function cannot return normally! That proves that we have found the return address storage area of the function. At this point, we can do something, such as overwriting the original return address with the desired return address. This is called an overflow attack.

Memory usage

You 1 straight thought you operating the real physical memory, in fact is not, you simply operating operating system distribution of qualified to a virtual address for you, but that doesn't mean we can use an unlimited amount of memory, the memory for doing so expensive, in fact or physical memory to store data, but the mediation in case of the operating system, different application window (can) is the same procedure can be Shared to use with 1 piece of memory area, a certain 1 denier lump programs that lack of physical memory, we'll have some useless to the data is written to your hard disk, and then use, from the hard disk read back. What does this property lead to? Suppose you use multiple Windows on Windows and open two identical programs:


...
int stay_here;
char tran_to_int[100];
printf("Address: %p\n", &stay_here);

fgets(tran_to_int, sizeof(tran_to_int), stdin);
sscanf(tran_to_int, "%d", &stay_here);

for(;;)
{
  printf("%d\n", stay_here);
  getchar();
  ++stay_here;
}
...

For this program (using the example of front bridge and m), press enter every time you hit it, and the value is increased by 1. When you open both programs at the same time, you will find that the stay_here of both programs is at the same address, but when you operate on it separately, the result is independent! This validates the validity of the virtual address in one respect. The point of virtual addresses is that even if one program has an error and its memory dies, it does not affect the other processes. For the two read statements in the middle of the program, it is a good example to understand the nature of the input stream of C language. It is recommended to use the query. Here is a brief explanation:

Generally speaking, fgets stores the input of stdin in the input stream at the starting address of tran_to_int from the call of stdin, and at most reads sizeof(tran_to_int), and then in the sscanf function stores the data just read into stay_here in the format of %d. This is the meaning of the stream concept that C language 1 emphasizes. The combination of these two statements seems to be as simple as reading 1 piece of data, but we need to know 1 question, 1 question about scanf


 scanf("%d", &stay_here);

This statement will read all the data entered by the keyboard until it hits enter. What does that mean? This means that the carriage return is left in the input stream and is read or discarded by the next input. This can affect our programs and produce unexpected results. Not when you use a combination of two sentences.

Functions and function Pointers

In fact, the function name appears to the right of the assignment symbol to represent the function's address


int function(int argc){ /*...*/
}
...
int (*p_fun)(int) = function;
int (*p_fuc)(int) = &function;// And on the 1 Sentence meaning 1 to 

The code above declares and initializes a function pointer, p_fun, which is a pointer to a function of type int with a return value of type int and an argument of type int


p_fun(11);
(*p_fun)(11);
function(11);

The same goes for all three of the above, but we can also use the concept of an array of Pointers to functions


int (*p_func_arr[])(int) = {func1, func2,};

Where func1 and func2 are functions that return int and int, then we can use this function like array index 1.

Tips: we always ignore function declarations, which is not a good thing.

In C, because the compiler doesn't go too far into whether there are any function declarations, it even indulges, and of course it doesn't include inline functions (inline), because it's only available in this file.
For example, when we call a function somewhere and don't declare it:


 CallWithoutDeclare(100); // parameter 100 for  int  type 

The C compiler then speculates that a function that USES an int parameter must have an int parameter list, and if the parameter list in the function definition does not match it, the parameter information will be passed incorrectly (the compiler always believes that it is right!). , we know that C language is a strongly typed language, 1 denier type is not correct, can lead to many unexpected results (often Bug) occur.

The same is true for function Pointers
C, malloc

We see this all the time:


int* pointer = (int*)malloc(sizeof(int));

What's so weird about that? Here's an example:


int* pointer_2 = malloc(sizeof(int));

Which is the right way to write it? Are both right, this is why, this is going to pursue to the ancient C language period, at the same time, void * this type has not yet appeared, malloc returns char * type, and then the programmer in call this function when the total want to plus casts, can use this function correctly, but after standard C appears, no longer have the problem, because any type of pointer can and void * conversion to each other, The standard C does not endorse casts where they are not needed, so the more orthodox version of the C language is the second.

Aside: pointer conversions in C++ need to be cast, not like the second example, but there is a better memory allocation method in C++, so this is no longer a problem.

Tips:

C's three functions, malloc, calloc and realloc, are all functions with great risks. Please remember to check their results when using them. The best way is to repackage them, either macros or functions.
realloc function is one of the most criticized functions, because its function is too broad. It can not only allocate space, but also free space. Although it looks like a good function, it may help us do some unexpected things inadvertently, such as releasing space for many times. Instead, use the ability to repackage and neuter it so that it can only expand or shrink the heap block size.
Pointers and structs


typedef struct tag{
    int value;
    long vari_store[1];
}vari_struct;

At first glance, it seems to be a very regular structure


...
vari_struct vari_1;
vari_struct* vari_p_1 = &vari_1;
vari_struct* vari_p_2 = malloc(sizeof(vari_struct))(

It seems to work that way, but there are always some people who come up with some weird ways to use 1


int     what_spa_want = 10;
vari_struct* vari_p_3 = malloc(sizeof(vari_struct) + sizeof(long)*what_spa_want);

What does that mean? This is called a variable length structure, and even if we're out of the range of the structure, as long as we're in the allocated space, we're not out of bounds. what_spa_want explains how much space you need, that is, how much space you need beyond the size of a structure. Space is used to store the long type. Since the allocated memory is continuous, you can use the array vari_store to directly index. And because the compiler does not check arrays for overbounds in C, the expression &arr[N] is permitted by the standard for an array with a number of N, but remember that arr[N] is illegal. This usage is not for entertainment, but becomes part 1 of the standard (C99) and applies in practice

Understanding of memory

In the process of memory allocation, we use malloc for allocation and free for release, but is this how we understand allocation and release? When calling malloc, this function either USES brk() or nmap() to apply for a piece of memory from the operating system, and allocates it to the required place when using it. The corresponding function is free, which is the same as our hard disk to delete something. In fact:


int* value = malloc(sizeof(int)*5);
...
free(value);
printf("%d\n", value[0]);

In the code, why do I continue to use this memory after free? Because free just marks out the memory tag, indicating the function that allocates memory, I can use it, but I don't break the contents of the current memory until something writes to it. This raises several questions:

Bug is even harder to find. Let's assume that if we have two Pointers p1,p2 to the same memory, and if we use free(p1) for one of the Pointers; Operation, but forget that there is another pointer to it, this will lead to a very serious security risk, and this risk is 10 points hard to find, because this Bug will not be revealed at the time, but may inadvertently crash your program at some point in the future.
It is possible to simplify some problems, such as releasing a linked list field.
To sum up, C is a double-edged sword.


Related articles: