C language Struct Hack notes

  • 2020-04-02 02:18:24
  • OfStack

Recently, in the experiment of Compiler CodeGenerator, some of the Java programs need to be translated into C programs, for example:


int [] array;
array = new int[10];
System.out.println(array.length); //10

The natural idea is to translate this code into C:

int * array; // int array[] not support in C
array = (int*)malloc(sizof(int)*10);
printf("%dn",sizof(array)/sizeof(int)); // 1

Unfortunately, this is wrong, because the malloc operation allocates space on the heap, not necessarily continuous, and the sizof(array) is the cell occupied by the pointer itself, which is the same as sizeof(int). It's not the same as the following:

int array[10];
printf("%dn",sizof(array)/sizeof(int)); // 10

Here array is an array, and it's a constant that points to the entire contiguous storage space, so sizeof is going to calculate the length of the entire region. But when the array name is passed as an argument to a function, the array degenerates into a pointer, returning to the problem.

What should we do?

A search on StackOverflow shows that ANSI C has no direct way to find the allocation length by pointer to memory. But Windows provides a way to calculate the amount of memory the pointer points to [malloc.h] :

_msize: returns the size (in bytes) as an unsigned integer.


size_t _msize(
void *memblock
);


However, due to operating system policy, the actual size assigned may be larger than specified.

Under Linux, the actual size is also recorded when the pointer is offset by a resized cell. Let's take a look at the contents of that cell:


//test.c
int main(){
 int * p;
 int i;
 int size;
 for (i=1;i<11;i++)
 printf("%d ",i);
 printf("n");
 for (i=0;i<10;i++){
 p = (int*)malloc(sizeof(int)*i);
 size = *(int*)((char*)p-sizeof(int));
 printf("size:%d ",size);
 free(p);
 }
 printf("n");
}

$gcc test.c
$./a.out
1  2  3  4  5  6  7  8  9  10
17 17 17 17 25 25 33 33 41 41


It seems that Linux's allocation strategy does not allow a one-to-one correspondence between the size of memory and the number of elements. It turned out that there was a function similar to _msize in Linux [malloc.h] :


int * array;
int size;
array = (int*)malloc(sizof(50);
size = malloc_usable_size(array);
printf("%dn",size);//50

But malloc.h is not standard c. we will continue to look for common methods. After a lot of searching, I finally found a code trick called struct-hack. As mentioned earlier, int a[] is illegal in C language, but it is ok to use it as the last member of a struct:


typedef struct array{
 int size;
 int free;
 int buf[];
 }array,*Tiger_array;


This feature was added late in the C language to enable flexible array so that the size needs to be recorded synchronously every time an array is allocated space. When calculating the size, simply take it out:


 Tiger_array ta;
ta = (int*)malloc(sizeof(array)+100);
ta->size = 100;
ta->free = 0;

Note that the size allocated should be sizeof(struct) plus the required array size.

So much for that question.


Related articles: