Implementation code and analysis based on malloc and free function

  • 2020-04-01 21:40:06
  • OfStack

The functions malloc and free for memory management should be familiar to programmers using the C language. Some time ago, I heard that some IT companies use "to realize a simple function of malloc" as the interview question. Recently, I was reviewing K&R, which was introduced above, so I spent some time to study IT carefully. After all, the problem is secondary to understand the realization of the idea, improve the technology is the main. This paper mainly introduces the realization ideas of malloc and free. The blue text is the core of my personal thinking. In addition to the code description, there are some explanations on K&R, highlighted in green.

Before looking at the implementation of K&R in chapter 8, section 5, let's look at the implementation of alloc/afree in chapter 5, section 4, although the main purpose of this code is to show the address operation.


alloc implementation  
#define ALLOCSIZE 10000
static char allocbuf[ALLOCSIZE];    
static char *allocp = allocbuf;    
char *alloc(int n)
{
    if(allocbuf+ALLOCSIZE - allocp >= n) {
        allocp += n;
        return alloc - n;
    } else
        return 0;
}
void afree(char *p)
{
    if (p >= allocbuf && p<allocbuf + ALLOCSIZE)
        allocp = p;
}

Disadvantages of this simple implementation:

1. As a memory resource, allocbuf is actually pre-allocated and may be wasted.

2. The order of allocation and release is similar to the stack, i.e. "last in, first out".

This implementation, while relatively rudimentary, still provides an idea. If these two shortcomings can be eliminated, the ideal malloc/free can be achieved.

Relying solely on address arithmetic to locate is what limits the flexibility of allocation-collection, requiring that used and unused parts be separated into two adjacent areas by a certain address. In order for the two regions to interleave each other, and even include some unallocated address Spaces, you need to use Pointers to join the same memory Spaces together to form a linked list, so that you can handle a series of memory Spaces with discontinuous addresses. But why connect only the free space and not the space in use? This question may have been asked without thinking about it because of an intuitive analogy between the two in the picture, which is easy because it is not necessary. The former are linked to each other so that they can traverse all free space at memory allocation time and reinsert it when the used space is recycled using free(). As for the Spaces in use, since we already know their addresses when we allocate them, we can tell free() directly when we recycle them instead of traversing them like malloc().

The < img border = 0 style = "DISPLAY: block; MARGIN - LEFT: auto; MARGIN - RIGHT: auto "Alt =" "SRC =" / / files.jb51.net/file_images/article/201305/201305040929168.png "width = 424 height = 287 >

Now that we're talking about linked lists, maybe someone with a little knowledge of data structures will immediately write down a struct to represent a memory region that contains a pointer to the next memory region, but what about the rest of the struct? As the memory area to be allocated, the size is variable, if it is declared as a member of the struct is obviously inappropriate; If declared as a pointer to some other region, this again seems inconsistent with the intuitive representation above. (also can be realized, of course, to do so, it seems to be between both of the above, the management structure and the actual distribution of space, in the end I will specifically discuss the implementation method) as a result, there are still separate from the control structure and the free space, but keep them in memory address, adjacent form below form, and by this characteristic, we can use pointer arithmetic of control structure pointer to locate the corresponding memory area:

The < img border = 0 style = "DISPLAY: block; MARGIN - LEFT: auto; MARGIN - RIGHT: auto "Alt =" "SRC =" / / files.jb51.net/file_images/article/201305/201305040929169.png "width = 420 height = 157 >

Correspondingly, the control information is defined as Header:


typedef long Align;
union header { 
    struct {
        union header *ptr; 
        unsigned size; 
    } s;
    Align x;
};
typedef union header Header;

The reason for using a union instead of a struct directly is for address alignment. This is a long alignment, and the union's x will never be used.

Thus, the main job of malloc is to manage these headers and the memory blocks that follow them.


malloc() 
static Header base;
static Header *freep = NULL;
void *malloc(unsigned nbytes)
{
    Header *p, *prevp;
    unsigned nunits;
    nunits = (nbytes+sizeof(Header)-1)/sizeof(Header) + 1;
    if((prevp = freep) == NULL) { 
        base.s.ptr = freep = prevp = &base;
        base.s.size = 0;
    }
    for(p = prevp->s.ptr; ;prevp = p, p= p->s.ptr) {
        if(p->s.size >= nunits) { 
            if (p->s.size == nunits)  
                prevp->s.ptr = p->s.ptr;
            else {
                p->s.size -= nunits;
                p += p->s.size;
                p->s.size = nunits;
            }
            freep = prevp;
            return (void*)(p+1);
        }
        if (p== freep) 
            if ((p = morecore(nunits)) == NULL)
                return NULL; 
    }
}

The actual space allocated is an integer multiple of the Header size, and there is an extra header-sized space for the headers. But intuitively it's not nunits = (nbytes+sizeof(Header)-1)/sizeof(Header) + 1, right? Wouldn't it be nice to use (nbytes+sizeof(Header))/sizeof(Header)+1? Actually, if you use the latter, when nbytes+sizeof(Header)%sizeof(Header) == 0, you allocate one more Header size space, so you have to subtract 1 from the parentheses to meet the requirement.

  The first call to malloc() creates a degenerate list base with only one space of size 0 and points to itself. Freep is used to identify an element of the free list, which may change every time you look it up. The intermediate lookup and allocation process is the basic linked list operation, which calls morecore() to obtain more memory space when there is no appropriate free space in the free list. The final return value is the first address of the free space, the address after the Header, which is consistent with the library function.


morecore() 
#define NALLOC 1024    
static Header *morecore(unsigned nu)
{
    char *cp;
    Header *up;
    if(nu < NALLOC)
        nu = NALLOC;
    cp = sbrk(nu * sizeof(Header));
    if(cp == (char *)-1)    
        return NULL;
    up = (Header *)cp;
    up->s.size = nu;
    free((void *)(up+1));
    return freep;
}

Morecore () requests more free space from the system and joins. Since SBRK () is called, the system overhead is large. To avoid the number of calls of morecore() itself, an NALLOC is set. If the space of each application is less than NALLOC, the space of the size of NALLOC is applied, so that subsequent malloc() need not call morecore() every time. For SBRK (), we'll look at that later.

Here's a surprise: malloc() calls morecore(), and morecore() calls free()! It may seem strange to see this for the first time, because the conventional wisdom is that malloc() and free() are supposed to be separate from each other. But think again. Free () is an extension of the free list, and malloc(), when the free list is insufficient, converts it into a part of the free list after the system requests more memory space. So it makes sense for malloc() to call free() to do the rest. Following this idea is the implementation of free(). Before that, there are a few details about morecore() itself:

1. SBRK () returns -1 if the system also has no space to allocate. Cp is of type char *. Char is unsigned on some machines. A cast is required.

2. The return value of the morecore() call looks strange. Don't worry, freep will change in free(). This return value is also used for the compactness of the judgment in malloc(), the re-assignment of p = freep.


free() 
void free(void *ap)
{
    Header *bp,*p;
    bp = (Header *)ap -1; 
    for(p=freep;!(bp>p && bp< p->s.ptr);p=p->s.ptr)
        if(p>=p->s.ptr && (bp>p || bp<p->s.ptr))
            break;    
    if (bp+bp->s.size==p->s.ptr) {    
        bp->s.size += p->s.ptr->s.size;
        bp->s.ptr = p->s.ptr->s.ptr;
    } else
        bp->s.ptr = p->s.ptr;
    if (p+p->s.size == bp) {     
        p->s.size += bp->s.size;
        p->s.ptr = bp->s.ptr;
    } else
        p->s.ptr = bp;
    freep = p;
}

  Free () first locates the bp corresponding to the ap to be freed relative to the free list, finds its nearest previous and next free space, or finds the first and last elements of the free list when it is in front or behind the entire free space. Note that because of the way malloc() is allocated and the way free() is reclaimed (mentioned in a moment) is merged, it is guaranteed that the list of the entire free space will always rise from the lower address one by one, with the free space at the highest address going back to the first free space at the lower address.

After positioning, the Header of the corresponding space is modified according to the adjacency between the space to be freed and the adjacent space. The juxtaposition of the two ifs allows bp to combine with both high and low address free space (if both are adjacent), or to merge one or the other, or not to merge.

When all three parts of the code are done (note that in the same source file, SBRK () requires #include < Unistd. H>) , you can use it. Note, of course, that the name conflicts with the function of the same name in stdlib.h and you can rename it yourself.

Looking at the source code for the first time, many implementations might not have thought of it: the structure and alignment of Header padding, the rounding of Spaces, the operation and initialization of linked lists (boundary conditions), the invocation of free() by malloc(), the order of linked list addresses implicitly guaranteed by malloc() and free(), and so on. In addition, I will attach the two questions mentioned in the previous article and some simple thoughts on supplementary questions :

1. The Header is separated from the free space, and the Header contains a pointer to its free space

This is not necessary, and the algorithm needs to be changed accordingly. At the same time, since the headers and free space are no longer adjacent, the space obtained by SBRK () should also contain portions of the headers, and the memory distribution may be more trivial. Of course, this may also have the benefit of managing the list with other data structures, such as hashing by size, which makes it faster to find.

2. About SBRK ()

SBRK () is also a library function, which can make the heap to the direction of the stack growth, specific reference: BRK (), SBRK () usage details.

3. Ways to improve

The search for free space is linear, the search process in memory allocation can be regarded as a loop first adaptive algorithm, in some cases may be slow; If you set up another data structure, such as a hash table, to index different sizes of space, you can certainly speed up the lookup itself and implement some algorithms, such as best matches. But the cost of faster lookups is that modifying the index takes extra time, which is a tradeoff.

The minimum allocation space in morecore() is the macro definition, which can be passed as a parameter in practice, and the minimum allocation limit can be set as needed.


Related articles: