osa1 github gitlab twitter cv rss

On malloc and brk/sbrk

April 8, 2018 - Tagged as: en, c.

Read about malloc() on Linux in lecture notes and even non-ancient books like The Linux Programming Interface (published on 2010) and you’ll see a lot of mentions to brk() and sbrk() system calls. They then move on to talk about mmap(), and at that point you probably start wondering how they interact.

The problem is, because mmap() lets you map stuff in your program’s address space, it seems like you can easily break brk/sbrk by mapping stuff right after the current program break. Here’s a program that does exactly that:

// mmap() something right after program break, then increase it by malloc-ing stuff

#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>

int main()
{
    long page_size = sysconf(_SC_PAGESIZE);
    printf("Page size:             %ld\n", page_size);

    void* brk = sbrk(0);
    printf("Current program break: %p\n", brk);

    // mmap() right after program break
    void* mmap_ret = mmap(
            brk, page_size, PROT_WRITE | PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    printf("mmap() returned:       %p\n", mmap_ret);

    assert(mmap_ret == brk);

    // Allocate until brk changes
    void* new_brk = sbrk(0);
    void* ret = malloc(page_size);
    for (;;)
    {
        if (!ret)
        {
            printf("malloc() failed\n");
            break;
        }

        if (new_brk != brk)
        {
            printf("brk changed\n");
            break;
        }

        if (ret > brk)
        {
            printf("ret > brk\n");
            break;
        }

        ret = malloc(page_size);
        new_brk = sbrk(0);
    }

    printf("New brk:               %p\n", new_brk);
    printf("ret:                   %p\n", ret);

    return 0;
}

So we read the program break, mmap() stuff right after it, then malloc() page-sized space until the program break changes. The idea is if we actually increment the program break, we’ll end up re-using mmap()d area.

Of course this does not happen. If you run this, you’ll see something like:

Page size:             4096
Current program break: 0x1efb000
mmap() returned:       0x1efb000
ret > brk
New brk:               0x1efb000
ret:                   0x7fdb72d8f010

So the malloc() implementation doesn’t care about program break at all, instead it uses mmap(), probably with NULL as the addr parameter, to get a fresh location in the address space.

Of course this is all obvious if you’re already familiar with this stuff. sbrk() just can’t work with mmap(). Try adding these lines to the program above, right before the return statement:

    // Try to increment program break manually
    void* sbrk_ret = sbrk(page_size);
    printf("sbrk_ret:              %p\n", sbrk_ret);
    printf("New brk:               %p\n", sbrk(0));

You’ll see that sbrk() returns something like 0xffffffffffffffff ( (void*)-1 ) and the program break does not change.

Really what confused me is all these over-simplified lecture notes and book chapters.