Post

Malloc

This post is only about high-level design discussion, not about programming sharing.

1. System Structure

When we use malloc(), free(), realloc(), and calloc() in C, we are doing space allocate status changing on heap. Let’s first have a glance of the position of heap in the main memory (Fig.1).

Fig.1 Memory Structure

The heap is a consecutive space. Its address grows from low to high when extending. When calling malloc(), we split enough space on the heap as a block for malloc(). The block is composed of elements like header and footer for the block size and extra info (block alloc status), and payload for the content. By using the header, we could distinguish blocks from blocks. Detailed reasons will be discussed in part2.

Fig.2 Heap and Block Structure

For convenience, the starting address of the splited block has an alighment, which is defined by the system and normally a multiple of 16-byte.

To further optimize for performance, especially considering SIMD (Single Instruction, Multiple Data) instructions, a 16-byte alignment is often preferred. This reduces the number of memory access operations required for certain types of data processing. (?) (From ChatGPT)

2. Implementation Strategies

2.1 Keeping Track of Block Size and Status

Alloc a header for these information in the block. Each size and status requires one byte to record, as shown in Fig.3.

Fig.3 Block (Ver.1)

But considering we only need 1 bit for current block status (0 for freed, 1 for alloced), and we could not always use up all the space alloced for block size, it is smart to spare 1 bit of block size to record the status info. Then the block structure becomes like Fig.4.

Fig.4 Block (Ver.2)

2.2 Coalescing Contiguous Free Blocks

Problem

Fig.5 Problem Need Coalescing

Solution

Every time when we free a block, we check the contiguous blocks if they are freed and coalesce with them as possible.

Another problem: It is easy to know the position of header of next block (just the next word size of current block’s end). So we could get its status. But how could we know the previous block’s status?

Answer: By adopting a footer!

Based on the Fig.4, we got this:

Fig.6 Block (Ver.3)

2.2 Keeping Track of Free Blocks

Implicit List

When the heap is divided into blocks with block size recorded, we could see it as an implicit list. The implicit list holds both free blocks and alloced blocks on the heap.

Fig.7 Implicit List

Adopt the Fig.6 Block Structure. Every block has a minimum size of 16 (header + footer) + 8 (minimum payload).

Explicit Free Lists

Use a double linked list to explicitly record all the free blocks.

Considering that a free block need space to store the pointers, so the minimum size of a block should be 16 (header + footer) + 16 (2 pointers).

Fig.8 Block Structure of Explicit List

Segregated Free Lists

A set of size-classified explicit free lists.

Fig.9 Segregated List

2.3 Split Block when Given More

When we are allocating a structure that is smaller than the free block it will be placed in, we could split the free block into a malloced one and a smaller free one.

2.4 Ways to Pick a Free Block for Allocation

  • First Fit

    Search the free list, use the first block whose space is larger than what we need.

  • Best Fit

    Search the entire free list, find the smallest block whose space is larger than what we need.

  • Better Fit

    Search for the first fit in the free list, then search the next x blocks for a better fit one if exist.

3. Performance Measurement

Time Efficiency

Throughput

Definition: Number of completed requests per unit time.

Goal: Given seuqnce of malloc and free requests, maximize throughput and peak memory utilization.

Sometimes the goal is conflicting, then find a balance based on need.

Memory Utilization

Internal Fragmentation

Caused by:

  • Overhead of maintaining heap data structures

    E.g. Header, Footer

  • Padding for alignment purposes

  • Explicit policy decisions

    E.g. To return a big block to satisfy a small request

External Fragmentation

Caused by:

Fig.10 External Fragmentation

4. Lab Grossary & Performance Improvement Strategies

In this lab, we are given the starter code of Implicit List Malloc. What we need to optimize is listed in the following table.

Fig.11 External Fragmentation

The parts concerning Explicit Free List, Segregated Free List and Better Fit Algorithm have already been discussed. Then I would like to share the motivation and strategies of implementing the eliminating footers and decreasing minimum block size.

In part2.2 of this article, I mentioned that the footer is only used when freeing a block because we need to know the alloc status of previous block. If the previous block is alloced, then there is nothing we need to do, which means that we could try to eliminate the footer in alloced blocks to decrease the internal fragmentation.

The idea comes that if we could store the alloc status of previous block in current block, then we could know the status of the previous block! It just takes 1 bit as current alloc status does. Very efficient.

Fig.12 Footer Elimination 1, Block Strcuture

Currently, we should keep the minimum size to be 16 (header + footer) + 16 ( 8 minimum payload + 8 saving space for the block to be freed).

Decreasing Minimum Blocks Size (Also Eliminating All Footers)

When a malloc request requests for less than 32 bytes, the strategy we used to adopt is expanding the block size, making it be able to store header, footer and 2 pointers. But there is an optimization.

Remember that the footer is only need when coalescing, and the footer has the same content as header. When the block size is bigger than the 32 (the old minimum block size), there is space for all elements. But when the block size is smaller than that, given that we have a fixed alignment of 16, then the block size must be 16. If known the previous block is a minimum block, then we could directly cauculate the position of the header of previous block!

There we go! If we spare a bit of the header to store if the previous block is a minimum block, then we could easily decreasing the minimum block sizes (from 32 to 16), effectively decreasing the internal fragmentation!

What’s more, we should use a single linked list to store the free minimum blocks. Because the minimum block’s structure is different from normal blocks. It has only the header and a pointer instead of header, footer and 2 pointers.

Fig.13 Min Block

This post is licensed under CC BY 4.0 by the author.