Detailed memory layout in Go

  • 2020-06-03 06:43:16
  • OfStack

1. go language memory layout

Imagine 1, you have 1 structure that looks like this.


type MyData struct {
        aByte   byte
        aShort  int16
        anInt32 int32
        aSlice  []byte
}

So what is this structure? Basically, it describes how data is laid out in memory. What does that mean? How does the compiler manifest itself? So let's do 1. First, let's use reflection to examine the fields in the structure.

2. On reflection

Here is some code that USES reflection to find out the field sizes and their offsets (where they are in memory relative to the beginning of the structure). Reflection can tell us what the compiler thinks about types, including structures.


// First ask Go to give us some information about the MyData type
typ := reflect.TypeOf(MyData{})
fmt.Printf("Struct is %d bytes long\n", typ.Size())
// We can run through the fields in the structure in order
n := typ.NumField()
for i := 0; i < n; i++ {
        field := typ.Field(i)
        fmt.Printf("%s at offset %v, size=%d, align=%d\n",
            field.Name, field.Offset, field.Type.Size(),
            field.Type.Align())
 }

In addition to the offset and size of each field, I also printed the alignment of each field, which I will explain later. The results are as follows:


Struct is 32 bytes long
aByte at offset 0, size=1, align=1
aShort at offset 2, size=2, align=2
anInt32 at offset 4, size=4, align=4
aSlice at offset 8, size=24, align=8

aByte is the first field in our structure, offset 0. It USES 1 byte of memory.

aShort is the second field. It USES 2 bytes of memory. The odd thing is that the offset is 2. Why is that? The answer is alignment, CPU better accesses the 2 bytes at the address at multiples of 2 bytes (" 2-byte boundary "), and accesses the 4 bytes at the 4-byte boundary up to the natural integer size of CPU, which is 8 bytes (64-bit) on modern CPU.

On some older RISC CPU access to misaligned Numbers causes a failure: on some UNIX systems, this will be an SIGBUS, which will stop your program (or kernel). Some systems can handle these errors and fix them: Your code will run, but slowly, because the extra code will be run by the operating system to fix the errors. I'm sure Intel and ARM's CPU also just handle any misalignment on the chip: maybe we'll test this point in a future article, along with any performance impact.

In any case, alignment is why the Go compiler skips one byte to place the field aShort so that it is at a 2-byte boundary. Because of this, we can put another field into the structure without it taking up more memory. Here is a new version of our structure with a new field anotherByte immediately after aByte.


type MyData struct {
       aByte       byte
       anotherByte byte
       aShort      int16
       anInt32     int32
       aSlice      []byte
}

Running the reflection code again, we can see that anotherByte is exactly the free space between aByte and aShort. It sits at offset 1 and aShort is still at offset 2. It may be time to pay attention to the mysterious alignment field I mentioned earlier. It tells us and the Go compiler how this field needs to be aligned.


Struct is 32 bytes long
aByte at offset 0, size=1, align=1
anotherByte at offset 1, size=1, align=1
aShort at offset 2, size=2, align=2
anInt32 at offset 4, size=4, align=4
aSlice at offset 8, size=24, align=8

3. Look at memory

But what does our structure look like in memory? Let's see if we can find out. Let's start by building an instance of MyData and populating it with 1 value. I chose a value that should be easy to find in memory.


data := MyData{
        aByte:   0x1,
        aShort:  0x0203,
        anInt32: 0x04050607,
        aSlice:  []byte{
                0x08, 0x09, 0x0a,
        },
 }

Now some code accesses the bytes that make up the structure. We want to get an instance of this structure, find its address in memory, and print out the bytes in that memory.

We use the unsafe package to help us do this. This lets us bypass the Go type system and convert a pointer to our structure into a 32-byte array, which is the in-memory data that makes up our structure.


dataBytes := (*[32]byte)(unsafe.Pointer(&data))
fmt.Printf("Bytes are %#v\n", dataBytes)

Let's run the above code. This is the result. The first field, aByte, is shown in bold from our structure. This is hopefully what you expect, single byte aByte = 0x01 at offset 0.


Bytes are &[32]uint8{**0x1**, 0x0, 0x3, 0x2, 0x7, 0x6, 0x5, 0x4, 0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

Next, let's look at AShort. This is at offset 2 and the length is 2. If you remember, aShort = 0x0203 , but the bytes displayed are in reverse order. This is because most modern CPU is ES92en-ES93en: the least-significant byte of the value appears first in memory.


Bytes are &[32]uint8{0x1, 0x0, **0x3, 0x2**, 0x7, 0x6, 0x5, 0x4, 0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

The same thing happens on Int32 = 0x04050607. The least-significant byte occurs first in memory.


Bytes are &[32]uint8{0x1, 0x0, 0x3, 0x2, **0x7, 0x6, 0x5, 0x4**, 0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

4. The mysterious Episode

Now what do we see? This is a aSlice = [] byte {0x08,0x09,0x0a} , 24 bytes in offset 8. I don't see any symbols anywhere in my sequences 0x08, 0x09, 0x0a. What's going on here?


Bytes are &[32]uint8{0x1, 0x0, 0x3, 0x2, 0x7, 0x6, 0x5, 0x4, **0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0, 0x3, 0x0**, **0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0**}

The Go reflection pack has its own answer. slice in Go is represented by the following structure, which begins with pointer data pointing to the memory that holds the data in the slice; Then there is the length of the useful data in the memory Len, and the size of the memory Cap.


type SliceHeader struct {
        Data uintptr
        Len  int
        Cap  int
}

If we give it to our code, we get the following offset and size. The data pointer and the two are each 8 bytes long, with 8 byte alignment.


Struct is 24 bytes long
Data at offset 0, size=8, align=8
Len at offset 8, size=8, align=8
Cap at offset 16, size=8, align=8

If we look at 1 under the back of the memory structure, we can see the data is at address 0 x000000c42001055a. And then we see that Len and Cap are both 3, which is the length of our data.


Bytes are &[32]uint8{0x1, 0x0, 0x3, 0x2, 0x7, 0x6, 0x5, 0x4, **0x5a, 0x5, 0x1, 0x20, 0xc4, 0x0, 0x0, 0x0**, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

We can access these data bytes directly with the following code. First let's access the slice header directly, and then print out the memory that the data points to.


dataslice := *(*reflect.SliceHeader)(unsafe.Pointer(&data.aSlice))
fmt.Printf("Slice data is %#v\n",
        (*[3]byte)(unsafe.Pointer(dataslice.Data)))

Here is the output:


Slice data is &[3]uint8{0x8, 0x9, 0xa}

conclusion

That's the end of the Go memory layout. I hope this article is helpful for you to learn or use Go. If you have any questions, please leave a message.


Related articles: