[Abandoned post] Modern memory management (what are structs in Crystal?)

I can never get around to writing this blog post. It's an ambitious one, with great examples an schematics planned. Anyway, maybe someone will still find the beginnings of it useful.


This is a writeup in layman's terms about how memory is being allocated and managed in most modern programming languages. Its target audience is users of Crystal programming language, but the parts specific to that language will be on purple background. Expect some omissions and inaccuracies for the purpose of explanation.


Let's start with the stack. This is the place somewhere near the start of the big memory block owned by each program. It has a limited size, and memory there is allocated sequentially and very neatly. The main reason why it's called a stack is because it stores the current chain of nested function calls, with main() at the start, all the way to the current function at the end (yes, I want to use "end" instead of "top" of the stack, otherwise it's confusing). Ever heard of a stack overflow? That typically happens when there has been too much recursion happening, too many nested function calls, the stack entries of which at some point bust through the allotted size of the stack. Anyway... whenever a function is called, it writes some information to the end of the stack, first of all the information on how to return to the previous function, then all of its arguments one after the other, then local variables. The memory layout of each function's stack entry is predefined at compilation. So, the function immediately knows where every relevant variable is, and so it can execute correctly using the memory as needed. When it's done, this whole stack entry is popped, and execution goes to the previous function, with its data still sitting there.

TODO: fibonacci

So... the stack is everything that one needs to make great things, right? Well, not really. Let's say you are working with a huge array of complex data entries. First of all, you probably won't have a fixed, predefined number of entries, you need to grow the array. And maybe you also need to pass it to another function. The stack does not work well with these use cases. Passing the array as an argument would mean copying all the data onto the top of the stack again so that the called function can work with it from its own stack entry. Even without involving arrays, you may not want to copy even one of the aforementioned "complex data entries" when passing it to functions, these copy operations add up!

This is where the heap comes in! The heap is basically the rest of the program's memory, it can grow indefinitely and does not need to have all data tightly packed, or have the size of the data known in advance. For all we know, data there could be randomly scattered across the whole RAM module. Of course, in reality the OS keeps tally of everything and has to have the data in some order, but we don't care about that. At any time, the program can request a contiguous memory region of a particular size on the heap, and the runtime will provide it. When the program is done with that memory, it needs to tell the runtime to reclaim that memory so it can be used for something else.

But the program also needs to be told where that memory region is. It receives only the memory address of the start of that memory region. That's a pointer. The program only needs to remember the pointer and keep in mind the size of the provided memory region.

TODO: snake

As mentioned, the program would also need to pay a lot of attention to when to release that memory, but garbage collectors are being used more and more often. They just keep track of those memory regions and automatically release them when there are no more pointers to them.


think of it like this. by default data is stored contiguously with whatever other data is in the type. value types (e.g. structs) start off in the stack. a reference type (e.g. class) is always a pointer to the heap. ‎‎

if a struct is inside a class, it means the class is a pointer to the heap, and the struct is stored in the heap contiguously with other data of the class

structs are stored "right here" and classes only store the pointer "right here", their data is always on the heap ‎

"right here" is the stack by default

‎ but once you've made the jump to the heap there's no going back to the stack

Created
Comments powered by Disqus