The use of a structure of arrays for storing dynamic information in games programming.

Joshua Ellis
Feb 18, 2020
7 min read

Updated: Mar 4, 2020

A change of topic.

This topic is going to be quite different from the norm today in that I'm going to start talking about more technical aspects of my work. I've wanted to diversify my article content for a while and today's topic is something I've had to think about this week in relation to the IsoEngine (our game project for those out of the loop).

Specifically, I'll be talking about the use of static data storage techniques (arrays in this case) to create, store and destroy dynamic game information, and how while it may seem counter productive, is beneficial for certain games in many ways. As a bonus I will also be pairing this with a look at a data storage pattern, which I believe is more efficient, called a structure of arrays (SoA), to hopefully improve the speed of your game or game engine.

Ditching lists.

So this might seem like strange advice. It seems obvious that the structure you should use for a collection of dynamically created and destroyed objects, that need to be stored and accessed sequentially should be a list. The argument for this of course being that not only do you have handy creation functions bundled in with the structure, but you are also only allocating memory on a need to store basis. Great right? Well... not always... You see, adding and removing things from dynamic data storage does incur a small performance cost, and when performing hundreds of actions per frame, that cost can be important to the performance of your game.

To mitigate this I employ something that does have some draw-backs, but is beneficial where it matters for games in particular. I use arrays as an alternative, because for my purposes the pros outweigh the cons. I'm working on a voxel engine, which means working on thousands of points of information for the dynamic environment we've created. As such, while it would be nice to have more flexible environment code, I need my batch operations for creating terrain to be incredibly fast.

One of the downsides of this implementation is mainly that due to the fixed nature of the data structure, there is an object limit, and also memory for the objects I want to spawn in has to be allocated at all times. This can be bad for low spec machines if you intend for the user to decide how many game objects are active at any one time. I argue however, that in some ways this can be beneficial because it forces you to stick to an object target and base other performance decisions around it, which is great for stability. This can also be worked around to some extent by using a hybrid approach, such as saving data into smaller arrays that are then referenced by a container list, therefore getting some benefits of both.

You might also ask how you decide if a space is being used or how an object might get spawned/deleted, and the answer is simple. Have a Boolean flag in the first array for each of your game objects, that can be used to tell other areas of the code if it's in use or not. A technique I employ regularly, because when deleting objects it simply requires me to set this flag to false and for other areas of the code to not interact with that slot. No data removal necessary. Fast! Just remember to overwrite every data element for that row when spawning new objects in and to set the flag to in use (true).

These speedups for spawning and deletion make it more than worth it for me. Plus having a hard object cap really does force me to make concrete design decisions for certain systems, but obviously you have to carefully evaluate this system for your own uses and use it selectively. So now that I've made my case for using static data structures, let me help take it a step further and make the case of how you should organise your new object data system.

Structure of arrays (SoA).

When looking at how we're actually going to store our game data there are two common storage patterns I'll be comparing. We'll start with the more common and conventional array of structures (AoS), which is thought of as generally more intuitive and better practice in software design. Then we'll look at the alternative, the structure of arrays (SoA), which I believe while less common is better suited to games development patterns.

As the name suggests, the array of structures is a pattern a sequence of struct containers within an array, which is a very common way of organising data about game objects. Each space acts like a game object container with data about an object being stored together. This pattern of storing object data in one collection is useful when it comes to good software engineering principles and in helping us understand our code and what it does. It makes sense that intuitively an object and its properties are grouped and accessible from the same space. This pattern does however come with some drawbacks, which I'll explore.

AoS drawback 1.

One is concerning how we typically design external game code in relation to this object array. A common practice in engines is to iterate through data of the same type for all game objects, such as changing position of objects due to gravity on your physics cycle. In this case we would have to access this data by looping through an array containing our objects, then access the struct and finally access the variable itself. Three levels of access for one piece of data across all of our objects can add up fast, especially when dealing with potentially hundreds of them.

AoS drawback 2.

Another issue is storing data in this way can create a bit of additional memory allocation overhead, especially when creating lots of objects of the same type. Each struct wrapper within the array incurs a memory cost, which again for hundreds of objects can add up quickly. Not ideal when our game/engine could hit well into the GB in RAM usage.

AoS drawback 3.

Lastly is more of a hardware optimisation concern. Storing data that is commonly used together (such as a list of health values) in memory variables that are not contiguous can lead to longer read times in CPU cache memory. It can also lead to inefficient use of RAM and storage space. Look up disk thrashing, cache line optimisations and memory fragmentation for more information on the subject (as each are topics I could write about in great detail on their own). You don't need to know the details of these, but just be conscious of the fact that these issues exist and there are ways to mitigate them.

So this all brings me on my solution to some of these issues, which may seem a little counter intuitive at first, and again, seems to fly in the face of good code design principles. I first was exposed to the concept I'm about to talk about when doing research for a primitive game engine I built from scratch. Performance was obviously a major concern, as you really have to push the hardware to its limits.

Array of structures.

In essence an array of structures, is simply taking the structure of arrays concept and flipping it on it's head. Put simply, we have a single structure that encapsulates all data for the same type of game object, such as all characters, and within that have an array for each unique piece of data relating to that type of object, such as health. So long as the sizes of each array are consistent, we can use the same index across the arrays to access data about one game object, with this index being akin to a game object ID.

SoA benefit 1.

Now you'd be right in asking why bother taking this approach. Well taking the issues I highlighted before I'll go through each one and specify how it solves the issue. First of all is the problem of accessing data values. Here we only have one struct, and a set of arrays. Now this keeps the amount of levels we have to drill through to get to the data we want the same; but notice how since there's only one struct, we technically don't need it. It's a formality to keep our code organised at this point, and if performance was a serious issue we could reduce the complexity of accessing our data down to just an array access. There is no right or wrong in code design, despite what best practice is taught, as at the end of the day, all that matters is your use case and specific needs in relation to your project. Don't be afraid of flexibility like this.

SoA benefit 2.

A simple benefit to explain, is that due to the lack of game object wrappers, our memory overhead for game object data has become a whole lot smaller. For a hundred objects we now only have one struct, and that's if we decide to keep it at all. Arrays on their own are the most memory efficient organisational structure we can have besides a primitive data type such as an integer, so having more of these in relation to an AoS is not a big deal. Now the difference for individual objects is tiny, but again with potentially hundreds, it is noticeable.

SoA benefit 3.

The final problem we initially had with data storage is also solved. Since we are now storing data that is accessed in bulk on the games systems side sequentially, we've essentially eliminated issues regarding data fragmentation. This benefits long-term storage, memory accesses, cache-line performance, and can improve speeds of your game in many hardware areas. Great if trying to target low-spec systems or laptops with limited capacities or speeds in these areas.

SoA bonuses.

Just like with the previous technique, we can still iterate across common variables in bulk, which makes system and engine programming a breeze. All we have to get used to is iterating over the containers contents rather than the container. Actually quite simple once you get used to it. There's also a nice little bonus, which has just occurred to me, in that you can also easily access shared information as part of this container, such as object counts. This would be similar to the static keyword in C++ programming, which is an extremely useful thing to have.

Summary.

So that about wraps it up! Hopefully this brief overview has given you a solid idea of why to look at this implementation as an alternative. It's not revolutionary by any means, but I believe it's something that is very often overlooked, and should definitely be something every beginner looks into as an alternative to convention. It would be great to know if you have had any success implementing this pattern with your projects, and I'd love to hear from you. See you next week all!

The use of a structure of arrays for storing dynamic information in games programming.

Recent Posts

Comments