content/posts/2024-01-26-data-oriented-design.smd


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

---
.title = "Book - Data Oriented Design",
.author = "Martin Ashby",
.date = @date("2024-01-26T20:19:31Z"),
.layout = "single.shtml",
.custom = {"comments": true},
---

I recently read [Data Oriented Design](https://www.dataorienteddesign.com/dodbook/) by Richard Fabian.

The book is about software design; and specifically it relates to software design in games, but the principals are partially relevant to other domains as well.

The main take away seems to be that 'Object Oriented Design' fails to deliver on some of the properties that software engineers think it will give their projects, as well as being hurtful for performance in general due to it's inherent incompatibility with how modern CPUs work. The remedy is 'Data Oriented Design' which advocates _separating_ data from behaviour, normalizing your data in the same way that you would in a relational database system, and processing in bulk rather than jumping back and forth between tasks.

There were some paragraphs on _why_ object oriented programming is bad for performance; and the answer is that it's bad for pipelining and branch prediction, mainly becuase of C++ virtual functions which branch and then have to do pointer lookups in order to find which code to execute, which means pipelining is ineffective. The same problem exists in theory in other programming languages which use inheritance.
 
Although it was an interesting read, I don't think it'll have much impact on how I think about software that I work on. At my company we are working on a web application which does not have the same stringent low-level performance requirements as video games do. We also use an external relational database rather than having local application state, and code we write uses a minimum amount of local state. There were some interesting points about database normal forms and how they are useful for extensibility which are relevant to me, and I'll be taking those on board. In future my work might involve more data processing, where performance is more of a concern, so I might end up revisiting this book.