Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the focus on ECS when talking about data-oriented design largely misses the point of what data-oriented design is all about. Focusing on a _code_ design pattern is the antithesis of _data_-oriented design. Data-oriented is about breaking away from the obsession with taxonomy, abstraction and world-modeling and moving towards the understanding that all software problems are data transformation problems.

It's that all games essentially (and most software in general) boil down to: transform(input, current_state) -> output, new_state

Then, for some finite set of platforms and hardware there will be an optimal transform to accomplish this and it is our job as engineers to make "the code" approach this optimal transform.



> Data-oriented is about breaking away from the obsession with taxonomy, abstraction and world-modeling

Something about this does not sit well with me.

Data is absolutely worthless if it generated on top of a garbage schema. Having poor modeling is catastrophic to any complex software project, and will be the root of all evil downstream.

In my view, the principal reason people hate SQL is because no one took the time to "build the world" and consult with the business experts to verify if their model was well-aligned with reality (i.e. the schema is a dumpster fire). As a consequence, recursive queries and other abominations are required to obtain meaningful business insights. If you took the time to listen to the business explain the complex journey that - for instance - user email addresses went down, you may have decided to model them in their own table rather than as a dumb string fact on the Customers table with zero historization potential.

Imagine if you could go back in time and undo all those little fuck ups in your schemas. With the power of experience and planning ahead, you can do the 2nd best thing.


You're right, when I mentioned "taxonomy, abstraction and world-modeling" I meant as it pertains to code organization in the tradition OOP/OOD sense where it's generally about naming classes, creating inheritance hierarchies, etc. Data-oriented design is _absolutely_ concerned with the data schema. I would, however, disagree that the focus should be on "building the world" with your schema. To me this means creating the schema based off of some gut/fuzzy feeling you get when the names of things all end up being real world nouns. To me creating a good schema is less about world building than it is about having the exact data that you need, well normalized and in a format that works well with the algorithm you want to apply to it.


I don't think ECS necessarily means a "build the world" approach. I think it's best kept at the level of a data structure with some given set of operations: create entity, destroy entity, add component to entity, remove component from entity, get component on entity, and the big one -- query / iterate through component combinations on all entities that have them.

Just like arrays and structs, it's yet another data structure to be used in the general data-oriented approach, one that becomes useful because those creation / destruction patterns come up in games and adding and removing components is a great way to express runtime behavior as well as explore gameplay.

The "focus" on ECS may just come from it being an interesting space as of late vs. arrays, structs and for loops that have been around for ever, but it's mostly just an acknowledgement of common array, struct and for loop patterns that arise. There's also a lot out there about the systems part and scheduling and event handling but I think it's almost best to start out with simple procedural code (that then has access to the aforementioned data structure) and let patterns collect pertinent to the game in question.

One big aspect I personally dig is if you establish an entity / data schema you get scene saving, undo / redo, blueprint / prefab systems that are all quite useful and basically necessary if you want to collaborate with artists and game designers on a content-based game, and empowers them to express large spaces of possibilities without editing the code.


People love SQL because it is truly an incredibly bad language. Poor to no ability to abstract, no composability, a grammar so convoluted it makes C++ look logical, and so on. The relational model is a beautiful thing but its power is obscured by how awful the main gateway to it is.


Schema design is THE problem data oriented programming is focused on. It's saying, let's design our data structures in memory and on disk such that they exist to solve the problem at hand. I think youre talking about the same thing


Or to circle.back again to Fred Brooks, time and time again:

"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious."

- Fred Brooks


> moving towards the understanding that all software problems are data transformation problems.

But this understanding is fundamentally, deeply wrong, in the same way that civil engineering based approaches to software engineering are wrong for most software applications.

That is: yes, all software systems are data transformation systems, but most software problems are not “how do I produce the system most narrowly tailored to the present requirements” but more often “how to engineer a system for success with the pace and kind of change that we can expect over time in this space”.

(Now, games, particularly, are both pushing the limits of hardware and fairly static, so making them narrowly-tailored, poorly adaptable static works is often not wrong. But that doesn't generalize to all, or even most, software.)


That is how you think about the software you write. System evolution is only one aspect. Most patterned OO codebases I have come across were *not* engineered for evolution. Sure there were some classes you could implement or replace, but the complexity was not paid back later.

Design principles can be applied to all implementation mechanisms.


This fits my gut feeling every time I see an ECS system that videogame design has gotten stuck in a local maxima abstraction/pattern. Often what they really want is a Monadic abstraction of a data/state transformation process, but they are often stuck in languages ("for performance reasons") that make it hard to impossible to get good Monadic abstractions. So instead they use the hammers that to make nails of the abstractions that they can get. ECS feels to me like a strange attempt to build Self-like OO dynamic prototypes in a class-based OO language, and that's almost exactly what you would expect for an industry only just now taking baby steps outside of C/C++.

C# has some good tools to head towards that direction (async/await is a powerful Monadic transformer, for instance; it's not a generic enough transformer on its own of course, but an interesting start), but as this article points out most videogames work in C# today still has to keep the back foot in C/C++ land at all times and C/C++ mentalities are still going to clip the wings of abstraction work.

(ETA: Local maxima are still useful of course! Just that I'd like to point out that they can also be a trap.)


>"for performance reasons"

The quotes imply that this is a bad reason, but in soft realtime systems you often want complete control of memory allocation.

Even in the case of something like Unity--in order to give developers the performance they want--they've designed subset of C# they call high performance C# where memory is manually allocated.

In most cases if you're using an ECS, it's because you care so much about performance that you want to organize most of your data around cache locality. If you don't care about performance, something like the classic Unity Game Object component architecture is a lot easier to work with.


Yea, you're right, I think the previous poster seriouspy underestimates videogames as performance critical (and performance consistent!) apps. In the modern days of desktop Java and C# (and even more in web dev) the vast majority of coders just don't come across the need to "do everything you need to do" in 33ms or less, consistently.


That's 16.66ms, if you want to hit 60FPS. Or 11.11ms if you're targeting 90FPS for VR applications.

Big respect to the work (and the people behind that work) that goes into getting modern AAA games to hit these targets.


I'm not implying it is a bad reason with the quotes, I'm trying to imply that it is a misguided reason (even if it has good intentions).

The "rule" that C/C++ is always "more performant" is just wrong. It's a bit of a sunk cost fallacy that because the games industry has a lot of (constantly reinvented) experience in performance optimizing C/C++ that they can't get the same or better benefits if they used better languages and higher abstractions. (It's the exact same sunk cost fallacy that a previous games industry generation said C/C++ would never beat hand-tuned Assembly and it wasn't worth trying.)

In Enterprise day jobs I've seen a ton of "high performance" C# with regular garbage collection. Performance optimizing C# and garbage collection is a different art than performance optimizing manually allocated memory code, but it is an art/science that exists. I've even seen some very high performance games written entirely in C# and not "high performance C#" but the real thing with honest garbage collection.

(It's a different art to performance optimize C# code but it isn't even that different, at a high level a lot of the techniques are very similar like knowing when to use shared pools or deciding when you can entirely stack allocate a structure instead of pushing it elsewhere in memory, etc.)

The implication in the discussion above is that a possible huge sweet spot for a lot of game development would actually be a language a lot more like Haskell, if not just Haskell. A lot of the "ECS" abstraction boils away into the ether if you have proper Monads and a nice do-notation for working with them. You'd get something of the best of both worlds that you could write what looks like the usual imperative code games have "always" been written in, but with the additional power of a higher abstraction and more complex combinators than what are often written by hand (many, many times over) in ECS systems.

So far I've not seen any production videogame even flirt with a language like Haskell. It clearly doesn't look anything like C/C++ so there's no imagination for how performant it might actually be to write a game in it (outside of hobbyist toys). But there are High Frequency Trading companies out there using Haskell in production. It can clearly hit some strong performant numbers. The art to doing so is even more different from C/C++ than C#'s is, but it exists and there are experts out there doing it.

Performance is a good reason to do things, but I think the videogames industry tends to especially lean on "performance" as a crutch to avoid learning new things. I think as an industry there's a lot of reason to avoid engaging more experts and expertise in programming languages and their performance optimization methodologies when it is far easier to train "passionate" teens extremely over-simplified (and generally wrong) maxims like "C++ will always be more performant than C#" than to keep up with the actual state of the art. I think the games industry is happiest, for a number of reasons, not exploring better options outside of local maxima and "performance" is an easily available excuse.


I think you may be indexing a bit into your exposure and painting the industry in a broad brush.

I've seen impressive things done with Lua, from literate AI programming with coroutines to building compostable component based language constructs instead of standard OOP. You have things like GOAL[1] which ran on crazy small systems (the Lua I saw ran in a 400kb block as well).

On performance, data oriented design and efficient use of caches is the way you get faster. I've done it in Java, I've done it in C#, I've done it in Rust and C++. Certain languages have better primitives for data layout and so you see gamedev index into them. We used to do things like "in-place seek-free" loading where an object was directly serialized to disk and pointers were written as offsets that were fixed up post load. Techniques like this easily net 10-30x performance benefits. It's the same reason database engines run circles around standard language constructs.

[1] https://en.m.wikipedia.org/wiki/Game_Oriented_Assembly_Lisp


You are correct that I am using a broad brush, but so far it is a broad brush sort of conversation. I realize I'm a (bitter?) cynic at this point and don't have a lot of respect for the videogames industry as a whole from a technical perspective, because it is an industry that prides itself on reinventing wheels, not sharing a lot of efforts between projects, and not trusting nor retaining expertise in the long run. I realize there are a lot of great and novel approaches such as the ones you mention (I appreciate that), but so much of the novelty is siloed and a lot of what I see as a certain kind of outsider is the tiniest slices of things that escaped the siloes such as Unity and Unreal. I realize they aren't accurate to the state of the technical art in some siloes, but these days given the number of games using one or the other of those two common engines today it certainly reflects the "state of the technical median".


Which companies use Haskell for HFT? I know someone who works at a HFT company that switched away from Haskell specifically for performance reasons.


> It's the exact same sunk cost fallacy that a previous games industry generation said C/C++ would never beat hand-tuned Assembly and it wasn't worth trying.

C/C++ hasn't beat the performance of hand-tuned assembly - it has simply gotten close enough that the cost of hand-tuned assembly is not worth it in most cases.


Seconding this - some esoteric JVM garbage collector tuning is required to build a high performance Java (or Clojure, etc.) system, but it can be done.

It's arguably significantly less work to learn how to tune the GC and then optimizing it for your situation than it is to deal with manual memory allocation and all of its fallout.


If memory serves, you (only) need two or three times the memory for GC which work well(low pause) than for manual allocation --> I'm surprised that developers of PC games didn't switch to GCs 'en masse'..

And no, I'm not joking: I work in C++ and I know exactly how annoying memory errors can be.. Thanks a lot valgrind|ASAN developers!


If your optimization goal is "use the least memory possible," then sure, manual memory allocation is the way to go. I was addressing a different optimization goal: the "high performance" case, meaning approximately "high throughput, low latency operation."

There is a common misconception that GC invariably precludes the construction of a "high performance" system, which is not true. If your use case allows you to not care as much about larger memory consumption -- 2x to 3x does seem like a reasonable first approximation of "larger" -- then GC is indeed a viable option for building "high performance" systems.

This case is not uncommon. Not everyone is targeting a memory constrained console or embedded system.

In many (though of course not all) cases, the tradeoff is well worth it -- consume more memory at runtime, spend some time tuning the GC, and in exchange developers can ship a product faster, by having to spend significantly less time dealing with manual memory allocation.


>If your optimization goal is "use the least memory possible," then sure, manual memory allocation is the way to go. I was addressing a different optimization goal: the "high performance" case, meaning approximately "high throughput, low latency operation."

Ignoring the amount of memory used, GC tuning a managed language doesn't give you the flexibility to control memory layout needed for maximum cache locality.

>If your use case allows you to not care as much about larger memory consumption -- 2x to 3x does seem like a reasonable first approximation of "larger" -- then GC is indeed a viable option for building "high performance" systems.

Not ignoring amount of memory used. In the context of this thread--video games specifically "high performance" video games--2x to 3x is almost never going to be acceptable.


I can tell you why this switch didn't happen: 2x to 3x the memory usage is just absolutely abysmal for a process that is barely fitting into memory as it is. Most of the games that run up against these constraints are multiplatform titles targeting consoles that are notoriously stingy with main memory to reduce cost.


>The "rule" that C/C++ is always "more performant" is just wrong.

Who said it's a rule? What C/C++ gets you is the ability to manually allocate memory without jumping through hoops.

> Performance optimizing C# and garbage collection is a different art than performance optimizing manually allocated memory code, but it is an art/science that exists. I've even seen some very high performance games written entirely in C# and not "high performance C#" but the real thing with honest garbage collection.

Performance optimizing C# with garbage collection for high performance soft realtime systems (I've done it) relies on tricks like object pooling to avoid triggering GC along with avoiding many of the more advanced language features. Even then you don't get the same level of control. I'm also almost completely certain that the high performance C# games you're talking about aren't using C# for the engine, but feel free to provide examples so I can take a look.

If your game (or parts of your game) doesn't need the performance that comes with a higher degree of memory layout control, then by all means use whatever tools you want to.

I've written game logic in C#, F#, Ruby, Haxe, Python, Lua, Java, JavaScript and Elixir.

>The implication in the discussion above is that a possible huge sweet spot for a lot of game development would actually be a language a lot more like Haskell, if not just Haskell.

There almost certainly is for game logic. Many modern game engines provide higher level scripting languages.

However, if what you are working on is in that sweet spot, you likely didn't need an ECS to begin with and a classic component architecture would have probably been a lot easier to deal with.

>But there are High Frequency Trading companies out there using Haskell in production.

HFT is not game dev. "Performance" in HFT doesn't mean the same thing as performance in games.

I haven't used Haskell specifically, but I've toyed with using Elixir for gamedev. It's reliance on linked lists makes it extremely difficult to iterate quickly enough. There are work arounds of course, but the work arounds remove most of what is nice about Elixir in the first place.

>Performance is a good reason to do things, but I think the videogames industry tends to especially lean on "performance" as a crutch to avoid learning new things. I think as an industry there's a lot of reason to avoid engaging more experts and expertise in programming languages and their performance optimization methodologies when it is far easier to train "passionate" teens extremely over-simplified (and generally wrong) maxims like "C++ will always be more performant than C#" than to keep up with the actual state of the art. I think the games industry is happiest, for a number of reasons, not exploring better options outside of local maxima and "performance" is an easily available excuse.

The average engine coder writing high performance code in C++ isn't a "passionate teen". They are experienced software engineers who want to stick as close to the metal as they feasibly can.

The games industry (outside of AAA games) also has an extremely low barrier to entry, and it's something that nearly every programmer has thought about doing at some point--if Haskell turns out to be a fantastic language for making games, it will almost certainly happen sooner or later.


> The average engine coder writing high performance code in C++ isn't a "passionate teen". They are experienced software engineers who want to stick as close to the metal as they feasibly can.

Statistically the median age in the games industry is 25 and always has been. It's a perpetually young industry not known for retaining experienced talent. I know that statistically the median doesn't tell you a lot about how long of a tail there is of senior talent, you need the standard deviation for that, but given what I've seen as mostly an outside observer with a strong interest the burn out rate in the industry remains as high as ever and senior developers with decades of experience are most likely to be an anomaly and an exception that proves the rule than a reality. In terms of anecdata all of the senior software developers I've ever followed the careers of on blogs and/or LinkedIn are all in management positions or entirely different industries after 30. I realize my sample size is biased by the people I chose to follow (for whichever reason) and anecdata is not data, but statistically it's really hard for me to square "experienced software engineers" with "in practice, it looks like no one over 30".


>Statistically the median age in the games industry is 25 and always has been.

Where are you getting this information from? The only hard data I can find is from self selected survey responses, but this survey from IGDA shows only 10% of employed game developers are under 25 [1]. My guess is that (as you've acknowledged is possible) there's some serious selection bias going on. You said you have an interest in burn out rate, so I'm guessing you're more likely to follow/notice game devs who discuss this topic. This group is more likely to be suffering from burn out I'd wager.

Another poster already mentioned that engine devs (the one's writing most of the C++) tend to be older than the industry average.

1. https://s3-us-east-2.amazonaws.com/igda-website/wp-content/u...


In game dev, there has been a really serious split between engine development and game (logic and content) development. Most of the talented and experienced programmers seem to drift towards engine development. That's where the hard problems are and where these guys can have the most impact. As a bonus, engine development cycles are not so closely coupled to game release dates anymore, so crunch is less of an issue in engine teams.


> Focusing on a _code_ design pattern is the antithesis of _data_-oriented design

Doesn't the former enable the latter? Ideally, language (both human and machine) would have the semantics needed to represent all transforms, but that's not the case. Code you rely on, since none of it is written in isolation, needs to enable you to implement data-oriented design should you so choose.

Also, I don't think pointing out that 'all games are essentially...' is particularly useful. It's true, no question, but that doesn't mean it's the most useful mental model for people to use when developing software. Our job as engineers is to make software that functions according to some set of desires, and those desires may directly conflict with approaching an optimal transform.


> Doesn't the former enable the latter?

Not necessarily. ECS is a local maxima when developing a general purpose game engine. Since it's general purpose it can do nothing more than provide a lowest common denominator interface that can be used to make any game. If you are building a game from scratch why would you limit yourself to a lowest common denominator interface when there's no need? Just write the exact concrete code that needs to be there to solve the problem.

> Our job as engineers is to make software that functions according to some set of desires, and those desires may directly conflict with approaching an optimal transform.

All runtime desires of the software must be encoded in the transform. So no software functionality should get in the way of approaching the optimal transform. What does get in the way of approaching the optimal transform is code organization, architecture and abstraction that is non-essential to performing the transform.


> Just write the exact concrete code that needs to be there to solve the problem.

Good luck with that when the exact code to solve the problem is not the exact code the next week, because the problem has changed or evolved.

Not to suggest an ECS is the answer, but this line of thinking is reductive to the realities of creating a piece of art. It's not a spec you can draw a diagram for and trust will be basically the same. It's a creature you discover, revealing more of itself over time. The popularity of the ECS is because it provides accessible composition. It's not the only way of composing data but being able to say "AddX", "RemoveX" without the implementation details of what struct holds what data and what groupings might matter is what makes it appealing.


I think there’s two orthogonal things being conflated by you. Flexibility of a solution and how general the solution is.

What you’re basically saying is a solution should be flexible to change because making a game requires trial and error. I totally agree with that.

Using a general solution is one path to flexibility but it does come with a cost associated. It’s flexibility built on a tower of complexity and if you look at a modern ECS implementation that is performant it’s actually quite a lot of complexity. You’re also reducing flexibility in the sense that these sort of solutions generally have preferred patterns you need to fit your game design into. So you end up introducing a learning, maintainance and conceptual burden into the project you might not need.

OTOH if you have a specific problem you can write a specific solution for you will end up with less code, hopefully in a conceptually coherent form. That in itself offers flexibility. Simple code you can easily replace is often more flexible than complex code you need to coax into a new form.

The key is to recognise whether your problem is specific or general you need flexibility.

These architectural patterns are fun to argue over and obsessed over by armchair game developers but are a trap if you’re trying to make a game rather than a general purpose game engine.

Which isn’t to say you don’t want some framework underlying things for all sorts of mundane reasons. But most games could get away with that being an entity type that gets specialised rather than anything more complex.


> I think there’s two orthogonal things being conflated by you. > It’s flexibility built on a tower of complexity

Agreed with the 'flexibility on a tower of complexity', 100%! :) was trying to not appear too dogmatic by describing is as 'accessible composition'; generally any solution that is 'accessible' is also broad enough that it has as many flaws as benefits, and an ECS definitely isn't an exception.

> These architectural patterns are fun to argue over and obsessed over by armchair game developers but are a trap if you’re trying to make a game rather than a general purpose game engine.

Again, agreed. Speaking from experience as an iterator and rapid prototyper who has used an ECS for years, and has been bitten by the complexity but hasn't been able to beat the flexibility of being able to just write something like `entity->Add<ScaleAnimation>(...)`, `entity->Add<DestroyAfter>(...)`, `entity->Add<Autotranslate>(...)`, `entity->Add<Sprite>(...)` to be quickly and easily create a thing that looks nice, pops in smoothly, moves effortlessly, destroys itself thoughlessly. It lets you move between ideas quickly and then you can pivot to addressing concerns if any show up.


Yeah for sure, I love a good composable approach to entity creation as well particularly when it’s specified in data rather than code. The basic framework for getting that going is extremely lightweight which is fantastic.


> If you are building a game from scratch why would you limit yourself to a lowest common denominator interface when there's no need?

There is a need: the limits of the human mind. Nobody can model an entire (worthwhile) game in their head, so unless you plan on recursively rewriting the entire program as each new new oversight pops up, you aren't going to get anywhere near optimal anyway.


> If you are building a game from scratch why would you limit yourself to a lowest common denominator interface when there's no need? Just write the exact concrete code that needs to be there to solve the problem.

Coming from the realm of someone who has mostly swam in the OO pool their career, I struggle understanding how a concrete implementation of something like a video game wouldn't spiral out of control quickly without significant organization and some amount of abstraction overhead. That said, I have found ECS type systems be so general purpose that you end up doing a lot of things to please the ECS design itself than you do focusing on the implementation.

Do you have any examples of games and/or code that are written in more of a data oriented way? I'd really love to learn more about this approach.


While stylistically I don't necessarily agree with him all the time, Casey Muratori's Handmade Hero (https://handmadehero.org/) is probably the most complete resource in terms of videos and access to source code as far as an 'example' goes.


Thank you!


The archetypal ’talk’ on Data Oriented Design (in the way GP is talking about it) is Mike Acton’s 2014 CPPCon keynote.

[1] https://www.youtube.com/watch?v=rX0ItVEVjHc


Thanks!


It's really annoying how many people misunderstand the term 'data oriented design'. Usually to mean something like 'not object oriented programming'. If your data was inherently hierarchical and talking about animals that meow or moo, go ahead and implement the textbook OO modeling.

This Mike Acton post describes it accurately: http://www.macton.ninja/home/onwhydodisntamodellingapproacha...


I do think that in practice a hierarchical ontology like that is still best not modeled as a language-level hierarchy because the language inheritance / hierarchy concepts are often not the exact semantic you want, IME. Esp. if it then ties to static types, since you then can't change the hierarchy at runtime. I think even with a data-oriented approach to a hierarchy -- your code isn't necessarily hierarchically organized, it just handles data that happens to express a hierarchy. And you want to be in control of the semantics of said hierarchy with more freedom (and explicitness) than the language-level hierarchy gives you -- so you want your own code that interprets the hierarchy expressed in the data and performs your desired semantics. This also allows artists and narrative or gameplay / level designers to go see and edit the hierarchy and add elements to it.

An example is the prefab hierarchy you get in Unity, which is expressed through the data (prefabs and their relationships). (Note: I mean specifically the prefab inheritance hierarchy, not the transform spatial hierarchy -- the former has more overlap with the "is a" relationships). The code processing this hierarchy could've just been plain C code that parses the files and maintains an in-memory set of structures about them, even. You then get to define how properties inherit, what overriding means, etc. yourself.


Totally agreed. The fact that some of these concepts are embedded into the language design (say for C++) are minor conveniences at best - when they almost perfectly line up with the data you have - but just get in the way most of the time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: