Overcoming the flaws of object oriented programming - data oriented design

utcenter · wrote on 11 Sept 2012, 10:33

In a regular object oriented practice it is not that rare objects have multiple unrelated member properties. And when objects are being processed, it is not rare that it is done in different passes, which aim for different parts of their properties.

In this regard, the typical approach to create collections of objects doesn't seem to be a very efficient one. Considering the ways computers access memory and the average size of cache lines there is a very good chance cache memory is getting filled with that that is not needed, but simply happened to be adjacent, so it ends up wasting capacity of the cache and adding stalling and latency to execution.

Even worse is the practice of using polymorphism and allocating objects dynamically, without memory pools and custom allocators. In this case, not only is cache getting filled with unneeded data, but due to the arbitrary addresses, used by dynamic memory allocation, the prefetchers fail to work adequately as well.

The salvation is to go back to a pre-OOP times and opt for data orientation, which seems to be the choice of preference for the development of performance critical applications, operating systems and such.

Instead of objects containing their own data members, they only store a reference to the collections, where their data members are stored sequentially in their own containers, and their member methods return data from those containers, this way the odds of unneeded data ending up on its way to the CPU should be reduced and the odds of data, needed in the near "future" is increased. The logical assumption is that this approach will improve prefetcher efficiency, cache hits and usage efficiency and will also reduce latencies, involved in automatic and manual parallelization.

According to "this slide":http://harmful.cat-v.org/software/OO_programming/_pdf/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf I found related to the subject, performance improvement ranges from SIGNIFICANT to TREMENDOUS which is nothing any performance demanding programmer should overlook.

!http://i45.tinypic.com/2ebbxfs.jpg(performance)!

mlong · wrote on 11 Sept 2012, 14:37

At the risk of fanning the fire, I'll ask... Are you asserting that oop is bad in all cases? I can see where this might apply in some cases of extremely high-performance applications, but I fail to see this as a generalization which applies in most common situations that we as developers run into on a daily basis.

[Edit: ok, I just now saw your last paragraph, which clarified this some]

utcenter · wrote on 11 Sept 2012, 17:16

bq. Are you asserting that oop is bad in all cases?

Not by a long shot. But in its current implementation it does appear to have significant drawbacks. It is understandable that the industry does not mind, since less efficient execution justifies selling all those faster CPUs with more cache, premium ram and whatnot.

But the problem is entirely in the implementation of OOP, not its paradigm, which can in fact be implemented in a more execution/cache friendly way. The same way C++ does a lot of boiler plate code behind the scenes for you, like virtual functions, like function overloading, like templates, like dynamic memory allocation.

My idea is to keep the OOP paradigm at the programmer frontend, but create a flat model implementation of it in the background, something that is hidden and done automatically for you. You still register your member variables in your class, so the concept of encapsulation is not broken, and it doesn't really matter where member variables reside, they can still be private and well protected. After all, it is not like that member methods exist in the actual class instance, it is just an illusion, and behind the scenes it is a global function and which instance invokes it is being passed as a parameter by the compiler.

Naturally, there are plenty of scenarios where the overhead is irrelevant, like for example GUI, where performance is not critical. But cache pollution is something that reflects the entire system, not just the performance of a single process. WE ARE LUCKY we all use operating systems that are NOT written in the OOP paradigm. But we all run plenty of software that is, in fact most of it, and the small CPU cache is full of data that is not needed, there is no room for the data that is needed, cache misses are expensive and so is RAM access. Those are resources that are shared by all running processes and threads, so imagine if that same type of 2-4X times performance improvement hits across the entire system - nothing to sneeze at...

I mean there is a way to eliminate all that overhead while still retain the concept of OOP, as a C++ fan I've been always skeptical of OOP criticism, even coming from big names like Linus Torvalds:

bq.
C++ is a horrible language. It's made more horrible by the fact that a lot
of substandard programmers use it, to the point where it's much much
easier to generate total and utter crap with it.

It is true that DOD is a hard complex, requiring at least somewhat detailed knowledge of how the underlying hardware of the machine operators, and still it shocks me to realize the many books and DVDs I've been watching on programming do not even touch that subject, considering the tremendous performance and efficiency it brings. Performance and efficiency is why I chose C++, and it wasn't until I got curious of how all the OOP concepts were applied back in the days of pure good old C, that I realized as much as intuitive OOP is, it goes against the way the machine works.

BUT that doesn't mean we can't have the best of both worlds. We just put that extra work in the compiler, we still declare our classes and put our member objects inside, the only difference is we specify which ones we want stored with the actual object and which ones we want in a flat, sequential memory model for all objects. Naturally, that would require some rethinking and redoing polymorphism, but considering the improvement this can bring I think it is well worth the investment.

Considering most members data is retrieved with accessors, it wouldn't even make that much of a difference. The only difference is when you start processing your objects in passes, pass A will only feed pass A related data to the CPU cache, pass B, C and D would do the same, whereas with OOP as it is, all passes will load data that is not needed, since line caches are fairly bigger than a primitive. Not to mention it will make auto vectorization much easier, and the continuous nature of data will result in less latency preparing SIMD instructions.

All this can totally be implemented in C++ or any other OOP language for that matter, but it must be done manually, but it can also be done automatically like so many other things.

I even came up with a catchy acronym for this paradigm - DOOP - Data Oriented Object Programming :)