Qt's heap and new-"fetishism" (and the consequences for safety relevant software)



  • So Qt uses the heap (free store) and new a lot. Even if you as a user, try and avoid it (example of user application), you cannot completely... since it's deeply ingrained in the library itself (example). Just look at low level events and D-pointers, for example.

    For a black-on-white proof just check out http://doc.qt.io/qt-5/search-results.html?q=heap
    for example: http://doc.qt.io/qt-5/qcoreapplication.html#postEvent
    which states: "The event must be allocated on the heap since the post event queue will take ownership of the event and delete it once it has been posted."

    Why is this?
    As an example: in some embedded systems and safety-relevant software, new is frowned upon.
    In safety-critical coding guidelines you may find guidelines such as these:

    • "new shall be used only during startup"
    • "delete shall not be used"

    (The obvious consequence being: unfortunately Qt cannot be used in domains with such guidelines, e.g. safety-critical, or real-time).

    So my question: Why is Qt so heap-"fetishist"?

    • Is it because of D-Pointer?
      There I've read: "The real reason to use d-pointers in Qt is for binary compatibility and the fact that Qt started out closed source" (So heck: is it because of it's proprietary roots??)
    • Or was it from the original design decisions (from way-back-when) where it was to be a Desktop Toolkit? (Or particularly a GUI toolkit)

    What would it mean, for Qt to be redesigned
    without new (for the library; -- of course the user can still opt to use new in his own application): possibly it would break binary compatibility (no more D-Pointer), but increase startup performance (and even performance in general)!
    But the cost in terms of changing the code-base would be huge and complex. (well probably).

    I'd be interested in hearing your general thoughts, particularly from long-time Qt'ers who know the libary better than me.
    Thanks.


  • Lifetime Qt Champion

    Hi,

    That's a question you should bring to the development mailing list It's the mailing about Qt's own development as opposed to the interest mailing list who's for persons using Qt. However, take the time to search through the archives, IIRC there was a thread about that (but I can't be 100% affirmative.)


  • Moderators

    Hi @nicesw123,

    There are many reasons for using the heap. Here, I present three:

    1. The stack is tiny

    Visual Studio, for example, provides a default stack size of 1 MB.

    If Qt allocates everything on the stack instead of the heap, programs that use Qt will encounter a stack overflow very quickly. Yes, liberal use of the heap makes Qt unsuitable for resource-starved embedded systems and safety applications. However, if we avoid the heap, Qt will become unsuitable for large and complex applications, scientific computation, and all sorts of other use cases that Qt is currently popular for.

    2. We want to update libraries without breaking binary compatibility

    If you don't use D-pointers, then your data structures are stuck. If you add or remove a single member variable to an exported class/struct, you will break binary compatibility -- this makes it extremely hard to add features to a library, or to refactor the library code. However, D-pointer objects are not exported, so you can add/remove as many members as you want.

    3. We want a high-level API design

    First, familiarize yourself with the concept of value objects vs. identity objects. Second, keep a clear distinction between heap-allocation of an object's internal data vs. heap-allocation of the object itself. Notice that Qt users only need to heap-allocate are identity objects (e.g. QDialog, QSerialPort). Value objects (e.g. QImage, QPoint) should not be created using new.

    It does not make sense to copy an identity object (can you think of why?). In this case, how would you pass these objects to other functions? The answer is: Pointers. Ok then, now how do you make sure the pointer remains valid after the function that creates them returns? The answer is: Allocate the object on the heap.


    There are many other reasons we can go into if you want. Like @SGaist said, post to the Development mailing list; you should get many more perspectives there.

    without new... increase startup performance (and even performance in general)!

    How did you arrive at this conclusion? What numbers have you measured, and what assumptions have you made?



  • Hi JKSH ,

    thanks for your thoughts: but to be honest, they feel a bit too status quo for my taste. ;)

    Yes I'm eying embedded safety-critical applications (just for kicks). Who says we can't make Qt work for it? (well with all the known hickups that SOUP may cause, and the methods of dealing with them [e.g. using only a subset of Qt, and have it audited])

    One very nice solution would be to have better control over the heap.
    Standard new is often not well-tuned for specific predictable safety-uses. One could instead use

    • placement new (link1, link2) (***)
      or
    • overload operator new; and implement predictable memory pools or other memory algorithms.
      (In safety-critical operations all new allocations have to be predictable [pools etc.], and delete would never be called).

    (***)
    You may be interested to know, that the guideline in the top post reading

    • "new shall be used only during startup"

    has an exception:
    "Exception: Placement-new (with the standard meaning) may be used for memory allocated from stacks."

    Here's a very rudimentary example for overloading operator new

    // c++11
    
    #include <iostream>
    #include <cstdint>
    
    constexpr uintptr_t alignNum = 128;           // nice fat power of 2  ;)
    alignas(alignNum) char membuf[alignNum*10];   // good thing c++ now has alignas!
    
    char *p = membuf;
    
    size_t offsetToNextAddress(size_t n)
    {
      // return smallest number x, such that (x >= n) and (x % alignNum == 0) 
      constexpr uintptr_t alignNumM1 = (alignNum-1);
      constexpr uintptr_t allOnes    = uintptr_t(-1);
      return (n + alignNumM1) & (allOnes - alignNumM1);
    }
    
    void* operator new(std::size_t n) throw(std::bad_alloc)
    {
      void *ret = p;
      std::cout << "greetings from custom new: " << ret << " (" << uintptr_t(ret) << ')'<< std::endl;
      p = p + offsetToNextAddress(n);
      return ret;
    }
    
    void  operator delete(void*) throw()
    {
      std::cout << "greetings from a rather \"lazy\" custom delete" << std::endl;
    }
    
    void* operator new[](std::size_t n) throw(std::bad_alloc)
    {
      void *ret = p;
      std::cout << "greetings from custom new[]: " << ret << " (" << uintptr_t(ret) << ')'<< std::endl;
      p = p + offsetToNextAddress(n);
      return ret;
    }
    
    void  operator delete[](void*) throw()
    {
      std::cout << "greetings from a rather \"lazy\" custom delete[]" << std::endl;
    }
    
    
    int main()
    {
      {
        int *i = new int{ 3 };
        int *j = new int{ 4 };
        std::cout << *i << std::endl;
        std::cout << *j << std::endl;
        
        delete j;
        delete i;
      }
    
      std::cout << "\n\n";
      
      {
        char *arr1 = new char[129];
        char *arr2 = new char[129];
        
        delete[] arr1;
        delete[] arr2;
      }
    
      return 0;
    }
    
    

    While I'm at it... I can't seem to find std::allocator in the Qt sources. Why on earth not?
    Browsing it briefly, allocators seems to be fantastic (albeit advanced) mechanism for handling custom memory allocation. (std::allocator is just the one that your compiler ships with, but you can provide your own!)

    It does not make sense to copy an identity object (can you think of why?). In this case, how would you pass these objects to other functions? The answer is: Pointers. Ok then, now how do you make sure the pointer remains valid after the function that creates them returns? The answer is: Allocate the object on the heap.

    Well partially yes, true. That's what got me thinking about these methods of exercising more control over one's heap. (Ultimately one could use (placement-)new, and have it use a predictable "stack" of "pool"-based algorithm)

    But pointers and new are not a prerequisite: Most automotive and aviation software does not use new (or malloc) at all. Then how does it work? Well preallocate all the storage ahead of time (statically), and then pass it to the function, which fills it with data.
    In this light (safety-critical, predictable, efficient, real-time), I find event-posting somethat 'designed-for-a-different-domain' (desktop). How to do it without new? I can image using a large enough ring-buffer; since events are "throw-away objects" anyway (just consumed).

    However, if we avoid the heap, Qt will become unsuitable for large and complex applications, scientific computation, and all sorts of other use cases that Qt is currently popular for.

    No, disagree. You don't need to argue against avoiding (or even removing) heap-usage in the library. The most important point is this: You can still use new in your own application if you like (particularly in dynamic applications, where you don't want to limit yourself with calculations of upper limits of preallocated static storage).

    (Oh and regarding binary compatibility... Qt is freedom-respecting, open-source software. This makes binary compatibility less of an issue, than it once was. The code does thankfully no longer need to be kept proprietary.) The benefits of performance-increase and alignment to other domains (safety, embedded, i.e. away from "typical desktop application") would far outweigh longer compilation times, in my opinion.

    without new... increase startup performance (and even performance in general)!

    How did you arrive at this conclusion? What numbers have you measured, and what assumptions have you made?

    See this https://github.com/mzimbres/rtcpp for some nifty stats and examples (which uses a custom allocatorrt:allocator.)


  • Moderators

    @nicesw123 said:

    thanks for your thoughts: but to be honest, they feel a bit too status quo for my taste. ;)

    You're welcome. I'm an engineer, so if it ain't broke, I don't fix it ;)

    Yes I'm eying embedded safety-critical applications (just for kicks). Who says we can't make Qt work for it? (well with all the known hickups that SOUP may cause, and the methods of dealing with them [e.g. using only a subset of Qt, and have it audited])

    To clarify: Are you eyeing specific parts of Qt only, or the entire Qt framework?

    I could see some parts of Qt getting fine-tuned for tighter control on memory allocation, but there's no feasible way to move all of Qt to the stack or allocate all its internal memory at startup. Examples:

    • Strings. QString is a core part of Qt, used everywhere by Qt's internal classes as well as by Qt users. The internal data of classes like QDomDocument and QJsonDocument could grow massive quite reasonably.
    • Databases. A query could reasonably grab hundreds of megabytes.
    • Web apps. (Chromium without standard new/delete? Hmm…)

    I don't want Qt to maintain a lifetime, fixed-size pool for these purposes, because the pool could easily be too big or too small.

    P.S. Qt is apparently a well-received SOUP. (Search for the word "soup" on the page. In this case, Kurt Pattyn is using Qt to interface with an X-ray driver)

    One very nice solution would be to have better control over the heap.
    Standard new is often not well-tuned for specific predictable safety-uses. One could instead use

    I'm familiar with placement new. You might be interested to know that QVarLengthArray uses it for low-level, fast allocation, and Qt uses QVarLengthArray internally where the array size is known in advance. Not when the array size is indeterminate, though.

    or

    • overload operator new; and implement predictable memory pools or other memory algorithms.
      (In safety-critical operations all new allocations have to be predictable [pools etc.], and delete would never be called).

    I've heard of this but haven't used it myself. Apparently it's not suitable for libraries?

    "Exception: Placement-new (with the standard meaning) may be used for memory allocated from stacks."

    The keyword here is "stack". Back to Point No. 1 (TM) in my previous post (in huge font). In an embedded system, the (sole) application could essentially dedicate all memory to the stack, but desktop applications get peanuts.

    I can't seem to find std::allocator in the Qt sources. Why on earth not?

    I don't have any experience with allocators, sorry. If you are serious about exploring possibilities, I suggest you discuss this with Qt engineers.

    But pointers and new are not a prerequisite: Most automotive and aviation software does not use new (or malloc) at all. Then how does it work? Well preallocate all the storage ahead of time (statically), and then pass it to the function, which fills it with data.
    In this light (safety-critical, predictable, efficient), I find event-posting somethat 'designed-for-a-different-domain' (desktop). How to do it without new? I can image using a large enough ring-buffer; since events are "throw-away objects" anyway (just consumed).

    Keep in mind that automotive an aviation software are highly specialized applications, and each addresses one use case extremely efficiently (and safely). They know their own allocation patterns, and how much memory they need. On the other hand, Qt is a general purpose library that caters for a wide variety of applications, from small standalone embedded systems to sprawling cloud servers.

    So:

    1. As a library author, how would you determine how large is "large enough", and how do you avoid allocating "too much" at startup?
    2. QEvents can be subclassed by users, to create custom events. If I've understood correctly, your proposal would require them to write custom allocators too, is that correct? (And if so, how do you justify the extra effort imposed on your library users?)

    However, if we avoid the heap, Qt will become unsuitable for large and complex applications, scientific computation, and all sorts of other use cases that Qt is currently popular for.

    No, disagree. You don't need to argue against avoiding (or even removing) heap-usage in the library. The most important point is this: You can still use new in your own application if you like (particularly in dynamic applications, where you don't want to limit yourself with calculations of upper limits of preallocated static storage).

    Is that a typo, or did I misunderstand you? As I understand it, you want to remove heap-usage in the library; you want Qt's classes to stop using standard new and delete internally.

    (Oh and regarding binary compatibility... Qt is freedom-respecting, open-source software. This makes binary compatibility less of an issue, than it once was. The code does thankfully no longer need to be kept proprietary.) The benefits of performance-increase and alignment to other domains (safety, embedded, i.e. away from "typical desktop application") would far outweigh longer compilation times, in my opinion.

    Compilation time is not the primary reason for enforcing binary compatibility. Being proprietary vs. free has nothing to do with it either.

    Picture this: A home user clicks "Update" in the Ubuntu Software Centre, and one of the things that happens is that the system Qt libraries get updated. As a result, his photo album app, which relies on Qt, stops working. How do you plan to explain to this user that the reliability of his ("unimportant") app has been reduced to cater for the development of devices that he's unlikely to use?

    If you can think of a way to do away with binary compatibility without inconveniencing end-users, app publishers, distro packagers and other stakeholders, then we can discuss the benefits of breaking it. Otherwise, binary compatibility is here to stay.

    P.S. Whatever you do, don't broach this topic with Linus Torvalds, the father of Linux.

    without new... increase startup performance (and even performance in general)!

    How did you arrive at this conclusion? What numbers have you measured, and what assumptions have you made?

    See this https://github.com/mzimbres/rtcpp for some nifty stats and examples (which uses a custom allocatorrt:allocator.)

    Ok, I see that he has almost halved allocation time for populating a set with a million integers, which is impressive. I wonder how this approach fares for other kinds of containers, or non-containers (e.g. internal data structure for 1 object).


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.