Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Storing QString constants without global static non-pod values



  • In my library I've got many string constants (100+) that are usually frequently used at runtime (or at least some of them). Currently all those strings are stored like this:

    // Constants_p.h:
    extern const char* XMLNS_SASL;
    
    // Constants.cpp:
    const char* XMLNS_SASL = "urn:ietf:params:xml:ns:xmpp-sasl";
    

    This is nice, because the constants are stored only once in the whole library, but when serializing XML using the QXmlStreamWriter I of course always need to convert the constants to QStrings. This causes unnecessary runtime overhead since every time they're used the constants are so converted from UTF-8 to UTF-16.

    So it would be better to directly store the strings as QStrings. I could also use QStringLiteral(), so they're converted to UTF-16 at compile-time already. This would look like this:

    // Constants_p.h:
    extern const QString XMLNS_SASL;
    
    // Constants.cpp:
    const QString XMLNS_SASL = QStringLiteral("urn:ietf:params:xml:ns:xmpp-sasl");
    

    The problem now is just, that the CTORs of the constants are all called at library load time, before main(), because QString has no trivial CTOR.

    What would be the right way to solve this problem?

    Things I thought about:

    1. Clazy instructs you to use Q_GLOBAL_STATIC in such a case. This makes sense for constructing QObject-based stuff, but would be a total overkill for QStrings, I'd guess.
    2. Then you could store the strings using char16_t * and the UTF-16 literal (u"test"). When using such a constant you could construct the QString using QString::fromUtf16(). The documentation says that this constructor is slow and makes a deep-copy, though. Not sure if that's even better than the current way.
    3. You could use QString(const QChar *unicode) or QString::fromRawData() should be used instead. The problem here is just that defining a QChar * isn't very nice, because you need to create it QChar by QChar: { u'A', u'B', ... } - or is there something I'm overlooking here?.
    4. Finally, you could probably do something like QStringLiteral does internally using QArrayData. The QString would need to be created from it always (AFAIK this would use implicit-sharing and wouldn't perform a deep-copy?).
    5. A totally different approach would be to use #define XMLNS_SASL QStringLiteral("..."), but this would probably cause the binary size to grow a lot, because the deduplication of the most compilers is probably not smart enough for QStringLiterals.

    Maybe I'm over-complicating things here or just haven't found the "correct" way.


  • Lifetime Qt Champion

    @Linus-Jahn Hi,

    This causes unnecessary runtime overhead since every time they're used the constants are so converted from UTF-8 to UTF-16.

    Indeed. Does it matter? Well, that depends. You should profile your app to see what you gain with any of the other steps. Nobody can tell you without this measurement data.

    Regards


  • Lifetime Qt Champion

    Hi
    I dont think it can get better than
    const QStringLiteral
    and have it be constructed once at load time.


  • Qt Champions 2019

    @mrjj said in Storing QString constants without global static non-pod values:

    and have it be constructed once at load time.

    There is nothing to construct in a QStringLiteral - at least not really.
    One disadvantage is that two places of e.g. QStringLiteral("foo") will not be united to one memory location as it is done for QLatin1String (or const char*).

    Bt as @aha_1980 already said -profiling is better here in the first place.



  • Thanks for your replies.

    It probably doesn't matter at all, because the only result will be that the library is a few micro seconds faster at runtime / at load time.

    The reason is probably rather a psychological one than a real performance issue. -- I'd like to write clean code, use the ideal way and don't want to see warnings when running i.e. clazy on my code. When looking at performance there are probably other issues that have a greater impact on the performance.


    I actually tested it now:

    Test 1: Runtime usage

    #include <QString>
    
    const char *xmlns = "urn:ietf:xml:ns:xmpp-sasl";
    // OR:
    const QString xmlns = QStringLiteral("urn:ietf:xml:ns:xmpp-sasl");
    
    int main()
    {
            for (int i = 0; i < 1e9; i++) {
                    // Option A:
                    QString text(xmlns);
                    // Option B: appending 'a'
                    QString xmlnsPlusA = QString(xmlns) + 'a';
            }
    }
    
    QStringLiteral char*
    A: QString( ), QStringLiteral is implicitly-shared 0.5 s 52 s
    B: appending 'a' 27 s 91 s

    -> The relative difference is high, but the absolute numbers are not relevant: You need to convert about 500 kB (about 20k comparable strings), so that this takes more than 1 ms and gets a performance problem in a GUI application.

    Test 2: Start-up time

    I also created a test binary with 10k global static constants (all with the same content):

    QString (cast from char *) QStringLiteral char *
    compile time 3 s 3.5 min < 1 s
    binary size (stripped) 410 kB 1270 kB 320 kB
    time to run (10k times) 33 s 23 s 19 s

    I guess the QStringLiteral binary takes longer to run, because it is larger and only partially, because of the non-POD CTORs.

    -> Using QStringLiteral has a slightly larger impact on the execution time than I've expected, but this is still not relevant since you usually have less than 10k global static strings and you only load the binary once instead of 10000 times. As long as you're not developing for an arduino this has probably no relevance.


    As expected the usage of QStringLiteral makes the binary a bit larger, the start-up time a little little bit larger, but is faster compared to casting from ascii everytime at runtime.

    I'll see if I can do anything with QArrayData, not because it's important, just for fun.