How can QChar::LastValidCodePoint fit inside an unsigned short int?



  • I was just looking through the QChar documentation and saw this member of QChar::SpecialCharacters and I spotted the value was - not unreasonably 0x10FFFF - but then I remembered that, and I quote:

    "The QChar class provides a 16-bit Unicode character."

    "In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a unsigned short."

    However an unsigned short integer only has to be at least 16 bits according to Wikipedia which directs the curious to ISO/IEC 9899:1999 specification, TC3 (PDF). p. 22, § 5.2.4.2.1 Sizes of integer types <limits.h>. - so how can Qt guarantee that a (lightweight) QChar can contain 0x10FFFF when the underlying type specification may only be big enough to hold 0xFFFF - I thought the whole reason to have to handle QString very carefully when dealing with non-BMP characters was because of the need to tread carefully when there are High & Low Surrogates in there to deal with the worst of both worlds that is UTF-16 - if a QChar can hold all 21 bits needed to represent a Unicode codepoint such as the highest legitimate one why do we not have a QWideString class that works on simple arrays of them to work directly with UTF-32?


  • Moderators

    @SlySven I didn't check QChar implementation, but I guess QChar uses int/uint internally. Many methods in QChar expect or return uint.



  • @jsulm said in How can QChar::LastValidCodePoint fit inside an unsigned short int?:

    but I guess QChar uses int/uint internally

    it does not: https://code.woboq.org/qt5/include/qt/QtCore/qchar.h.html#87

    looks like it just gets truncated to 0xFFFF


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.